##plugins.themes.academic_pro.article.main##
Abstract
This article focuses on the development of deep learning models and algorithms specifically designed for Uzbek language processing within the IT field. A comprehensive approach involving data collection, preprocessing, model selection, and evaluation was employed. Experiments with RNN, LSTM, and transformer-based models like BERT and GPT were conducted, with transformer models yielding superior results. Key challenges included limited datasets and the complex morphological structure of Uzbek. The findings suggest that fine-tuned transformer models, especially with language-specific preprocessing, can significantly improve performance in language understanding tasks for low-resource languages
Keywords
##plugins.themes.academic_pro.article.details##
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
References
- Ahmadaliyev, S. (2020). Uzbek Language Processing: Challenges and Advances.
- Karimov, N. (2018). Morphological Complexity of the Uzbek Language.
- Rakhmonov, I. (2019). Neural Networks for Low-Resource Languages: The Case of Uzbek.
- Tursunov, D. (2017). Tokenization Strategies for Agglutinative Languages.
- Muminov, A. (2021). Machine Translation and Deep Learning for Uzbek Language Processing.
- Vaswani, A., et al. (2017). Attention Is All You Need.
- Devlin, J., et al. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language
- Understanding.
- Bojanowski, P., et al. (2017). Enriching Word Vectors with Subword Information.
- Tiedemann, J. (2012). Parallel Data, Tools and Interfaces in OPUS.
- Cho, K., et al. (2014). Learning Phrase Representations using RNN Encoder-Decoder for Statistical
- Machine Translation.
- Mikolov, T., et al. (2013). Efficient Estimation of Word Representations in Vector Space.
- Peters, M. E., et al. (2018). Deep Contextualized Word Representations.