Preview

The Herald of the Siberian State University of Telecommunications and Information Science

Advanced search
Vol 19, No 4 (2025)
https://doi.org/10.55648/1998-6920-2025-19-4

18-27 7
Abstract

This article presents a study on the creation of an emotional lexicon of the Uzbek
language, taking into account its cultural and linguistic specificities. The emotional vocabulary
of Uzbek reflects a complex interplay between language, national identity, and collective
emotional experience. The purpose of the research is to develop a culturally sensitive
framework for identifying and categorizing emotional expressions in Uzbek across diverse
communicative contexts. The study analyzes a wide range of traditional sources, including
folklore, proverbs, and classical literature, alongside modern discourse materials such as media
texts, online forums, and social networks. Particular attention is given to dialectal variations and
culturally embedded connotations, such as the double meaning of words like “achchiq” (bitter
taste and emotional bitterness) and “suyanmoq” (physical leaning and emotional reliance).
Methodologically, the research integrates corpus linguistics, sociolinguistic surveys, and AI-
based sentiment analysis to ensure both empirical validity and cultural depth. As a result, the
study proposes a prototype of an Uzbek emotional lexicon that captures emotional polarity,
intensity, and contextual usage. The practical applications of this lexicon include improving
human-computer interaction (e.g., chatbots), enriching language learning tools, and supporting
sociolinguistic and affective computing research. The findings underscore the necessity of a
standardized, culturally informed affective lexicon in Uzbek for preserving linguistic richness
and enhancing emotional nuance in digital communication. This work contributes to the broader
field of Turkic affective linguistics and emphasizes the importance of integrating cultural
semantics into computational models.

3-17 9
Abstract

 This paper presents a comparative analysis of machine learning (SVM), deep learning
(LSTM), and transformer-based (BERT) models for sentiment classification in Uzbek texts,
enhanced by Named Entity Recognition (NER). The study addresses the challenge of accurately
detecting sentiment in morphologically complex languages with limited resources, focusing on
Uzbek–a Turkic language with rich agglutinative structures. A dataset of 10,000 user-generated
comments from social platforms was annotated using a hybrid approach: manual labeling for
sentiment (positive, negative, neutral) and a CRF-based NER system to identify entities (e.g.,
brands, locations, public figures). The integration of NER features aimed to resolve contextual
ambiguities, such as distinguishing between "I love Samarkand’s history" (positive) and
"Samarkand’s traffic is unbearable" (negative). Experimental results demonstrate that BERT,
fine-tuned on Uzbek text, achieved the highest accuracy (90.2%) by leveraging contextualized
embeddings to align entities with sentiment. LSTM showed competitive performance (85.1%)
in sequential pattern learning but required extensive training data. SVM, while computationally
efficient, lagged at 78.3% accuracy due to its inability to capture nuanced linguistic
dependencies. The findings emphasize the critical role of NER in low-resource languages for
disambiguating sentiment triggers and propose practical guidelines for deploying BERT in real-
world applications, such as customer feedback analysis. Limitations, including data scarcity and
computational costs, are discussed to inform future research on optimizing lightweight models
for Uzbek NLP tasks.

28-47 11
Abstract

The relevance of integrated data quality management tasks is increasing in the context
of growing volume, variety, and criticality of data used. Despite this, organizations still have
significant gaps in understanding the interconnections between data quality, process quality, and
information systems. The purpose of this study is a systematic analysis of existing
methodologies and concepts for data quality management, as well as the identification of key
challenges in their practical implementation. The paper presents a review of scientific literature,
outlines standard elements and a framework for an integrated approach to data quality. A high-
level data flow scheme based on the Data Lakehouse architecture has been developed, reflecting
the interaction of system components. The necessity of developing new methods and algorithms
for optimizing big data quality, which go beyond traditional paradigms focused on structured
data, is substantiated. Key problems often ignored in practice are identified and systematized,
and criteria for the successful implementation of an integrated data quality management
approach are formed.

48-62 10
Abstract

The problem of studying the plane stress state by the photoelasticity method is
considered. The technique is based on solving equilibrium equations. The boundary conditions
for them are set based on the recorded interference pattern. A uniform grid is applied to it. For
each node, the order of the interference band to which it belongs is determined. To date, the
problem of automating this procedure has not been fully solved. To solve this problem, an
algorithm has been developed and programmatically implemented that determines whether a
node belongs to a stripe based on the color of the surrounding area. The algorithm is based on
testing the statistical hypothesis that the sample belongs to a given distribution according to the
Pearson criterion. To do this, the brightness histograms in all three color channels of the band of
each order are quantitatively compared with the corresponding histograms constructed for the
area in the vicinity of the node under consideration. The application of the method to the data
taken at the PPU-7 installation has shown its effectiveness. In particular, the following results
were obtained for interference patterns from simple objects (disk, plate). Of the 208 points
(nodes of the rectangular grid) for which the interference band was determined, approximately
95% were correctly classified. Moreover, in some cases, there were no incorrectly classified
pixels at all. Pixels that were not classified due to the fact that hypotheses about the
correspondence of the chromaticity of their neighborhood to the color gamut of any of the bands
were rejected accounted for 5-10%.

63-91 9
Abstract

This work is part of a series of articles reflecting the ATMO model (analysis of
territorial multisectoral objects). Here, the previously described product and financial
components of this model, implementing the Leontief input-output scheme, are complemented
by the trade balance model. It comprehensively displays for each sector of the region the share
of supplies of manufactured products and transit imports in a certain territorial system. These
supplies are supported by existing product reserves. Further, these commodity flows are
balanced by supplies to other regions outside this system and exports abroad. A special feature
of the model is that it simultaneously takes into account the oncoming flows of domestic and
foreign products into the modeled region from the cluster under consideration, while forming an
import indicator into it. This value is used in the production balance block, which is also
included in the ATMO model, thereby closing the complex of blocks included in it. In the trade
balance block, commodity flows to other regions are determined by supply (transit) coefficients,
according to which exports (imports) from a particular region are distributed into a cluster of
regions for a particular sector. They are constructed as shares of the distribution of supplies of
regional and foreign-made products across the regions of the corresponding cluster, based on a
retrospective analysis of similar supplies in the past. Expressions are constructed to reflect the
relationship between investments in transport infrastructure and the costs of inter-regional
supplies of goods and equipment. The targets of this block determine the development of the
regional system aimed at increasing exports of products produced in the regions. It, in turn,
relies on the conditions of industrial growth provided by the ever-increasing demand for the
supply of products outside each region and for export.

92-109 10
Abstract

The article presents a study of errors arising in non-contact measurement of fuel assembly head height variations using the parallax-shift method under conditions of arc-shaped motion of a television camera. The primary sources of error are examined, including radiation-induced and noise-related distortions of the video stream, contour detection inaccuracies,
subpixel deviations of circle centers, geometric imperfections of camera motion, and variations in camera orientation. A model describing the influence of these factors on the final height measurement accuracy is developed. The sensitivity of height estimations to key system
parameters is analyzed. The study demonstrates that the method maintains metrological reliability under typical operational deviations characteristic of nuclear power plant environments.

110-119 9
Abstract

As the complexity of information systems increases and the requirements for
information security become more stringent, there arises a need not only for the collection and
analysis of monitoring data but also for ensuring their immutability over time. Traditional
monitoring systems do not provide protection against unauthorized modifications of metric
history, which limits their applicability in contexts that require transparent auditing. This work
proposes to address this issue using blockchain technology. The proposed approach involves
daily extraction of key monitoring metrics for the previous day, followed by hashing of the
aggregated data. The resulting hash is transmitted to a smart contract deployed on the public
Ethereum blockchain, where it is stored along with a timestamp. For verification, a software
module was developed that retrieves data from the monitoring system and compares their hash
with the one stored on the blockchain. A prototype implementation has achieved full automation
of the process of recording and subsequently verifying aggregated metrics. The verification
procedure reliably detects any data tampering by comparing hashes. While the proposed method
does not eliminate the risk of recording tampered data in the event of a trusted party being
compromised, it does ensure a transparent and immutable history of recordings, allowing for
retrospective detection of violations. A key advantage is its independence from the
organization's internal infrastructure and the ability to perform verification using open tools.



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1998-6920 (Print)