Preview

The Herald of the Siberian State University of Telecommunications and Information Science

Advanced search

Solving the problem of rumors classification in online news

https://doi.org/10.55648/1998-6920-2025-19-3-122-138

Abstract

The article considers an approach to solving the problem of classifying rumors in the
news based on production rules. Unverified information appearing on news sites has the nature
of information garbage and is capable of causing significant harm to consumers in some cases.
The problem being solved is non-trivial, relevant and has no standard solution.

About the Authors

Alexander Dmitrievich Khudobin
Voronezh State University
Russian Federation

A second-year master's student at Voronezh State University, Faculty of Applied Mathematics, Informatics and Mechanics, majoring in Applied Informatics



Irina Evgenievna Voronina
Voronezh State University
Russian Federation
Doctor of Technical Sciences, Associate Professor, Professor of the Department of Software and Information Systems Administration, Voronezh State University


References

1. Gracheva Zh. V. Lingvisticheskaja jekspertiza v trudah Voronezhskoj associacii jekspertov-lingvistov : monografija [Linguistic expertise in the works of the Voronezh Association of Linguistic Experts: monograph]. Voronezh, Izdatel'skij dom Voronezhskogo gosudarstvennogo universiteta, 2023. 357 p.

2. Khudobin A. D., Voronina I. E. TF-IDF, Bag-of-words, Word2Vec i N-grammy dlja reshenija zadachi klassifikacii sluhov v novostjah [TF-IDF, Bag-of-words, Word2Vec and N-grams for News Rumors Classification Problem]. Informatika: problemy, metody, tehnologii, 2024, pp. 1285–1294.

3. Osetrova E. V. Rumors in the paradigm of linguistic genristics. International Journal “Speech Genres”, 2015, vol. 12, no. 2, pp. 80–89.

4. Osetrova E. V. Rumours as a Subject of Scientific Analysis: Social Psychology, History, Philology. Journal of Siberian Federal University. Humanities & Social Sciences, 2013, no. 9, pp. 1265–1280.

5. Rusakova I. B. Koncept sluh i ego reprezentacija v russkih i anglijskih poslovicah [The concept of hearing and its representation in Russian and English proverbs]. Mezhdunarodnyj aspirantskij vestnik. Russkij jazyk za rubezhom, 2024, no. 1, pp. 60–67.

6. Ivanova S. V, Khakimova G. Celebrity gossip as a genre in English‐language mass media discourse. Russian Journal of Linguistics, 2020, vol. 24, no. 2, pp. 386–418.

7. Khakimova G. Rumour Text Constructing Techniques In Media Discourse: Case Study Of Gossip Columns. The European Proceedings of Social and Behavioural Sciences, 2019, pp. 346–357.

8. Khakimova G. CONCEPT “RUMOUR” IN TERMS OF LEXICOGRAPHIC ANALYSIS. PART 1. Bulletin of the South Ural State University series Linguistics, 2016, vol. 13, no. 2, pp, 29–36

9. Khakimova G. CONCEPT “RUMOUR” IN TERMS OF LEXICOGRAPHIC ANALYSIS. PART 2. Bulletin of the South Ural State University series Linguistics, 2016, vol. 13, no. 3, pp, 38–46

10. Pastrevich M. K., Voronina I. E. Avtomaticheskaja klassifikacija rechevoj agressii [Automatic classification of speech aggression]. Aktual'nye problemy prikladnoj matematiki, informatiki i mehaniki, 2023, pp. 1359–1365.

11. Roumeliotis K. I., Tselika N. D., Nasiopoulos D. K. Next-Generation Spam Filtering: Comparative Fine-Tuning of LLMs, NLPs, and CNN Models for Email Spam Classification. Electronics, 2024. vol. 13, no. 11, pp. 2034–2058.

12. Li X. et. al. Dice Loss for Data-imbalanced NLP Tasks. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 465–476.

13. Khudobin A. D., Voronina I. E. Podhod k resheniju zadachi klassifikacii sluhov v internet-novostjah [An approach to solving the problem of rumors classification in online news]. Aktual'nye problemy prikladnoj matematiki, informatiki i mehaniki, 2024, pp. 334–340.

14. Duncan J. et. al. The Sensitivity of Word Embeddings-based Author Detection Models to Semantic-preserving Adversarial Perturbations. 2021, 22 p., available at: https://arxiv.org/pdf/2102.11917 (accessed 13.04.2025).

15. Policyna, E. V., Policyn S. A., Porechnyj A. S., Rykunov A. N. Analiz kachestva raboty i rasshirenie vozmozhnostej instrumentov morfologicheskogo analiza tekstov na russkom jazyke [Analysis of the quality of work and expansion of the capabilities of tools for morphological analysis of texts in Russian]. Vestnik Voronezhskogo gosudarstvennogo universiteta. Serija: Sistemnyj analiz i informacionnye tehnologii, 2023, no. 2, pp. 171–180.

16. Hobson L., Hapke H., Hovard K. Obrabotka estestvennogo jazyka v dejstvii [Natural Language Processing in Action]. Saint Petersburg, Piter, 2020, 576 p.

17. Mohare R. V. et. al. ‘Bag of Words’ to ‘Bag of Concepts’: Improving Text Categorization using SVM. Nanotechnology Perceptions, no. S6, pp.419–426.

18. Suhasini V., Vimala Dr. N. A Hybrid TF-IDF and N-Grams Based Feature Extraction Approach for Accurate Detection of Fake News on Twitter Data. Turkish Journal of Computer and Mathematics Education, 2021, vol. 12, no. 06, pp. 5710–5723.

19. Cahyani D. E., Patasik I. Performance comparison of TF-IDF and Word2Vec models for emotion text classification. Bulletin of Electrical Engineering and Informatics, 2021, no. 5, pp. 2780–2788.

20. Abubakar H. D., Umar M., Bakale M. A. Sentiment Classification: Review of Text Vectorization Methods: Bag of Words, Tf-Idf, Word2vec and Doc2vec. Sule Lamido University Journal of Science & Technology, 2022, no. 1 & 2, pp. 27–33.

21. Lin'kova G. V., Botov D. S. Sravnenie metodov vektornyh predstavlenij tekstov v zadachah klassifikacii vakansij [Comparison of Text Vector Representation Methods in Job Classification Problems]. Nauchnoe soobshhestvo studentov XXI stoletija. Tehnicheskie nauki, 2018, no. 6, pp. 96–113.

22. Santos R. F. et. al. Long Term-short Memory Neural Networks and Word2vec for Self-admitted Technical Debt Detection. Proceedings of the 22nd International Conference on Enterprise Information Systems, 2020, pp. 157–165.

23. Devlin J. et. al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Proceedings of NAACL-HLT, 2019, pp. 4171–4186.

24. Wang L. et. al. Multilingual E5 Text Embeddings: A Technical Report. 2024, 6 p., available at: https://arxiv.org/pdf/2402.05672 (accessed 13.04.2025).

25. Bi X. et. al. DeepSeek LLM: Scaling Open-Source Language Models with Longtermism. 2024, 48 p., https://arxiv.org/pdf/2401.02954 (accessed 13.04.2025).

26.


Review

For citations:


Khudobin A.D., Voronina I.E. Solving the problem of rumors classification in online news. The Herald of the Siberian State University of Telecommunications and Information Science. 2025;19(3):122-138. (In Russ.) https://doi.org/10.55648/1998-6920-2025-19-3-122-138

Views: 1


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1998-6920 (Print)