Sentiment Analysis of Short Russian Texts in Social Media | Vestnik Tomskogo gosudarstvennogo universiteta. Ekonomika – Tomsk State University Journal of Economics. 2019. № 47. DOI: 10.17223/19988648/47/17

Sentiment Analysis of Short Russian Texts in Social Media

The rapid growth of the popularity of social media (Twitter, Facebook, etc.) increases interest in the sentiment analysis problem. Sentiment analysis is a method of automatic selection of an emotional component in texts, e.g., the emotional evaluation of considering themes, objects, events, etc. The large volume of accumulated data and the speed of getting new data do not leave a chance for interested people and companies to do data analysis in a manual mode. This makes the development of tools for the extraction of relevant data an important task. In this study, the author proposes an approach for sentiment analysis of short Russian text with vector representation. During the study, a self-prepared corpus of short Russian texts with 112 thousands units was used. The markup was made using markers. The efficiency of three algorithms was compared (decision tree, multilayer perception, logistic regression). The best model has an accuracy of classification equal to 76.2%, which is a high indicator of quality for the sentiment analysis task and thus allows using the approach in marketing research or monitoring audience loyalty to a particular topic or brand.

Download file
Counter downloads: 153

Keywords

сентимент-анализ, анализ естественного языка, машинное обучение, обучение с учителем, обучение без учителя, анализ данных, sentiment analysis, natural language processing, machine learning, supervised learning, unsupervised learning, data analysis

Authors

NameOrganizationE-mail
Bogdanov Aleksandr L.Tomsk State Universitybogdanov.al@mail.tsu.ru
Dulya Ivan S.Tomsk State Universityidulya7@gmail.com
Всего: 2

References

Baier M., Wagner K. User Behavior in Crowdfunding Platforms - Exploratory Evidence from Switzerland // Proceedings of Conference: Hawaii International Conference on System Sciences (HICSS), At Kauai, Hawaii, USA. 2016. Vol. 49. P. 3583-3593.
Poecze F., Ebster C., Strauss C. Social media metrics and sentiment analysis to evaluate the effectiveness of social media posts // Proceedings of The 9th International Conference on Ambient Systems, Networks and Technologies (ANT). 2018. Vol. 130. P. 660-666.
Zeroual I., Lakhouaja A. Data science in light of natural language processing: An overview // Proceedings of The First International Confidence on Intelligent Computing in Data Science, ICDS. 2017. Vol. 127. P. 82-91.
Mayfield A. What Social Media Is // ICrossing. URL: https://www.icrossing.com/ uk/sites/default/files_uk/insight_pdf_files/What%20is%20Social%20Media_iCrossing_ebook.pdf (дата обращения 21.08.2018).
Интернет в России: динамика проникновения // Фонд общественного мнения. URL: https://fom.ru/SMI-i-internet/13585 (дата обращения: 23.08.2018).
Шигина Я.И., Фоменков Д.А. Социальные медиа: современные тенденции в маркетинге // Вестник Казанского технологического университета. 2014. Т. 17, № 24. С. 453-456.
Twitter Study Reveals Interesting Results About Usage // Pear Analytics. URL: https://38r0us9g9l1438rwf2z2tcsz-wpengine.netdna-ssl.com/wp-content/uploads/2009/08/ Twitter-Study-August-2009.pdf (дата обращения: 21.08.2018).
The top 500 sites on the web // Alexa Internet. URL: https://www.alexa.com/topsites (дата обращения: 23.08.2018).
Pang B., Lee L. Opinion mining and sentiment analysis // Foundations and Trends in Information Retrieval. 2018. Vol. 2. P. 1-135.
Boudad N., Faizi R., Oulad Haj Thami R., Chiheb R. Sentiment analysis in Arabic: A review of the literature // Ain Shams Engineering Journal. 2018. Vol. 9, № 4. P. 2479-2490.
Sokhin T., Butakov N. Semi-automatic sentiment analysis based on topic modeling // Proceedings of 7th International Young Scientists Conference on Computational Science, YSC2018, Heraklion, Greece. 2018. Vol. 136. P. 284-292.
Tartir S., Abdul-Nabi I. Semantic sentiment analysis in arabic social media // Arabic Natural Language Processing: Models, Systems and Applications. 2017. Vol. 29, № 2. P. 229-233.
Mallek F., Belainine B., Sadat F. Arabic Social Media Analysis and Translation // Arabic Computational Linguistics, 2017. Vol. 117. P. 298-303.
Al-Thubaity A., Alqahtani Q., Aljandal A. Sentiment lexicon for sentiment analysis of Saudi dialect tweets // Arabic Computational Linguistics. 2018. Vol. 142. P. 301-307.
Юсупова Н.И., Богданова Д.Р., Бойко М.В. Алгоритмическое и программное обеспечение для анализа тональности текстовых сообщений с использованием машинного обучения // Вестник УГАТУ. 2012. Т. 16, № 6. С. 91-99.
Moussa M., Mohamed E., Haggag M. A survey on opinion summarization techniques for social media // Future Computing and Informatics Journal. 2018. Vol. 3, № 1. P. 82-109.
Amrani Y., Lazaarb M., Kadiri K. Random Forest and Support Vector Machine based Hybrid Approach to Sentiment Analysis // Proceedings of The First International Confidence on Intelligent Computing in Data Science, ICDS. 2017. Vol. 127. P. 511-520.
Stieglitz S., Mirbabaie M., Ross B., Neuberger C. Social media analytics - Challenges in topic discovery, data collection, and data preparation // International Journal of Information Management. 2018. Vol. 39. P. 156-168.
Birjali M., Beni-Hssane A., Erritali M. Machine Learning and Semantic Sentiment Analysis based Algorithms for Suicide Sentiment Prediction in Social Networks // Proceedings of The 7th International Conference on Current and Future Trends of Information and Communication Technologies in Healthcare (ICTH). 2017. Vol. 113. P. 65-72.
Araque O., Zhu G., Iglesias A. A semantic similarity-based perspective of affect lexicons for sentiment analysis // Knowledge-Based Systems. 2019. Vol. 165. P. 346-359.
Ankit S.N. An Ensemble Classification System for Twitter Sentiment Analysis // Proceedings of International Conference on Computational Intelligence and Data Science. 2018. Vol. 132. P. 937-946.
Yoon K. Convolution neural networks for sentence classification // arXiv: 1408.5882 [cs.CL]. 2014. URL: https://arxiv.org/abs/1408.5882 (дата обращения: 15.09.2018).
Heikal M., Torki M., El-Makky N. Sentiment Analysis of Arabic Tweets using Deep Learning // Arabic Computational Linguistics. 2018. Vol. 142. P. 114-122.
Tweepy Documentation // Tweepy. URL: https://tweepy.readthedocs.io/en/v3.5.0/ index.html (дата обращения: 10.09.2018).
Srishty Jindal, Dr. Kamlesh Sharma Intend to analyze social media feeds to detect behavioral trends of individuals to proactively act against social threats // Proceedings of International Conference on Computational Intelligence and Data Science. 2018. Vol. 132. P. 218-225.
Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space // In Proceedings of Workshop at ICLR. 2013.
 Sentiment Analysis of Short Russian Texts in Social Media | Vestnik Tomskogo gosudarstvennogo universiteta. Ekonomika – Tomsk State University Journal of Economics. 2019. № 47. DOI: 10.17223/19988648/47/17

Sentiment Analysis of Short Russian Texts in Social Media | Vestnik Tomskogo gosudarstvennogo universiteta. Ekonomika – Tomsk State University Journal of Economics. 2019. № 47. DOI: 10.17223/19988648/47/17

Download full-text version
Counter downloads: 977