The Multi-Parameter Analysis of Linguistic Data in the Information System Semograf (On the Example of the Study of Social Network Users' Speech) | Vestnik Tomskogo gosudarstvennogo universiteta. Filologiya – Tomsk State University Journal of Philology. 2020. № 64. DOI: 10.17223/19986645/64/1

The Multi-Parameter Analysis of Linguistic Data in the Information System Semograf (On the Example of the Study of Social Network Users' Speech)

The aim of this article is to demonstrate the capabilities of the information system Semograph (http://semograph.org) as a tool for text content analysis when implementing a network approach to the organization of scientific research in linguistics. Semograph can be used for the analysis of text data, creation and/or annotation of language/text corpora, conducting, processing and analysis of psycholinguistic and sociolinguistic experiments, development of classifiers and thesauri, and solving other problems that arise when analyzing language material. Semograph implements the principles of a full research cycle, network distribution of research participants, a multi-user mode of operation and methodological pluralism. The possibilities of network organization of work in Semograph are shown on the example of a multiparametric analysis of speech behavior, social parameters and psychological characteristics of users of the social network VKontakte. The total volume of the automatically collected material is 18,126 utterances of 340 users who have completed a psychological survey of BFI, according to which results of the severity of the five psychological personal traits (extraversion vs. introversion, agreeableness vs. antagonism, conscientiousness vs. lack of direction, neuroticism vs. emotional stability, openness vs. closedness to experience) are determined. For the analysis of the text material, a multi-level hierarchical classifier was developed that allows each expert-linguist to create and develop a separate classification branch (thus, the same material is considered by different experts from different points of view, and its multiparametric linguistic classification is created). This classification and specific user metadata (gender, psychological characteristics, etc.) provide the basis for constructing a model of interrelations between linguistic parameters of speech and socio-psychological characteristics of a person by means of interactive visual analytics. The article demonstrates these interrelations on the example of differences in the use of role and spatial deixis tools by extroverts and introverts, abusive and obscene lexical unites by users with a strong tendency for closedness and openness to experince, etc. The resulting model shows that the speech variability of texts is due to the interaction of psychological and gender characteristics of the informants, rather than a single act of these factors. In general, the article demonstrates that the information system Semograph allows, on the one hand, analyzing large arrays of texts with linguistic and extra-linguistic annotations, on the other hand, applying a network model of research organization that in the aggregate gives advantages in constructing models of fragments of linguistic and sociocultural reality.

Download file

Counter downloads: 384

Keywords

semantic graph modeling, visual analytics, multiparameter analysis, information system Semograph, social network-services, network science, графосемантическое моделирование, визуальная аналитика, многопараметрический анализ, информационная система «Семограф», социальные интернет-сервисы, сетевая наука

Authors

Name	Organization	E-mail
Belousov Konslanlin I.	Perm State University	belousovki@gmail.com
Erofeeva Elena V.	Perm State University	elener-ofee@gmail.com
Baranov Dmilriy A.	Perm State University	baranov@semograph.com
Zelyanskaya Nalalya L.	Perm State University	zelyanskaya@gmail.com
Shchebetenko Sergei A.	Higher School of Economics	shebetenko@rambler.ru

Всего: 5

References

Успенский Б.А. Мифологический аспект русской экспрессивной фразеологии (статья первая) // Studia Slavica Hungarica. 1983. Vol. 29. P. 33-69.

Мокиенко В.М. Русская бранная лексика: цензурное и нецензурное // Русистика. 1994. № 1/2. С. 50-73.

База данных «Речевые и неречевые параметры пользователей социальной сети»: Свидетельство о государственной регистрации базы данных, охраняемой авторскими правами / Баранов Д.А., Белоусов К.И., Боронникова Н.В., Ерофеева Е.В., Зелянская Н.Л., Константинов И.М., Обухова И.А., Руденко Е.С., Русинова И.И., Худякова Е.С. М. : Федеральная служба по интеллектуальной собственности. Внесена в реестр баз данных, регистрационный № 2018621839 от 20.11.2018.

Shchebelenko S. “The best man in the world”: Attitudes toward personality traits // Psychology. Journal of the Higher School of Economics. 2014. Vol. 11, № 3. P. 129-148.

Shchebelenko S. Reflexive Characteristic Adaptations Explain Sex Differences in the Big Five: but not in Neuroticism // Personality and Individual Differences. 2017. Vol. 111. P. 153-156.

John O.P., Naumann L.P., Solo C.J. Paradigm Shift to the Integrative Big-Five Trait Taxonomy: History, Measurement, and Conceptual Issues // O.P. John, R.W. Robins, L.A. Pervin (eds.). Handbook of personality: Theory and research. New York, NY : Guilford Press, 2008. P. 114-158.

John O.P., Donahue E.M., Kenlle R.L. The Big-Five Inventory-Version 4a and 54. Berkeley, CA : Berkeley Institute of Personality and Social Research; University of California, 1991.

Zuniga H.G. de, Diehl T., Huber B., Liu J. Personality Traits and Social Media Use in 20 Countries: How Personality Relates to Frequency of Social Media Use, Social Media News Use, and Social Media Use for Social Interaction // Cyberpsychology, Behavior, And Social Networking. 2017. Vol. 20, № 9. P. 540-552.

Wang X., Li Y. Users' Satisfaction with Social Network Sites: A Self-Determination Perspective // Journal of Computer Information Systems. 2015. Vol. 56, № 1. P. 48-54.

Nadkarni A. Why Do People Use Facebook? // Personality and Individual Differences. 2012. Vol. 52, № 3. P. 243-249.

Pentina I., Zhang L. Effects of Social Support and Personality on Emotional Disclosure on Facebook and in Real Life // Behaviour and Information Technology. 2017. Vol. 36, № 5. P. 484-492.

Liu D., Baumeister R.F. The Big Five Personality Traits, Big Two Metatraits and Social Media: A Meta-Analysis // Journal of Research in Personal. 2017. Vol. 70. P. 229-240.

Morrison M.A., Cheong H.J., McMillan S. Posting, Lurking, and Networking: Behaviors and Characteristics of Consumers in the Context of User-Generated Content Morrison // Journal of Interactive Advertising. 2013. Vol. 13, № 2. P. 97-108.

Baranov D.A., Belousov K.I., Ichkineeva D.A., Zelyanskaya N.L. The network organization of experimental research in linguistics: opportunities and prospects // Procedia - Social and Behavioral Sciences. 2015. Vol. 214. P. 958-964.

Белоусов К. И. Теория и методология полиструктурного синтеза текста. М. : Флинта : Наука, 2009. 216 с.

Рябинин К.В., Баранов Б.Д., Белоусов К.И. Интеграция информационной системы Семограф и визуализатора SciVi для решения задач экспертного анализа языкового контента // Научная визуализация. 2017. № 4. С. 67-77.

Citizen science. URL: https://en.wikipedia.org/wiki/List_of_citizen_science_projects (date of access: 03.08.2018).

Cooke N.J., Hilton M.L. (Eds.). Enhancing the Effectiveness of Team Science / Committee on the Science of Team Science; Board on Behavioral, Cognitive, and Sensory Sciences; Division of Behavioral and Social Sciences and Education; National Research Council. Washington DC : The National Academies Press, 2015. 256 p.

Кастельс М. Галактика Интернет: Размышления об Интернете, бизнесе и обществе. Екатеринбург : У-Фактория, 2004. 328 с.

Пурдехнад Д. Открытые инновации и социальные сети // Проблемы управления в социальных системах. 2012. Т. 4, № 7. С. 22-27.