Authorship identification of the text on the lexical level (formal-quantitative model)
The article deals with the development of the way to qualify the author's speech individuality - the formal-quantitative model of identification of the personal text on the lexical level. The personal text contains language characteristics, the variants of linguistic personality behaviour that can be expressed through quantitative indicators. In the present article the lexical level is regarded from the point of view of lexeme usage frequency in the personal text. Qualitative interpretation of the formal-quantitative data gives us some particular indicators on the lexemes distribution in speech, depicts the variety of the author's preferences of linguistic personalities that can help to define the author of the text. So, the article states the types of language personal identification. The following hypothesis is formulated in the article: the text contains individual lexical characteristics which, expressed through contrasting qualitative data, can serve as the identifiers of the text. The suggested model is presented at the first stage in two different ways, and they are based on a different interpretation of the term rank. The working formulae are: Д Rr= R1F -R2F (the rank is understood as a rank of a group of words), ДЯ= R1 - R2 (the rank of each word). Every word is analysed by means of these formulae within four frequency-contrastive tables. Based on the results of the lexemes vocabulary rank comparison of each frequency a contrastive table a diagram is made; and a conclusion on lexemes frequency is drawn. These two approaches have shown the possibility of their usage in attribution of the texts. Taking into consideration the conditions described (the genre of the text, approximate number of word usage) is obligatory, because any other conditions have not been verified. The second stage is represented through the comparison of the vocabulary of the texts with the modern frequency dictionary of the Russian language. The dictionary serves as the ''absolute'' index of the distribution of words, and shows which words are closer to the ''absolute index'', and, as a result, are more ''standard'', and which words differ. Within this study a theoretical and methodological (language personal analysis) problem of text identification is observed, the problem which has a direct access to different expert activities. Language personal analysis becomes a methodological base for identification examination. In the system of forensic linguistics this work serves as an analogue of the linguistic identification examination, the purpose of which is to establish the identity of objects. The present work has identified some regular occurrences in the lexical-quantitative structures of the texts belonging to the same author or different ones. It gives us an opportunity to use the mentioned personal texts identification methods in authorship examination.
Keywords
forensic linguistics, quantitative linguistics, formal-quantitative model of text identification, personal text, language personal analysis, identification linguistics, юрислингвистика, квантитативная лингвистика, формально-количественная модель идентификации текста, персонотекст, лингвоперсонология, идентификационная лингвистикаAuthors
| Name | Organization | |
| Napreenko Galina V. | Kemerovo State University | vila1991@mail.ru |
References
Authorship identification of the text on the lexical level (formal-quantitative model) | Vestnik Tomskogo gosudarstvennogo universiteta – Tomsk State University Journal. 2014. № 379. DOI: 10.17223/15617793/379/3