METHODOLOGY AND MATHEMATICS OF KEY WORDS
Today, the procedure of keywords selection has become a standard mode of comparison being carried out on basis of analysis of two text corpora: reference and target. In this paper we consider different meanings of the term “keyword”, which have been used in the history of linguistics, and in the end we give a statistical definition of the term which in its turn assumes the corpus analysis. Thereupon we consider linguistic methodical potential of statistical supports in the form of keywords. Speaking on methodology, we also consider in details those mathematical processes and models, which have underlain the corpus analysis and identification of important words in the text. They provide authenticity and make it possible to analyze a large body of language data. This analysis was impossible in pre-corpus epoch. The corpus manager WordSmith Tools 6.0 has become a tool for processing of our linguistic database, which represents a program package for analysis of corpus texts. This software realizes identification of keywords with the help of logarithmic plausibility criteria and chi-square. Having formulated by G.Zipf the dependence of word quantity in the corpus on their frequency, gives us understanding of importance of corpus methods for definition relative frequency of words with regard to validation criterion. The main practical goal of the research is to show any possible ways of using corpus statistics for the selection of professional relevant vocabulary for the students with economics, social and political majors. The article demonstrates an example for composition of professional-targeted, specially compiled corpus for the majors given above with the volume of 2 mln word usage. It considers as well the selection of reference corpus, where we were able to use for the first time the text database of BNC (British National Corpus) with 100 mln word usage. It has become possible due to compatibility of WordSmith Tools software and BNC. The article highlights a huge linguistic and didactical potential for using language computer corpora, which have been designed by the staff of the department and the university, in teaching the professional-targeted foreign language. Authentic corpus examples can be used for composition of lexical minimum, linguistic and didactical material with the use of corpus tools. At the same time, linguistic obviousness of concordance obtained makes it possible to realize so-called ‘condensed reading’ of authentic speaking usage that leads to intensive acquisition of most probable lexical and grammatical collocation and interference prevention. The paper shows the conclusion on expediency of corpus procedures usage in teaching and presents examples of their usage in the design of linguistic and methodical material with the corpus support.
Keywords
корпусная методика, профессионально-направленное обучение, иностранный язык, процедура хи-квадрат, логарифмическое правдоподобие, ключевые слова, corpus-informed teaching, ESP- teaching, second language acquisition, chi-square, log-likelihood testAuthors
Name | Organization | |
Gorina O.G. | National Research university “Higher school of economics” | gorina@bk.ru |
References
_6_2017_1498554699.jpg)
METHODOLOGY AND MATHEMATICS OF KEY WORDS | Open and distance education. 2017. № 2(66). DOI: 10.17223/16095944/66/6