WORD and low frequency vocabulary in dictionaries of the text editor | Vestnik Tomskogo gosudarstvennogo universiteta – Tomsk State University Journal. 2018. № 435. DOI: 10.17223/15617793/435/5

WORD and low frequency vocabulary in dictionaries of the text editor

This article describes problems of computerized spell checking of Russian-language texts. Microsoft Word ® 2016™ (2018 modification) text editor's built-in spell checking engine is being investigated and evaluated. It is shown that the inclusion of some obsolete and low-frequency words into the internal (system) computer dictionaries sometimes not only does nothing to improve the work of the speller, but also leads to the skipping of errors and typos. It is worth mentioning that many flaws and gaps of previous MS Word versions have been patched in MS Word 2016. Nevertheless, computerized analysis of word concord - in phrases and in standalone word combinations - raises even more questions, especially when compared with previous Orfo™-based spellers. Even detection of spelling errors (as the most developed analysis area) and prompting of possible corrections are still far from being perfect. The suggesting program of the spell checker suggests possible correction variants of words underlined and unidentified by the system (no more than three options in 2017 edition) or revision of the phrase. The suggestion is not always able to propose the normative spelling of the word, especially if it differs from the underlined one by a few letters. In the list of options, words are often split by a space, without taking into account the coherence of the two resulting words. The article contains examples of words quite frequently used in modern phrases and not known by the WinWord system dictionary, which should not be detected as mistakes but should be skipped without remarks. At the same time, there is no reason to keep rare and low-frequent short lexical units which coincide with beginnings and endings of more commonly used words in the system dictionary, because they may appear when a word is unintentionally split by space. The article contains examples of specially constructed phrases with errors: interchange of letters in a word, hyphaeresis or gemination, word split or concatenation. All such words resulting from errors are present or generated within the system dictionary. Word forms do not concord here; however, MS Word is unable to detect syntax errors of this type. Similar phrases can also be used for testing spell checkers of other MS Word versions, not only previous but also newer ones. The author provides a list of rare words considered by MS Word as correct despite of a significant chance of an error in writing more commonly used words. It is advisable to remove some 'specific' rare words from internal system dictionaries or deactivate them for the time being, until the spell checker is more informative about the contextual areas where the words can be used. Many of the flaws described by the author and by other Internet users have recently been eliminated from the MS Word text editor, but the content of its Russian system dictionary and the recommendations of the spell checker suggesting program leave a lot of questions.

Download file
Counter downloads: 210

Keywords

Microsoft Word, WinWord, компьютерная проверка правописания, текстовый редактор, спеллер, устаревшая лексика, орфографические ошибки, нормативное написание, архаизмы, русский язык, Microsoft Word, WinWord, computer spell checker, text editor, speller, obsolete vocabulary, spelling mistakes, regulatory writing, archaisms, Russian language

Authors

NameOrganizationE-mail
Lavoshnikova Elina K.Lomonosov Moscow State Universityelavoshnikova@mail.ru
Всего: 1

References

Русский орфографический словарь: около 200 000 слов / под ред. В.В. Лопатина, О.Е. Ивановой. 4-е изд., испр. и доп. М. : АСТ-Пресс книга, 2015. 896 с.
Букчина Б.З., Сазонова И.К., Чельцова Л.К. Орфографический словарь русского языка. 4-е изд., испр. М. : АСТ-Пресс книга, 2008. 1296 с.
Зализняк А. А. Грамматический словарь русского языка: Словоизменение. Ок. 110 000 слов. 6-е изд., стер. М. : АСТ-Пресс книга, 2010. 800 с.
Ефремова Т.Ф. Новый словарь русского языка. Толково-словообразовательный: в 2 т. 2-е изд. М. : Рус. яз., 2001.
Спира И. Microsoft Excel и Word 2013: Учиться никогда не поздно. СПб. : Питер, 2014. 256 с.
Лавошникова Э.К. Microsoft Word 2016 и синтаксический контроль // Современные информационные технологии и ИТ-образование. 2016. Т. 12, № 2. С. 205-210.
Лавошникова Э.К. «Проблемные» слова как причина пропуска ошибок при компьютерной проверке орфографии // Текст. Книга. Книго издание. 2017. № 15. С. 113-129. DOI: 10.17223/23062061/15/8
Гаспаров М.Л., Скулачёва Т.В. Односложные слова в стихе: ритм и части речи // Русский язык в научном освещении. 2003. № 1 (5). С. 35-51.
Поэзия Московского университета: от Ломоносова и до.. Книга 6: от Арсения Альвинга до Владислава Ходасевича, включая Глеба Анфилова, Николая Арсеньева, Николая Бухарина, Надежду Гиляровскую, Юрия Сидорова, Александра Тришатова. М. : НИВЦ МГУ -Бослен, 2011. 480 с.
Лавошникова Э.К. Вариативность в грамматике стихотворных текстов и их компьютерная коррекция (на материале антологии Поэзия Московского университета: от Ломоносова и до..) // Текст. Книга. Книгоиздание. 2017. № 14. С. 108-122. DOI: 10.17223/23062061/14/7
Бешенкова Е.В. Вариативность, узуальная норма и политика нормализаторов // Сибирский филологический журнал. 2016. № 3. С. 35-42.
Успенский В. А. Субъективные заметки о неправильной норме // Русский язык сегодня. Вып. 4. Проблемы языковой нормы. М. : Ин-т рус. яз. им. В.В. Виноградова РАН, 2006. С. 537-571.
Крысин Л.П. Проблема обновления толковых словарей современного русского языка // Известия РАН. Сер. литературы и языка. 2011. Т. 70, № 1. С. 3-9.
 WORD and low frequency vocabulary in dictionaries of the text editor | Vestnik Tomskogo gosudarstvennogo universiteta – Tomsk State University Journal. 2018. № 435. DOI: 10.17223/15617793/435/5

WORD and low frequency vocabulary in dictionaries of the text editor | Vestnik Tomskogo gosudarstvennogo universiteta – Tomsk State University Journal. 2018. № 435. DOI: 10.17223/15617793/435/5

Download full-text version
Counter downloads: 2409