An approach to recognizing named entities using the example of technological terms in a limited training sample
The paper considers the problem of recognizing named entities by the example of technological terms, a named entity is a word or phrase denoting an object or phenomena of a certain category. Automatic recognition of technological terms allows companies to optimize business processes. Recognizing named entities for a limited training sample is a non-trivial task. Currently, the standard for recognizing named entities are conditional random field methods (conditional random field, CRF) and bidirectional long-term short-term memory network (bidirectional long-term short-term memory, Bi-LSTM). The paper proposes an approach that is a combination of a statistical (CRF) and a neural network (Bi-SM-CRF) model. The main advantage of using the CRF model is a slight increase in training time against the background of providing additional information for the subsequent Bi-LSTM-CRF model, which will allow you to learn more effectively in a limited sample. Two approaches are used to convert text to feature space: extracting the syntactic properties of words for a statistical model and converting text to a vector using the Sci-Bert language model. Within the framework of the work, a significant improvement in the quality of recognition of technological terms was demonstrated due to the combination of statistical and neural network models of machine learning and the use of a domain-oriented language model for vector representation of scientific texts. This made it possible to improve the quality of recognition of technological terms using the f1-score metric by 12% when training on 800 texts compared to the traditional approach.
Keywords
technology term recognition,
named entity recognition,
model combination,
Bi-LSTM (bidirectional long short-term memory),
CRF (conditional random field)Authors
Kulnevich Alexey Dmitrievich | National Research Tomsk State University | kulnevich94@mail.ru |
Koshechkin Alexander Alekseevich | National Research Tomsk State University | kaa1994g@mail.ru |
Karev Svyatoslav Vasilyevich | National Research Tomsk State University | svyatoslav.karev@live.ru |
Zamyatin Alexander Vladimirovich | National Research Tomsk State University | avzamyatin@inbox.ru |
Всего: 4
References
Nadeau D., Sekine S. A survey of named entity recognition and classification // Lingvisticae Investigationes. 2007. V. 30, № 1. P. 3-26.
Marrero M. et al. Named entity recognition: fallacies, challenges and opportunities // Computer Standards & Interfaces. 2013. V. 35, № 5. P. 482-489.
Korkontzelos I. et al. Boosting drug named entity recognition using an aggregate classifier // Artificial intelligence in medicine. 2015. V. 65, № 2. P. 145-153.
Lafferty J., McCallum A., Pereira F.C.N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. 2001. URL: https://repository.upenn.edu/cgi/viewcontent.cgi?article=1162&context=cis_papers
Schuster M., Paliwal K.K. Bidirectional recurrent neural networks // IEEE transactions on Signal Processing. 1997. V. 45, № 11. P. 2673-2681.
Jing L., Aixin S., Ray H., Chenliang L. A Survey on Deep Learning for Named Entity Recognition // IEEE Transactions on Knowledge and Data Engineering. 2020. DOI: 10.1109/TKDE.2020.2981314
Hossari M., Dev S., Kelleher D.J., TEST: A Terminology Extraction System for Technology Related Terms // ICCAE 2019, Feb ruary 23-25, 2019. URL: https://arxiv.org/pdf/1812.09541.pdf
Jason P.C., Chiu N.E. Named Entity Recognition with Bidirectional LSTM-CNNs // arXiv preprint:1511.08308v5. 2016. URL: https://arxiv.org/pdf/1511.08308.pdf
Pennington J., Socher R., Christopher D.M. GloVe: Global Vectors for Word Representation / Computer Science Department, Stanford University. 2014. URL: https://nlp.stanford.edu/pubs/glove.pdf
Wang S., Zhou W., Jiang C. A survey of word embeddings based on deep learning // Computing. 2020. V. 102, № 3. P. 717-740.
Wang Y. et al. From static to dynamic word representations: a survey // International Journal of Machine Learning and Cybernetics. 2020. V. 11 (4). P. 1-20.
Devlin J. et al. BERT: Pre-training of deep bidirectional transformers for language understanding // arXiv preprint arXiv:1810.04805. 2018. URL: https://arxiv.org/pdf/1810.04805.pdf
Beltagy I., Lo K., Cohan A. SciBERT: a pretrained language model for scientific text // Proc. of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019. DOI: 10.18653/v1/D19-1371
Tenney I. et al. What do you learn from context? probing for sentence structure in contextualized word representations // arXiv preprint arXiv:1905.06316. 2019. URL: https://arxiv.org/pdf/1903.10676.pdf
Tenney I., Das D., Pavlick E. BERT rediscovers the classical NLP pipeline // Proc. of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. P. 4593-4601.
Huang Z., Xu W., Yu K. Bidirectional LSTM-CRF models for sequence tagging // arXiv preprint arXiv:1508.01991. 2015. URL: https://arxiv.org/pdf/1508.01991.pdf
Vaswani A. et al. Attention is all you need // arXiv preprint arXiv:1706.03762. 2017. URL: https://arxiv.org/pdf/1706.03762.pdf
Service for the free distribution of articles in the fields of physics, mathematics, computer science and other. URL: https://arxiv.org/(accessed: 22.10.2020).
Stenetorp P. et al. BRAT: a web-based tool for NLP-assisted text annotation // Proc. of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics. 2012. P. 102-107.