Approach to improving the performance of software processes for processing and storing large volumes of geomagnetic data | Vestnik Tomskogo gosudarstvennogo universiteta. Upravlenie, vychislitelnaja tehnika i informatika – Tomsk State University Journal of Control and Computer Science. 2020. № 50. DOI: 10.17223/19988605/50/3

Approach to improving the performance of software processes for processing and storing large volumes of geomagnetic data

The issues of increasing the computational speed of software processes for the analytical processing of large volumes of geomagnetic data, which are the result of continuous monitoring of the parameters of the geomagnetic field by a great number of distributed ground magnetic stations and observatories, are discussed. A comparative review of the existing geomagnetic data architecture (presented in the framework of the specified IAGA-2002 format provided by International Association of Geomagnetism and Aeronomy), as well as popular data formats is given, and arguments are presented in favor of the need to improve the approach to organizing the results of geomagnetic observations. To solve this problem, a new hybrid format for long-term storage of geomagnetic data is presented, represented by a set of three interrelated components and characterized in that it uses the rules of referential integrity to combine relational, hierarchical and columnar data models used to describe metadata and geomagnetic data, and also sets POSIX-component addressing structure and implements a combination of textual and binary formats for presenting information. The main purpose of the proposed architecture is to increase the reactivity of software tools for analytic processing of geomagnetic data, on the one hand, and reducing the cost of the required amount of physical memory, on the other hand. The results of the comparison of the proposed hybrid format for presenting geomagnetic data with the existing approach to describing geomagnetic observation data (IAGA-2002), as well as other common formats for presenting large volumes of structured and semi-structured data (XML, JSON, Avro, etc.) are presented. In this case, the criteria for evaluating the effectiveness of a hybrid format for storing geomagnetic data determined the reactivity of software data processing and the amount of required disk space for their placement. The results of the experiment showed that the proposed format provides a significant increase in computing performance (about 4 times), conducted in relation to sets of heterogeneous geomagnetic data, and also significantly reduces the computational costs associated with their physical storage (approximately 5 times).

Download file
Counter downloads: 224

Keywords

big data, analytical processing, software reactivity, geomagnetic data, большие данные, аналитическая обработка, реактивность программного обеспечения, геомагнитные данные

Authors

NameOrganizationE-mail
Vorobev Andrei V.Ufa State Aviation Technical Universitygeomagnet@list.ru
Vorobeva Gulnara R.Ufa State Aviation Technical Universitygulnara.vorobeva@gmail.com
Всего: 2

References

Angles R., Gutierrez C. Survey of graph database models // ACM Computing Surveys. 2008. V. 40, No. 1. P. 1-39.
Femy P.F.M., Reshma K.R., Surekha S.M. Outcome analysis using Neo4j graph database // Int. J. on Cybernetics & Informatics (IJCI). 2016. V. 5, No. 2. P. 229-236.
HDF5. URL: https://www.hdfgroup.org/HDF5/ (accessed: 22.05.2019).
Emeakaroha V., Healy P. et al. Analysis of Data Interchange Formats for Interoperable and Efficient Data Communication in Clouds // Proc. of the 2013 IEEE/ACM 6th Int. Conf. on Utility and Cloud Computing. P. 393-398.
Plase D., Niedrite L., Taranovs R. Comparison of HDFS compact data formats: Avro Versus Parquet // Mokslas-Lietuvos ateitis. 2017. No. 9. P. 267-276.
Peng D., Cao L., Xu W. Using JSON for Data bn exchanging in Web Service Applications // J. of Computational Information System. 2011. V. 7 (16). P. 5883-5890.
Yahui Y. Impact data-exchange based on XML // Proc. 7th Int. Conf. Computer Science & Education (ICCSE). 2012. P. 1147-1149.
Carrera D., Rosales J., Blanco G.A.T. Optimizing Binary Serialization with an Independent Data Definition Format // Int. J. of Computer Applications. 2018. V. 180, No. 28. P. 15-18.
Intermagnet technical reference manual. Version 4.6 / ed. by Benoit St-Louis. Edinburgh, 2012. 92 p. https://www.intermagnet.org/ publications/intermag_4-6.pdf (accessed: 22.05.2019).
Воробьев А.В., Воробьева Г.Р., Юсупова Н.И. Концепция единого пространства геомагнитных данных // Тр. СПИИРАН. 2019. Т. 18, № 2. С. 390-415.
Geomagnetic Observations and Models / ed. by M. Mandea, M. Korte. Dordrecht : Springer, 2011. P. 149-181. (IAGA Special Sopron Book Series 5). https://link.springer.com/book/10.1007/978-90-481-9858-0 (accessed: 22.05.2019).
 Approach to improving the performance of software processes for processing and storing large volumes of geomagnetic data | Vestnik Tomskogo gosudarstvennogo universiteta. Upravlenie, vychislitelnaja tehnika i informatika – Tomsk State University Journal of Control and Computer Science. 2020. № 50. DOI: 10.17223/19988605/50/3

Approach to improving the performance of software processes for processing and storing large volumes of geomagnetic data | Vestnik Tomskogo gosudarstvennogo universiteta. Upravlenie, vychislitelnaja tehnika i informatika – Tomsk State University Journal of Control and Computer Science. 2020. № 50. DOI: 10.17223/19988605/50/3

Download full-text version
Counter downloads: 609