The choice of technology for parallelization of numerical the solution of the convective-diffusion equation on hybrid multiprocessor computer system
This work is devoted to comparison of three technologies for building parallel algorithms, by the example of numerical solution of the nonstationary 3D convection-diffusion equation. This equation is the basis of many models of continuum mechanics, hydrome-teorology, ecology, and others. When doing operational modeling of atmospheric or hydrological processes, it is necessary to present the results of prognostic modeling in the shortest possible time computing (on a computer). The use of multiprocessor and multicore computing equipment gives a great opportunity to accelerate the process of obtaining a numerical solution. This work is aimed to carry out computational experiments on a hybrid computing system in order to determine the most promising use of parallel computing technologies (MPI, OpenMP, OpenACC) to obtain efficient parallel programs for solving the nonstationary 3D convection-diffusion equation. When applying MPI technology as the main approach of parallelization for a distributed-memory computer system, a two-dimensional grid domain decomposition into subdomains was chosen. The calculations done on TSU Cyberia computing cluster showed sufficiently high efficiency of the chosen method for parallelization of the explicit-implicit computational algorithm. Open Multi-Processing technology is used for multiprocessor (multicore) computer systems with shared RAM. The results showed that OpenMP-program is inferior to MPI-program in terms of acceleration; however, in absolute values, operate time of OpenMP-program is slightly shorter for each parallel start (the difference decreases with the increase in the number of processes). The calculation results of the parallel program created using OpenACC technology on one computing node of TSU Cyberia cluster showed about 3 seconds execution time, which is significantly shorter than the execution time of the sequential program, and several times shorter than that of MPI- and OpenMP-programs on 12 or 16 processes on computing nodes of Cyberia cluster. The calculations done on structured difference grids with more than 2 million nodes showed that the most high-performance program for solving the task in question can be obtained using MPI technology. It accelerates calculations in almost 90 times in comparison with the calculation of the sequential program on 128 processes (10-20 CPUs connected with computer network). More "low-end" options include the use of OpenACC technology for NVIDIA graphic processors (1 CPU+1GPU on one computing node) which accelerates calculations in almost 30 times under the same conditions, and the use of OpenMP technology (2 CPUs on one computing node) accelerates calculations in 10 times.
Keywords
распараллеливание, параллельные вычисления, MPI, OpenMP, OpenACC, конвективно-диффузионное уравнение, явно-неявные разностные схемы, parallel computing, MPI, OpenMP, OpenACC, convective-diffusion equation, explicit-implicit differencing schemesAuthors
Name | Organization | |
Semenov Evgenii Vitalevich | National Research Tomsk State University | semyonov@math.tsu.ru |
Starchenko Alexander Vasilevich | National Research Tomsk State University | starch@math.tsu.ru |
Danilkin Evgeniy Alexandrovich | National Research Tomsk State University | ugin@math.tsu.ru |
Prohanov Sergey Anatolevich | National Research Tomsk State University | viking@mail.tsu.ru |
References

The choice of technology for parallelization of numerical the solution of the convective-diffusion equation on hybrid multiprocessor computer system | Vestnik Tomskogo gosudarstvennogo universiteta. Upravlenie, vychislitelnaja tehnika i informatika – Tomsk State University Journal of Control and Computer Science. 2020. № 51. DOI: 10.17223/19988605/51/13