Optimization of collective communication algorithms for hierarchical distributed computer systems | Vestnik Tomskogo gosudarstvennogo universiteta. Upravlenie, vychislitelnaja tehnika i informatika – Tomsk State University Journal of Control and Computer Science. 2011. № 2(15).

Optimization of collective communication algorithms for hierarchical distributed computer systems

Parallel algorithms and programs for distributed computer systems are developed in messagepassingmodel. In this model parallel processes interact by sending and receiving messages toeach other. There are two kinds of communications: point-to-point and collective communications.Point-to-point communications is communication between two processes. Collective communicationsare divided onto several types: One-to-all Broadcast, All-to-all Broadcast and All-tooneReduce. By one-to-all broadcast a message from root process is transmitted to all processes.By all-to-all broadcast process sends message to all processes and receives messages from eachprocess. All-to-one reduce implies receiving messages from all processes in one process.Analysis of collective communications usage in parallel algorithms and programs shows thatabout of 80 % of communication time is the time of collective communications.In MPI libraries and parallel programming systems (for example, in Partitioned Global AddressSpace languages) collective communications are based on ring, recursive doubling andJ. Brucks algorithms, and also on algorithms which order processes in trees of different types.This algorithms are based on assumption about homogeneity of communication channels betweenprocessor cores of distributed computer systems. But modern systems are multi-architectural andthey have hierarchical structure. In such systems communication time between processor coresdepends on the location of cores in the system.In paper method for optimization of collective communication algorithms in hierarchical distributedcomputer systems is proposed. The main idea of the method is allocating intensively interactingprocesses on same computer node. This step ensures communication of processes viashared memory of node and, as consequence, decreasing of communication time. Algorithms areimplemented in topology-aware communication library TopoMPI. Proposed versions of recursivedoubling and Brucks algorithms are 1.1 - 4 times faster of their original version. Overhead of themethod is small and it is compensated by reduction in execution time of the developed algorithms.

Download file

Counter downloads: 348

Keywords

distributed computer systems, parallel programming, message passing, collective communication operation, распределенные вычислительные системы, параллельное программирование, модель передачи сообщений, операции коллективных обменов информацией

Authors

Name	Organization	E-mail
Kurnosov Mikhail G.	Siberian State University of Telecommunications and Information Sciences	mkurnosov@gmail.com

Всего: 1

References

Karypis G. and Kumar V. A fast and highly quality multilevel scheme for partitioning irregular graphs // SIAM Journal on Scientific Computing. 1999. V. 20. No. 1. P. 359−392.

Khoroshevsky V., Kurnosov M. Mapping parallel programs into hierarchical distributed computer systems // Proc. of «Software and Data Technologies». Sofia: INSTICC. 2009. V. 2. P. 123−128.

Хорошевский В.Г., Курносов М.Г. Алгоритмы распределения ветвей параллельных программ по процессорным ядрам вычислительных систем // Автометрия. 2008. Т. 44. № 2. С. 56−67.

Balaji P. MPI on a million processors / P. Balaji et. al. // Proc. of the PVM/MPI. Berlin: Springer-Verlag, 2009. P. 20−30.

Thakur R., Rabenseifner R., and Gropp W. Optimization of collective communication operations in MPICH // Int. Journal of High Performance Computing Applications. 2005. V. 19. No. 1. P. 49−66.

Rabenseifner R. Automatic MPI counter profiling // Proc. of the 42nd Cray User Group. Noorwijk, 2000. 19 pp.

Хорошевский В.Г. Архитектура вычислительных систем. М.: МГТУ им. Н.Э. Баумана. 2008. 520 с.

Евреинов Э.В., Хорошевский В.Г. Однородные вычислительные систем. Новосибирск: Наука, 1978. 320 с.