Optimization of collective communication algorithms for hierarchical distributed computer systems
Parallel algorithms and programs for distributed computer systems are developed in messagepassingmodel. In this model parallel processes interact by sending and receiving messages toeach other. There are two kinds of communications: point-to-point and collective communications.Point-to-point communications is communication between two processes. Collective communicationsare divided onto several types: One-to-all Broadcast, All-to-all Broadcast and All-tooneReduce. By one-to-all broadcast a message from root process is transmitted to all processes.By all-to-all broadcast process sends message to all processes and receives messages from eachprocess. All-to-one reduce implies receiving messages from all processes in one process.Analysis of collective communications usage in parallel algorithms and programs shows thatabout of 80 % of communication time is the time of collective communications.In MPI libraries and parallel programming systems (for example, in Partitioned Global AddressSpace languages) collective communications are based on ring, recursive doubling andJ. Brucks algorithms, and also on algorithms which order processes in trees of different types.This algorithms are based on assumption about homogeneity of communication channels betweenprocessor cores of distributed computer systems. But modern systems are multi-architectural andthey have hierarchical structure. In such systems communication time between processor coresdepends on the location of cores in the system.In paper method for optimization of collective communication algorithms in hierarchical distributedcomputer systems is proposed. The main idea of the method is allocating intensively interactingprocesses on same computer node. This step ensures communication of processes viashared memory of node and, as consequence, decreasing of communication time. Algorithms areimplemented in topology-aware communication library TopoMPI. Proposed versions of recursivedoubling and Brucks algorithms are 1.1 - 4 times faster of their original version. Overhead of themethod is small and it is compensated by reduction in execution time of the developed algorithms.
Keywords
distributed computer systems, parallel programming, message passing, collective communication operation, распределенные вычислительные системы, параллельное программирование, модель передачи сообщений, операции коллективных обменов информациейAuthors
Name | Organization | |
Kurnosov Mikhail G. | Siberian State University of Telecommunications and Information Sciences | mkurnosov@gmail.com |
References
