Modeling network contention effects on process allocation in computer systems discrete function and automatons | Vestnik Tomskogo gosudarstvennogo universiteta. Upravlenie, vychislitelnaja tehnika i informatika – Tomsk State University Journal of Control and Computer Science. 2019. № 47. DOI: 10.17223/19988605/47/11

Modeling network contention effects on process allocation in computer systems discrete function and automatons

Interconnection networks of modern high-performance distributed computer systems have at least a two-level hierarchical organization. The first level of the communication network is formed by the switch-based network (InfiniBand, Ethernet). The second level is represented by a shared memory of SMP/NUMA-computer nodes. In such systems a communication time between processors depends on their replacement in the system. In this paper, we present a benchmark for estimating the message passing time when MPI-processes share the communication channels. We analyze the degradation of the communication network performance when message passing queues are formed for computer systems with NUMA/SMP computer nodes. We consider three levels of communication environment: shared memory of a computer node, processor interconnect in NUMA nodes, network interconnect between nodes (InfiniBand and Gigabit Ethernet). The Resource and Jobs Management Systems (RJMS) form a subsystem of p processor cores. If computer systems consist of multiprocessor nodes, this problem has many solutions. For example, a symmetric set of nodes that has the rank equal to eight can be formed in three ways: one computational node with eight processor cores (1x8), two nodes with four cores (2x4) and four nodes with two cores (4x2). A completion time of collective communication operations on these subsystems will be different. Therefore, the development of algorithms that determine nodes allocation taking into account a message passing structure of the target program has practical interest. Authors have developed a software for predicting the execution time of the All-to-all operation on the given subsystem of nodes. A software uses the results of an experimental estimate of the performance degradation for the MPI_Send/MPI_Recv operations during simultaneous use of the communication channel by a set of processes.

Download file
Counter downloads: 187

Keywords

параллельное мультипрограммирование, организация функционирования, вычислительные системы, collective communications, network contention, computer clusters

Authors

NameOrganizationE-mail
Peryshkova Eugene N.Siberian State University of Telecommunications and Information; Rzhanov Institute of Semiconductor Physics of SB RASe.peryshkova@gmail.com
Kurnosov Mikhail G.Siberian State University of Telecommunications and Information Sciences; Rzhanov Institute of Semiconductor Physics of SB RASmkurnosov@gmail.com
Всего: 2

References

Alverson R., Roweth D., Kaplan L. The Gemini System Interconnect // Proc. 18th IEEE Symposium on High Performance Inter connects. Washington, DC : IEEE Press, 2010. P. 83-87.
Chen D., Eisley N.A., Heidelberger P., Senger R. et al. The IBM Blue Gene/Q interconnection network and message unit // Proc. 2011 International Conference for High Performance Computing, Networking, Storage and Analysis. New York : ACM Press, 2011. DOI: 10.1145/2063384.2063419.
Ajima Y., Inoue T., Hiramoto S., Shimizu T., Takagi Y. The tofu interconnect // IEEE Micro 32(1). 2012. P. 21-31.
Корнеев В.В. Вычислительные системы. М. : Гелиос АРВ, 2004. 512 с.
Prisacari B., Rodriguez G., Minkenberg C., Hoefler T. Bandwidth-optimal all-to-all exchanges in fat tree networks // Proc. 27th international ACM conference on International conference on supercomputing, June 10-14, Eugene, Oregon,. USA. 2013.
Steffenel L.A. Modeling network contention effects on all-to-all operations. IEEE Press, 2006.
Hovestadt M., Kao O., Keller A., Streit A. Scheduling in HPC resource management systems: Queuing vs. Planning // Proc. 9th Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP), LNCS #2862, 2003. P. 1-20.
 Modeling network contention effects on process allocation in computer systems discrete function and automatons | Vestnik Tomskogo gosudarstvennogo universiteta. Upravlenie, vychislitelnaja tehnika i informatika – Tomsk State University Journal of Control and Computer Science. 2019. № 47. DOI: 10.17223/19988605/47/11

Modeling network contention effects on process allocation in computer systems discrete function and automatons | Vestnik Tomskogo gosudarstvennogo universiteta. Upravlenie, vychislitelnaja tehnika i informatika – Tomsk State University Journal of Control and Computer Science. 2019. № 47. DOI: 10.17223/19988605/47/11

Download full-text version
Counter downloads: 731