Efficient algorithms have been developed for the solution of many graph problems. The solution is performed with associative parallel processors. At present there are no widely used associative architectures. Nevertheless, the recent development of the GPUs has made it possible to implement the associative parallel models with no significant efficiency loss. It enables to use the associative algorithms in practice. The implementation of the abstract model of the associative parallel data processor is given (the STAR-machine) with the GPUs by means of the CUDA technology. The performance is being analysed as well as the efficiency of the solution of the graph problems. The test case is the Warshall algorithm for finding the transitive closure of a directed graph. A graph with 5000 nodes was processed by the sequential Warshall algorithm for 884,622 s, by the associative parallel version for 64,454 s (speed-up is 13 times) and by the GPU-adopted associative parallel version for 0,372 s (speed-up is 2 378 times).
Download file
Counter downloads: 312
- Title Solution of graph problems by means of the STAR-machine being implemented on GPUs
- Headline Solution of graph problems by means of the STAR-machine being implemented on GPUs
- Publesher
Tomsk State University
- Issue Prikladnaya Diskretnaya Matematika - Applied Discrete Mathematics 3(33)
- Date:
- DOI 10.17223/20710410/33/9
Keywords
ассоциативный параллельный процессор, вертикальная обработка данных, SIMD, GPU, ориентированный граф, транзитивное замыкание, associative parallel processor, vertical data processing, SIMD, GPU, directed graph, adjacency matrix, transitive closureAuthors
References
Potter J.L. Associative Computing: a Programming Paradigm for Massively Parallel Computers. Boston: Perseus Publishing, 1991. 304 p.
Rudolph J. A. A production implementation of an associative array processor - STARAN // AFIPS'72. N.Y.: ACM, 1972. P. 229-241.
Foster C. C. Content Addressable Parallel Processors. N. Y.: John Wiley & Sons, 1976. 233 p.
Higuchi T., Furuya T., Handa K., et al. IXM2: a parallel associative processor // ISCA'91. Proc. 18th Annual Intern. Symp. Computer Architecture. N. Y.: ACM, 1991. P. 22-31.
Непомнящая А. Ш., Владыко М. А. Сравнение моделей ассоциативного вычисления // Программирование. 1997. №6. С. 41-50.
Nepomniaschaya A. Sh. Language STAR for associative and parallel computation with vertical data processing // Parallel Computing Technologies. Singapore: World Scientific, 1991. P. 258-265.
Potter J., Baker J., Scott S., et al. ASC: An Associative-Computing Paradigm // Computer. 1994. No. 27(11). P. 19-25.
Nepomniaschaya A. Sh. Basic associative parallel algorithms for vertical processing systems // Bulletin of the Novosibirsk Computing Center. 2009. Ser. Comp. Science. No. 9. P. 63-77.
Nepomniaschaya A. Sh. Constructions used in associative parallel algorithms for undirected graphs. Part 1 // Bulletin of the Novosibirsk Computing Center. 2013. Ser. Comp. Science. No. 35. P. 69-83.
Nepomniaschaya A. Sh. Constructions used in associative parallel algorithms for directed graphs // LNCS. 2015. V.9251. P. 201-209.
Nepomniaschaya A. Sh. Solution of path problems using associative parallel processors // Intern. Conf. Parallel and Distributed Systems. Seoul: IEEE Computer Society Press, 1997. P. 610-617.
Nepomniaschaya A. Sh. and Dvoskina M. A. A simple implementation of Dijkstra's shortest path algorithm on associative parallel processors // Fundamenta Informaticae. IOS Press, 2000. V. 43. P. 227-243.
Nepomniaschaya A. Sh. An associative version of the Bellman - Ford algorithm for finding the shortest paths in directed graphs // LNCS. 2001. V.2127. P. 285-292.
Nepomniaschaya A. Sh. Efficient implementation of Edmonds' algorithm for finding optimum branchings on associative parallel processors // Proc. 8th Intern. Conf. Parallel and Distributed Systems (ICPADS'01). KyongJu City: IEEE Computer Society Press, 2001. P. 3-8.
Непомнящая А. Ш. Сравнение алгоритмов Прима - Дейкстры и Краскала с помощью ассоциативного параллельного процессора // Кибернетика и системный анализ. 2000. №2. С. 19-27.
Nepomniaschaya A. Sh. Representation of the Gabow algorithm for finding smallest spanning trees with a degree constraint on associative parallel processors // LNCS. 1996. V. 1123. P. 813-817.
Nepomniaschaya A. Sh. Associative parallel algorithms for dynamic edge update of minimum spanning trees // LNCS. 2003. V.2763. P. 141-150.
Nepomniaschaya A. Sh. Associative parallel algorithm for dynamic reconstructing a minimum spanning tree after deletion of a vertex // LNCS. 2005. V. 3606. P. 151-173.
Непомнящая А. Ш. Ассоциативный параллельный алгоритм для динамической обработки минимального каркаса после добавления к графу новой вершины // Кибернетика и системный анализ. 2006. №1. С. 19-31.
Nepomniaschaya A. Sh. Efficient implementation of the Italiano algorithms for updating the transitive closure on associative parallel processors // Fundamenta Informaticae. IOS Press, 2008. V. 89. No. 2-3. P. 313-329.
Nepomniaschaya A. Sh. Associative version of the Ramalingam decremental algorithm for dynamic updating the single-sink shortest paths subgraph // LNCS. 2009. V. 5698. P. 257-268.
Непомнящая А. Ш. Ассоциативная версия алгоритма Рамалингама для динамической обработки подграфа кратчайших путей после добавления к графу новой дуги // Кибернетика и системный анализ. 2012. №3. C. 45-57.
Mingxian J. Associative operations from MASC to GPU // PDPTA-15 - The 21st Intern. Conf. on Parallel and Distributed Processing Techniques and Applications. Las Vegas: CSREA Press, 2015. P. 388-393.
http://www.dis.uniroma1.it/challenge9/download.s - 9th DIMACS Implementation Challenge - Shortest Paths. 2010.
http://www.nvidia.ru/object/tesla-server-gpus-ru.html - Высокопроизводительные вычисления для серверов| Графические процессоры Tesla|NVIDIA. 2016.
Harish P. and Narayanan P. Accelerating large graph algorithms on the GPU using CUDA // LNCS. 2007. V. 4873. P. 197-208.
Katz G. J. and Kider Jr. T. All-pairs shortest-paths for large graphs on the GPU // Proc. 23rd ACM SIGGRAPH/EUROGRAPHICS Symp. Graphics Hardware. Sarajevo: Eurographics Association, 2008. P. 47-55.
Lund B. and Smith J. W. A Multi-Stage CUDA Kernel for Floyd - Warshall. arXiv preprint arXiv:1001.4108

Solution of graph problems by means of the STAR-machine being implemented on GPUs | Prikladnaya Diskretnaya Matematika - Applied Discrete Mathematics. 2016. № 3(33). DOI: 10.17223/20710410/33/9
Download full-text version
Counter downloads: 1057