Improving vertex-frontier based GPU breadth-first search
来源期刊:中南大学学报(英文版)2014年第10期
论文作者:YANG Bo(杨博) LU Kai(卢凯) GAO Ying-hui(高颖慧) XU Kai(徐凯) WANG Xiao-ping(王小平) CHENG Zhi-quan(程志权)
文章页码:3828 - 3836
Key words:breadth-first search; GPU; graph traversal; vertex frontier
Abstract: Breadth-first search (BFS) is an important kernel for graph traversal and has been used by many graph processing applications. Extensive studies have been devoted in boosting the performance of BFS. As the most effective solution, GPU-acceleration achieves the state-of-the-art result of 3.3×109 traversed edges per second on a NVIDIA Tesla C2050 GPU. A novel vertex frontier based GPU BFS algorithm is proposed, and its main features are three-fold. Firstly, to obtain a better workload balance for irregular graphs, a virtual-queue task decomposition and mapping strategy is introduced for vertex frontier expanding. Secondly, a global deduplicate detection scheme is proposed to remove reduplicative vertices from vertex frontier effectively. Finally, a GPU-based bottom-up BFS approach is employed to process large frontier. The experimental results demonstrate that the algorithm can achieve 10% improvement over the state-of-the-art method on diverse graphs. Especially, it exhibits 2-3 times speedup on low-diameter and scale-free graphs over the state-of-the-art on a NVIDIA Tesla K20c GPU, reaching a peak traversal rate of 11.2×109 edges/s.
YANG Bo(杨博)1, 2, LU Kai(卢凯)1, 2, GAO Ying-hui(高颖慧)3, XU Kai(徐凯)1, 2, WANG Xiao-ping(王小平)1, 2, CHENG Zhi-quan(程志权)4
(1. Science and Technology on Parallel and Distributed Processing Laboratory,
National University of Defense Technology, Changsha 410073, China;
2. College of Computer, National University of Defense Technology, Changsha 410073, China;
3. Department of Electronic Science and Engineering, National University of Defense Technology,
Changsha 410073, China;
4. Avatar Science Company, Guangzhou 510001, China)
Abstract:Breadth-first search (BFS) is an important kernel for graph traversal and has been used by many graph processing applications. Extensive studies have been devoted in boosting the performance of BFS. As the most effective solution, GPU-acceleration achieves the state-of-the-art result of 3.3×109 traversed edges per second on a NVIDIA Tesla C2050 GPU. A novel vertex frontier based GPU BFS algorithm is proposed, and its main features are three-fold. Firstly, to obtain a better workload balance for irregular graphs, a virtual-queue task decomposition and mapping strategy is introduced for vertex frontier expanding. Secondly, a global deduplicate detection scheme is proposed to remove reduplicative vertices from vertex frontier effectively. Finally, a GPU-based bottom-up BFS approach is employed to process large frontier. The experimental results demonstrate that the algorithm can achieve 10% improvement over the state-of-the-art method on diverse graphs. Especially, it exhibits 2-3 times speedup on low-diameter and scale-free graphs over the state-of-the-art on a NVIDIA Tesla K20c GPU, reaching a peak traversal rate of 11.2×109 edges/s.
Key words:breadth-first search; GPU; graph traversal; vertex frontier