谷国太1, 孙陆鹏2, 张红艳2, 肖汉2
1.河南省新闻出版学校,河南 郑州 450044;2.郑州师范学院 信息科学与技术学院,河南 郑州 450044
摘要:针对枚举排序算法在处理大规模数据时存在运算量大、计算时间长、计算效率低等问题,提出一种利用GPU并行运算提升大规模数据处理速度的方法。在CUDA下对枚举排序算法进行串-并行分析,分别从细粒度与粗粒度角度进行优化,根据CPU与GPU的结构特点优化排序数据的读取和存储方式,内核采用一个GPU线程对应一次比较操作的计算方法,以充分利用GPU计算能力。实验结果表明,当排序数据规模大于40 000时,在GPU上的运算速度比在CPU上快3倍左右,并且随着数据规模的不断增大,加速比越来越大。研究结果对于提升大规模数值计算效率具有重要的意义。
An enumeration sorting algorithm based on GPU and it' s parallelization
GU Guotai1, SUN Lupeng2, ZHANG Hongyan2, XIAO Han2
1.Henan Press and Publishing School, Zhengzhou 450044 , Henan, China;2.School of lnformation Science and Technology, Zhengzhou Normal University ,Zhengzhou 450044 , Henan,China
Abstract:In order to solve the problems of large amount of computation, long computing time and low computational efficiency when enumeration sorting is used to deal with large amounts of data, a method of using GPU parallel operation to improve the processing speed was proposed. Under the condition of CUDA, the enumeration sorting algorithm was analyzed in series-parallel, and optimized from the point of view of fine-grained and coarse-grained respectively. According to the structural characteristics of CPU and GPU , the reading and storage mode of data to be sorted was optimized. The kernel used a GPU thread corresponding to a experimental. The results showed that when the size of the dataset to be sorted was larger than 40 000 , the operation speed on GPU is about 3 times faster than that on CPU. And with the continuous increase of the data scale, the speedup was getting larger and larger. This study was of practical significance for large-scale numerical calculation.
Key words:enumeration sorting;graphic processing unit;parallel computing;data processing;performance optimization