|
|
An Efficient Implementation of Double Precision 1-D FFT for GPUs Using CUDA |
Yanjun Liu, Licai Guo, Bin Luo, Xingyi Zhang |
|
|
Abstract Fast Fourier Transform (FFT) is a well known and widely used tool in
many scientific and engineering fields. CUFFT, which is the NVIDIA's
FFT library included in the CUDA toolkit, supports double precision
FFTs. However, the implementation of CUFFT is not very efficient.
In this paper, we implement an efficient double-precision
Cooley-tukey algorithm for GPUs using CUDA. Some programming
techniques are employed to exploit the hardware characteristics.
These techniques include on-chip shared memory utilization, removing
redundant computation, and coalescing the global memory access.
Experiments show that the performance of our 1-D FFT is as fast as
CUFFT. Furthermore, the performance of our FFT implementation is more
than twice faster than CUFFT for small input sizes.
|
|
|
|
|
[1] |
Lingxiao Ma;Yi Li;Hancong Tang;Weilai Chi;Depeng Dang. Parallel Chameleon Clustering Based on MapReduce[J]. , 2015, 12(6): 2053-2062. |
[2] |
Jun Chen;Zhengyang Luo;Chengying Gao. An Improved Hole-filling Technology Based on MLS[J]. , 2015, 12(6): 2063-2072. |
[3] |
Weixin Xie;Hongbin Huang;Haotian Zhai;Weiping Liu. Features Extraction and Classification of Rice Paper Images Based on Wavelet Transform[J]. , 2015, 12(6): 2073-2079. |
[4] |
Yehong Du;He Cui;Bing Li;Jie Li. Research on Regional Coverage with LAVs Based on MOPSOA[J]. , 2015, 12(6): 2081-2092. |
[5] |
Xiaojian You;Xiaohai He;Xuemei Han;Chun Wu;Hong Jiang. A Novel Cognitive Radio Decision Engine Based on Chaotic Quantum Bee Colony Algorithm[J]. , 2015, 12(6): 2093-2106. |
[6] |
Yao Fan;Yanli Chu. pplication of Improved ART Algorithm in Concrete Ultrasonic Imaging[J]. , 2015, 12(6): 2107-2116. |
[7] |
Lun Xie;Xin Liu;Zhiliang Wang. Micro-expression Cognition and Emotion Modeling Based on Gross Reappraisal Strategy[J]. , 2015, 12(6): 2117-2132. |
[8] |
Xiaoxue Guo;Haosen Lin. Remain Resource Reallocation DRA Algorithm with Multiple QoS Parameters Constraint[J]. , 2015, 12(6): 2133-2141. |
[9] |
Zhiwei Ni;Xuhui Zhu;Liping Ni;Meiying Cheng;Yiling Wang. An Improved Discrete Optimization Algorithm Based on Artificial Fish Swarm and Its Application for Attribute Reduction[J]. , 2015, 12(6): 2143-2154. |
[10] |
Siyuan Liu;Meng Wang;Haosong Hu. A Method and Application of Signal Demodulation Based on Wavelet Packet and Wavelet Ridge Decomposition[J]. , 2015, 12(6): 2155-2164. |
[11] |
Yuan Xi;Kai Cheng;Tao Xiao;Xitong Lou;Lei Cheng;Yanjuan Hu. Parametric Design of Reverse Blowing Pickup Mouth Based on Flow Simulation[J]. , 2015, 12(6): 2165-2175. |
[12] |
Hui Zhang;Peng Zhao;Jian Gao;Chengxiang Zhuge;Xiangming Yao. An Effective Intelligent Method for Optimal Urban Transit Network Design[J]. , 2015, 12(6): 2177-2184. |
[13] |
Xin Wang;Fulian Yin;Jianping Chai;Xinran Wang. The Research of Broadcast Television Community Discovery Technology Based on Double-weight Gaussian Kernel Similarity[J]. , 2015, 12(6): 2185-2196. |
[14] |
Yanli Huang;Beibei Xu;Xiaoliang Li. Properties of Rational General Solutions for First Order Multivariate Autonomous Rational Differential Systems[J]. , 2015, 12(6): 2197-2204. |
[15] |
Jianli Feng;Xiaomin Zhang. An Identification Algorithm of Passive Millimeter Wave Detection Armored Targets Based on Signal Complexity[J]. , 2015, 12(6): 2205-2212. |
|
|
|
|