跳到主要內容

臺灣博碩士論文加值系統

(3.95.131.146) 您好!臺灣時間:2021/07/26 04:29
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:黃勝偉
研究生(外文):Sheng-Wei Huang
論文名稱:在SMP叢集環境上以混合式程式模式建置快速傅利葉轉換
論文名稱(外文):Implementing Fast Fourier Transform Using Hybrid Programming Model for Cluster of SMPs
指導教授:翁添雄
指導教授(外文):Tien-Hsiung Weng
學位類別:碩士
校院名稱:靜宜大學
系所名稱:資訊工程學系碩士班
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2010
畢業學年度:98
語文別:英文
論文頁數:41
中文關鍵詞:叢集計算混合式MPIOpenMP位元反轉
外文關鍵詞:Bit-ReversalOpenMPMPIHybridCluster computing
相關次數:
  • 被引用被引用:0
  • 點閱點閱:314
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
現今的平行硬體架構通常是在共享式記憶體,或是分散式記憶體,或是分散式共享記憶體的系統上,執行平行的程式碼。而考慮到價格以及計算效能的提升比例下,分散式系統和叢集環境變得越來越普遍,甚至是透過高速的網路將擁有多核心的高效能計算處理器給串連起來。因此,在本研究中所提的實驗方法,是針對這些設備來設計快速傅立葉轉換平行演算法,在實驗過程中我們實作了MPI以及OpenMP的混合式快速傅立葉轉換平行演算法。在演算法當中,OpenMP是在每個運算節點的內部運作的,因此需要特別的去設計,而在使用OpenMP平行話語言時,可透過omp for指令輕易的將迴圈平行化,但是在演算法的設計上,是採用SPMD的方式來實作,所以並不單單只使用omp for指令來達到迴圈的平行化。除了OpenMP的部分,整個演算法皆使用SPMD的方式去設計。在論文的最後,會將和只使用MPI指令的快速傅立葉演算法以及目前最常被用來使用的快速傅立葉演算法FFTW作比較並分析琪實驗數據。其實驗結果可發現混合式快速傅立葉轉換演算法的執行效能比單純只使用MPI指令的演算法時還要好,表式更可充分的利用實驗環境上的資源。最後我們也會在未來努力的改善其演算法,使其執行效能能更加提升。
Current parallel hardware architectures commonly used to execute parallel code can be broadly categories into shared-memory (SM), distributed-memory (DM), and distributed shared-memory (DSM) systems. DM systems and clusters have been popular because of their price/performance ratio and scalability. Even the high performance computing machine is a multi-core node interconnected by a high speed network. To write a parallel FFT code on these machines, in this thesis, we implemented a hybrid MPI and OpenMP parallel programming for FFT. The parallel MPI FFT code uses mainly SPMD and master-worker programming styles at the first level of parallelism. OpenMP is used to exploit second level of parallelism within a node, most other implementation, at this level of parallelism, a naïve loop level parallelization by adding omp for directive of OpenMP has been used. We implemented this level of parallelism within a node using an OpenMP SPMD style instead. Hence, our implementation style is SPMD within SPMD. We present our implementation and discuss the experimental results. We compare the results with MPI version of FFTW. Our hybrid model show promising and it will have better performance when our pure MPI FFT code is further improved.
Chinese Abstract iii
English Abstract iv
Acknowlegments v
Contents… vi
List of Table vii
List of Figures viii

CHAPTER 1: INTRODUCTION 1
CHAPTER 2: RELATED WORK 2
2.1 The Fast Fourier Transform 2
2.2 MPI 3
2.3 OpenMP 3
2.4 Hybrid Model 4
2.5 Related Research 5
CHAPTER 3: PARALLEL FFT DESIGN
- HYBRID APPROACH 8
CHAPTER 4: EXPERIMENTAL SETUP AND RESULTS 22
CHAPTER 5: CONCLUSIONS AND FUTURE WORK 28
REFERENCES 30
RELATED PUBLICATIONS 33
1.Bader, David A. and Virat Agarwal. FFTC: Fastest Fourier Transform for the IBM Cell Broadband Engine. The 14th International Conference on High Performance Computing (HiPC 2007), LNCS 4873, pp. 172-184, 2007.
2.Balaji, Pavan and Buntinas, Darius and Goodell, David and Gropp, William and Kumar, Sameer and Lusk, Ewing and Thakur, Rajeev and Träff, Jesper Larsson. MPI on a Million Processors. In Proceedings of the 16th European PVM/MPI Users'' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes In Computer Science; Vol. 5759, pp. 20 - 30 , 2009.
3.Bollman, D. Seguel, J. and Feo, J. Fast Digit-Index Permutations. Scientific Progress, vol. 5, no. 2, pp. 137-146, 1996.
4.Cooley, J. W., and Tukey, J. W. An algorithm for the machine calculation of complex Fourier series. In Math. Computation, 19, pp. 297–301, 1965.
5.Dinan, James. Balaji,Pavan. Lusk, Ewing. Sadayappan, P. Thakur, Rajeev. Hybrid Parallel Programming with MPI and Unified Parallel C. In Proceedings of the 7th ACM international conference on Computing frontiers, pp. 177-186, 2010.
6.Frigo, M., Leiserson, C.E., Randall, K.H. The Implementation of the Cilk-5 Multithreaded Language. ACM SIGPLAN ''98 Conference on Programming Language Design and Implementation, pp.212–223, 1998
7.Frigo, M. and Johnson, S.G. The design and implementation of fftw3. Proceeding of the IEEE, vol. 93, issue 2, pp.216-231, 2005.
8.Karp, A. H. Bit Reversal on Uniprocessors. SIAM Review, Vol. 38,pp. 289-307, 1996.
9.Krawezik, Geraud. and Cappello, Franck. Performance Comparison of MPI and three OpenMP Programming Styles on Shared memory Multiprocessors. In Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures, pp. 118-127, 2003
10.Liu, Z., Chapman, B., Wen, Y., Huang L., Weng TH., and Hernandez, O. Analyses for the Translation of OpenMP Codes into SPMD Style with Array Privatization. In Proceedings of International Workshop on OpenMP Applications and Tools (WOMPAT), pp. 244-259, 2003.
11.Liu, Zhenying, Huang, Lei. M,Chapman,Barbara.,Weng, Tien-hsiung. Efficient Implementation of OpenMP for Clusters with Implicit Data Distribution, LNCS 3349, pp. 121-136, 2005..
12.Lokhmotov, A., Mycroft, A. Optimal bit-reversal using vector permutations. ACM Symposium on the 19th Parallel Algorithms and Architectures, pp.198–199, 2007.
13.Lusk, Ewing. and Chan, Anthony. Early Experiments with the OpenMP/MPI Hybrid Programming Model, LNCS, 5004, pp. 36-47, 2010.
14.MPI Forum. MPI: A message-passing interface standard. Technical Report UT-CS-94-230, University of Tennessee, Knoxville, 1996.
15.OpenMP Architecture Review Board. Fortran 2.0 and C/C++ 2.0 Specifications. At www.openmp.org
16.Rodriguez, J.Jeffrey. An improved Bit-reversal algorithm for the fast Fourier transform. In proceedings of International Conference on Acoustics, Speech, and Signal Processing, vol.3, pp.1407-1410, 1988.
17.Rubio, M., Gómez, P., Drouiche, K. A new superfast bit reversal algorithm. International Journal of Adaptive Control and Signal Processing, vol. 16, issue 10, pp.703-707, 2002.
18.Sterling, Thomas. Messina, Paul. and Smith, H, Paul. Enabling Technologies for Petaflops Computing. MIT Press, 1995.
19.Seguel, J., Bollman, D., Feo, J. A Framework for the Design and Implementation of FFT Permutation Algorithms, IEEE Transactions on Parallel and Distributed Systems, vol.11, no.7, pp.625-635, 2000.
20.Takahashi, D., Sato, M., and Boku, T. An OpenMP Implementation of Parallel FFT and Its Performance on IA-64 Processors. WOMPAT 2003, LNCS 2716, pp. 99-108, 2003
21.Takahashi, Daisuke. A Hybrid MPI/OpenMP Implementation of a Parallel 3-D FFT on SMP Clusters, LNCS 3911, pp. 970-977, 2006.
22.Wallcraft, A. J. SPMD OpenMP vs. MPI for Ocean Models. Proceedings of First European Workshops on OpenMP (EWOMP’99), Lund, Sweden, 1999.
23.Zhang, Z., Zhang, X. Fast Bit-Reversals on Uniprocessors and Shared-Memory Multi-processors, SIAM Journal on Scientific Computing, vol.22, no.6, pp.2113-2134, 2000.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊