跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.90) 您好!臺灣時間:2024/12/03 16:48
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:張力
研究生(外文):Li Chang
論文名稱:優化於軟體定義無線電中圖形處理器加速球面解碼之效能
論文名稱(外文):GPU-Aware Sphere Decoder Implementation in Software-Defined Radio
指導教授:徐慰中
口試日期:2016-07-23
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:英文
論文頁數:39
中文關鍵詞:軟體定義無線電圖形處理器球面解碼多輸入多輸出
外文關鍵詞:Software-defined radioGraphics processing unitSphere decodingMulti-input Multi-output
相關次數:
  • 被引用被引用:0
  • 點閱點閱:169
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
在行動通訊標準日益複雜以及通用處理器日益強大下,使用通用程式語言及通用處理器實作通訊演算法變得可行,使軟體定義無線電成為具吸引力的高彈性解決方案。本文探討在軟體定義無線電下以圖形處理器實作的基頻處理演算法中,相較其他部分在執行時行為較不規律的多輸入多輸出偵測演算法之球面解碼,根據阿姆達爾定律可能成為系統的瓶頸。因此在實作時必須考慮到圖形處理器硬體架構對軟體優化,以有效利用圖形處理器提供之性能,針對球面解碼演算法實作中造成圖形處理器執行效率不佳的因素,我們透過實作上的優化以及提出的分歧減少方法,使得在中至低訊噪比環境時,我們的分歧減少方法比起未優化的實作在執行時間上平均提升1.6倍的效能,加上實作上的優化後比起未優化之實作整體增進2-3.5倍的效能。
Modern communication protocols are getting more complicated. With general purpose processor getting more powerful, software-defined radio offer an attractive alternative with its high flexibility. Graphics Processing Unit(GPU) enable software-defined radio to exploit its massive parallelism computing paradigm. While the majority of the baseband processing algorithm is highly parallel, the sphere decoder, however, due to its depth-first search nature, is not as regular as other components. According to the Amdahl''s law, the efficiency of the software-defined LTE system implementation may be limited by the sphere decoding stage. In this thesis, we propose a preprocessing stage to significantly improve the extit{warp} execution efficiency of sphere decoding algorithm on GPU. The sphere decoder can run 1.6X faster in average in a middle to low SNR environment. Together with memory hierarchy related optimizations, the overall performance improvement to a sphere decoder has been improved by 2-3.5 times.
誌謝 i
摘要 ii
Abstract iii
Contents iv
List of Figures vi
List of Tables viii
1 Introduction 1
2 Background and Motivation 4
2.1 Overview of MIMO Detection Algorithms 4
2.1.1 Maximum Likelihood Detection 6
2.2 Graphics Processing Unit 9
2.2.1 Overview of CUDA 10
2.2.2 GPU Memory Hierarchy 10
2.2.3 Irregular Program 11
2.3 MIMO Detection on GPU 12
3 Memory-aware Implementation and Divergence Reduction Preprocessing for Sphere Decoding 14
3.1 Schnorr-Euchner Sphere Decoding Algorithm 14
3.2 Memory-aware Implemenation 15
3.2.1 Minimize the Main Memory Access 17
3.3 Divergence Issue in Schnorr-Euchner Sphere Decoding 18
3.3.1 Observation of Input Data Pattern 18
3.3.2 Full Frame Data Size 22
3.3.3 Divergence reduction Through Reorder 22
3.4 Proposed Preprocessing Stages 25
4 Experiments 28
4.1 Simulation Setup 28
4.2 Results 29
4.3 Limitation 32
5 Conclusions 35
Bibliography 37
[1] U. Ramacher, “Software-defined radio prospects for multistandard mobile phones,” Computer, vol. 40, no. 10, 2007.
[2] S. Yang and L. Hanzo, “Fifty years of mimo detection: The road to large-scale mimos,” IEEE Communications Surveys & Tutorials, vol. 17, no. 4, pp. 1941–1988, 2015.
[3] Q. Zheng, Y. Chen, R. Dreslinski, C. Chakrabarti, A. Anastasopoulos, S. Mahlke, and T. Mudge, “Architecting an lte base station with graphics processing units,” in 2013 IEEE Workshop on Signal Processing Systems, SiPS 2013. IEEE, 2013, pp. 219–224.
[4] J. Berkmann, C. Carbonelli, F. Dietrich, C. Drewes, and W. Xu, “On 3g lte terminal implementation-standard, algorithms, complexities and challenges,” in Wireless Communications and Mobile Computing Conference, 2008. IWCMC’08. Interna-
tional. IEEE, 2008, pp. 970–975.
[5] J. Ketonen, M. Juntti, and J. R. Cavallaro, “Performance—complexity comparison of receivers for a lte mimo–ofdm system,” IEEE transactions on signal processing, vol. 58, no. 6, pp. 3360–3372, 2010.
[6] Q. Zheng, Y. Chen, R. Dreslinski, C. Chakrabarti, A. Anastasopoulos, S. Mahlke, and T. Mudge, “Wibench: An open source kernel suite for benchmarking wireless systems,” in Proceedings of the IEEE International Symposium on Workload Char- acterization (IISWC), 2013, pp. 123–132.
[7] S.Bang,C.Ahn,Y.Jin,S.Choi,J.Glossner,andS.Ahn,“Implementation of lte system on an sdr platform using cuda and uhd,” Analog Integrated Circuits and Signal Processing, vol. 78, no. 3, pp. 599–610, 2014.
[8] J. Kim, S. Hyeon, and S. Choi, “Implementation of an sdr system using graphics processing unit,” IEEE Communications Magazine, vol. 48, no. 3, 2010.
[9] T.-D.Chiueh,P.-Y.Tsai,andI.-W.Lai,Baseband receiver design for wireless MIMO-OFDM communications. John Wiley & Sons, 2012.
[10] B. Hassibi and H. Vikalo, “On the sphere-decoding algorithm i. expected complexity,” IEEE transactions on signal processing, vol. 53, no. 8, pp. 2806–2818, 2005.
[11] S. Cook, CUDA Programming: A Developer’s Guide to Parallel Computing with GPUs, 1st ed. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2013.
[12] Q. Xu, H. Jeon, and M. Annavaram, “Graph processing on gpus: Where are the bottlenecks?” in Proceedings of the IEEE International Symposium on Workload Characterization (IISWC), 2014, pp. 140–149.
[13] D. Sui, Y. Li, J. Wang, P. Wang, and B. Zhou, “High throughput mimo-ofdm detection with graphics processing units,” in 2012 IEEE International Conference on Computer Science and Automation Engineering (CSAE), vol. 2, pp. 176–179.
[14] M.Wu,S.Gupta,Y.Sun,andJ.R.Cavallaro,“A GPU implementation of a real-time mimo detector,” in 2009 IEEE Workshop on Signal Processing Systems, SiPS, 2009, pp. 303–308.
[15] M.S.Khairy,C.Mehlführer,and M.Rupp,“Boosting spheredecoding speed through graphic processing units,” in 2010 European Wireless Conference (EW). IEEE, 2010, pp. 99–104.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top