(3.237.20.246) 您好!臺灣時間:2021/04/16 06:31
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:楊得鑫
研究生(外文):Te-Shin Yang
論文名稱:VLIWDSP架構之增進指令並行度之向量化運算機制
論文名稱(外文):Improving ILP with the Vectorized Computing Mechanism in VLIW DSP Architecture
指導教授:邱日清
指導教授(外文):Jih-Ching Chiu
學位類別:碩士
校院名稱:國立中山大學
系所名稱:電機工程學系研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2003
畢業學年度:91
語文別:英文
論文頁數:84
中文關鍵詞:指令並行度向量運算
外文關鍵詞:VLIWvector computinginstruction level parallelism
相關次數:
  • 被引用被引用:0
  • 點閱點閱:144
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:9
  • 收藏至我的研究室書目清單書目收藏:0
現今的DSP處理器設計常利用VLIW架構提高指令執行之並行度,以達到提高效能的目的。提高指令並行度的瓶頸有二,一是硬體資源是否足以同時處理所有的平行指令,二是由於指令間的相依關係所以無法平行處理;本論文針對FFT演算法設計了一個VLIW架構之運算核心DVBTDSP,並利用軟體排程(Software pipelining)的方式將指令迴圈重新排程以達到在處理FFT之蝴蝶運算時具有最佳之指令並行度,另外為了能提供順暢的資料流,本論文針對FFT向量運算之特性,改良傳統DSP的餘數定址(modulo addressing)之運算機制,使得原本離散的向量能被視為一新的連續向量,避免了因向量中斷所造成的管線延遲,根據模擬分析的結果,此架構在處理FFT運算時跟C6200相比只需要其1/2的運算時間,在做其他演算法如FIR,IIR,DCT也有不亞於C6200的效能。
In order to improving the performance for real-time application, current digital signal processors use VLIW architectures to increase the degree of instruction level parallelism (ILP). Two factors will limit the ILP, one is enough hardware resource for all parallel instructions. Another is the dependence relations between instructions. This thesis designs a VLIW architecture processing core called DVBTDSP molded by FFT algorithm and uses the software pipelining mechanism to schedule the loop to achieve the highest ILP degree when used to execute FFT butterfly operations. Furthermore, in order to provide the smooth data stream for pipeline operations, we design a mechanism to improve the modulo addressing, which will collect the discrete vectors into one continuous vector. The simulation results show that the DVBTDSP has double performance of the C6200 for the FFT processing, and has good performance for FIR, IIR and DCT algorithm computing.
摘要 i
ABSTRACT ii
Contents iii
List of Figures v
List of Tables vii
Chapter 1 Introduction 1
1.1 The Development of DSP and Vector Processors 3
1.2 Standard DSP Architecture 4
1.3 Motivation and Goal 6
Chapter 2 Survey 8
2.1 VLIW 8
2.2 Basic Compiler ILP 9
2.3 Vector Processors 13
2.4 Current DSP processor with vector computing (VFP, C3x, C6x) 14
Chapter 3 Design of an Instruction Pipeline Decoder 20
3.1 The Characteristics of Arm Introduction Set 21
3.1.1 Instruction types 21
3.1.2 Multi-cycle instruction 22
3.1.3. Instruction stream 22
3.1.4. Forwarding controller 25
3.2 A Single Instruction Pipeline Decoder Design 26
3.2.1 Architecture 26
3.2.2 Resolution unit 28
3.3 Decoder design in VLIW DSP architecture 29
Chapter 4 Vectorized computing algorithm in VLIW architecture 31
4.1 FFT algorithm with DSP processing 31
4.2 Vectorized code scheduling 36
4.3 Circular Index Register setting instructions 39
4.4 Conditional load instruction 40
4.5 Modulo addressing mode 41
4.6 The Architecture of DVBTDSP 46
4.7 Super Element Architecture 48
4.7.1 ALUL 50
4.7.2 ALUR & MUL 51
4.7.3 Load 53
4.7.4 Store 55
4.7.5 Register File 56
Chapter 5 Verification and Analysis result 59
5.1 Verification environment 61
5.2 Synthesis results 62
5.3 Analysis results 65
Chapter 6 Conclusions and Future Work 72
Appendix 74
Reference 82
[1] Sunghyun Jee; Palaniappan, K, ”Dynamically scheduling VLIW instructions with dependency information” Interaction between Compilers and Computer Architectures, 2002, pp15-23
[2] J W Cooley and J W Tukey: “An Algorithm for the Machine Computation of Complex Fourier Series”, Mathematical Computations, 19, April 1965, pp. 297-301
[3] Lars Wanhammar, DSP Integrateed Circuits, academic press, 1999.
[4] Glasser L.A and Dobberpuhl D.W, “The Design and Analysis of VLSI Circuits”, Addison-Wesley, Reading, MA, 1985
[5] Gene Frantz, “Digital Signal Processor Trends“, IEEE Micro,
November-December 2000 pp 52-59 November/December 2000 (Vol. 20, No. 6)
[6] Wolfe, A.; Fritts, J.; Dutta, S.; Fernandes, E.S.T.,” Datapath design for a VLIW video signal processor”, High-Performance Computer Architecture, 1997., Third International Symposium on , pp24 -35, 1-5 Feb 1997
[7] Sunghyun Jee; Palaniappan, K. “Compiler processor tradeoffs for DISVLIW architecture”, International Symposium on Parallel Architectures, Algorithms and Networks, pp: 175 -180. 2002
[8] J. Fritts. Architecture and Compiler Design Issues in Programmable
Media Processors, Ph.D. Thesis, 2000.
[9] D. A. Patterson and J. L. Hennessy, “Computer Atchitecture a Quantitative Approach”, Third Edition, Morgan Kaufmann Publisher, 2003
[10] Calahan, D.; Ames, W., ”Vector processors: Models and applications”, Circuits and Systems, IEEE Transactions on, pp715-726, Volume: 26 Issue: 9 , Sep 1979
[11] Kai Hwang, Faye A. Briggs, “Computer Architecture and Parallel Processing”,McGraw-Hill Book Company,1984
[12] Texas Instruments, ”TMS320C3X User''s Guide”, http://www.ti.com/sc/docs/psheets/rel_dsp.htm
[13] Texas Instruments, “TMS320C6000 CPU and Instruction Set Reference Guide”, http://www.ti.com/sc/docs/psheets/rel_dsp.htm
[14] J. Eyre, J. Bier, "DSP Processors hits the mainstream" Computer Magazine,
pp. 51-59, August 1998.
[15] ARM,”VFP9-S Vector Floating-point Coprocessor Technique Reference Manual”, http://www.arm.com
[16] ARM,”Arm Architecture Reference Manual”, http://www.arm.com
[17] Simon Segars, ”The ARM9 Family – High performance Microprocessors for Embedded Applications” Computer Design: VLSI in Computers and Processors, 1998. ICCD ''98. Proceedings. International Conference, pp:230-235,1998
[18] Steve Fuber, “ARM System-on-Chip Architecture” Addison Wesley Longman Inc,1996.
[19] Findlay, P.A.; Trainis, S.A.; Steven, G.B.; Adams, R.G.,” HARP: a VLIW RISC processor”, CompEuro ''91. ''Advanced Computer Technology, Reliable Systems and Applications''. 5th Annual European Computer Conference. Proceedings. , pp368 -372, 13-16 May 1991
[20] Lee, M.; Tirumalai, P.; Ngai, T.-F., “Software pipelining and superblock scheduling: compilation techniques for VLIW machines,” Proceeding of the Twenty-Sixth Hawaii International Conference on System Sciences, pp 202 -213, 5-8 Jan 1993.
[21] Bogong Su; Jian Wang; Zhizhong Tang; Wei Zhao; Yimin Wu; A Software sPipelining Based VLIW Architecture and Optimizing Compiler Microprogramming and Microarchitecture. Micro 23. Proceedings of the 23rd Annual Workshop and Symposium, Workshop on, pp17-27, 27-29, Nov 1990
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
系統版面圖檔 系統版面圖檔