跳到主要內容

臺灣博碩士論文加值系統

(44.222.82.133) 您好!臺灣時間:2024/09/07 19:23
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:陳俊佑
研究生(外文):Jiun-You Chen
論文名稱:具備捨入機制之MIPS32浮點協同處理器之實作
論文名稱(外文):An Implementation of MIPS32 Floating-Point Co-processor with Rounding Mechanism
指導教授:朱守禮
學位類別:碩士
校院名稱:中原大學
系所名稱:資訊工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2009
畢業學年度:97
語文別:中文
論文頁數:67
中文關鍵詞:RoundingFPALU浮點Count Leading Zero浮點協同處理器
外文關鍵詞:Count Leading ZeroFPALURounding
相關次數:
  • 被引用被引用:0
  • 點閱點閱:277
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
摘要
隨著多媒體應用的蓬勃發展,現今嵌入式手持式設備需要更多的計算效能來處理複雜的計算。其中,多媒體處理所需之浮點計算能力尤為重要。在過去,多數的嵌入式系統未具備硬體浮點計算能力,當需要浮點計算時,只能以整數函數模擬。除了增加程式計算時間外,更提高了完成工作所需之總能量消耗。有鑑於此,本研究以Verilog硬體描述語言,設計一個具備完整MIPS32浮點指令運算之浮點運算協同處理器(Floating-Point Co-Processor)。此浮點協同處理器實現了52道浮點指令之功能,包括浮點算術邏輯指令、分支指令、比較指令、記憶體存取指令、轉換指令、與搬移指令,並符合IEEE 754單精度與倍精度標準。當中包含自行研發之快速多週期浮點算術邏輯單元(Floating Point Arithmetic Logic Unit, FPALU)與快速捨入機制,並開發符合IEEE 754與MIPS32所要求各種例外情況之專屬例外處理機制。藉此加速浮點運算處理速度,進而提昇系統整體效能。
設計過程中,本研究以MIPS32軟體模擬器:SPIM作為參考依據,藉以驗證所設計之浮點處理器之功能。並以Verilog模擬器:Mentor Graphic的Modelsim與Novas nLint進行功能驗證與可合成語法檢查。在與本實驗室所開發之MIPS32整數處理器整合後,在硬體模擬模型(Simulation Model)中,藉著先前研發之模擬輸出入函式庫與MIPS SDE Lite/GCC 交叉編譯器(Cross Compiler),編譯應用程式,並搭配Softfloat函式庫,分別產生具有浮點指令與不具有浮點指令之測試檔,以驗證浮點協同處理器之功能,進而取得同一程式,在有硬體浮點運算與軟體模擬浮點運算下之執行時間差異。
在通過功能驗證後,本研究以Synopsys Design Compiler,在TSMC 0.13μm的製程技術,合成前述之RTL Verilog浮點運算器,並利用ARM Integrator發展板,將此一設計實現於FPGA晶片上,並進行軟硬體協同驗證,以確認硬體功能正確。
從實驗結果可知,本研究所開發之浮點協同處理器,在TSMC 0.13μm製程技術下,工作頻率可達113.3MHz。實驗結果亦證實,在具備硬體浮點運算器之電腦系統裡,需浮點運算之程式,效能提昇達823%以上,足見本研究之設計正確與其應用價值。
Abstract
With rapid growing of multimedia applications, modern embedded handheld devices need more computing power to process complex problems, especially in floating point computation when process multimedia applications. In the past, most of embedded systems don’t contain hardware floating point unit, so the applications have to be simulated by integer functions while they are required floating point capabilities. It consumes more execution time and energy to complete this kind of jobs. Accordingly, this study designs a hardware floating point co-processor with complete MIPS32 floating point instructions. This co-processors implements 52 floating point instructions, which includes floating point arithmetic logic instructions, branch instructions, comparison instructions, memory access instructions, convert instructions and move instructions, that are complaint with IEEE 754 single and double precision standard. In this co-processor, it contains a fast multi-cycle floating point arithmetic logic unit and a fast rounding mechanism. In order to deal with the exception handling mechanisms that are required by IEEE 754 and MIPS32, this study proposes a specific exception handling mechanism. These mechanisms can accelerate the floating point operations and overall system performance.
In the development, this study adopts a MIPS32 software simulator: SPIM, to be the golden model for verifying the functionality of this floating point co-processor. Then the designed Verilog co-processor is simulated by Mentor Graphic Modelsim and lint by Novas nLint to check its functional correctness and synthesizable syntax. By integrating with MIPS32 integer processor that developed by our lab, in the corresponding Simulation Model, this study evaluates the performance difference of benchmarks with and without floating point instructions. These benchmarks are generated by our designed simulating I/O routines and MIPS SDE Lite/GCC cross compiler to compile the same application, with Softfloat library.
After passing the functional verification, this study synthesize the proposed RTL Verilog floating point co-processor by Synopsys Design Compiler with TSMC 0.13μm technology library, then process the Hardware/Software co-verification by downloading this design into the FPGA of ARM Integrator and verifying its correctness.
According to experimental results, the proposed floating point co-processor can achieve 113.3MHz by TSMC 0.13μm technology. It also provides that the speedup can achieve 820% while the given benchmarks require floating point operations and executes in the computer system which includes our proposed floating point co-processor. These results demonstrate the correctness and functionality of our designed floating point co-processor.
目錄
摘要 I
Abstract II
致謝 IV
目錄 V
圖目錄 VII
表格目錄 VIII
第一章 序論 1
1.1. 研究動機 1
1.2. 研究架構 1
第二章 背景技術 2
2.1. MIPS浮點指令架構 2
2.1.1 MIPS浮點指令演進 2
2.1.2 MIPS32浮點指令集 3
2.2. IEEE 754標準規格 3
2.2.1 IEEE 754簡介 3
2.2.2 IEEE 754 基本數值 4
2.2.3 IEEE 754例外情況 5
2.3. Simulation Model 6
2.4. DSP Stone 6
第三章 相關研究 7
3.1. 針對浮點乘法器之捨入機制 7
3.2. 雙路徑之單精度五階管線浮點加法器 8
3.3. 整合捨入機制之浮點乘法器 9
第四章 MIPS32之浮點協同處理器實作與設計流程 11
4.1. Rounding機制 11
4.2. Count Leading Zero機制 13
4.3. 系統內部架構 14
4.3.1 浮點指令集 14
4.3.2 CPU與FP Co-processor相互關係 14
4.4. 浮點處理器 17
4.4.1 浮點暫存器組(FPR) 18
4.4.2 控制單元(Control Unit) 18
4.4.3 浮點算術運邏輯單元 20
第五章 設計驗證及實驗數據 36
5.1. 軟硬體驗證 36
5.1.1 軟體模擬驗證 36
5.1.2 軟硬體協同驗證 43
5.2. 結果分析 46
第六章 結論 47
參考資料 48
附錄A MIPS浮點指令集比較 50
附錄B MIPS32浮點指令集 52
附錄C Design Compiler合成數據(CPU與FP Co-processor) 56
附錄D Design Compiler合成數據(僅CPU) 58

圖目錄
Figure 2.1  IEEE 754之浮點數表示法 3
Figure 2.2  IEEE 754兩種資料型態 4
Figure 3.1  二進位標注法 10
Figure 4.1  Rounding表示圖 12
Figure 4.2  Rounding公式 12
Figure 4.3  四位元Encoding Priority機制之Count Leading Zero 13
Figure 4.4  CPU與FP Co-processor關係圖 15
Figure 4.5  Debussy展示完整FP Co-processor設計之架構圖 17
Figure 4.7  Debussy展示控制單元之架構圖 20
Figure 4.8  FPALU之表示圖 21
Figure 4.9  浮點加法器之資料路徑 22
Figure 4.10 加法器分階段處理 24
Figure 4.11 浮點減法器之資料路徑 25
Figure 4.12 減法器分階段處理 27
Figure 4.13 浮點乘法器之資料路徑 28
Figure 4.14 浮點除法器之資料路徑 30
Figure 4.15 Non-Restoring Square Root演算法 31
Figure 4.16 原始32位元無號數平方根計算器之資料路徑 32
Figure 4.17 修改後32位元無號數平方根計算器之資料路徑 33
Figure 4.18 浮點平方根之資料路徑 33
Figure 5.1  ModelSim模擬測試圖 37
Figure 5.2  SPIM指令模擬測試圖 37
Figure 5.3  使用nLint檢查設計之驗證結果 38
Figure 5.4  Complex_multiply之C code 39
Figure 5.5  Complex_multiply之Simulation model運算結果 39
Figure 5.6  Biquad_one_section之C code 40
Figure 5.7  Biquad_one_section之Simulation model運算結果 40
Figure 5.8  Mat1x3之C code 41
Figure 5.9  Mat1x3之Simulation model運算結果 42
Figure 5.10 Complex_multiply之ARM integrator執行結果 44
Figure 5.11 Biquad_one_section之ARM Integrator執行結果 44
Figure 5.12 Mat1x3之ARM Integrator執行結果 45

表格目錄
Table 2.1 各版本MIPS指令集支援之資料類型 2
Table 2.2 MIPS32 release 1指令分類 3
Table 3.1 利用RI與RZ取代IEEE 754中rounding模式 7
Table 4.1 浮點加、減、乘法例外情況之輸出 23
Table 4.2 除法器之例外情況處理及輸出 31
Table 5.1 各Testbench使用浮點處理器效能增益 43
Table 5.2 Design Compiler各項比較 46
參考資料
[1]MIPS Technologies, Inc., “MIPS32 Architecture For Programmers Volume I: Instruction to the MIPS32TM Architecture”, June 2003
[2]MIPS Technologies, Inc., “MIPS32 Architecture For Programmers Volume II: The MIPS32TM Instruction Set”, June 2003
[3]MIPS Technologies, Inc., “MIPS32 Architecture For Programmers Volume III: The MIPS32TM Privileged Resource Architecture”, 2003
[4]Denise E. M. Penors, “See MIPS Run”, 2nd Edition, Morgan Kaufmann
[5]IEEE, “IEEE Standard for Binary Floating-point Arithmetic”, ANSI/IEEE Standard, no. 754, 1985
[6]陳侑谷, “設計一顆MIPS R2000處理器並以ARM Integrator為基礎發展其軟硬體共同驗證流程”, 中原大學資訊工程所碩士論文, 2007
[7]V. Zivojnovic, J. Martinez, C. Schläger and H. Meyr, “DSPstone: A DSP-Oriented Benchmarking Methodology”, Proc. of ICSPAT'94 - Dallas, Oct. 1994.
[8]M. Gok, “A novel IEEE rounding algorithm for high-speed floating-point multipliers”, Integration, the VLSI Journal Volume 40, Issue 4, July 2007.
[9]Sheetal A. Jain, “Low-Power Single-Precision IEEE Floating-Point Unit” M.Eng. Thesis, Massachusetts Institute of Technology, May 2003.
[10]N. Quach, N. Takagi, M. Flynn, “Systematic IEEE rounding on high-speed floating-point multipliers”, IEEE Trans. VLSI Syst. 12 (2004) 511–519.
[11]N. T. Quach, N. Takagi, and M. J. Flynn, “On Fast IEEE Rounding,” Stanford Univ., Stanford, CA, Tech. Rep. CSL-TR-91-459, Mar. 1991.
[12]W. Park, T. Han, S. Kim, and S. Yang, “Floating point multiplier performing IEEE rounding and addition in parallel,” J. Syst. Architecture, vol. 45, no. 14, pp. 1195–1207, 1999.
[13]G. Even and P.-M. Seidel, “A comparison of three rounding algorithms for IEEE floating-point multiplication,” IEEE Trans. Computers , vol. 49, pp. 638–650, July, 2000.
[14]N. Burgess, S. Knowles, “Efficient implementation of rounding units”, Thirty-Third Asilomar Conference, Volume 2, pp.1489-1493, 1999
[15]David A. Patterson and John L. Hennessy, “Computer Organization & Design”, 2nd Edition, Morgan Kaufmann.
[16]陳信宏, “設計一個具有高速中斷處理機制之六階管線MIPS32處理器及其驗證環境”, 中原大學資訊工程所碩士論文, 2007
[17]Yamin Li and Wanming Chu, “Implementation of Single Precision Floating Point Square Root on FPGAs”, 1997
[18]John R. Hauser, “Handling Floating-Point Exceptions in Numeric Programs”, ACM Transactions on Programming Languages and Systems 18:2 (March 1996), pp. 139-174.
[19]許志男, “設計一個MIPS32處理器的工作驗證環境”, 中原大學資訊工程所碩士論文, 2009
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top