跳到主要內容

臺灣博碩士論文加值系統

(44.222.218.145) 您好!臺灣時間:2024/03/04 00:31
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:賴鈺仁
研究生(外文):Yu-ren Lai
論文名稱:以單指令派發亂序執行之指令管道設計適用於嵌入式系統之超純量雙核心架構
論文名稱(外文):Design of the Superscalar Dual-Core Architecture using Single-Issue Out-of-Order Instruction Pipe for Embedded System
指導教授:邱日清
指導教授(外文):Jih-chin Chiu
學位類別:碩士
校院名稱:國立中山大學
系所名稱:電機工程學系研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2009
畢業學年度:97
語文別:中文
論文頁數:77
中文關鍵詞:亂序執行嵌入式系統超純量雙核心單指令派發
外文關鍵詞:Dual-CoreSuperscalarEmbedded SystemOut-of-OrderSingle-Issue
相關次數:
  • 被引用被引用:0
  • 點閱點閱:275
  • 評分評分:
  • 下載下載:37
  • 收藏至我的研究室書目清單書目收藏:0
在現今嵌入式系統多元化的應用下,系統之設計除了著重在低功率消耗與降低設計的複雜度外,彈性化且適切化的運算效能在處理器的架構發展上有著不可被忽視的重要性。而隨著微處理器系統的演進,並由於製程技術的進步,使得將多個核心整合到單一顆處理器中變得更為容易。然而多核心的系統架構雖然具備良好的多執行緒執行效能,但是卻無法有效地提升單執行緒執行效能,導致無法對原有程式的執行提供有效地支援,程式碼必須在好的執行排序與好的作業環境下才能發揮其功效。
在本論文中以適用於嵌入式系統之多核心處理器設計為考量,討論如何以多核心架構有效地提升單執行緒的執行效能且相容於原有程式的執行。而為了簡化在多核心架構設計上的問題複雜度,故在本論文中以雙核心的架構來討論如何建構由於資料相依與控制相依的指令執行路徑,因此本論文的設計重點在於:
1. 建構簡單的亂序執行核心。
2. 可動態排程的指令分析器設計。
3. 可進行跨核心運算元共享的機制。
4. 具雙核心間同步偵測功能的指令執行確認。
本架構中的每個核心皆為單指令派發亂序執行之指令管道。指令在抓取時可以藉由指令分析器針對指令間的資料相依性編寫對應的指令標籤並將指令動態排程後派發至兩個核心內執行,指令在核心內則可根據指令標籤至其他核心取得所需之運算元資料,完成運算元在兩個核心間的資料交流,使得指令可以在此雙核心架構中達到最大化的平行運算。而核心內指令以資料流為導向的方式執行但遵守依序完成的原則,以維持程式執行的正確性。本論文以ARM指令集為依據,實現與探討如何建立兩個核心之間的指令交互控制機制以及如何提升單一執行緒在此架構中的執行效能。
而在效能評估方面,利用程式完成此超純量雙核心架構之動作行為模型,以程式軌跡導向模擬的方式,對此架構進行功能驗證,並以MediaBench suite為效能評估程式進行效能之評估模擬,根據模擬的結果與單核心五階管線架構相比,顯示平均有1.4倍以上的效能增進。
With the improvement in VLSI technology, realization of multiple processor cores on a single chip becomes easier. Therefore, more and more users execute applications on current multi-core architectures. The multi-core system has a brilliant performance in executing multi-threaded applications, but this system could not gain any performance in single-threaded applications. This paper proposes a multi-core architecture for enhancing single-threaded performance in embedded system, and focuses on four points:
1. Construct a simple out-of-order execution core.
2. Design a dynamically scheduled instruction analyzer.
3. Design a mechanism for sharing operands between two cores.
4. Design a mechanism for committing instructions synchronously between two cores.
The architecture of each core is single-issue out-of-order instruction pipe. First, instruction analyzer will fetch instructions and generate instruction dependence tags by detecting the dependencies among the fetched instructions, then schedule instructions dynamically and dispatch to the cores. In the core, instructions can know where to get required operands according to the information of instruction tags, this mechanism enables data can be shared between two cores. Instructions are executed by data-driven approach, but in-order complete to maintain the correctness of the program order. Based on ARM instruction set, this paper tries to explore ways to achieve interaction control mechanisms between two cores and to accelerate a single-thread in the dual-core architecture.
We write a simulation model of the proposed architecture in C language as our trace-driven simulation framework and the MediaBench suite is selected for the experiments. According simulation result, the architecture can obtain average 40% performance speedup comparing to the five-stage pipelined architecture.
摘要 I
ABSTRACT III
目錄 V
圖片列表 VII
表格列表 IX
第一章 簡介 1
1-1研究動機 1
1-2研究目標 2
1-3論文架構 3
第二章 相關研究 4
2-1單一核心架構介紹 4
2-2超純量處理器介紹 6
2-2-1靜態排程超純量處理器 7
2-2-2動態排程超純量處理器 8
2-2-3預先執行超純量處理器 10
2-3多核心架構介紹 11
第三章 單指令派發亂序執行之超純量雙核心架構設計 20
3-1單指令派發亂序執行之超純量雙核心架構 20
3-2指令分析器之設計 24
3-3單指令派發亂序執行之核心架構設計 30
3-3-1 Fetch Stage 34
3-3-2 Data Stage 36
3-3-3 Memory Stage 41
3-3-4 Commit Stage 44
第四章 模擬與分析 47
4-1架構驗證 47
4-1-1奇數和偶數和程式運作範例 47
4-1-2矩陣相乘程式運作範例 49
4-1-3範例程式在其他架構中運作的比較 51
4-2效能模擬 54
4-2-1模擬環境 55
4-2-2模擬器的實現 56
4-2-3效能評估程式 57
4-3模擬結果分析 59
第五章 結論 60
參考文獻 62
[1] P. P. Gelsinger Intel Corp., Hillsboro, OR, “Microprocessors for the new millennium : Challenges,opportunities, and new frontiers”; IEEE International Solid-State Circuits Conference, 2001, pp. 22-25
[2] ARM, “ARM9TDMI Technical Reference Manual Rev 3”; 2000
http://infocenter.arm.com
[3] ARM, “ARM1176JZ-S Technical Reference Manual Revision : r0p7”; 2008
http://infocenter.arm.com
[4] ARM, “Cortex-A8 Technical Reference Manual Revision : r3p2”; 2009
http://infocenter.arm.com
[5] J. E. Smith, G. S. Sohi, “The microarchitecture of superscalar processors”; Proceedings of the IEEE Vol. 83, Issue 12, Dec. 1995, pp. 1609-1624
[6] J. E. Thornton, “Parallel operation in the Control Data 6600”; Proceedings of Spring Joint Computer Conference, 1964, pp. 33-40
[7] D. W. Anderson, F. J. Sparacio, and R. M. Tomasulo, “The IBM System/360 model 91: Machine philosophy and instruction-handling”; IBM Journal of Research and Development, Vol. 11, 1967, pp. 8-24
[8] Z. Purser, K. Sundaramoorthy, and E. Rotenberg, “A study of Slipstream Processors”; Proceedings of the 33rd annual ACM/IEEE international, 2000, pp. 269-280
[9] K. Sundaramoorthy, Z. Purser, and E. Rotenberg. “Slipstream processor: improving both performance and fault tolerance”; ACM SIGPLAN Notices, Vol 35, 2000, pp. 257-268
[10] K. Z. Ibrahim, G. T. Byrd and E. Rotenberg, “Slipstream execution mode for CMP-based multiprocessors”; High-Performance Computer Architecture, 2003, pp. 179-190
[11] H. Zhou, “Dual-core execution: building a highly scalable single-thread instruction window”; 14th International Conference on Parallel Architectures and Compilation Techniques (PACT''05), 2005, pp. 231-242
[12] M. Tremblay, J. Chan, S. Chaudhry, A. W. Conigliam, S. S. Tse, “The MAJC architecture: a synthesis of parallelism and scalability”; Micro, IEEE Vol. 20, Issue 6, Nov.-Dec. 2000, pp. 12-25
[13] G. S. Sohi, S. E. Breach, T. N. Vijaykumar, “Multiscalar Processor”; 22nd Annual International Symposium on Computer Architecture, 1995, pp. 414-425
[14] M. Franklin, “The Multiscalar Architecture”; Ph.D. Thesis, Computer Science Technical Report #1196, 1993
[15] L. Wang, C. L. Wu, “Distributed Instruction Set Computer Architecture”; IEEE Transactions on Computers, Vol. 40, 1991, pp. 915-934
[16] J. Congy, G. Hany, A. Jagannathan, G. Reinmany, K. Rutkowski, “Accelerating Sequential Applications on CMPs Using Core Spilling”; IEEE Transactions On Parallel and Distributed Systems, Vol. 18, Issue 8, 2007, pp. 1094-1107
[17] J. C. Chiu, Y. L. Chou, P. K. Chen, “A Superscalar Dual-Core Architecture for ARM ISA”; Proceedings of the International Computer Symposium 2006, Dec. 2006, pp. 21-26
[18] L. Seiler, D. Carmean, E. Sprangle, T. Forsyth, M. Abrash, P. Dubey, S. Junkins, A. Lake, J. Sugerman, R.Cavin, R. Espasa, E. Grochowski, T. Juan, and P. Hanrahan, “Larrabee: a many-core x86
architecture for visual computing”; ACM Transactions on Graphics, Vol. 27, Issue 3, Aug. 2008
[19] M. Gschwind, H. P. Hofstee, B. Flachs, M. Hopkins, Yukio Watanabe, Takeshi Yamazaki, “Synergistic Processing in Cell''s Multicore Architecture”; IEEE Micro, Vol. 26, Issue 2, 2006, pp. 10-24
[20] S. S. Stone, K. M. Woley, M. I. Frank, “Address-Indexed Memory Disambiguation and Store-to-Load Forwarding”; Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture, Nov. 2005, pp. 171-182
[21] Tingting Sha, M. M. K. Martin, A. Roth, “Scalable Store-Load Forwarding via Store Queue Index Prediction”; Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture, Nov. 2005, pp. 159-170
[22] B. Bishop, T. P. Kelliher, M. J. Irwin, “A detailed analysis of MediaBench”; Signal Processing Systems, 1999. SiPS 99. 1999 IEEE Workshop, 20-22 Oct. 1999, pp. 448-455
[23] C. Lee, M. Potkonjak, W. H. Mangione-Smith “MediaBench: a tool for evaluating and synthesizing multimedia and communications systems”; Microarchitecture, 1997. Proceedings. Thirtieth Annual IEEE/ACM International Symposium, 1-3 Dec. 1997, pp. 330-335
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top