跳到主要內容

臺灣博碩士論文加值系統

(44.192.79.149) 您好!臺灣時間:2023/06/03 00:20
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:陳昀徽
研究生(外文):Yun-Huei Chen
論文名稱:瞬間指令完成計數:一個同步多線程的提取引擎
論文名稱(外文):ICC:A Simultaneous Multithreading Fetch Engine
指導教授:謝忠健謝忠健引用關係
指導教授(外文):Jong-Jiann Shieh
學位類別:碩士
校院名稱:大同大學
系所名稱:資訊工程學系(所)
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2005
畢業學年度:93
語文別:英文
論文頁數:49
中文關鍵詞:提取策略同步多線程提取單元
外文關鍵詞:Fetch PolicyFetch UnitSimultaneous Multithreading
相關次數:
  • 被引用被引用:0
  • 點閱點閱:187
  • 評分評分:
  • 下載下載:2
  • 收藏至我的研究室書目清單書目收藏:0
同步多線程(SMT)是一種允許在每一個週期能夠同時發派來自不同獨立的應用程式或是線程的指令的一種技術。提取單元一直被認為是同步多線程的主要瓶頸所在,過去許多研究曾提出過一些提取策略來增進提取效率以及整體的效能。
在此篇論文,我們提出一個全新的提取策略,稱之為瞬間指令完成計數(ICC),它會計算每個線程在每一個時脈確認完成的指令數目,然後依照這些資訊來決定下一個週期要從哪些線程來提取指令。此外,我們還將此提取策略和被稱之為提取偏向(FB)和提取閘控優選(FGAP)的分支機制做結合,來建構更有效率的提取單元。經由模擬結果顯示,整體效能提升大約百分之十三,並且還減少了發派佇列的使用大小,同時還減少錯誤路徑指令的提取。另外,我們還展示負載平衡的狀態,這是過去相關研究沒有詳細討論過的議題。
Simultaneous Multithreading (SMT) is a technique that permits multiple instructions from multiple independent applications or threads to issue each cycle. While the fetch unit has been identified as one of the major bottlenecks of SMT architecture, several fetch schemes were proposed by prior works to enhance the fetching efficiency and overall performance.
In this paper, we propose a novel fetch policy called Instantaneous Commit Count (ICC) which counts each thread’s retired instructions each cycle then properly selects which threads to feed next cycle. We also combine this scheme with branch mechanisms, named FB and FGAP, to construct the effective fetch unit. Simulation results show that the overall performance is improved about 13% on speedup, the issue queue size is reduced and the wrong-path instructions fetch are also reduced. Furthermore, we show the state of load balance that never discussed in prior works in detail.
ENGLISH ABSTRACT i
CHINESE ABSTRACT ii
ACKNOWLEDGEMENTS iii
TABLE OF CONTENTS iv
LIST OF FIGURES vi
LIST OF TABLES vii
CHAPTER 1 INTRODUCTION 1
1.1 Simultaneous Multithreading Architecture 1
1.2 Bottlenecks of SMT 6
1.3 The Thesis Organization 7
CHAPTER 2 RELATED WORKS 8
2.1 Fetch Policy 8
2.2 Branch Prediction Mechanism 10
2.2.1 Biased Branch Filter 11
2.2.2 Confidence Estimator 14
2.2.3 Integrated Branch Prediction Mechanism 16
CHAPTER 3 FETCH POLICY ON SMT 19
3.1 Base Fetch Scheme 19
3.2 Our Proposed Fetch Policy 19
3.3 Dynamically Speculative Controlled Fetch Policy 23
CHAPTER 4 METHODOLOGY 26
CHAPTER 5 SIMULATION RESULTS 30
5.1 Experiment Results 30
5.2 Load Balance on SMT 40
CHAPTER 6 CONCLUSIONS 43
REFERENCE 45
APPENDIX A 48
A.1 Original ICOUNT Code 48
A.2 Modified ICOUNT Code 49
[1]D. Tullsen, S. Eggers, and H. Levy, “Simultaneous multithreading: Maximizing on-chip parallelism,” In 22nd Annul International Symposium on Computer Architecture, June 1995, Pages 392-403

[2]D. Tullsen, S. Eggers, J. Emer, H. Levy, J. Lo, and R. Stamm, “Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading
processor,” In 23rd Annul International Symposium on Computer Architecture, May 1996

[3]S. Eggers, J. Emer, H. Levy, J. Lo, and R. Stamm, and D. Tullsen, “Simultaneous multithreading: A platform for next-generation processors,” IEEE Micro, Sep. 1997, Pages 12-18

[4]D. Madon, E. Sanchez, and S. Monnier, “A Study of a Simultaneous Multithreaded Architecture,” In Proceedings of EuroPar'99, Toulouse, Lectures Notes in Computer Science, Volume 1685, Springer-Verlag, Sep. 1999, Pages 716-726

[5]D. Tullsen and J. Brown, “Handling Long-latency Loads in a Simultaneous Multithreading Processor” MICRO-34, Dec. 2001, Pages 318-327

[6]D. Marr, F. Binns, D. Hill, G. Hinton, D. Koufaty, J. Miller and M. Upton, “Hyper-Threading Technology Architecture and Microarchitecture” Intel Technology Journal Q1, 2002

[7]C. Shin and S. Lee, “Dynamic Scheduling Issues in SMT Architectures,” In Proceedings of the 17th International Parallel and Distributed Processing Symposium, April 2003, Pages 8pp.

[8]A. El-Moursy and D. Albonesi, “Front-end policies for improved issue efficiency
in SMT processors,” In Proceedings of the 9th International Symposium on High-Performance Computer Architecture, Feb. 2003, Pages 31-40

[9]A. Falcon, A. Ramirez and M. Valero, “A Low-Complexity, High-Performance Fetch Unit for Simultaneous Multithreading Processors,” In Proceedings of the 10th International Symposium on High Performance Computer Architecture,
Feb. 2004, Pages 244-254

[10]E. Fernandez, F. Cazorla, A. Ramirez and M. Valero, “DCache Warn: an I-Fetch Policy to Increase SMT Efficiency,” In Proceedings of the 18th International Parallel and Distributed Processing Symposium, April 2004, Pages 74-84

[11]L. He and Z. Liu, “An Effective Instruction Fetch Policy for Simultaneous Multithreaded Processors,” In Proceedings of the 7th International Conference on High Performance Computing and Grid in Asia Pacific Region, July 2004, Pages 162-168

[12]T.-R. Yang, and J.-J. Shieh, “Dynamic Fetch Engine Design for Simultaneous Multithreaded Processors,” In Proceedings of the 9th Asia-Pacific Computer Systems Architecture Conference, Sep. 2004, Pages 489-502
[13]C.-H. Lin, and J.-J. Shieh, “A Study of Branch Prediction and Fetch Policy on Simultaneous Multithreading Architecture,” In Proceedings of the 9th World Multi-Conference on Systemics, Cybernetics and Informatics, Orlando Florida, USA, July 2005

[14]P.-Y. Chang, M. Evers, and Y. Patt, “Improving Branch Prediction Accuracy by Reducing Pattern History Table Interference,” 1996 conference on Parallel Architectures and Compilation Techniques, Oct. 1996, Pages 48-57

[15]P.M.W. Knijnenburg, A. Ramirez, F. Latorre, J. Larriba, and M. Valero, “Branch classification to control instruction fetch in simultaneous multithreaded architectures,” In 2002 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems, Jan. 2002, Pages 67-76

[16]T. Austin, E. Larson, D. Ernst, “SimpleScalar: an infrastructure for computer system modeling,” IEEE Computer Journal, Feb. 2002, Pages 59-67

[17]S. Hily, A. Seznec, “Branch Prediction and Simultaneous Multithreading,” In Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques, Oct. 1996, Pages 169-173
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top