(3.236.6.6) 您好!臺灣時間:2021/04/22 18:48
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:陳永志
研究生(外文):Chen, Yung-Chih
論文名稱:容忍資料讀取失誤延遲技術的能耗效率實測-使用Wattch功耗模型
論文名稱(外文):Energy Efficiency Measurement of Tolerating Load Miss Latency Techniques - Using Wattch Power Model
指導教授:鍾崇斌
指導教授(外文):Chung, Chung-Ping
口試委員:鍾崇斌范倫達謝萬雲
口試委員(外文):Chung, Chung-PingVan, Lan-DaShieh, W. Y.
口試日期:2017-07-13
學位類別:碩士
校院名稱:國立交通大學
系所名稱:資訊學院資訊學程
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:英文
論文頁數:84
中文關鍵詞:能耗效率功耗模型資料讀取失誤延遲
外文關鍵詞:WattchEnergy EfficiencySimpleScalerPreserving BufferLoad Miss Latency
相關次數:
  • 被引用被引用:0
  • 點閱點閱:54
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:5
  • 收藏至我的研究室書目清單書目收藏:0
現代處理器雖然在架構設計上可能有所差異,但通常是使用許多共通的基本硬體單元建構而成。 可以透過一套共通的功耗模型來進行基本硬體單元能耗的模擬與評估。
透過將Wattch 功耗模型實施到不同容忍資料讀取失誤延遲技術,本研究對這些技術進行效能與能耗效率的實測,用以確認不同計算機架構設計在運算性能以及能耗效率的差異。
透過實測結果我們觀察到容忍資料讀取失誤延遲技術可以在那些會造成較高LLC資料讀取失誤指令比率與延誤執行週期的評估指標上取得很好的能耗效率改進,另一方面在其他評估指標上則可能會依照不同技術而有不同程度的能耗效率減損。
我們也發現採用Preserving Buffer為基礎的設計相比於指令窗硬體擴展的設計,在可承受的能耗效率減損代價上,可以擁有更好的硬體擴展可行性。
Although modern processors differ in architectural design, they are often constructed using a number of common basic hardware units. We can carry out the basic hardware unit energy consumption simulation and evaluation by using a set of common power model.
By applyg the Wattch power model on the different tolerating load miss latency techniques’ performance simulator, this study carries out the energy efficiency measures of these techniques to confirm the differences in computing performance and energy efficiency across different computer architecture designs.
Through the measured results we observed that latency-tolerant techniques can get better energy efficiency improvement at benchmarks which have higher LLC data load miss ratio and blocking cycles, otherwise, these techniques may get energy efficiency degradation at other benchmarks.
We also find PB-based design can have good hardware extension feasibility with affordable energy efficiency degradation compared to EIW-based design.
1. Introduction 1
1.1 Observation of Tolerating Load Miss Latency Techniques 1
1.2 Energy Efficiency Measurement and Evaluation of the Computer Architecture 2
1.3 Achievement 4
1.4 Organization of this Thesis 5
2. Background 6
2.1 Wattch Power Model Introduction 6
2.2 Energy Efficiency Metric Study 9
2.3 Tolerating Data Load Miss Latency Techniques 10
2.3.1 Direct Extend Instruction Window Size (EIW) 10
2.3.2 Dynamic Instruction Window Resizing 10
2.3.3 Runahead Execution 12
2.3.4 Preserving Buffer Design 14
2.3.4.1 Preserving Buffer (PB1T) 14
2.3.4.2 Concurrent Two-thread Preserving Buffer (PB2T) 15
3. Measurement of Performance and Energy 17
3.1 The Measuring Procedure of the Computer Architecture Energy Efficiency 17
3.2 Implementation of specific design’s power+performance simulator by using the Wattch power model 17
3.2.1 Identify the main different hardware components of different tolerating load miss latency techniques 18
3.2.2 Power Model Construction of additional components for different tolerating load miss latency techniques 19
3.2.2.1 EIW (Extend Instruction Window) 22
3.2.2.2 DIWR (Dynamic Instruction Window Resizing) 22
3.2.2.3 RA (Run-Ahead Execution) 23
3.2.2.4 PB1T (Preserving Buffer one Thread) 24
3.2.2.5 PB2T (Concurrent Two-thread Preserving Buffer) 24
3.3 Measurement of Different Evaluation Metrics 26
3.3.1 Measurement of Overall Performance 26
3.3.2 Measurement of Memory Level Parallelism Improvement 26
3.3.2.1 Measure MLP Improvement of EIW based design 26
3.3.2.2 Measure MLP Improvement of RA based design 26
3.3.3 Measurement of Total Energy 27
3.3.4 Measurement of Total Energy with Power Gating 27
3.3.5 Measurement of Energy Efficiency 28
3.4 Workloads Selection and Analysis for Latency-Tolerant Techniques 29
3.5 A Tool for Helping Execution Result Analysis 31
4. Experimental Results 34
4.1 Methodology 34
4.1.1 Benchmark and Baseline Configurations 34
4.2 Performance and energy efficiency Measurement Result for each design 36
4.2.1 EIW (Extend Instruction Window) 36
4.2.2 DIWR (Dynamic Instruction Window Resizing) 41
4.2.3 RA (Run-Ahead Execution) 45
4.2.4 PB1T (Preserving Buffer one Thread) 49
4.2.5 PB2T (Concurrent Two-thread Preserving Buffer) 56
4.3 Measurement Result Comparison between latency-tolerant techniques 63
4.3.1 Performance Improvement Comparision 63
4.3.2 Energy Consumption Increase Comparision 68
4.3.3 Energy Efficiency Improvement Comparision 72
4.4 Analysis of the benchmarks that do not match the expected results 76
4.4.1 Measurement of the CPU cycle intervals between two LLC data load miss instructions 78
4.4.2 Measurement of the number of blocking CPU cycles caused by each LLC data load miss instruction 80
5. Conclusion and Future Work 82
6. Reference 83
[1] Suzanne Marion Rivoire,” Models and metrics for energy-efficient computer systems”, Stanford University, Stanford, CA, 2008.
[2] Doug Burger and Todd M. Austin. “The SimpleScalar tool set, version 2.0.” ACM SIGARCH Computer Architecture News, 25(3):13–25, June 1997.
[3] David Brooks,Vivek Tiwari, Margaret Martonosi, “Wattch: a framework for architectural-level power analysis and optimizations” in ACM SIGARCH Computer Architecture News, Volume 28, Issue 2, 2000.
[4] Yuya Kora, Kyohei Yamaguchi, Hideki Ando, “MLP-aware dynamic instruction window resizing for adaptively exploiting both ILP and MLP”, in Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, December 07-11, 2013.
[5] O. Mutlu, J. Stark, C. Wilkerson, and Y. N. Patt, “Runahead Execution: An alternative to very large instruction windows for out-of-order processors,” in HPCA ’03: Proceedings of the 9th International Symposium on High-Performance Computer Architecture. Washington, DC, USA: IEEE Computer Society, 2003, p. 129.
[6] W. Y. Li, C.-L. Huang, and C.-P. Chung, “Tolerating Load Miss-Latency by Extending Effective Instruction Window with Low Complexity,” in Parallel Processing (ICPP), 2011 International Conference on. IEEE, 2011, pp. 83–92.
[7] M. Horowitz, T. IndermaurR. Gonzalez,”Low-power digital design” in Proceedings of the IEEE Symposium on Low Power Electronics,1994
[8] Kirk W. Cameron, Rong Ge, Xizhou Feng, Drew Varner, and Chris Jones, POSTER:“High-performance, Power-aware Distributed Computing Framework.” in Proceedings of the 16th ACM/IEEE International Conference on High Performance Computing and Communications (SC 2004), 2004
[9] Alain J. Martin, Mika Nystr¨om, Paul I. P´enzes “ET^2: A Metric For Time and Energy Efficiency of Computation.”, California Institute of Technology, Pasadena, CA. , 2001.
[10] W. Y. Li and C.-P. Chung, “Concurrent Latency-Tolerant Execution and Visualized Analysis”, National Chiao Tung University, Department of Computer Science (in press).
[11] S. Thoziyoor, J. Ahn, M. Monchiero, J. Brockman, and N. Jouppi, “A Comprehensive Memory Modeling Tool and its Application to the Design and Analysis of Future Memory Hierarchies,” in ISCA, 2008.
[12] S. Palacharla, N. P. Jouppi, and J. E. Smith, “Complexity-effective superscalar processors,” in ISCA’97: Proceedings of the 24th annual international symposium on Computer architecture. New York, NY, USA: ACM Press, 1997, pp. 206–218.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
無相關期刊
 
無相關點閱論文
 
系統版面圖檔 系統版面圖檔