跳到主要內容

臺灣博碩士論文加值系統

(100.28.132.102) 您好!臺灣時間:2024/06/21 23:38
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:黎晉丞
研究生(外文):Li, Jing Cheng
論文名稱:解決探聽過濾器過時化問題的高效架構
論文名稱(外文):An Efficient Architecture for Resolving the Aging Problem of Snoop Filter
指導教授:張世杰張世杰引用關係
指導教授(外文):Chang, Shih Chieh
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2015
畢業學年度:103
語文別:英文
論文頁數:42
中文關鍵詞:探聽式一致性協定探聽式過濾器過濾器復興
外文關鍵詞:Snoop-based coherence protocolSnoop filterFilter rejuvenation
相關次數:
  • 被引用被引用:0
  • 點閱點閱:165
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
快取一致性(Cache coherence)是指保留在快取記憶體(Cache)中的共享資源必須保持資料一致性的機制。其中,探聽式一致性協定由於其簡單的特性在多系統晶片應用非常普遍。快取控制器(Cache controller)會藉由對快取中快取塊(Cache line)相對應的標籤(Cache tag)進行快取標籤查詢(Cache tag lookup)來決定一筆資料是否存在快取中來回應每筆探聽式要求(Snoop request)。根據以往的研究表示,由於共享資源在不同的端點之間數量是有限的,約90%的探聽式要求是多餘的。這些多餘的要求會因為使快取控制器進行快取標籤查詢而浪費系統的能源。因此,探聽式過濾器(Snoop filter)就是被提出應用在篩選出無用的探聽式要求。探聽過濾器必須將所有快取讀取過的資料的位置(Address)壓縮進過濾器中。由於壓縮的特性,探聽式過濾器可能會做出錯誤的篩選又稱為假陽性(False positive)。所謂假陽性要求是指通過了過濾器並進入到快取中進行快取標籤查詢,才發現這是一筆多餘的要求。然而隨著時間,在過濾器中大量的壓縮資料會導致過濾器產生假陽性的篩選機率變高。所以一個低效率的過時化過濾器會導致許多浪費的標籤查詢。
為了解決低效率的過時化過濾器所導致的問題,IBM提出了一個使過濾器更新的方法,並提出更新的時機點為發生快取掩蓋時(Cache wrap)。如果發生快取掩蓋的時機點太長,過濾器就會開始降低效率,甚至在過濾器更新後不能達到更新的目的。我們發現在一些應用(SPLASH 2)中,快取掩蓋發生的時機點很長,同時過濾器產生假陽性的篩選機率會升高。因此在這篇論文中,我們專注在如何更新一個發生過時化的過濾器而不是在如何設計一個過濾器上。我們提出我們的過濾器復興技術 (Filter rejuvenation technique) 來解決低效率的過時化過濾器所導致的問題。

Snoop-based coherence protocol is very popular in multiprocessor systems because of its simplicity. In a snoop-based, many cache tag lookups are needed for snoop requests. However, it has been shown about 90% snoop requests are useless and therefore cache lookups are redundant. To reduce unnecessary cache lookups, the snoop filter scheme was proposed. However, it is known that the efficiency of a snoop filter decreases with time. In other words, an aging filter cannot filter out unnecessary requests. To solve the problem of an aging snoop filter, [8] has proposed a novel way to rejuvenate an aging snoop filter so that an aging filter can be refreshed to have high efficiency again. We observe that in several real designs, [8] fail to achieve effective rejuvenation. In this paper, we focus on how to rejuvenate a snoop filter design rather than to design the snoop filter itself. We propose a novel way of rejuvenating an aging snoop filter by four filter rejuvenation techniques. Our experimental results show that the proposed techniques, when works together, reduce the number of unnecessary requests to 62.23% and the energy consumption to 67.58% averagely. For the best case, we approximately reduce the number to 30% compared to [8].
CONTENTS
中文摘要 III
ABSTRACT VI
CONTENTS VII
LIST OF TABLES IX
LIST OF FIGURES X
Chapter 1 INTRODUCTION 1
Chapter 2 BACKGROUND AND MOTIVATION 6
2.1 Snoop Filter 6
2.1.1 Removing addresses from snoop filter 8
2.2 Filter Rejuvenation 8
2.3 Motivation 10
2.3.1 Slow cache wrap 10
2.3.2 Stubborn set 12
Chapter 3 FILTER REJUVENATION TECHNIQUE 14
3.1 Architecture Assumption 14
Chapter 4 FOUR TYPES OF REJUVENATION TECHNIQUES 15
4.1 Type 0 rejuvenation technique 15
4.2 Type 1 rejuvenation technique 16
4.3 Type 2 rejuvenation technique 17
4.3.1 Self-invalidation of a cache line 18
4.3.2 Problems of self-invalidation 18
4.3.3 No stubborn line in L1 19
4.3.4 L1 cache wrap detection 21
4.3.5 Reduce redundant learnings 23
4.4 Type 3 rejuvenation technique 23
4.4.1 L1 cache hardware modification 24
4.5 Cache wrap condition 26
Chapter 5 INTEGRATION FOR FOUR FILTER REJUVENATION TECHNIQUES 28
5.1 Type 0 state 28
5.2 Type 1 state 28
5.3 Type 2 state or type 3 state 29
Chapter 6 EXPERIMENTAL RESULTS 31
6.1 Architectural Simulation Setup 31
6.2 Filter rejuvenation analysis 32
6.2.1 False positive analysis 33
6.2.2 Filter learning analysis 35
Chapter 7 RELATED WORK 37
Chapter 8 CONCLUSIONS 40
REFERENCE 41






















LIST OF TABLES
Table 1: Conditions of cache wrap for all four techniques 26
Table 2: SPLASH­2 benchmark characteristics 31
Table 3: Multicore architecture modeled for SESC 32
Table 4: Number of cache wraps 34


LIST OF FIGURES
Figure 1: Cache system with a snoop filter 2
Figure 2: Snoop filter false positive rate with cache wrap segment 10
Figure 3: Two stubborn set examples 11
Figure 4: Memory architecture assumption 13
Figure 5: Type 1 rejuvenation technique architecture 17
Figure 6: Type 2 rejuvenation technique architecture. 20
Figure 7: Type 3 rejuvenation technique architecture 24
Figure 8: Two kinds of the state machines to integrate our filter rejuvenation techniques 29
Figure 9: SPLASH­2 benchmarks with our filter rejuvenation techniques 36
Figure 10: Snoop based system with source-based snoop filters 38
Figure 11: Snoop based system with destination-based snoop filters 49

REFERENCES
[1] E. Atoofian and A. Baniasadi, “Using supplier locality in poweraware interconnects and caches in chip multiprocessors,” J. Systems Architecture 54(5): 507-518, 2008.
[2] E. Atoofian, A. Baniasadi and K. Aasaraai, “Speculative supplier identification for reducing power of interconnects in snoopy cache coherence protocols,” CF 2007: 259-266.
[3] M. Blumrich, V. Salapura and A. Gara, “Exploring the architecture of a stream register-based snoop filter,” 2011.
[4] A. Moshovos, “RegionScout: Exploiting Coarse Grain Sharing in Snoop-Based Coherence,”
[5] A. Moshovos, G. Memik, B. Falsafi and A. Choudhary, “JETTY Filtering Snoops for Reduced Energy Consumption in SMP Servers,” HPCA, 2001.
[6] J. Nilsson, A. Landin and Per Stenstrom, “The Coherence Predictor Cache: A Resource-Efficient and Accurate Coherence Prediction Infrastructure,” IPDPS, 2003.
[7] J. Renau et al. SESC simulator, January 2005. http://sesc.sourceforge.net.
[8] V. Salapura, M. A. Blumrich and A. Gara, “Design and implementation of the Blue Gene/P snoop filter,” HPCA, 2008.
[9] V. Salapura, M. Blumrich and A. Gara, “Improving the accuracy of snoop filtering using stream registers,” MEDEA, 2007.
[10] J. Singh, W.-D. Weber, and A. G. Splash, “Stanford parallel applications for shared memory. Computer Architecture News,” 1992.
[11] D. Tarjan, S. Thoziyoor and N. P. Jouppi, “Cacti 4.0. Technical report,” Compaq Research Lab, 2006.
[12] R. Ulfsnes, “A survey of low power design techniques for cache coherency in multiprocessor memory systems,” Semester project NTNU, 2012.
[13] S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, “The SPLASH-2 Programs: Characterization and Methodological Considerations,” in 22nd International Symposium on Computer Architecture (ISCA), 1995.
[14] D. H. Woo, M. Ghosh, E. Ozer, S. Biles and H.-H. S. Lee, “Reducing Energy of Virtual Cache Synonym Lookup using Bloom Filters,” CASES, 2006.

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top