(3.238.249.17) 您好!臺灣時間:2021/04/12 13:03
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:吳佳倫
研究生(外文):Chia-Lun Wu
論文名稱:自適應布穀鳥過濾器: 避免非必要的硬碟操作
論文名稱(外文):Adaptive Cuckoo Filters: Avoiding Trips to Disk
指導教授:陳銘憲陳銘憲引用關係
指導教授(外文):Ming-Syan Chen
口試委員:吳尚鴻陳建錦彭文志帥宏翰
口試委員(外文):Shang-Hung WuChen-Chin ChenWen-Chih PengHong-Han Shuai
口試日期:2016-07-21
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:電機工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2016
畢業學年度:104
語文別:英文
論文頁數:32
中文關鍵詞:布隆過濾器布穀鳥過濾器自適應
外文關鍵詞:Bloom filtersCuckoo filtersAdaptive
相關次數:
  • 被引用被引用:0
  • 點閱點閱:90
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
布隆過濾器已經長時間地在各個領域被用來做快速的近似集合成員測試因為它具有簡單但有效的設計。近年來,一種新型態的過濾器 – 布穀鳥過濾器被提出,布穀鳥過濾器支援刪除元素的動作,而且不管在時間上或空間上都比標準的布隆過濾器以及能支援刪除動作的布隆過濾器變形在實務上有更好的表現。然而,在OLTP 資料庫的使用情境中,物件的存取分佈通常是高度不均勻的,也就是某些物件被存取的比其他物件多很多,而在這樣的存取模式中會讓傳統的布隆過濾器以及布穀鳥過濾器無法達到它們理論上該有的偽陽性比率,因為他們都是在假定物件的存取分佈是均勻的情況所設計。
在這篇論文中,我們基於傳統的布穀鳥過濾器而設計了一種新的資料結構 – 自適應性布穀鳥過濾器。我們採用一系列小的布穀鳥過濾器,而在存取分佈不均勻時動態調整部分布穀鳥過濾器的大小以達到比傳統布穀鳥過濾器更低的偽陽性比率。這篇論文展示詳盡的實驗結果,包含ACF 的空間使用,偽陽性比率與速度效能。此外,我們也介紹ACF 如何被應用在一個物聯網資料庫引擎中並且在真實的資料存取模式中有很好的表現。

Bloom filters have been used for fast approximate set membership tests in various areas in a long history because of its compact and simple design. Recently, a newly proposed data structure - Cuckoo filter supports dynamic deletion of elements and has practically better performance in both time and space than Bloom filter and its variants. However, in the scenario of OLTP databases, the access workload is often skewed and will make both Bloom filter and Cuckoo filter fail to achieve their target false positive rate which is calculated in the assumption that the workload is uniform distributed.
In this thesis, we present a new data structure called Adaptive Cuckoo Filters (ACF) which can exploit the skewed access pattern and dynamically adjust the size of a list of cuckoo filters to achieve smaller false positive rate than a single cuckoo filter. This thesis also shows the results of comprehensive experiments covering space, precision and speed of ACF. Furthermore, we show how ACF can be applied to an IoT database engine and achieve better performance in real workload.

口試委員會審定書 - i
Acknowledgements - ii
中文摘要 - iii
ABSTRACT - iv
CONTENTS - v
LIST OF FIGURES - vi
Chapter 1 Introduction - 1
Chapter 2 Application Background - 3
Chapter 3 Related Work - 4
3.1 Bloom Filters and Variants - 4
3.2 Cuckoo Filter - 5
3.3 Split Bloom Filters - 6
Chapter 4 Adaptive Cuckoo Filters - 8
4.1 Overview - 8
4.2 Data Structure - 10
4.3 Operations - 11
4.4 Adaptive Strategy - 14
4.5 Negative Cache - 17
Chapter 5 Experiments - 18
5.1 Benchmark Environment - 18
5.2 Software/Hardware Settings - 19
5.3 Experiment 1: False Positives - 20
5.4 Experiment 2: Speed - 22
5.5 Experiment 3: Grow/Shrink Buckets - 23
5.6 Experiment 4: Grow Threshold - 25
5.7 Experiment 5: Negative Cache - 27
5.8 Experiment 6: Combinations - 29
Chapter 6 Conclusion - 30
Bibliography - 31

[1] A. Rousskov and D. Wessels. 1998. Cache digests. In Computer Netw. ISDN Syst., vol. 30, no. 22–23, pp. 2155–2168.
[2] A. Malik and P. Lakshman. 2010. Cassandra: a decentralized structured storage system. In SI-GOPS Operating System Review, vol. 44, no. 2.
[3] A. Eldawy, J. J. Levandoski, and P. Larson. 2014. Trekking through Siberia: Managing cold da-ta in a memory-optimized database. In Proc. Int. Conf. Very Large Data Bases, pp. 931–942.
[4] B. H. Bloom. 1970. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, 13(7):422–426.
[5] Bruck, J., Gao, J., and Jiang, A. 2006. Weighted Bloom Filter. In IEEE International Symposium on Information Theory.
[6] Bin Fan, Dave G. Andersen, Michael Kaminsky, and Michael D. Mitzenmacher. 2014. Cuckoo filter: Practically better than Bloom. In Proc. 10th ACM Int. Conf. Emerging Networking Ex-periments and Technologies (CoNEXT ’14), pages 75–88, 2014. doi:10.1145/2674005. 2674994.
[7] C. Diaconu, C. Freedman, E. Ismert, P.-A. Larson, P. Mittal, R. Stonecipher, N. Verma, and M. Zwilling. 2013. Hekaton: SQL Server’s memory-optimized OLTP engine. In SIGMOD, pages 1–12.
[8] D. N. Simha, M. Lu, and T.-c. Chiueh. 2012. An update aware storage system for low-locality update intensive workloads. In Proceedings of the International conference on Architectural Support for Programming Languages and Operating Systems, pages 375–386. ACM.
[9] F. Putze, P. Sanders, and S. Johannes. 2007. Cache-, hash- and space efficient bloom filters. In Experimental Algorithms, pages 108–121. Springer Berlin / Heidelberg.
[10] J. K. Mullin. 1990. Optimal semijoins for distributed database systems. In IEEE Trans. Soft-ware Eng., 16(5):558–560.
[11] L. Fan, P. Cao, J. Almeida, and A. Z. Broder. 1998. Summary cache: A scalable wide-area Web cache sharing protocol. In Proc. ACM SIGCOMM, Vancouver, BC, Canada.
[12] L. Sidirourgos and P.Å. Larson, Adjustable and Updatable Bloom Filters. Available from the authors.
[13] N.P.Jouppi. 1990. Improving direct-mapped cache performance by the addition of a small ful-ly-associative cache and prefetch buffers. In 17th Annual International Symposium on Computer Architecture. doi:10.1109/ISCA.1990.134547
[14] Zhong, M., Lu, P., Shen, K., and Seiferas, J. 2008. Optimizing Data Popularity Conscious Bloom Filters. In Proceedings of the 27th ACM symposium on Principles of Distributed Compu-ting.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
無相關期刊
 
無相關點閱論文
 
系統版面圖檔 系統版面圖檔