跳到主要內容

臺灣博碩士論文加值系統

(216.73.217.144) 您好!臺灣時間:2026/04/26 14:53
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:廖啟村
研究生(外文):Chi-TsunLiao
論文名稱:Hadoop HBase的分散式快照架構
論文名稱(外文):A Framework of Distributed Snapshots for Hadoop HBase
指導教授:蕭宏章
指導教授(外文):Hung-Chang Hsiao
學位類別:碩士
校院名稱:國立成功大學
系所名稱:資訊工程學系碩博士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2013
畢業學年度:101
語文別:英文
論文頁數:32
中文關鍵詞:HBase分散式快照
外文關鍵詞:HBasedistributed snapshots
相關次數:
  • 被引用被引用:0
  • 點閱點閱:329
  • 評分評分:
  • 下載下載:8
  • 收藏至我的研究室書目清單書目收藏:0
儘管 Apache HBase™已經是一套相當出色的分散式big data store,但是系統狀態還原的研究仍相當缺乏彈性,例如HBase 無法指定系統恢復至過去某一時刻的狀態。本論文著重在實現更有彈性的復原機制在HBase,主要分成四個階段。首先紀錄HFile 和更新指令日誌。第二,利用vector clock 找出一致性的分散式快照。第三,我們利用bulk load process 讀取HFile 以重建HBase。第四,重播備份的HFile 和實際恢復表格內容的時間差內的更新指令日誌。最後我們用一個應用程式來證明修改後的HBase 擁有還原至任意分散式快照的能力。
Apache Hadoop HBase™ is an emerging distributed key-value persistent data store, which can accommodate a large volume of data rapidly introduced from a variety of sources. While data objects stored in HBase are precious, HBase is unable to perform parallel recovery for recovering historical data objects concurrently stored in multiple storage servers in a consistent manner. The study presents a framework for implementing a data recovery scheme in HBase. The framework consists of four components, including (1) distributed snapshots represented by event logs gathered from internal (system) and external (clients) operations, (2) a global time labeling scheme for correlated events, (3) a bulk load process for bootstrapping a HBase cluster with a given snapshot, and (4) a forward replaying mechanism for precisely running the system into any specified time instance in the past. We enhance HBase such that it is capable of performing parallel recovery, and demonstrate our prototype implementation with performance results. In addition, based on our prototype, an application tracking multiple clients’ locations is demonstrated.
摘要 iv
ABSTRACT v
ACKNOWLEDGEMENTS vi
TABLE OF CONTENTS vii
LIST OF TABLES ix
LIST OF FIGURES x
CHAPTER 1 INTRODUCTION 1
1.1 Solutions in State-of-the-Art Products for Recovery 2
1.2 Research Issues 3
1.3 Our Proposal and Contributions 4
1.4 Roadmap 5
CHAPTER 2 RELATED WORK 6
2.1 Apache Hadoop 6
2.2 Apache HBase 8
2.3 Lamport Timestamps 11
2.4 Vector Clock 12
CHAPTER 3 OUR PROPOSED FRAMEWORK 14
3.1 State Gathering 15
3.1.1 HFile replication 15
3.1.2 Event log 15
3.2 Distributed Snapshots 16
3.2.1 Moving region 17
3.2.2 The implement of mark vector clock 17
3.2.3 Consistent distributed snapshots 19
3.3 Bulk Load and Log Replay 21
3.3.1 Bulk load process 21
3.3.2 Log replay process 22
CHAPTER 4 EVULATION 23
4.1 System Deploying 23
4.2 Experiment 24
CHAPTER 5 APPLICATION OF THE CONSISTENT SNAPSHOTS 27
CHAPTER 6 SUMMURY 29
REFERENCES 31
[1] Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein,Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, and Ramana
Yerneni. “PNUTS: Yahoo!'s hosted data serving platform. Proc. VLDB Endow. 1,2.2008, pp. 1277-1288.
[2] Cassandra. http://cassandra.apache.org/
[3] CDH. http://www.cloudera.com/content/cloudera/en/products/cdh.html
[4] Colin Fidge. “Timestamps in Message-Passing Systems that Preserve the Partial Ordering. In Proceedings of the 11th Australian Computer ScienceConference, February 1988, pp. 55–66.
[5] Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A.
Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber.“Bigtable: a distributed storage system for structured data. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7 (OSDI '06), Vol. 7. USENIX Association, Berkeley, CA, USA, 2006, pp. 15-15.
[6] Friedemann Mattern. “Virtual Time and Global States of Distributed Systems. In M. Cosnard et al., editor, Proceedings of the Workshop on Parallel and Distributed Algorithms, 1989, pp. 215–226.
[7] Hadoop. http://hadoop.apache.org/
[8] HBase. http://hbase.apache.org/
[9] James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, J. J. Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li,
Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, and Dale Woodford. “Spanner: Google's globallydistributed database. In Proceedings of the 10th USENIX conference on 32 Operating Systems Design and Implementation (OSDI'12). USENIX Association, Berkeley, CA, USA, 2012, pp.251-264.
[10] Joseph M. Hellerstein, Michael Stonebraker, and James Hamilton. “Architecture of a Database System. Now Publishers Inc., Hanover, MA, USA. 2007. Chapter. 7
[11] Leslie Lamport. “Time, clocks, and the ordering of events in a distributed system. Communications of the ACM 21 (7), 1978, pp. 558-565.
[12] MongoDB. http://www.mongodb.org/
[13] Özalp Babaoğlu, Keith Marzullo. “Consistent global states of distributed systems: fundamental concepts and mechanisms. In Distributed systems (2ndEd.), Sape Mullender (Ed.). ACM Press/Addison-Wesley Publishing Co., New York, NY, USA. 1993, pp. 55-96.
[14] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. “The Google file system. In Proceedings of the nineteenth ACM symposium on Operating systems principles (SOSP '03). ACM, New York, NY, USA, 2003, pp. 29-43.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
1. 12. 呂錦山、方菀萍,2001,高雄港貨櫃市場行銷策略之研究,運輸計劃季刊,第30卷,第2期,439-480。
2. 10. 吳杰穎、黃昱翔,2011,颱洪災害脆弱度評估指標之建立:以南投縣水里鄉為例,都市與計劃,第38卷,第2期,195-218。
3. 8. 朱金元、黃文吉,2003,貨櫃場貨櫃處理能量之研究,海運研究學刊,第14期,29-44。
4. 7. 朱經武、陳朝文,2003,基隆港貨櫃碼頭優先靠泊制度之探討,運輸計劃季刊,第32卷,第1期,95-114。
5. 1. 石豐宇、陳明和、胡權峰,2003,海運航商各種策略聯盟之評估─以合作賽局求解,運輸計劃季刊,第32卷,第3期,391-421。
6. 13. 李欣輯、楊惠萱、廖楷民、蕭代基,2009,水災社會脆弱性指標之建立,建築與規劃學報,第10卷,第3期,163-182。
7. 14. 李佳逸、張志清、包嘉源、呂頌揚,2009,由各國航運稅制論我國因應之道,航運季刊,第18卷,第3期,91-111。
8. 16. 李樑堅、陳昭宏、黃茂祥,2006,自由貿易港區設立對提升高雄港競爭力之影響,公共事務評論,第7卷,第2期,23-55。
9. 17. 林冠慧,2004,全球變遷下脆弱性與適應性研究方法與方法論的探討,全球變遷通訊雜誌,第43期,33-38。
10. 18. 林昱君,2009,金融危機前後兩岸經貿往來之觀察,經濟前瞻,第126期,12-18。
11. 19. 周宏彥、呂錦隆,2006,自由貿易港區組織變革之探討,航運季刊,第15卷,第1期,65-82。
12. 24. 胡滌生、魯炳炎,2006高雄港自由貿易港區之產業引進政策論證分析,航運季刊,第15卷,第1期,83-114。
13. 25. 陳基國、蕭丁訓,2005,臺灣地區港埠經營策略規劃,航運季刊,第14卷,第2期,1-20。
14. 28. 陳志嘉,2007,臺灣在全球環境變遷下的脆弱性研究與發展,環境與世界,第16期, 47-71。
15. 30. 陳桂嘉、吳守從、陳朝圳,2010,臺東地區土砂災害之生態環境脆弱度評估,航測及遙測學刊,第15卷,第1期,51-64。