跳到主要內容

臺灣博碩士論文加值系統

訪客IP:216.73.216.57
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:卓志遠
研究生(外文):CHIH-YUAN CHO
論文名稱:Hadoop分散式檔案系統與Ceph效能比較
論文名稱(外文):Performance Comparison of Hadoop Distributed File System and Ceph
指導教授:楊朝棟楊朝棟引用關係
指導教授(外文):CHAO-TUNG YANG
口試委員:賴冠州洪國禎劉榮春時文中
口試委員(外文):KUAN-CHOU LAIKUO-CHEN HUNGJUNG-CHUN LIUWEN-CHUNG SHIH
口試日期:2014-06-30
學位類別:碩士
校院名稱:東海大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2014
畢業學年度:102
語文別:中文
論文頁數:68
中文關鍵詞:雲端運算分散式檔案系統HadoopCeph雲端儲存
外文關鍵詞:Cloud ComputingDistributed File SystemHadoopCephcloud storage
相關次數:
  • 被引用被引用:4
  • 點閱點閱:1516
  • 評分評分:
  • 下載下載:163
  • 收藏至我的研究室書目清單書目收藏:1
雲端運算與服務是指可以隨時、隨地、依需求、使用任何裝置存取各種服務。它是一種模式,依照需求能夠方便地存取網路上所提供的電腦資源,這些電腦資源包括網路、伺服器、儲存空間、應用程式及服務。因應雲端運算服務的普及而產生大量的資料數據,未來科技發展以巨量資料的保存、處理及分析應用為重點研究方向,針對大量資料的儲存及處理,沒有使用分散式運算與分散式檔案系統,無法滿足此需求。本論文使用開放原始碼,比較當前較著名的分散式檔案系統Hadoop與Ceph,針對這兩個系統的檔案上傳與下載效能,大小檔案的傳輸能力與容錯能力做比較分析,在60次不同檔案大小的傳輸測試中,Ceph只有2次明顯數據的優於Hadoop,其餘的實驗數據都顯示Hadoop具有較好的效能表現。證明現階段Hadoop受產業界所採用實作,已具有較穩定及較佳的效能,而Ceph目前還不建議在生產環境中採用,對於未來發展還有很大的成長空間。
Cloud computing refers to services at anytime, anywhere, on demand, using any device to access various services. It is a model that can be easily accessed in accordance with the needs of the network computer resources provided by these computer resources, including networks, servers, storage, applications, and services. In response to the popularity of cloud computing services, which produce large amount of information and data, and in order to save the future of science and technology development, processing and analyzing massive data applications for key research direction, the storage and handling of large amount of data without the use of distributed computing and Distributed File System, has become the focal point. In this thesis, the open source, Hadoop Distributed File System, and Ceph were compared in these areas of file uploading/downloading performance, transmission capacity, and fault tolerance comparative analysis of file size. In 60-different-file size transmission test, Ceph performed only 2 times better than the obvious data Hadoop. The rest of the experimental data shown a better performance achieved with the Hadoop. The more stable and better performing Hadoop though currently under proof stage, has yet to be implemented by the industry. Ceph is not currently recommended in production environment; however, there can be a great development for future growth.
摘要 I
Abstract II
致謝詞 III
Table of Contents V
List of Figures VII
List of Tables VIII
1 簡介 1
1.1 研究背景. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 研究動機. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 論文架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 研究背景 10
2.1 分享儲存系統(Share Storage) . . . . . . . . . . . . . . . . . . . . . 11
2.2 直接連接儲存(Direct Attached Storage, DAS) . . . . . . . . . . . . 11
2.3 網路附加存儲(Network Attached Storage, NAS) . . . . . . . . . . 12
2.4 網路檔案系統(Network File System, NFS) . . . . . . . . . . . . . . 13
2.5 分散式檔案系統(Distributed File System, DFS) . . . . . . . . . . . 14
2.6 Google 檔案系統(Google File System, GFS) . . . . . . . . . . . . . 16
2.7 Hadoop 分散式檔案系統(Distributed File System, HDFS) . . . . . 18
2.8 Ceph 分散式檔案儲存系統. . . . . . . . . . . . . . . . . . . . . . . 19
2.9 Hadoop 與Ceph 比較. . . . . . . . . . . . . . . . . . . . . . . . . 21
2.10 虛擬化. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 系統設計與實作 25
3.1 Hadoop 的HDFS 架構. . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2 Ceph 的Ceph FileSystem 架構. . . . . . . . . . . . . . . . . . . . 26
3.3 系統安裝. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4 實驗環境與結果 30
4.1 實驗環境. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 實驗方法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3 實驗結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5 結論與未來方向 39
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Appendix A 安裝前環境準備45
Appendix B Hadoop HDFS 安裝步驟47
Appendix C Ceph 安裝步驟52
[1] 廖本加. 於雲端計算環境上高可用性儲存系統之實作. Master’s thesis, 東海
大學, July 2011.
[2] 黃智霖. 實作一個雲端計算上具資源監控的分散式資料儲存系統. Master’s
thesis, 東海大學, July 2011.
[3] 陳曉莉. 2020 年全球數位資料量達40zb http://www.ithome.com.tw/node/
77784, Dec. 2012.
[4] 郑柯. 500tb——facebook 每天收集的数据量http://www.infoq.com/cn/
news/2012/08/FB-collect-500TB-everyday, Aug. 2012.
[5] 郭和杰. 調查報告:美國成年人擁有平板電腦比例突破1/3 http://
www.ithome.com.tw/node/80883, June 2013.
[6] 採訪整理:陳冠榮. 2013 智慧型手機用戶行為調查:台灣人手機依賴度居
亞太之冠http://www.appshot.net/op/news?nid=2788, June 2013.
[7] 戴廷芳. 紅帽注資1.75 億美元買下inktank,更邁進軟體定義儲存市場
http://www.ithome.com.tw/news/87542, May 2014.
[8] Apache hadoop 2.3.0 - hdfs architecture http://hadoop.apache.org/docs/current/
hadoop-project-dist/hadoop-hdfs/HdfsDesign.html, June 2014.
[9] Apache hadoop 官方網站http://hadoop.apache.org/, June 2014.
[10] Architecture —ceph documentation http://ceph.com/docs/master/architecture/,
June 2014.
[11] Ceph 官方網站http://ceph.com/, June 2014.
[12] Gmail https://mail.google.com/, June 2014.
[13] Google 搜尋https://www.google.com.tw/, June 2014.
[14] E. Al-Rayis and H. Kurdi. Performance analysis of load balancing architectures
in cloud computing. In EMS, pages 520–524, 2013.
[15] B. Arres, N. Kabachi, and O. Boussaid. Building olap cubes on a cloud
computing environment with mapreduce. In AICCSA, pages 1–5, 2013.
[16] S. Bai and H. Wu. The performance study on several distributed file systems.
In CyberC, pages 226–229, 2011.
[17] C. Biardzki and T. Ludwig. Analyzing metadata performance in distributed
file systems. In PaCT, pages 8–18, 2009.
[18] R. A. Brown. Hadoop at home: large-scale computing at a small college. In
SIGCSE, pages 106–110, 2009.
[19] H.-C. Chao, T.-J. Liu, K.-H. Chen, and C.-R. Dow. A seamless and reliable
distributed network file system utilizing webspace. In WSE, pages 65–68,
2008.
[20] C. Chen. Facebook 10 周歲,全球用戶12.3 億人http://www.bnext.com.tw/
article/view/id/30952, Feb. 2014.
[21] H.-Y. Chung, C.-W. Chang, H.-C. Hsiao, and Y.-C. Chao. The load rebalancing
problem in distributed file systems. In CLUSTER, pages 117–125,
2012.
[22] T. Grandison, E. M. Maximilien, S. S. E. Thorpe, and A. Alba. Towards a
formal definition of a computing cloud. In SERVICES, pages 191–192, 2010.
[23] B. Jia, T. W. Wlodarczyk, and C. Rong. Performance considerations of data
acquisition in hadoop system. In CloudCom, pages 545–549, 2010.
[24] S. Liu, X. Huang, H. Fu, and G. Yang. Understanding data characteristics
and access patterns in a cloud storage system. In CCGRID, pages 327–334,
2013.
[25] G. Loewen, M. Galloway, and S. V. Vrbsky. On the performance of apache
hadoop in a tiny private iaas cloud. In ITNG, pages 189–195, 2013.
[26] G. Mackey, S. Sehrish, and J. Wang. Improving metadata management for
small files in hdfs. In CLUSTER, pages 1–4, 2009.
[27] R. Moraveji, J. Taheri, M. HosseinyFarahabady, N. B. Rizvandi, and A. Y.
Zomaya. Data-intensive workload consolidation on hadoop distributed file
system. CoRR, abs/1303.7270, 2013.
[28] G. S. Reddy, Y. Feng, Y. Liu, J. S. Dong, J. Sun, and R. Kanagasabai.
Towards formal modeling and verification of cloud architectures: A case study
on hadoop. In SERVICES, pages 306–311, 2013.
[29] N. R. Reizer, G. D. Abowd, B. C. Meyers, and P. R. H. Place. Using formal
methods for requirements specification of a proposed posix standard. In ICRE,
pages 118–125, 1994.
[30] A. P. Sorin, F. Moldoveanu, A. Moldoveanu, V. Asavei, and C. M. Caraman.
Hardware acceleration in ceph distributed file system. In ISPDC, pages 209–
215, 2013.
[31] J. Spillner, J. Müller, and A. Schill. Creating optimal cloud storage systems.
Future Generation Comp. Syst., 29(4):1062–1072, 2013.
[32] I. Tomasic, J. Ugovsek, A. Rashkovska, and R. Trobec. Multicluster hadoop
distributed file system. In MIPRO, pages 301–305, 2012.
[33] Y. Wang, W. Wang, C. Ma, and D. Meng. Zput: A speedy data uploading
approach for the hadoop distributed file system. In CLUSTER, pages 1–5,
2013.
[34] S. A. Weil, S. A. Brandt, E. L. Miller, D. D. E. Long, and C. Maltzahn.
Ceph: A scalable, high-performance distributed file system. In OSDI, pages
307–320, 2006.
[35] J. Xiong, Y. Hu, G. Li, R. Tang, and Z. Fan. Metadata distribution and consistency
techniques for large-scale cluster file systems. IEEE Trans. Parallel
Distrib. Syst., 22(5):803–816, 2011.
[36] Q. Zhang, D. Feng, and F. Wang. Metadata performance optimization in
distributed file system. In ACIS-ICIS, pages 476–481, 2012.
[37] S. Zhang, S. Zhang, X. Chen, and X. Huo. Cloud computing research and development
trend. In Proceedings of the 2010 Second International Conference
on Future Networks, ICFN ’10, pages 93–97, Washington, DC, USA, 2010.
IEEE Computer Society.
[38] X. Zhang, W. Feng, and X. Qin. Performance evaluation of online backup
cloud storage. IJCAC, 3(3):20–33, 2013.
[39] W. Zhou, J. Han, Z. Zhang, and J. Dai. Dynamic random access for hadoop
distributed file system. In ICDCS Workshops, pages 17–22, 2012.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊