(34.226.234.102) 您好!臺灣時間:2021/05/12 10:00
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

: 
twitterline
研究生:嚴智民
研究生(外文):Yen, Chi Ming
論文名稱:基於服務品質與設備異質性考量之 高效能雲端資料備份演算法
論文名稱(外文):An Efficient Replication Algorithm for Cloud Computing Systems with QoS and Heterogeneous Considerations
指導教授:林振緯
指導教授(外文):Jenn-Wei Lin
口試委員:郭斯彥王國禎連國珍林振緯
口試委員(外文):Sy-Yen KuoKuo-Chen WangBrian LienJenn-Wei Lin
口試日期:2011.07.11
學位類別:碩士
校院名稱:輔仁大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2011
畢業學年度:99
語文別:中文
論文頁數:47
中文關鍵詞:雲端運算資料複製服務質量Hadoop設備異質性圖形問題
外文關鍵詞:Cloud computingdata replicationQoSdevice heterogeneitygraph problem
相關次數:
  • 被引用被引用:0
  • 點閱點閱:288
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:14
  • 收藏至我的研究室書目清單書目收藏:1
在近年來廣受歡迎的雲端運算領域當中,伺服器端需要處理大量的資料,這些數目龐大的資料必須被存放在雲端儲存中心當中以備使用,要如何去確保當雲端儲存中心故障發生時,資料的可用性還可以保持穩定的效能水平,這是一個非常重要並且實用的議題,在這一個議題當中,我們提到了雲端運算中資料的複製問題,在我們所熟知的計算機研究領域當中,資料複製是一個廣泛被探討以及研究的技術,不管是發展迅速的網際網路技術,還是現行的資料庫領域,資料複製的議題都是一項關鍵性的技術,非常適合學術單位來進行深入研究。
本篇論文所提出的方法,是實作一個有效率的資料副本演算法來給目前廣受歡迎的雲端技術使用,在其中我們特別考慮各種應用程式的QoS需求,例如資料回應時間。還有各種硬體設備的異質性特徵,例如不同的傳輸率、資料存取時間、儲存容量…等等。
我們所提出的演算法當中分兩個階段來進行:第一部分主要在說明資料副本的需求是透過一個存在權重的二分圖來說明,這一個二分圖當中考慮了應用程式QoS需求以及各種硬體設備的異質性特徵;在第二部分,在前述所提到的具有權重的二分圖將會被延伸為流量圖,利用所延伸的流量圖,我們將資料的副本問題由二分圖轉變成我們熟知的最小流量演算法來解決。
而根據上述所得到的最小流量演算法的解答,使我們所提出的演算法對於資料副本的需求可以得到最佳的副本放置法。
最後,本篇論文所提出的方法進行了模擬實驗,證明了我們提出的方法當中的有效性,在減少副本放置的成本以及回復時間下的兩項目的當中可以有良好的效率。

In cloud computing, there is a huge amount of data in storage devices. It is an important issue how to ensure the data continues to be available when failures occur in storage devices. Data replication is a widely used technique. This thesis presents an efficient data replication algorithm for cloud computing systems. The algorithm considers the QoS requirements of applications (e.g. response time) and the heterogeneous characteristics of devices (e.g. different transmission rates, data access time, storage capacities, etc). There are two stages in the proposed algorithm. In the pre-processing stage, data replication requests are modeled as a weighted bipartite graph for considering QoS requirements and device heterogeneities. In the post-processing stage, the weighted bipartite graph is extended as a flow graph. Using the flow graph, the data replication problem can be optimally solved by transforming this problem to the well-known minimum cost flow problem. Finally, this thesis performs simulation experiments to demonstrate the effectiveness of the proposed algorithm in the replication cost and recovery time.
第一章 緒論………………………………………………………………………..1                     
1.1 研究背景簡介…………………………………………………………….1
1.2 研究動機與目的………………………………………………………….5
第二章 背景知識……………………………………………………………….....10               
2.1 雲端運算技術………………………………………………………......10
2.2 Mapreduce……………………………………..………….…………......11
2.3 Hadoop…………………………………………………………………..11
2.4 HDFS………………………………………………………………........13
2.4.1 HDFS預設策略…………………………………………………..14
2.5 副本放置演算法……………………………………………………….15
2.6 最小成本流量演算法………………………………………………….16
第三章 問題描述……………………………………………………………........18
3.1 基本想法……………………………………………………………….18
3.2 細節技術討論………………………………………………………….18
3.2.1 硬碟成本考量…………………………………………………..21
3.2.2 區域網路與交換機成本考量…………………………………..22
3.2.3 傳輸成本與成本矩陣…………………………………………..22
3.3單一資料區塊的副本放置法……………………………………………23
3.3.1 分流問題…………………………………………………………25
3.3.2 資料區塊的不可分割性…………………………………………26
3.4 多個資料區塊的副本放置法…………………………………………...27
3.4.1 資料區塊的大小…………………………………………………30
3.5簡化該問題到最小成本流量演算………………………………………32
第四章 模擬實驗與結果…………………………………………………………37
4.1 模擬實驗相關數據……………………………………………………...37
4.2 模擬結果………………………………………………………………...38
第五章 結論………………………………………………………………………42
第六章 參考文獻…………………………………………………………………44


附表目錄
表1.Google與Hadoop架構比較……………………………………….……...…13
表2.放置策略的步驟………………………………………………..................…16
表3. Summary of Notations……………………………………….……...........…25



附圖目錄
圖1.1美國國家標準與技術局所制定的三種服務的模式……..……………….3
圖1.2 hadoop平台的儲存策略………………………..………………………….7
圖1.3大型資料儲存中心……………………………..…………………………..7
圖2.1 MapReduce架構…………………………………………………………..12
圖2.2 hadoop預設策略………………………………………………………….15
圖2.3副本放置…………………………………………………………………..16
圖2.4最小成本流量圖範例……………………………………….………………17
圖3.1系統樹狀圖……………………………………………………………….20
圖3.2樹狀結構轉化成流量問題………….……………………………………21
圖3.4 hop數……………………………………………………………………..23
圖3.5備份Block大小………………………………………………………….24
圖3.6網路流量圖………………………..……………………………………….24
圖3.7備份資料流(a)(b)…………………..………………………………………25
圖3.8 block分散………………………………………………………………….26
圖3.9分割問題…………………………...………………………………………27
圖3.10儲存容量………………………………………………………………….28
圖3.11放置結果……………………………………..…………………………..28
圖3.12 A.B兩點各自的儲存容量………………………………………………29
圖3.13 A.B兩點的成本以及儲存容量………………………………………….29
圖3.14樹狀結構當中的網路流量……………………………………………….30
圖3.15剩餘可放置的容量……………………………………………………….30
圖3.16 0-1 Knapsack Problem……………………………………………………32
圖4.1副本放置成本……………………………………………………………..39
圖4.2副本回復時間……………………………………………………………..40

[1]Cloud Computing Wikipedia. [Online]. Available: http://en.wikipedia.org/wiki/
Cloud_computing
[2]Apache Hadoop Project. [Online]. Available: http://hadoop.apache.org/
[3]Apache HBase Project. [Online]. Available: http://hbase.apache.org/
[4]NCHC Cloud Computing Research Group. [Online]. Available: http://trac.nchc.org.tw/cloud/
[5]Hadoop Wiki. [Online]. Available: http://wiki.apache.org/hadoop/
[6]Hadoop API. [Online]. Available: http://hadoop.apache.org/common/docs/r0.20.2/api/index.htm
[7]Hbase API [Online]. Available: http://hbase.apache.org/docs/current/api/index.html
[8]J. Dean, S. Ghemawat, “MapReduce: Simplified Data Processing on
Large Clusters,” Proc. Symposium on Operating Systems Design and Implementation, San Francisco CA,
Dec.2004.
[9]Q. Zheng, “Improving MapReduce Fault Tolerance in the Cloud,” IEEE International Symposium on Parallel
and Distributed Processing, Atlanta, USA, Apr. 2010.
[10]I. Shadi, J. Hai, L. Lu, L. Qi, S. Wu, and X. H. Shi, “Evaluating MapReduce on Virtual Machines: The
Hadoop Case,” Proc. International Conference on Cloud Computing, Beijing, China, 2009.
[11]A. Gates, O. Natkovich, S. Chopra, P. Kamath, S. Narayanam, C. Olston, B. Reed, S. Srinivasan, and U.
Srivastava. “Building a High-Level Dataflow System on top of MapReduce: The Pig Experience,” Proc. of
Very Large Data Bases, vol. 2, no. 2, pp. 141-142, 2009.
[12]A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu,P. Wyckoff, R. Murthy, “Hive – A
Warehousing Solution Over a MapReduceFramework,” Proc. Very Large Data Bases, vol. 2, no. 2, pp. 1626-
1629, Aug. 2009.
[13]S. Ghemawat, H. Gobioff, S. T. Leung, “The Google File System,” Proc. Symposium on Operating Systems
Principles, pp.29-43, Lake George, New York, USA, Oct. 2003.
[14]F. Chang, J. Dean, S. Ghemawat, and W. C. Hsieh, “Bigtable: A Distributed Storage System for Structured
Data,” Proc. Symposium on Operating Systems Design and Implementation, pp. 1-14, Seattle, WA, USA, Nov.
2006.
[15]K. Shvachko, H. Kuang, S. Radia, and R. Chansler,”The Hadoop Distributed File System,” IEEE Mass Storage
Systems and Technologies, pp.1-10, Incline Villiage, Nevada, USA, May 2010.
[16]F. P. Junqueira, B. C. Reed. “The Life and Times of a Zookeeper,” Proc. ACM Symposium on Principles of
Distributed Computing, Calgary, Canada, 10-12 Aug. 2009.
[17]A. Iosup, S. Ostermann, M. N. Yigitbasi, and R. Prodan, “Performance Analysis of Cloud Computing
Services for Many-Tasks Scientific Computing,” IEEE Trans. Parallel Distrib. Syst., vol. 22, no. 6, pp.
931-945, June 2011.
[18]R. Maggiani, “Cloud Computing is Changing How We Communicate,” Proc. IEEE International Professional
Communication, Waikiki, United states, July2009.
[19]Q. Wei, B. Veeravalli, and L. Zeng, “CDRM: A Cost-effective Dynamic Replication Management Scheme for
Cloud Storage Cluster,” IEEE International Conference on Cluster Computing, pp.188-196, Heraklion,
Crete, Greece, Sept. 2010.
[20]C. Harvesf, and M. Blough, “Replica Placement for Route Diversity in Tree-Based Routing Distributed Hash
Tables,” IEEE Trans. Dependable and Secure Computing, vol. 8, no. 3, pp. 419-433, June 2011.
[21]W. Rao, L. Chen, and G. Wang, “Optimal Resource Placement in Structured Peer-to-Peer Networks,” IEEE
Trans. Parallel Distrib. Syst., vol. 21, no. 7, pp. 1011-1026, July 2010.
[22]V. S. Agneeswaran and D. Janakiram, “Node-Capability-Aware Replica Management for Peer-to-Peer Grids,”
IEEE Trans. Systems, vol. 39, no. 4, pp. 807-818, July 2009.
[23]A. Benoit, V. R. Sonigo, and Y. Robert, “Replica Placement and Access Policies
in Tree Networks,” IEEE Trans. Parallel Distrib. Syst., vol. 19, no. 12, pp.1614-1627, Dec. 2008.
[24]X. Tang, H. Chi, and S. T. Chanson, “Optimal Replica Placement Under TTL-Based Consistency,” IEEE Trans.
Parallel Distrib. Syst., vol. 18, no. 3, pp. 351-363, Mar. 2007.
[25]X. Zhang, D. He, D. H.C. Du, and Y. Lu, “Object Placement in Parallel Tape Storage Systems,” Proc.
International Conference on Parallel Processing, pp. 101-108, Columbus, Ohio, USA, Aug.2006.
[26]K. Kalpakis, K. Dasgupta, and O. Wolfson, “Optimal Placement of Replicas in Trees with Read, Write, and
Storage Costs,” IEEE Trans. Parallel Distrib. Syst., vol. 12, no. 6, pp.628-637, June 2001.
[27]A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S.
Sengupta, “VL2: A Scalable and Flexible Data Center Network,” Proc. ACMSIGCOMM Conference on Data
Communication, 2009.
[28]K. Li, L. Li, and T. Tesfazghi, “Adaptive and Cost-Optimal Parallel Algorithm for the 0-1 Knapsack
Problem,” Proc. International Euro micro Conference on Parallel, pp. 537-544, Ayia Napa, Cypru, Feb.
2011.
[29]K. Kato, M. Sakawa, “Genetic Algorithms With Decomposition Procedures
for Multidimensional 0-1 Knapsack Problems With Block Angular Structures,”IEEE Trans. Systems, vol. 33,
no. 3, pp. 410-419, June 2003.
[30]Z. Wang, J. Li, and H. Xie, ” A Solution of Linear Programming with Fuzzy
Variable,” Proc. Seventh International Conference on Fuzzy Systems andKnowledge Discovery, pp. 801-
805,Yantai, China, Aug. 2010.
[31]A. Kumar, K. Devi, S.P. Yadav, “Method to Solve Linear Programming
Problems Using Vague Sets,” Proc. IEEE International Advance Computing Conference, pp. 359-363, Patiala,
India, Mar. 2009.
[32]X. Meng, V. Pappas, and L. Zhang, “Improving the Scalability of Data Center Networks with Traffic-aware
Virtual Machine Placement,” Proc. IEEE INFOCOM, pp. 1-9, San Diego, CA, USA, 15-19 Mar. 2010.
[33]X. Yan, J. Yang, and Z. Zhang, “An Outer Bound for Multisource Multisink Network Coding with Minimum
Cost Consideration,” IEEE Trans. Inf. Theory, vol. 52, no. 6, pp. 2373-2385, June 2006.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔