跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.81) 您好!臺灣時間:2025/10/07 00:43
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:王閭威
研究生(外文):Wang, Lyu Wei
論文名稱:BiFennel:針對大數據的快速二部圖劃分算法
論文名稱(外文):BiFennel: Fast Bipartite Graph Partitioning Algorithm for Big Data
指導教授:鍾葉青鍾葉青引用關係
指導教授(外文):Yeh Ching, Chung
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊系統與應用研究所
學門:電算機學門
學類:系統設計學類
論文種類:學術論文
論文出版年:2015
畢業學年度:103
語文別:英文
論文頁數:36
中文關鍵詞:二部圖圖劃分圖計算PowerGraph
外文關鍵詞:Bipartite graphGraph partitioningGraph processingPowerGraph
相關次數:
  • 被引用被引用:0
  • 點閱點閱:236
  • 評分評分:
  • 下載下載:6
  • 收藏至我的研究室書目清單書目收藏:2
在雲端計算被廣泛應用的今日,社交網絡分析、生物信息網絡、語義分析類的諸多應用對於快速處理億萬節點圖的能力有著迫切的需求,這也是分佈式高性能計算領域的研發重點之一。而許多諸如音樂、電影網絡的推薦系統,LDA主題模型等問題都可以通過二部圖的建模、計算進行解決。而圖劃分作為圖計算預處理最重要的環節,雖然已經是一個很成熟的技術,然而許多經典算法因為需要迭代多次,時間複雜度過大,因此應用到大規模圖劃分時時間過長。近年來也湧現出一些速度較快的圖劃分算法,然而這些算法并沒有針對二部圖進行針對性的優化。本文提出的BiFennel算法是一種新二部圖劃分算法,通過降低節點複製率,平衡圖數據負載的方式,進而降低迭代圖計算部分時間以及網路負載。最後,本文將BiFennel算法應用于圖計算平台PowerGraph [10]。實驗證明,相比於目前最好的Aweto算法,使用BiFennel後,圖引擎整體執行時間最高降低1.69倍,網路負載減少最高達到2.32倍。另外,當圖規模增大或是集群中節點數增多時,BiFennel的性能並沒有降低。
Cloud computing is widely utilized in today’s internet service, which severely
requires the ability of processing graphs of billion vertices rapidly in the situation
such as social network analyze, bio-informational network analyze and semantic
processing. Therefore, graph processing has a significant role in the research and
development of high-performance computing. However, many problems such as
music and movie recommendation web, LDA topic model can be solved by modeling
these data into bipartite graph and computing it with graph processing engines. As the
most important step in preprocessing, graph partitioning is a relatively mature
technology. However, most of classic graph partitioning algorithms require iterative
calculation for several times, which causes huge time complexity especially adapting
to big data. There are some rapid algorithms these years, which cannot be used in
bipartite graph directly. Therefore, this thesis proposed a new bipartite graph
partitioning algorithm, BiFennel, which effectively decrease graph processing time
and network loading by reducing vertex replication factor and maintaining work
balance. Afterwards, this thesis applies BiFennel to a popular graph engine called
PowerGraph and proves it gets up to 1.69 times overall runtime speedup of graph
processing and 2.32 times network load reduction comparing to the best competitor,
Aweto. Besides, the algorithm doesn’t meet a degradation when graph scale increases,
so does its scalability.
List of Figures 4
List of Algorithm 5
Chapter 1 Introduction 6
Chapter 2 Related Work 11
2.1 Graph Engines 11
2.1.1 Pregel (Apache Giraph) 11
2.1.2 GraphX 13
2.1.3 PowerGraph 14
2.2 Graph Partitioning Algorithms 15
2.2.1 METIS 16
2.2.2 Fennel 18
2.2.3 BiCut 19
2.2.4 Aweto 21
Chapter 3 Proposed Algorithm and Implementation 22
3.1 BiFennel Algorithm 22
3.2 Sample of BiFennel 25
3.3 Implementation in PowerGraph 27
Chapter 4 Performance Evaluation 29
4.1 Environment Settings 29
4.2 Partitioning Performance on Different Graph Data 30
4.3 Performance on Different Algorithms 31
4.4 Scalability 33
Chapter 5 Conclusions and Future Work 34
REFERENCE 35

1. Sergey B., Larry P. "The anatomy of a large-scale hypertextual web search engine. " Computer Networks and ISDN Systems, 30 (1998): 5-20.
2. Yu, Gu. "Large scale graph data processing on cloud computing environments." Chinese Journal of Computers, 34.10 (2010): 1753-1765.
3. Malewicz G., Austern M. H., et al. "Pregel: a system for large-scale graph processing. " In SIGMOD 10(2010): 135–146. 

4. http://finance.chinanews.com/it/2014/01-16/5745005.shtml
5. http://tech.qq.com/a/20140725/000288.htm
6. Yucheng L., Bickson D., et al. "Distributed GraphLab: a framework for machine learning and data mining in the cloud." Proceedings of the VLDB Endowment 5.8 (2012): 716-727.
7. Xin, Reynold S., et al. "Graphx: A resilient distributed graph system on spark." First International Workshop on Graph Data Management Experiences and Systems. ACM, 2013.
8. Tsourakakis C. E., Gkantsidis C., et al. "Fennel: Streaming graph partitioning for massive scale graphs. " Tech. Rep. 175918, Microsoft, 2012.
9. Spark. http://spark.apache.org
10. Gonzalez, Joseph E., et al. "PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs." OSDI. Vol. 12. No. 1. 2012.
11. Andreev, Konstantin, and Harald Racke. "Balanced graph partitioning." Theory of Computing Systems 39.6 (2006): 929-939.
12. Chen, Rong, et al. "Bipartite-oriented distributed graph partitioning for big learning." Proceedings of 5th Asia-Pacific Workshop on Systems. ACM, 2014.
13. Dean, Jeffrey, and Sanjay Ghemawat. "MapReduce: simplified data processing on large clusters." Communications of the ACM 51.1 (2008): 107-113.
14. Valiant, Leslie G. "A bridging model for parallel computation." Communications of the ACM 33.8 (1990): 103-111.
15. BSP: https://en.wikipedia.org/wiki/Bulk_synchronous_parallel
16. Apache Giraph: http://giraph.apache.org
17. Xin, Reynold S., et al. "Graphx: A resilient distributed graph system on spark." First International Workshop on Graph Data Management Experiences and Systems. ACM, 2013.
18. Kernighan, Brian W., and Shen Lin. "An efficient heuristic procedure for partitioning graphs." Bell system technical journal 49.2 (1970): 291-307.
19. Bi, T., et al. "Efficient multiway graph partitioning method for fault section estimation in large-scale power networks." IEE Proceedings-Generation, Transmission and Distribution 149.3 (2002): 289-294.
20. Karypis, George, and Vipin Kumar. "Metis-unstructured graph partitioning and sparse matrix ordering system, version 2.0." (1995).
21. KONECT. http://konect.uni-koblenz.de/
22. Bershad, Brian N., Edward D. Lazowska, and Henry M. Levy. "PRESTO: A system for object‐oriented parallel programming." Software: Practice and Experience 18.8 (1988): 713-732.

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top