(3.236.175.108) 您好!臺灣時間:2021/03/01 12:46
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:林卿安
研究生(外文):Ching-An Lin
論文名稱:基於映射化簡機制的模組平行化重構方法
論文名稱(外文):Module Parallelized Refactoring Method Based on Map Reduce Structure
指導教授:楊浩青楊浩青引用關係
指導教授(外文):Haw-Ching Yang
學位類別:碩士
校院名稱:國立高雄第一科技大學
系所名稱:電機工程研究所碩士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2014
畢業學年度:102
語文別:中文
論文頁數:69
中文關鍵詞:映射化簡Hadoop平行處理軟體重構
外文關鍵詞:software refactoringMap-ReduceHadoopparallel processing
相關次數:
  • 被引用被引用:0
  • 點閱點閱:204
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:25
  • 收藏至我的研究室書目清單書目收藏:0
對物料生產管理系統而言,從蒐集物料行為、計算物料行為特性、建立物料生產模型、與產生物料報表等模組功能均需大量計算能力。然實務上,常因下列因素致系統難以於時限內完成:1)資料量增加; 2) 計算資源分配不均; 3) 原模組功能不支援。
本研究基於映射化簡(Map Reduce)架構提出一模組平行化重構方法,可有效利用現有計算資源達到快速計算的目的。在軟體重構時,採批量物件化資料以重構多緒存取資料庫的程序,可有效減少資料庫存取次數;利用映射化簡機制,可增加平行處理多模組的能力。在平行計算時,根據資料項目與不同模組相依次序關係,適時啟用多節點同步計算,以能於時限內完成各模組的任務需求。
在執行應用上,在基於映射化簡架構的Hadoop環境下,以所提之模組平行化重構方法重構一物料控管系統模組。在部分模組具相依性的情況下,動態啟用分散於4部伺服器中的17個計算節點,則所需的計算時間可從原先1部伺服器需時約380分鐘,縮短為15分鐘,可見本研究所提模組平行化重構方法實具擴展性與時效性。
To execute material production management systems, lots of computing powers are required to finish module tasks of systems, such as material behavior collection, material behavioral characteristic calculation, material production model building, and material report generation. However, the following factors inhibit the systems to finish the tasks within time constraints: 1) data amount increased; 2) loads of computing resources unbalanced; and 3) original functions not supported.
This study based on Map-Reduce architecture proposes a module parallelized refactoring method to achieve the goal of fast computation by utilizing existing computing resources. In software refactoring, batching data by objects are presented to refactor the original procedures of multithread accessing databases for reducing database accessing times. Capabilities of parallel processing modules can be improved by using the Map-Reduce based architecture. During execution, according to data items and dependent sequences of modules, tasks of modules can be finished in time by the timely invoked several nodes for parallel computing.
In results, the proposed method based on a Map-Reduce structure of Hadoop environment was applied to refactor modules of a material control system. With some dependent modules, average time for executing the modules was reduced from 380 min (using a server) to 15 min (dynamically invoking 17 computing nodes from 4 servers). This result demonstrates timeliness and applicability of the proposed method.
目錄
中文摘要 i
Abstract ii
圖目錄 vi
表目錄 viii
第一章 緒論 1
1.1研究背景與動機 1
1.2研究目的 4
1.3論文架構 6
第二章 相關技術探討 8
2.1 RDBMS與MapReduce結構比較 8
2.2 Hadoop環境下分散式資料存取方式 10
2.2.1 Sqoop 10
2.2.2 Hive 10
2.2.3 Impala 13
2.3 Hadoop環境下資料庫存取效率比較 14
第三章 重構系統處理架構 16
3.1系統重構方法 16
3.2 平行化系統架構 19
3.2.1 Use Case 19
3.2.2 模組平行化執行架構 21
3.2.3 模組平行處理流程 26
第四章 模組平行化處理架構效益分析 33
4.1實際案例研究 33
4.1.1系統建置 33
4.1.2 案例佈署 34
4.2案例效率分析 39
第五章 結論與未來工作 54
5.1結論 54
5.2未來研究方向 54
參考文獻 56
附錄 58
A. Sqoop 58
B. Hive 60
C. Impala 64
D. Cloudera 65
[1] 刘麒赟, Apache Sqoop:連接 DB2 和 Hadoop 的橋梁,http://www.ibm.com/ ,2012。
[2] Tom White,2013,Hadoop技術手冊第三版,台北市,碁峯資訊。
[3] “Hive Command line Options,” The Apache Software Foundation, http://archive.cloudera.com.
[4] 王宏仁,擴充Hadoop功能的軍火,http://www.ithome.com.tw/ ,2012。
[5] 耿益鋒,陳冠誠,2013, Impala:新一代開源大數據分析引擎。
[6] 郭萌裕,2013,應用粒子群最佳化排程之平行物料控管處理架構,國立高雄第一科技大學,碩士論文。
[7] J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” in Proc. of Sixth Symposium on Operating System Design and Implementation, CA, December, 2004.
[8] J. Gray, “Distributed Computing Economics,” Queue, vol. 6, issue 3, pp. 63-68, 2008.
[9] S. Fu, “Failure-Aware Construction and Reconfiguration of Distributed Virtual Machines for High Availability Computing,” in Proc. of 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 372-379, 2009.
[10] google-gson, https://code.google.com/p/google-gson/ 。
[11] The Hadoop Distributed File System: Architecture and Design. http://hadoop.apache.org/common/docs/r0.19.1/hdfs_design.html
[12] Apache. Apache Hadoop .http://hadoop.apache.org/core/
[13] HBase architecture. http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture
[14] Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao,et al. Hive-
A Warehousing Solution Over a MapReduce Framework. Facebook Data Infrastructure Team. Aug 2009
[15]Relational database management system. http://en.wikipedia.org/wiki/Relational_database_management_system
[16]HADOOP VS RDBMS. WHERE HADOOP SCORES OVER RDBMS. http://bigdatanerd.wordpress.com/2012/02/12/hadoop-vs-rdbms-where-hadoop-scores-over-rdbms/
[17]Q. Xiongpai, et al., 2012, “Beyond Simple Integration of RDBMS and MapReduce -- Paving the Way toward a Unified System for Big Data Analytics: Vision and Progress,” 2012 Second International Conference on Cloud and Green Computing, pp.716-725, 1-3 Nov.
[18]Tim Kaldewey, Eugene J. Shekita, Sandeep Tata. Clydesdale: Structured Data Processing on MapReduce. EDBT 2012, pp.15-25.
[19] Yongqiang He, Rubao Lee, Yin Huai, Zheng Shao, Namit Jain, Xiaodong Zhang, Zhiwei Xu. RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems. ICDE 2011, pp. 1199-1208.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔