跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.84) 您好!臺灣時間:2024/12/04 12:09
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:羅宇傑
研究生(外文):Yu-Chieh Lo
論文名稱:格網運算環境於序列型樣探勘之設計與實作
論文名稱(外文):The Design and Implementation of a Grid-Computing Environment for Mining Sequential Patterns
指導教授:吳志宏吳志宏引用關係董信煌董信煌引用關係
指導教授(外文):Chih-Hung WuShing-Hwang Doong
學位類別:碩士
校院名稱:樹德科技大學
系所名稱:資訊管理研究所
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2006
畢業學年度:95
語文別:中文
論文頁數:107
中文關鍵詞:資料探勘序列型樣探勘分散式處理鬆散耦合處理格網運算
外文關鍵詞:Data MiningMining Sequential PatternsDistributed ProcessingLoosely Coupled ParallelismGrid Computing
相關次數:
  • 被引用被引用:3
  • 點閱點閱:258
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本論文提出格網運算環境於序列型樣探勘之設計與實作。本研究實作一Apriori-like演算法的序列型樣探勘於格網運算環境,並加以驗證、分析其探勘效能與結果。Apriori-like演算法相較於相關序列型樣探勘的演算法而言,探勘過程需歷經大量重覆性與遞迴式的資料處理與演算,缺乏高效率的執行效能。但Apriori-like演算法透過修改少量的資料探勘演算程序,即可適用於鬆散耦合的分散式處理,並實行分散任務於格網運算環境。本研究所提出的格網運算環境中,設計了運算格網與資料格網等兩種格網節點型態,所有的格網節點皆以Globus Toolkit實作,每一格網節點安裝與設定本研究所開發的分散探勘程式。格網服務程序為透過使用者或遠端格網節點所觸發之程序,並賦予回應探勘結果至格網主控端,相互合作地完成探勘任務。格網運算環境主要分散於兩個不同的大學校園網路,安裝與設定了16台格網節點,每一格網節點為獨立電腦主機,每台電腦皆配置著不同的硬體元件,藉以呈現真實格網運算的實作環境。最後,經由本研究之實驗結果與效能評估顯示,格網運算環境可提供高度彈性與高效能之運算平台,適用於大容量資料庫的序列型樣探勘。
This thesis presents the design and implementation of a grid-computing environment for mining sequential patterns. An Apriori-like algorithm for mining sequential patterns is deployed in the proposed grid-computing environment. Apriori-like algorithm is not of very high performance in comparison to others but it is more convenient to be realized for distributed processing in a grid computing environment due to its nature of loosely coupled processing. Two types of grids are designed, the computing grid and data grid, in the proposed environment. All grid nodes are installed with full functions implementing the mentioned Apriori-like algorithm for mining sequential patterns, each of which is wrapped by Globus Toolkit. Grid services are invoked by the users or other grids and able to respond to the invoking side for cooperatively completing the mining task. There are 16 computers serving as grid nodes each of which is equipped with different hardware components and is distributed across two WANs. The experimental results show that the proposed grid-computing environment provides a flexible and efficient platform for mining sequential patterns from large datasets.
摘 要 i
ABSTRACT ii
誌 謝 iii
目 錄 iv
表 目 錄 v
圖 目 錄 vii
第一章 緒論 1
第一節 研究動機 1
第二節 研究目的 3
第三節 論文架構 3
第二章 研究背景 4
第一節 序列型樣資料探勘方法 4
第二節 分散式處理 15
第三節 格網運算 18
第三章 研究設計與方法 28
第一節 研究架構與流程 28
第二節 基於鬆散耦合處理的序列型樣探勘分析 33
第三節 問題探討與設計 36
第四節 基於格網運算的序列型樣探勘 42
第四章 系統設計與實作 61
第一節 系統設計 61
第二節 中介軟體 62
第三節 系統實作 63
第五章 實驗設計與分析 74
第一節 實驗需求與環境 74
第二節 實驗設計 78
第三節 實驗分析與討論 97
第六章 結論與討論 99
第一節 結論 99
第二節 討論 100
第三節 未來研究與建議 101
參考文獻 102
相關論文發表 107
[1]A. Ali, A. Anjum, T. Azim, J. J. Bunn, A. Mehmood, R. McClatchey, H. B. Newman, W. ur Rehman, C. Steenberg, M. Thomas, F. van Lingen, I. Willers, and M. A. Zafar, 2005, “Resource Management Services for a Grid Analysis Environment,” in Proceedings of the 34th International Conference on Parallel Processing Workshops, IEEE Computer Society, pp. 53-60, June.
[2]A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S. Tuecke, 2000, “The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets,” Journal of Network and Computer Applications, vol. 23, no. 3, pp. 187-200.
[3]A. S. Grimshaw, W. A. Wulf, J. C. French, A. C. Weaver, and P. F. Reynolds Jr., 1994, “Legion: The Next Logical Step toward a Nationwide Virtual Computer,” University of Virginia, Technical Report, CS-94-21, August.
[4]A. Silberschatz and P. Galvin, 2003, “Operating System Concepts,” 6th Edition, John Wiley & Sons.
[5]C. C. Yu and Y. L. Chen, 2005, “Mining Sequential Patterns from Multidimensional Sequence Data,” IEEE Transactions on Knowledge and Data Engineering, vol.17, no.1, pp. 136-140.
[6]C. J. Turner, D. Mosberger and L. L. Peterson, 1994, “Cluster-C: Understanding the Performance Limits,” in Workshop on Data Mining and the Grid at the IEEE International Conference on Data Mining, IEEE Computer Society, pp. 229-238, May.
[7]D. W. Erwin and D. F. Snelling, 2001, “UNICORE: A Grid Computing Environment,” Lecture Notes in Computer Science, vol. 2150, pp. 825-834.
[8]F. Masseglia, F. Cathala and P. Poncelet, 1998, “The PSP Approach for Mining Sequential Patterns,” Principles of Data Mining and Knowledge Discovery, pp. 174-184.
[9]Globus Alliance, http://www.globus.org/.
[10]I. Foster and C. Kesselmany, 1997, “Globus: A Metacomputing Infrastructure Toolkit,” The International Journal of Supercomputer Applications and High Performance Computing, vol. 11, no. 2, pp. 115-128.
[11]I. Foster, C. Kesselman, J. Nick, and S. Tuecke, 2002, “The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration,” January.
[12]IBM developerWorks, 2006, “New to Grid Computing”, http://www-128.ibm.com/developerworks/grid/newto/, August.
[13]IBM Quest Data Mining Project, 1996, “Quest Synthetic Data Generation Code.”
[14]J. Ayres, J. Flannick, J. Gehrke, and T. Yiu, 2002, “Sequential Pattern Mining Using a Bitmap Representation,” in Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp. 429-435.
[15]J. Cao, S. A. Jarvis and S. Saini, 2002, “ARMS: an Agent-Based Resource Management System for Grid Computing,” Scientific Programming, vol. 10, no. 2, pp. 135-148.
[16]J. Han, J. Pei, B. Mortazavi-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M. Hsu, 2004, “Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach,” IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 11, pp. 1424-1440.
[17]J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal and M. C. Hsu, 2000, “FreeSpan: Frequent Pattern-Projected Sequential Pattern Mining,” in Proceedings of the 6th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 355-359.
[18]K. Hsu, 2001, “Inside Distributed Database System,” http://w3.ocit.edu.tw/ben/foxpro6/index_english.asp.
[19]L. Ferreira, V. Berstis, J. Armstrong, M. Kendzierski, A. Neukoetter, M. Takagi, R. Bing-Wo, A. Amir, R. Murakawa, O. Hernandez, J. Magowan and N. Bieberstein, 2003, “Introduction to Grid Computing with Globus,” IBM International Technical Support Organization, September.
[20]L. He and T. R. Ioerger, 2005, “Forming Resource-Sharing Coalitions: a Distributed Resource Allocation Mechanism for Self-Interested Agents in Computational Grids,” in Proceedings of the 2005 ACM Symposium on Applied Computing, pp. 84-91, March.
[21]L. J. Zhang, J. Y. Chung and Q. Zhou, 2002, “Developing Grid computing applications - Part 1 and Part 2”, IBM developerWorks, December.
[22]M. Agyemang, K. Barker and R. Alhajj, 2005, “Mining Web Content Outliers using Structure Oriented Weighting Techniques and N-grams,” in Proceedings of the 2005 ACM Symposium on Applied Computing, pp. 482-487, March.
[23]M. Cannataro and C. Comito, 2003, “A Data Mining Ontology for Grid Programming,” in the 1st International Workshop on Semantics in Peer-to-Peer and Grid Computing, pp. 113-134.
[24]M. Cannataro and D. Talia, 2003, “Knowledge Grid: An Architecture for Distributed Knowledge Discovery,” Communication of ACM, vol. 46, no. 1.
[25]M. J. Zaki, 2001, “SPADE: An Efficient Algorithm for Mining Frequent Sequences,” Machine Learning, vol. 42, no. 1/2, pp.31-60.
[26]M. Kuramochi and G. Karypis, 2005, “Finding Frequent Patterns in a Large Sparse Graph,” Data Mining and Knowledge Discovery, vol. 11, no. 3, pp. 243-271.
[27]M. N. Alpdemir, A. Mukherjee, N. W. Paton, P. Watson, A. A. A. Fernandes, A. Gounaris, and J. Smith, 2003, “Service-Based Distributed Querying on the Grid,” in Proceedings of the 1st International Conference on Service-Oriented Computing, Springer, pp. 467-482, December.
[28]M. Roughan and Y. Zhang, 2006, “Secure Distributed Data-Mining and its Application to Large-Scale Network Measurements,” ACM SIGCOMM Computer Communication Review, vol.36, no.1, pp.7-14.
[29]M. S. Chen, J. Han, and P. S. Yu, 1996, “Data Mining: an Overview from a Database Perspective,” IEEE Transactions on Knowledge and Data Engineering, vol. 8, pp. 866-883.
[30]M. Swany and R. Wolski, 2004, “Building Performance Topologies for Computational Grids,” International Journal of High Performance Computing Applications, vol. 18, no. 2, pp. 255-265.
[31]M. Y. Lin, S. Y. Lee and S. S. Wang, 2002, “DELISP: Efficient Discovery of Generalized Sequential Patterns by Delimited Pattern-Growth Technology,” in Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, pp.198-209, May.
[32]P. Tzvetkov, X. Yan, and J. Han, 2003, “TSP: Mining Top-K Closed Sequential Patterns,” in Proceedings of the 3rd IEEE International Conference on Data Mining, IEEE Computer Society, pp. 347-354, December.
[33]R. Agrawal and R. Srikant, 1995, “Mining Sequential Patterns,” in Proceedings of the 11th International Conference on Data Engineering, IEEE Computer Society, pp. 3-14, March.
[34]R. M. Rahman, K. Barker, and R. Alhajj, 2005, “Replica Selection in Grid Environment: a Data-Mining Approach,” in Proceedings of the 2005 ACM Symposium on Applied Computing, ACM, pp. 695-700, March.
[35]R. Natarajan, R. Sion, C. Apte, and I. S. Narang, 2004, “A Grid-Based Approach for Enterprise-Scale Data Mining,” in Workshop on Data Mining and the Grid at the 4th IEEE International Conference on Data Mining, November.
[36]R. Srikant and R. Agrawal, 1996, “Mining Sequential Patterns: Generalizations and Performance Improvements,” in Proceedings of the 5th International Conference on Extending Database Technology, vol. 1057, Springer, pp. 3-17.
[37]R. van Nieuwpoort, J. Maassen, R. Hofman, T. Kielmann, and H. E. Bal, 2002, “IBIS: an Efficient Java-Based Grid Programming Environment,” in Proceedings of the 2002 Joint ACM-ISCOPE Conference on Java Grande 2002, ACM, pp. 18-27, November.
[38]S. S. Manvi and M. N. Birje, 2005, “An Agent-Based Resource Allocation Model for Grid Computing,” in Proceedings of the 2005 IEEE International Conference on Services Computing, pp. 311-314, July.
[39]T. Sandholm and J. Gawor, 2003, “Globus Toolkit 3 Core - A Grid Service Container Framework,” Technical Report, July.
[40]S. Tuecke, K. Czajkowski, I. Foster, J. Frey, S. Graham, C. Kesselman, T. Maquire, T. Sandholm, D. Snelling and P. Vanderbilt, 2003, “Open Grid Services Infrastructure (OGSI) Version 1.0,” Global Grid Forum.
[41]W. E. Johnston, D. Gannon, and B. Nitzberg, 1999, “Grids as Production Computing Environments: The Engineering Aspects of NASA’s Information Power Grid,” in Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing.
[42]W. J. Frawley, G. Piatetsky-Shapiro, and C. J. Matheus, 1992, “Knowledge Discovery in Databases - an Overview,” AI Magazine, vol. 13, pp. 57-70.
[43]W. K. Cheung and J. Liu, 2005, “On Knowledge Grid and Grid Intelligence: a Survey,” Computational Intelligence, vol. 21, no. 2, pp. 111-129, May.
[44]Y. Chen and L. Hu, 2005, “Study on Data Mining Application in CRM System based on Insurance Trade,” in Proceedings of the 7th international conference on Electronic commerce, pp. 839-841.
[45]Y. Dong, J. Yang and Z.i Wu, 2006, “ODSG: An Architecture of Ontology-based Distributed Simulation on Grid,” in Proceedings of the 1st International Multi-Symposiums on Computer and Computational Sciences, pp. 759-765, June.
[46]Y. F. Huang and S. Y. Lin, 2003, “Mining Sequential Patterns Using Graph Search Techniques,” in Proceedings of the 27th International Computer Software and Applications Conference, IEEE Computer Society, pp. 4-9, November.
[47]Y. L. Chen, M. C. Chiang and M. T. Kao, 2003, “Discovering Time-Interval Sequential Patterns in Sequence Databases,” Expert Systems with Applications, vol. 25, pp. 343-354.
[48]Y. Zhang, W. Sun and Y. Inoguchi, 2006, “CPU Load Predictions on the Computational Grid,” in Proceedings of the 6th IEEE International Symposium on Cluster Computing and the Grid, pp. 321-326, May.
[49]Z. Yang and M. Kitsuregawa, 2005, “LAPIN-SPAM: An Improved Algorithm for Mining Sequential Pattern,” in Proceedings of the 21st International Conference on Data Engineering, pp. 1222.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top