跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.171) 您好!臺灣時間:2024/12/09 01:37
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:林志隆
研究生(外文):Zin-Long Lin
論文名稱:基於動態叢集之數值型屬性資料挖掘方法
論文名稱(外文):A Data Mining Approach based on Dynamic Clustering for Numercial Attributes
指導教授:錢炳全錢炳全引用關係
指導教授(外文):Been-Chian Chien
學位類別:碩士
校院名稱:義守大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2000
畢業學年度:88
語文別:中文
論文頁數:50
中文關鍵詞:動態叢集資料挖掘數值型屬性
外文關鍵詞:Dynamic ClusteringData MiningNumerical dataFuzzy C-means
相關次數:
  • 被引用被引用:0
  • 點閱點閱:323
  • 評分評分:
  • 下載下載:21
  • 收藏至我的研究室書目清單書目收藏:2
資料挖掘的相關研究一直是最近幾年相當熱門的研究領域。資料挖掘演算法中最典型的例子為Agrawal所提出的Apriori演算法。.由於Apriori演算法只能處理項目資料型態,所以有許多的學者提出許多演算法加以改良,使Apriori演算法能夠處理數值型態的資料,其做法是將數值型態的資料切割成好幾個區間,再將這些區間對應到不同項目型資料上,再應用類似Apriori的演算法找出關聯式規則。此種挖掘數值資料型態的關聯式規則演算法有一個重大的關鍵,就是如何對數值資料作切割以獲得使用者認為合理的區間大小。
本論文提出一個動態叢集演算法。在該演算法中,我們定義了有關數值型資料叢集的兩個條件:”相對關連程度”與”相對聚合程度”。利用對此二條件的需求程度訂定一個相關參數a,即可自動獲得合理之叢集結果。依此叢集結果作分割的動作,以獲得正確合理的區間。另外本論文也提出以Fuzzy C-means為基礎的模糊群聚演算法,可以藉此迅速得到的模糊叢集的結果,改善Fuzzy C-means的效益不佳問題。利用此一模糊化的結果,應用模糊挖掘演算法,將可以很快地找出模糊關聯式規則。實際實驗的結果顯示出本演算法在一維的數值資料上將可迅速的完成合理群聚,以提供進一部的資料挖掘應用。
Data mining is an active and important research topic of late years. The early algorithm of data mining is proposed by Agrawal, called the Apriori algorithm. The Apriori algorithm can only process categorical type of data. Many researchers tried to modify the Apriori algorithm to handle numerical data for mining useful association rules from large databases. Their methods usually partition the numerical data into several equal intervals and map these intervals into different item attributes. Then, the Apriori-like algorithm is used to find association rules. However, it is difficult for users to make decision for the suitable partition and reasonable intervals.
In this thesis, we proposed a dynamic clustering algorithm to solve the problem of interval partition. We define two main characteristics of clustering numerical data: relative inter-connectivity and relative closeness. By giving a proper parameter, a, to determine the importance between relative closeness and relative inter-connectivity, the proposed clustering algorithm will generate a reasonable clustering result automatically for the user. The clustering result also provides a good approximation for the fuzzy C-means algorithm, so that the K-fuzzy clusters can be obtained efficiently through the estimation of the cluster center in the clustering algorithm. The experimental results show that the proposed clustering algorithm can behave a good performance on both of clustering results and speed.
第1章 簡介 1
1.1 資料挖掘的問題 1
1.2 研究動機 2
1.3 論文架構 3
第2章 相關研究 5
2.1 資料挖掘演算法 5
2.1.1 關聯式規則的型態 5
2.1.2 模糊關聯式規則 9
2.2 叢集演算法 10
2.2.1 分割演算法 10
2.2.2 階層演算法 13
2.2.3 格子型演算法 16
2.2.4 密度型演算法 16
第3章 動態叢集演算法 18
3.1 動態叢集演算法 18
3.1.1 基本觀念 18
3.1.2 動態叢集演算法參數表示 20
3.1.3 動態叢集演算法 22
3.2 屬性函數產生演算法 I 23
3.3 屬性函數產生演算法II 24
3.4 實驗 25
第4章 挖掘演算法 38
4.1 模糊挖掘演算法參數表示 38
4.2 模糊挖掘演算法 39
4.3 模糊挖掘演算法之例子 40
第5章 結論與未來研究方向 45
[1] Rakesh Agrawal, Tomasz Imielinski, and Arun Swami. Mining Association Rules between Sets of Items in Large Databases. In Proc. of the ACM SIGMOD Conference on Management of Data, pages 207-216, Washington, D. C., May 1993
[2] R. Agrawal and R. Srikant, Fast Algorithms for Mining Association Rules in Large Databases, Proc. 20th Int’l Conf. Very Large Data Bases, pp. 478-499, September 1994.
[3] Wai-ho Au, and Keith C.C. Chan. An Effective Algorithm for Discovering Fuzzy Rules in Relational Databases. The 1998 IEEE International Conference on Fuzzy Systems Proceedings
[4] S. Brin et al., Dynamic Itemset Counting and Implication Rules for Market Basket Data, Proc. ACM SIGMOD Int’l Conf. Management of Data, ACM Press, New York, 1997, pp. 255-264.
[5] K.C.C. Chan, and W.H. Au. Mining Fuzzy Association Rules. In Proc. of the 6th ACM Int’l Conf. On Information and Knowledge Management (CIKM’ 97), Las Vegas, Nevada, Nov. 1997.
[6] M.-S. Chen, J. Han and P. S. Yu, Data Mining: An Overview from Database Perspective, IEEE Trans. on Knowledge and Data Engineering, Vol. 8, No. 6, pp. 866-883, December 1996.
[7] M. Ester, H. Kriegel, J. Sander, and X. Xu. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of 2nd International Conference on Knowledge Discovery Databases and Data Mining, AAAI Press, Menlo Park, Calif., 1996, pp. 226-231.
[8] Usama Fayyad, and Paul Stolorz, Data mining and KDD: Promise and challenges. Future Generation Computer Systems 13(1997), 99-115
[9] Sudipto Guha, Rajeev Rastogi, and kyuseok Shim. CURE: An Efficient Clustering Algorithm for Large Databases. SIGMOD 98, Proceedings ACM SIGMOD International Conference on Management of Data, June 2-4, 1998, Seattle, Washington, USA.
[10] J. Han, and Y. Fu. Discovery of multiple-level association rules from large databases. Proc. of International Conference on Very Large Databases, Zurich, Switzerland, pp. 420-431, September 1995.
[11] K. Hirota and W. Pedrycz, Linguistic data mining and fuzzy modeling, IEEE International conference on Fuzzy systems, Vol.2, 1996, pp. 1488-1496.
[12] Hinneburg A., Keim D.A.: An Efficient Approach to Clustering in Multimedia Databases with Noise, Proc. 4rd Int. Conf. on Knowledge Discovery and Data Mining, New York, AAAI Press, 1998.
[13] Ta-Jung Huang, and Shyue-Liang Wang. Ansering Null Queries in Fuzzy Relational Databases. 1997 Proceedings of the 8th International Conference on Information Management.
[14] Chan-Sheng Kuo, A Study of Fuzzy Data Mining Algorithms for Qunatitative Value. Master thesis, Business Administration, I-Shou University, Taiwan, 1999.
[15] L. Kaufman and P. J. Rousseeuw. Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley & Sons, 1990.
[16] George Karypis, Eui-hong Han, and Vipin Kumar, “Chameleon: Hierarchical Clustering Using Dynamic Modeling”, IEEE Computer: Special Issue on Data Analysis and Mining, vol. 32, no. 8, pp 68-75, August 1999.
[17] R. Krishnapuram, and J. M. Keller. A possibilistic approach to clustering, IEEE Transactions on Fuzzy Systems Volume: 1 2, May 1993, Page(s): 98 —110.
[18] Lee-Hyong Lee, and Hyung Lee-Kwang, An Extension of Association Rules Using Fuzzy Sets. IFSA ’97
[19] R.T. Ng and J. Han. Efficient and Effective Clustering Methods for Spatial Data Mining. In Proceedings of the 20th VLDB Conference, pages 144-155, 1994.
[20] Vincent Ng, and John Lee. Quantitative Association Rules over Incomplete Data. 1998 IEEE Conf. on SMC. Vol. 3, 2821-2826.
[21] Jong Soo Park, Ming-Syan Chen, Philip S. Yu, An Effective Hash Based Algorithm for Mining Association Rules. SIGMOD Conference 1995: 175-186
[22] J. S. Park, M. S. Chen, and P. S. Yu. Using a Hash-Based Method with Transaction Trimming for Mining Association Rules. IEEE Transactions on Knowledge and Data Engineering, Vol. 9, No. 5, September/October 1997.
[23] Witold Pedrycz. Fuzzy set technology in knowledge discovery. Fuzzy Sets and Systems 98.
[24] R. Srikant and R. Agrawal. Mining Generalized Association Rules. In Proceedings of the 20th International Conference on Very Large Data Bases, pages 407-419, September 1995.
[25] R. Srikant, and R. Agrawal, Mining Quantitative Association Rules in Large Relational tables. In Proc. of 1996 ACM SIGMOD Int’l Conf. on Management of Data, Monreal, Canada, June 1996, pp. 1-12.
[26] Takahiko Shintani, and Masaru Kitsuregawa, Parallel Mining Algorithms for Generalized Association Rules with Classification Hierarchy. SIGMOD ’98, Seattle, WA, USA
[27] G Sheikholeslami, Surojit Chatterjee, Aidong Zhang: WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases. VLDB''98, Proceedings of 24th International Conference on Very Large Data Bases, August 24-27, 1998, New York City, New York, USA.
[28] A. Savasere, E. Omiecinski, and S. Navathe, An Efficient Algoritm for Mining Association Rules in Large Database, Proc. 21st Int’l Conf. Very Large Data Bases, Morgan Kaufmann, San Francisco, 1995, pp. 432-444.
[29] Wei Wang, Jiong Yang, and Richard Munts. STING: A statistical information grid approach to spatial data mining. In Proceedings of the 23rd VLDB Conference, pages 186-195, Athens, Greece, 1997.
[30] X. Xu, M. Ester, H.-P. Kriegel, J. Sander. A Distribution-Based Clustering Algorithm for Mining in Large Spatial Databases. In Proc. IEEE International Conference on Data Engineering, 1998.
[31] Tian Zhang, Raghu Ramakrishnan, and Miron Livny. BIRCH: An Efficient Data Clustering Method for Very Large Databases. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, pages 103-114, Montreal, Canada, 1996.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top