跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.40) 您好!臺灣時間:2026/06/16 20:55
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:王勝玄
研究生(外文):Sheng-Hsuan Wang
論文名稱:自動分群處理混合型資料之自組映射圖
論文名稱(外文):Clustering of Self-Organizing Map on Mixed Data
指導教授:許中川許中川引用關係
指導教授(外文):Chung-Chian Hsu
學位類別:碩士
校院名稱:國立雲林科技大學
系所名稱:資訊管理系碩士班
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2005
畢業學年度:93
語文別:英文
論文頁數:87
中文關鍵詞:自組映射圖混合型資料分群分群演算法視覺化分群技術
外文關鍵詞:Self-organizing mapClusteringConcept hierarchyClustering of the SOMVisualization-induced SOMMixed-data clustering
相關次數:
  • 被引用被引用:0
  • 點閱點閱:218
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
分群的目的在於將相似資料群聚在一起,使得群內資料相似度高,群間資料差異性大。自組映射圖)是一種視覺化分群技術,能將高維度資料投射到低維度空間(通常為二維平面),同時保留住原始資料的群聚現象。以視覺化為基礎的自組映射圖,改善傳統自組映射圖無法忠實地在二維平面上反映原始資料的群聚現象。儘管如此,傳統的這兩種自組映射圖皆無法合理表達種類型資料,亦無法忠實地在二維自組映射圖上反映混合型資料的群聚現象。因此,本研究提出一個新的視覺化分群技術來解決上述問題。我們利用概念階層來合理地表達種類型資料間的相似度,並儘可能忠實地在二維自組映射圖上呈現群聚現象。此外,針對訓練後的自組映射圖,依其鄰近關係進行自動分群。實驗包含兩個人工資料集及兩個真實資料集,並評估手動及自動的分群品質。實驗結果證明,本研究所提出的方法能有效處理混合型資料,並呈現出較佳的群聚現象。
The visualization-induced SOM (ViSOM) is a non-linear multi-dimensional projection method, extended from self-organizing map (SOM). It overcomes the drawbacks that the structure of the clusters may not be apparent and the nodes often spread around the 2-D map in the SOM. The objective of the ViSOM is to preserve the data structure as well as the topology as faithfully as possible. Even so, it still cannot express reasonably the distance or similarity of categorical data and preserve the structure of categorical data. In this study, the extended ViSOM is proposed to overcome these shortcomings. We utilize the concept hierarchies to define and calculate the distance of categorical values and preserve the structure of mixed data as well as the topology of trained EViSOM map as faithfully as possible. In addition, we perform clustering based on the output map generated by the network and evaluate the clustering result. Experimental results on two synthetic and two real datasets demonstrate that the proposed clustering algorithm is able to cluster mixed data better than the traditional SOM and ViSOM do. In addition, our algorithm better reveals the cluster structure and the clustering quality than traditional approaches, with respect to manual or automatic clustering of the trained map.
摘要..................................................................................................................... i
Abstract .............................................................................................................. ii
誌謝................................................................................................................... iii
Table of Contents............................................................................................... iv
List of Figures .................................................................................................. vii
List of Tables ...................................................................................................... x
1. Introduction .................................................................................................... 1
1.1. Motivation ........................................................................................... 1
1.2. Objective ............................................................................................. 2
1.3. Limitation ............................................................................................ 2
1.4. Organization ........................................................................................ 3
2. Background .................................................................................................... 4
2.1. SOM .................................................................................................... 4
2.2. ViSOM................................................................................................. 6
2.3. Problems of the conventional approaches ........................................... 9
2.4. Clustering of the SOM ...................................................................... 10
2.5. Concept hierarchies ........................................................................... 12
2.6. GSOM ............................................................................................... 15
3. Research framework..................................................................................... 18
3.1. Extended ViSOM............................................................................... 18
3.2. Clustering of extended ViSOM......................................................... 21
3.2.1. Global clustering validity index for different clustering
algorithms......................................................................................... 21
3.2.2. DBSCAN for clustering of SOM ........................................... 26
3.3. Evaluating clustering results ............................................................. 27
4. Experiments.................................................................................................. 30
4.1. Synthetic categorical data.................................................................. 31
4.2. Synthetic mixed data ......................................................................... 33
4.3. Real mixed dataset Adult................................................................... 35
4.3.1. Training results ....................................................................... 36
4.3.2. Manual clustering of trained map........................................... 37
4.3.3. Automatic clustering of trained map by using distance
threshold ........................................................................................... 40
4.3.4. Automatic clustering of trained map by using CDbw index .. 43
4.3.5. Automatic clustering of trained map by using DBSCAN
algorithm .......................................................................................... 46
4.4. Real mixed dataset POS .................................................................... 48
4.4.1. Training results ....................................................................... 49
4.4.2. Manual clustering of trained map........................................... 50
4.4.3. Automatic clustering of trained map by using distance
threshold ........................................................................................... 52
4.4.4. Automatic clustering of trained map by using CDbw index .. 54
4.4.5. Automatic clustering of trained map by using DBSCAN
algorithm .......................................................................................... 56
5. Conclusions and Future Work ...................................................................... 59
5.1. Conclusions ....................................................................................... 59
5.2. Future work ....................................................................................... 59
References ........................................................................................................ 61
Appendix .......................................................................................................... 64
Appendix A: Tables of Adult dataset........................................................ 64
Manual clustering............................................................................. 64
Automatic clustering by using distance threshold............................ 66
Automatic clustering by using CDbw index .................................... 68
Automatic clustering by using DBSCAN algorithm........................ 70
Appendix B: Tables of POS dataset ......................................................... 72
Manual clustering............................................................................. 72
Automatic clustering by using distance threshold............................ 73
Automatic clustering by using CDbw index .................................... 74
Automatic clustering by using DBSCAN algorithm........................ 75
[1] Han, J. and M. Kamber, 2001, Data mining concepts and techniques,
San Francisco: Morgan Kaufmann.
[2] Mao, J. and A.K. Jain, 1995, “Artificial neural networks for feature
extraction and multivariate data projection,” IEEE Transactions on
Neural Networks, Vol. 6, No. 2, pp. 296-317.
[3] Yin, H., 2002, “ViSOM-A novel method for multivariate data
projection and structure visualization,” IEEE Transaction on Neural
Network, Vol. 13, No. 1, pp. 237-243.
[4] Yin, H., 2002, “Data visualization and manifold mapping using the
ViSOM,” Neural Networks, Vol. 15, pp. 1005-1016.
[5] Kohonen, T., 1990, “The self-organizing map,” Proceeding of the
IEEE, Vol. 78, No. 9, pp. 1464-1480.
[6] Hu, W., D. Xie, and T. Tan, 2004, “A hierarchical self-organizing
approach for learning the patterns of motion trajectories,” IEEE
Transactions on Neural Networks, Vol. 15, No. 1, pp. 135-144.
[7] Kohonen, T., S. Kaski, K. Lagus, J. Salojarvi, J. Honkela, V. Paatero,
and A. Saarela, 2000, “Self-organization of a massive document
collection,” IEEE Transactions on Neural Networks, Vol. 11, No. 3,
pp. 574-585.
[8] Rauber, A., D. Merkl, and M. Dittenbach, 2002, “The growing
hierarchical self-organizing map: Exploratory analysis of
high-dimensional data,” IEEE Transactions on Neural Networks, Vol.
13, No. 6, pp. 1331-1341.
[9] Fisher, R.A., 1936, “The use of multiple measurements in taxonomic
problems,” Annals Eugenics, Vol. 7, pp. 178-188.
[10] Lampinen, J. and E. Oja, 1992, “Clustering properties of hierarchical
self-organizing maps,” Journal of Mathematical Imaging and Vision,
Vol. 2, pp. 261-272.
[11] Murtagh, F., 1995, “Interpreting the Kohonen self-organizing feature
map using contiguity-constrained clustering,” Pattern Recognition
Letter, Vol. 16, pp. 339-408.
[12] Kiang, M.Y., 2001, “Extending the Kohonen self-organizing map
networks for clustering analysis,” Computational Statistics & Data
Analysis, Vol. 38, pp. 161-180.
[13] Vesanto, J. and E. Alhoniemi, 2000, “Clustering of the
self-organizing map,” IEEE Transaction on Neural Network, Vol. 11,
No. 3, pp. 586-600.
[14] Wu, S. and W.S. Chow, 2004, “Clustering of the self-organizing map
using a clustering validity index based on inter-cluster and
intra-cluster density,” Pattern Recognition, Vol. 37, pp. 175-188.
[15] Davies, D.L. and D.W. Bouldin, 1979, “A cluster separation
measure,” IEEE Transaction on Pattern Analysis and Machine
Intelligence, Vol. 1, No. 2, pp. 224-227.
[16] Halkidi, M. and M. Vazirgiannis, 2002, “Clustering validity
assessment using multi representatives,” Proceedings of SETN
Conference, Thessaloniki, Greece.
[17] Hsu, C.C., “Generalizing self-organizing map for categorical data,”
submitted for publication.
[18] Guha, S., R. Rastogi, and K. Shim, 1998, “CURE: An efficient
clustering algorithm for large databases,” Proceedings of ACM
SIGMOD International Conference on Management of Data, New
York, pp. 73-84.
[19] Ester, M., H.-P. Kriegel, J. Sander, and X. Xu, 1996, “A
density-based algorithm for discovering clusters in large spatial
databases with noise,” Proceedings of 2nd International Conference
on Knowledge Discovery and Data Mining (KDD’96), pp. 226-231.
[20] Gluck, A. and J. Corter, 1985, “Information, uncertainty, and the
utility of categories,” Proceedings of the Seventh Annual Conference
of the Cognitive Science Society.
[21] Shannon, C.E., 1948, “A mathematical theory of communication,”
Bell System Technical Journal, pp. 379-423.
[22] Merz, C.J. and P. Murphy, 1996, “UCI repository of ML databases,”
http://www.cs.uci.edu/~mlearn/MLRepository.html.
[23] Kohonen, T., J. Hynninen, J. Kangas, and J. Laaksonen, 1996,
“SOM_PAK: The self-organizing map program package,” Report
A31, Helsinki University of Technology, Laboratory of Computer
and Information Science, Espoo, Finland. Also available in the
Internet at the address http://www.cis.hut.fi/.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top