(34.236.244.39) 您好!臺灣時間:2021/03/09 17:53
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:柯國隆
研究生(外文):Kuo-Lung Ke
論文名稱:主題導向式自我組織圖之研究
論文名稱(外文):Research on Topic Oriented Self-organizing Map
指導教授:楊新章楊新章引用關係
指導教授(外文):Hsin-Chang Yang
學位類別:碩士
校院名稱:國立高雄大學
系所名稱:資訊管理學系碩士班
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2010
畢業學年度:98
語文別:中文
論文頁數:75
中文關鍵詞:類神經網路文件分群自我組織圖
外文關鍵詞:document clusterstopic-oriented self-organizing mapgrowing hierarchical self-organizing mapReuters-21578
相關次數:
  • 被引用被引用:0
  • 點閱點閱:225
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
人類常在接受到許多未知或未清楚定義之事物時,會自然而然的將相似的事物聚為一類。類神經網路一直以來常被用來模擬人腦之運作,其中自我組織圖為最被廣泛使用之模式,其主要用以模擬人類腦中之物以類聚之模式。雖然我們可以將相似的事物歸為一類,但卻不能立即得知該群聚中所蘊藏之內隱知識,須對該群集進行充分的解析才能給予適當的主題名稱作為代表。本文描述如何從一堆文件資料中發掘更多內隱知識之方法。在本研究中,我們發展一適用於文件資料之可調性自我組織圖模型,即主題導向式自我組織圖,來將文件資料進行分群並建立階層結構。本模式與傳統自我組織圖主要的不同是依各群集之主題來進行自我組織圖之橫向擴展與縱向階層建立。如此建立之結構將可以更深入了解該文件集之內涵。我們使用路透社文件集(Reuter-21578)來建構實驗以發掘文件間之內隱知識。本研究實驗顯示我們的方法可以對文字文件發展出符合其內涵之文件結構。
Text document clustering is a basic operation of text processing and is widely applied in data visualization, theme identification, text summarization, hierarchy generation, etc. However, it will be inconvenient for users to find a document after clustering without proper labeling of topics. Moreover, there exist hierarchical relationships between document clusters. In this work, we will propose an adaptive self-organizing map model, namely the topic-oriented self-organizing map (TOSOM), that can adaptively expand the map laterally and hierarchically according to the topics of clusters, rather than the data distributions used in traditional adaptive self-organizing maps such as growing hierarchical self-organizing map (GHSOM). We conducted experiments on the Reuters-21578 dataset and obtained promising result.
目錄
致謝 I
摘要 II
Abstract III
目錄 IV
圖目錄 VI
表目錄 VIII
第一章 緒論 1
1.1 研究背景 1
1.2 研究動機 2
1.3 研究目的 4
1.4 研究架構 4
第二章 文獻探討 5
2.1 可調整式自我組織圖 5
2.1.1 自我組織圖 5
2.1.2 自我組織圖的增長 6
2.1.3 階層的產生 7
2.1.4 增長層級式自我組織圖 8
2.2 文件分群演算法 9
2.2.1 K-means演算法 9
2.2.2 貝氏分類 9
2.2.3 類神經網路分類 10
2.3 文本探勘 10
2.4 基於自我組織圖之文件分群演算法 14
第三章 研究方法與實驗設計 17
3.1 文件前置處理 18
3.1.1 斷詞 18
3.1.2 關鍵字選取. 19
3.1.3 文件向量產生 20
3.2 文件分群 22
3.2.1 SOM訓練 22
3.2.2 主題導向式自我組織圖 26
第四章 實驗結果與評估 36
4.1 實驗步驟 36
4.1.1 前置處理 37
4.1.2 資料分群. 40
4.2 實驗評估 49
第五章 結論與討論 60
5.1 結論 60
5.2 未來研究方向 61
參考文獻 62
參考文獻
[1]Kaski, S., Kangas, J., and Kohonen, T. (1998) “Bibliography of Self-Organizing Map (SOM) Papers: 1981-1997.” Neural Computing Surveys, Vol. 1, pp. 102-350.
[2]Oja, M., Kaski, S., and Kohonen, T. (2003) “Bibliography of Self-Organizing Map (SOM) Papers: 1998-2001 Addendum.” Neural Computing Surveys, Vol. 3, pp. 1-156.
[3]Pöllä, M., Honkela, T., and Kohonen, T. (2009) “Bibliography of Self-Organizing Map (SOM) Papers: 2002-2005 Addendum.” TKK Reports in Information and Computer Science, Helsinki University of Technology, Report TKK-ICS-R24.
[4]Kohonen, T. (1982) “Self-organized formation of topologically correct feature maps.” Biological Cybernetics, Vol. 43, pp. 59-69.
[5]Fritzke, B. (1995) “Growing Grid – a Self-Organizing Network with Constant Neighborhood Range and Adaption Strength.” Neural Processing Letter, Vol. 2, no. 5, pp. 9-13.
[6]Alahakoon, D., Halgamuge, S. K., and Srinivasan, B. (2000) “Dynamic self-organizing maps with controlled growth for knowledge discovery.” IEEE Transactions on Neural Networks, Vol. 11, No. 3, pp. 601-614.
[7]Miikkulainen, R. (1990) “Script recognition with hierarchical feature maps.” Connection Science,” Vol. 2, pp. 83-101.
[8] Koikkalainen, P. (1999) “Tree Structured Self-Organizing Maps.” In Oja, E. and Kaski, S., eds., Kohonen Maps, pp. 121-130. Elsevier, The Netherlands.
[9] Dittenbach, M., Rauber, A., and Merkl, D. (2001). Recent Advances with the Growing Hierarchical Self-Organizing Map. In Allinson, N. et al. Eds., Advances in Self-Organizing Maps, Springer, Lincoln, England, pp. 140-145.
[10]Rauber, A., Merkl, D., and Dittenbach, M. (2002). The Growing Hierarchical Self-Organizing Map: Exploratory Analysis of High-Dimensional Data. IEEE Transactions on Neural Networks, Vol. 13, pp. 1331-1341.
[11]MacQueen, J. (1967). “Some Methods for Classification and Analysis of Multivariate Observations,” In Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, pp. 281-297.
[12]Chen, T.S., Lin, C.C., Chiu, Y.H., and Chen, R.C. (2006) “Combined Density- and Constraint-based Algorithm for Clustering.” In Proceedings of 2006 International Conference on Intelligent Systems and Knowledge Engineering.
[13] Tan, A. H. (1999). “Text Mining : The state of the art and the challenges.” In Proceedings of the Pacific Asia Conference on Knowledge Discovery and Data Mining(PAKDD’99), Beijing, pp. 65-70.
[14] Losiewicz, P., Douglas W. O., and Ronald N. K. (2000) “Textual Data Mining to Support Science and Technology Management.” Journal of Intelligent Information System, Vol. 15, pp. 99-119.
[15]Hearst, M.A. (2004) “What Is Text Mining?” Retrieved Dec. 29, 2008 from http://www.sims.berkeley.edu/~hearst/
[16] Agrawal, R., Imielinski, T., and Swami, A. (1993) “Mining Association Rules between Sets of Items in Large Databases.” In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, D.C, pp. 207-216.
[17]Dörre, J., Gerstl, P., and Seiffert, R. (1999) “Text Mining: Finding Nuggets in Mountains of Textual Data.” In Proceedings of the 5’s ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, pp. 398-401.
[18]Feldman, R. and Dagan, I. (1995) “KDT-Knowledge Discovery in Texts.” In Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD95), Montreal, Canada, pp. 112-117.
[19] Feldman, R. and Hirsh, R. (1997a) “Finding Associations in Collections of Text.” In R.S. Michalski, I. Bratko, & M. Kubat, editors, Machine Learning and Data Mining: Methods and Applications, pp. 223-240, John Wiley & Sons.
[20] Feldman, R., Klösgen, W., Yehuda, Y. B., Kedar, G., and Reznikv, V. (1997b) “Pattern Based Browsing in Document Collections.” In Proceedings Principles of Data Mining and Knowledge Discovery, First European Symposium(PKDD '97), Trondheim, Norway, pp. 112-122.
[21] Hearst, M.A. (1997). “Text Data Mining: issues, Techniques, and the Relation to Information Access.” Retrieved Dec. 29, 2008 from http://www.sims.berkeley.edu/~hearst/
[22]Kohonen, T. (1998). “Self-Organization of Very Large Document Collections: State of the Art.” In Proceedings of ICANN98, the 8th International Conference on Artificial Neural Networks, Vol. 1, pp. 65-74, Springer.
[23]Honkela, T., Kaski, S., Lagus, K., and Kohonen, T. (1996) “Newsgroup Exploration with WEBSOM Method and Browsing Interface.” Technical Report A32. Helsinki University of Technology, Laboratory of Computer and Information Science, Espoo, Finland..
[24]Kaski, S., Honkela, T., Lagus, K., and Kohonen, T. (1998) “WEBSOM-Self-Organizing Maps of Document Collections.” Neurocomputing, Vol. 21, pp. 101-117.
[25] Kohonen, T. (1995) Self-Organizing Maps. Springer Verlag, Berlin.
[26]Salton, G., Wong, A., and Yang, C. S. (1975) “A Vector Space Model for Automatic Indexing,” Communications of the ACM, Vol. 18, No. 11, pp.613-620.
[27]Deerwester, S., Dumais, S., Furnas, G., and Landauer, K. (1990) “Indexing by Latent Semantic Analysis.” Journal of the American Society for Information Science, Vol. 40, No. 6, pp. 391-407.
[28]Merkl, D. (1998) “Text Classification with Self-Organizing Maps: Some Lessons Learned.” Neurocomputing, Vol. 21, No. 1-3, pp. 61-77.
[29]Kohonen, T., Kaski, S., Lagus, K., Salojärvi, J., Honkela, J., Paatero, V., and Saarela, A. (2000) “Self Organization of a Massive Document Collection.” IEEE Transactions on Neural Networks, Special Issue on Neural Networks for Data Mining and Knowledge Discovery, Vol. 11, No. 3, pp. 574-585.
[30]Ultsch, (1992) “Self-organizing neural networks for visualization and classification.” in Information and Classification. Concepts, Methods and Application, O. Opitz, B. Lausen, and R. Klar, Eds., Studies in Classification, Data Analysis, and Knowledge Organization, Springer, Dortmund, Germany, pp. 307-313.
[31]Merkl, D. and Rauber, A. (1997) “Alternative ways for cluster visualization in self-organizing maps.” In Proceedings of the Workshop on Self-Organizing Maps (WSOM97), T. Kohonen, Ed., Espoo, Finland, pp. 106-111.
[32]Rauber, A. (1999) “LabelSOM: On the Labeling of Self-Organizing Maps.” In Proceedings of the International Joint Conference on Neural Networks (IJCNN'99), Washington, DC, pp. 10-16.
[33]Merkl, D. and Rauber, A. (1999) “Automatic Labeling of Self-Organizing Maps for Information Retrieval.” In Proceedings of the 6th International Conference on Neural Information Processing (ICONIP99), Perth, Australia, pp. 16-20.
[34]Rauber, A., Merkl, D., and Dittenbach, M. (2002) “The growing hierarchical self-organizing map: exploratory analysis of high-dimensional data.” IEEE Transactions on Neural Networks, Vol. 13, No. 6, pp. 1331-1341.
[35]Rauber, A., Dittenbach, M., and Merkl, D. (2001) “Towards Automatic Content-Based Organization of Multilingual Digital Libraries: An English, French and German View of the Russian Information Agency Nowosti News.” In Proceedings of the Third All-Russian Scientific Conference on Digital Libraries: Advanced Methods And Technologies, Digital Collections (RCDL01), Russia, pp. 11-13.
[36]曾元顯 (1997) “關鍵詞自動擷取技術之探討.” 中國圖書館學會會訊, 第106期, 第26-29頁。
[37]許中川、陳景揆 (2001) “探勘中文新聞文件.” 中華民國資訊管理學會會報, Vol. 14(2), 第 103-122 頁。
[38]Chen, K. J. and Bai, M. H. (1998) “Unknown Word Detection for Chinese by a Corpus-based Learning Method.” International Journal of Computational linguistics and Chinese Language Processing, Vol. 3, No. 1, pp. 27-44.
[39]Porter, M. F. (1980) “An algorithm for suffix stripping.” Program, Vol. 14, No. 3, pp.130-137.
[40]Fellbaum, C. (1999) “Wordnet” MIT Press.
[41]Lewis, D. D., “Reuters-21578 Text Categorization Test Collection Distribution”, in AT&T Labs – Research, 1996.
電子全文 電子全文(本篇電子全文限研究生所屬學校校內系統及IP範圍內開放)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔