臺灣博碩士論文加值系統

English |FB 專頁 |Mobile

免費會員登入| 註冊

功能切換導覽列

(216.73.216.110) 您好！臺灣時間：2025/09/27 01:11

字體大小：

:::

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
目次
參考文獻
紙本論文
QR Code

本論文永久網址:

研究生:

陳丁溫

研究生(外文):

Ding-Wen Chen

論文名稱:

應用增長層級式自我組織映射圖於多國語言資訊檢索

論文名稱(外文):

A Multilingual Information Retrieval Approach Based on Growing Hierarchical Self-Organizing Maps

指導教授:

吳美宜

指導教授(外文):

Mei-Yi Wu

學位類別:

碩士

校院名稱:

長榮大學

系所名稱:

資訊管理研究所

學門:

電算機學門

學類:

電算機一般學類

論文種類:

學術論文

論文出版年:

2007

畢業學年度:

語文別:

中文

論文頁數:

中文關鍵詞:

多國語言資訊檢索、文件分群、類神經網路、增長層級式自我組織映射圖

外文關鍵詞:

multilingual information retrieval、text clustering、neural network、growing hierarchical self-organizing maps

相關次數:

被引用:0
點閱:142
評分:
下載:0
書目收藏:0

隨著網際網路上多國語言文件的增加，多國語言資訊檢索技術的應用成為一個重要的研究課題。本文描述我們在發掘多種語言文件上的知識所發展的一個方法。我們從光華雜誌中收集中文與英文的新聞資料，測試語料庫中各有976份中英雙語文件。
在本研究中，我們採用一類神經網路中文件分群的方法，即增長層級式自我組織映射圖，來協助我們發現多國語言文件之關聯。我們使用中英雙語平行語料庫來建構實驗以發掘文件間之關連性。本研究實驗顯示我們的方法可以獲取不同語言文件間之關係。

With the increasing amount of multilingual texts in the Internet, multilingual information retrieval has become an important research issue. This paper describes our work on developing a method for discovery of knowledge from multilingual documents. We collected English and Chinese news articles from the Taiwan-panorama magazine. Our test corpus includes 976 pairs of Chinese-English parallel documents.
In this study, we adopt a text clustering approach, which apply a neural network approach, namely the growing hierarchical self-organizing maps (GHSOM), to help us discovering relationships among multilingual documents. We have conducted experiments to uncover relationships of documents based on Chinese-English bilingual parallel corpora. The experimental results show that our multilingual text mining approach may capture conceptual relationships among documents written in different languages.

致謝....................................I
中文摘要................................II
Abstract................................III
目錄....................................IV
表目錄..................................V
圖目錄..................................VI
第一章緒論.............................1
1.1 研究背景............................1
1.2 研究動機............................2
1.3 問題領域............................4
1.4 論文架構............................4
第二章相關研究探討.....................5
2.1 多國語言資訊檢索....................5
2.1.1 辭典為本方法......................5
2.1.2 索引典為本方法....................7
2.1.3 語料庫為本方法....................8
2.2 文件分群演算法......................12
2.2.1 k-Nearest Neighbor Classifier.....12
2.2.2 Na??ve Bayesian Classifier........13
2.2.3 Neural Network Classifier.........13
第三章研究方法與實驗設計...............16
3.1 文件前置處理........................17
3.1.1 斷詞..............................17
3.1.2 特徵選取..........................18
3.1.3 向量空間模型......................21
3.2 文件分群............................22
3.2.1 自我組織映射圖(SOM)...............22
3.2.2 增長層級式自我組織映射圖(GHSOM)...25
第四章實驗結果.........................30
4.1 實驗步驟............................30
4.1.1 前置處理..........................31
4.1.2 文件分群..........................34
4.1.3 檢索介面..........................37
4.2 實驗評估............................43
第五章結論與未來研究方向...............49
5.1 章節回顧............................49
5.2 結論................................50
5.3 未來研究方向........................50
參考文獻................................52

[1] Gordon, R. G. (2005) “Ethnologue: Languages of the World,” Fifteenth edition. Dallas, Tex.: SIL International. Online version: http://www.ethnologue.com/.
[2] Korfhage, R. R. (1997) “Information Storage and Retrieval,” John Wiley & Sons.
[3] Internet Usage World Stats, (2007) Top ten languages used in the web. from http://www.internetworldstats.com/stats7.htm
[4] Oard, D. W. and Dorr, B. J. (1996) “A Survey of Multilingual Text Retrieval,” Technical Report UMIACS-TR-96-19, University of Maryland, Institute for Advanced Computer Studies.
[5] Ballesteros, L. and Croft, W. B. (1996) “Dictionary–based Methods for Cross-Lingual Information Retrieval,” Proceedings of the 7th International DEXA Conference on Database and Expert Systems Applications, pp. 791-801.
[6] Chen, H. H., Lin, C. C., and Lin, W. C. (2000) “Construction of a Chinese-English WordNet and Its Application to CLIR,” Proceedings of 5th International Workshop on Information Retrieval with Asian Languages, Hong Kong, pp. 189-196.
[7] Fluhr, C. (1995) “Survey of the State of the Art in Human Language Technology,” Center for Spoken Language Understanding, Oregon Graduate Institute, pp. 291-305.
[8] Tallving, M. and Nelson, P. (1990) “A question of international accessibility to Japanese databases,” In David I Raitt, editor, 14th International Online Information Meeting Proceedings, Oxford, Learned Information, pp. 423-437.
[9] McCarley, J. S. (1999) “Should we Translate the Documents or the Queries in Cross-Language Information Retrieval?” In Proceedings of the 37th Annual Meeting of the Association for Computation Linguistics, pp. 208-214.
[10] Hull, D. A. and Grefenstette, G. (1996) “Querying Across Languages: A Dictionary-based Approach to Multilingual Information Retrieval,” Proceedings of the 19th International Conference on Research and Development in Information Retrieval, pp. 49-57.
[11] Davis, M. W. (1997) “New Experiments in Cross-Language Text Retrieval at NMSU’s Computing Research Lab,” Proceedings of TREC 5.
[12] Ballesteros, L. and Croft, W. B. (1997) “Phrasal Translation and Query Expansion Techniques for Cross-Language Information Retrieval,” Working Notes of AAAI-97 Spring Symposiums on Cross-Language Text and Speech Retrieval, pp. 1-8.
[13] Min, J., Sun, L., and Zhang, J. (2005) “ISCAS in English-Chinese CLIR at NTCIR-5,” Proceedings of NTCIR-5.
[14] Gey, F. C. (2005) “How similar are Chinese and Japanese for Cross-Language Information Retrieval?” Proceedings of NTCIR-5.
[15] Thompson, P. and Dozier, C. (1997) “Name Searching and Information Retrieval,” Proceedings of Second Conference on Empirical Methods in Natural Language Processing, Providence, Rhode Island.
[16] Chen, H. H., Huang, S. J., Ding, Y. W., and Tsai, S. C. (1998) “Proper Name Translation in Cross-Language Information Retrieval,” Proceedings of 17th International Conference on Computational Linguistics and 36th Annual Meeting of the Association for Computational Linguistics, Montreal, Quebec, Canada, pp. 232-236.
[17] Peters, C. and Picchi, E. (1997) “Across Languages, Across Cultures Issues in multilinguality and Digital Libraries,” D-Lib Magazine.
[18] Fellbaum, C. (1999) “Wordnet,” MIT Press.
[19] Suarez, A., Saiz-Noeda, M., and Palomar, M. (1999) “A Method of Restricted Knowledge Acquisition from Wordnet,” IEEE Third International Conference on Knowledge-Based Intelligent Information Engineeing Systems, Adelaide, Australia.
[20] Salton, G. (1970) “Automatic Processing of Foreign Language Documents,” Journal of the American Society for Information Science, pp. 187-194.
[21] Chen, H. H., Kuo, J. J., and Su, T. C. (2003) “Clustering and Visualization in a Multi-Lingual Multi-Document Summarization System,?羾roceedings of 25th European Conference on Information Retrieval Research, Lecture Notes in Computer Science, LNCS 2633, April 14-16, Pisa, Italy, pp. 266-280.
[22] Brown, R. D. (1996) “Example-Based Machine Translation in the Pangloss System,” Proceedings of the 16th International Conference on Computational Linguistics.
[23] Oard, D. W. and Dorr, B. J. (1996) “Evaluating Cross-Language Text Filtering Effectiveness,” In Proceedings of the Cross-Linguistic Multilingual Information Retrieval Workshop, Zurich, Switzerland, pp. 8-14.
[24] Salton, G. (1989) “Automatic Text Processing: the Transformation, Analysis, and Retrieval of Information by Computer,” Reading, MA: Addison-Wesley.
[25] Croft, W. B., Broglio, J., and Fujii, H. (1995) “Applications of multilingual text retrieval,” In Proceedings of the Twenty-Ninth Annual Hawaii International Conference on System Sciences, pp. 98-107.
[26] Chen, K. H. and Chen, H. H. (1994) “A Part-of-Speech-Based Alignment Algorithm,” Proceedings of 15th International Conference on Computational Linguistics, Kyoto, pp. 166-171.
[27] Davis, M. W. and Dunning, T. (1996) “A TREC Evaluation of Query Translation Methods for Multi-lingual Text Retrieval,” Proceedings of TREC-4.
[28] Sheridan, P. and Ballerini, J. P. (1996) “Experiments in Multilingual Information Retrieval Using the SPIDER System,” Proceedings of the 19th ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 58-65.
[29] Kohonen, T. (1982) “Self-organized formation of topologically correct feature maps,” Biological Cybernetics, Vol. 43, pp. 59-69.
[30] Yeh, C. H. and Chau, Rowena (2004) “Filtering Multilingual Web Content Using Fuzzy Logic and Self-Organizing Maps,” Neural Computing & Applications, Springer-Verlag, London UK, ISBN/ISSN: 0941-0643, pp. 140-148.
[31] Yeh, C. H. and Chau, Rowena (2004) “A multilingual text mining approach to web cross-lingual text retrieval,” Knowledge-Based Systems, Elsevier Science, Amsterdam Netherlands, ISBN/ISSN: 0950-7051, pp. 219-227.
[32] Lee, C. H. and Yang, H. C. (2000) “Towards multilingual information discovery through a SOM based text mining approach,” in Proceedings of International Workshop on Text and Web Mining, The Sixth Pacific Rim International Conference on Artificial Intelligence (PRICAI 2000), Melbourne, Australia, pp. 81–87.
[33] Lee, C. H. and Yang, H. C. (2003) “A Multilingual Text Mining Approach Based on Self-Organizing Maps,” In Applied Intelligence: Vol. 18, No. 3, pp. 295-310. (SCI)(EI)
[34] Rauber, A., Merkl, D., and Dittenbach, M. (2002) “The growing hierarchical selforganizing map: exploratory analysis of high-dimensional data,” IEEE Transactions on Neural Networks, Vol. 13, pp. 1331-1341.
[35] Rauber, A., Dittenbach, M., and Merkl, D. (2001) “Towards Automatic Content-Based Organization of Multilingual Digital Libraries: An English, French and German View of the Russian Information Agency Nowosti News,” In: Proceedings of the Third All-Russian Scientific Conference "Digital Libraries: Advanced Methods And Technologies, Digital Collections" (RCDL01), Russia, pp. 11-13.
[36] Pazzani, M., Muramatsu, J., and Billsus, D. (1996) “Syskill & Webert : Identifying Interesting Web Sites,” AAAI Spring Symposium on Machine Learning in Information Access, Standford, March 1996
and Proceedings of the Thirteenth National Conference on Artificial Intelligence AAAI 96, pp. 54-61.
[37] Yang, Y., and Liu, X. (1999) “A Re-Examination of Text Categorization Methods,” In Proceedings of SIGIR-99,22nd ACM International Conference on Research and Development in Information Retrieval (Berkeley, CA), pp.42-49.
[38] Joachims, T. (1997) “A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization,” Proceedings of the 14th International Conference on Machine Learning ICML97, pp. 143-151.
[39] Pazzani, M., and Billsus, D. (1997) “Learning and Revising User Profiles: The Identification of Interesting Web Sites,” Machine Learning 27 , Kluwer Academic Publishers, pp. 313-331.
[40] Merk, D. (1998) “Text classification with self-organizing maps: some lessons learned,” Neurocomputing, 21(1-3): pp. 61-77.
[41] Kohonen, T., Kaski, S., Lagus, K., Saloj?鑼vi, J., Honkela, J., Paatero, V., and Saarela, A. (2000) “Self Organization of a Massive Document Collection,” IEEE Transactions on Neural Networks, Special Issue on Neural Networks for Data Mining and Knowledge Discovery, vol. 11, no. 3, pp. 574-585.
[42] Ultsch, (1992) “Self-organizing neural networks for visualization and classification,” in Information and Classification. Concepts, Methods and Application, O. Opitz, B. Lausen, and R. Klar, Eds., Studies in Classification, Data Analysis, and Knowledge Organization, Springer, Dortmund, Germany, pp. 307-313.
[43] Merkl, D. and Rauber, A. (1997) “Alternative ways for cluster visualization in self-organizing maps,” in Proceedings of the Workshop on Self-Organizing Maps (WSOM97), T. Kohonen, Ed., Espoo, Finland, pp. 106-111.
[44] Rauber, A. (1999) “LabelSOM: On the labeling of selforganizing maps,” in Proceedings of the International Joint Conference on Neural Networks (IJCNN''99), Washington, DC, pp. 10-16.
[45] Merkl, D. and Rauber, A. (1999) “Automatic labeling of selforganizing maps for information retrieval,” in Proceedings of the 6. International Conference on Neural Information Processing (ICONIP99), Perth, Australia, pp. 16-20.
[46] Miikkulainen, R. (1990) “Script recognition with hierarchical feature maps,” Connection Science, vol. 2, pp. 83-101.
[47] Alahakoon, D., Halgamuge, S. K., and Srinivasan, B. (2000) “Dynamic self-organizing maps with controlled growth for knowledge discovery,” IEEE Transactions on Neural Networks, vol. 11, no. 3, pp. 601-614.
[48] Salton, G., Wong, A., and Yang, C. S. (1975) “A Vector Space Model for Automantic Indexing,” Communications of the ACM, Vol. 18(11), pp. 613-620.
[49] Salton, G. (1989) “Automatic Text Processing : the Transformation, Analysis, and Retrieval of Information by Computer,” Reading, MA: Addison-Wesley.
[50] Porter, M. F. (1980) “An algorithm for suffix stripping,” Program, Vol. 14, No. 3, pp. 130-137.
[51] Ricardo, B. Y. and Berthier, R. N. (1999) “Modern Information Retrieval,” Addison-Wesley.
[52] 曾元顯 (1997)“「關鍵詞自動擷取技術之探討」,?苳什篧炷?館學會會訊, 第106期, 第26-29頁。
[53] 許中川、陳景揆 (2001)“「探勘中文新聞文件」,?苳今堨蟆篣穈T管理學會會報, Vol. 14(2), 第103-122頁。
[54] 陳文華、施人英、吳壽山 (2004) “「探討文字採掘技術在管理者知識地圖之應用」,?? 中山管理評論，Vol. 12(6)，第35-64頁。
[55] 梁家豪、林福仁 (2004) “「結合事件主軸摘要之議題回顧機制於新聞報導應用」,?訄?立中山大學資訊管理學系碩士論文。

國圖紙本論文

推文
網路書籤
推薦
評分
引用網址
轉寄

top

相關論文
相關期刊
熱門點閱論文

1.	運用自我組織圖於多語言階層產生與比對之研究
2.	主題導向式自我組織圖之研究

無相關期刊

1.	色彩的空間效用之分析
2.	虛實光影於彩墨山水畫之創作研究
3.	影響醫學中心醫師開立慢性病連續處方箋之因素探討
4.	無線感測網路中以既有路徑為基礎：新需求之路由策略
5.	非閉合感測網路中之路徑更換策略
6.	無線感測網路中輔助路徑之建置
7.	中高齡族選擇社區照顧為老年安養之研究－以高雄市為例
8.	國內航空貨運承攬業在大陸地區佈署據點之探討
9.	遷徙家戶對住宅權屬與遷徙地點選擇之實證分析—以高雄縣市為例
10.	中國彩墨女性人物畫之創作研究
11.	員工壓力、身心狀況、自覺健康及休閒活動之探討－以南部某科學工業園區為例
12.	側風與擴散效應對捕集裝置的捕集效率影響
13.	低溫食品加工作業勞工健康危害探討
14.	國中生健康促進生活型態及其影響因素之探討-以台南縣四所國中生為例
15.	無氣泡氧氣溶解裝置應用於河川水質改善之效能評估

簡易查詢 | 進階查詢 | 熱門排行 | 我的研究室