跳到主要內容

臺灣博碩士論文加值系統

(44.221.70.232) 您好!臺灣時間:2024/05/21 05:39
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:朱慶章
研究生(外文):Chin-Chang Chu
論文名稱:應用文本探勘於網頁導覽架構建立之研究
論文名稱(外文):A Text Mining Approach for Construction of a Web Page Navigation Structure
指導教授:楊新章楊新章引用關係
指導教授(外文):Hsin-Chang Yang
學位類別:碩士
校院名稱:長榮大學
系所名稱:資訊管理學系碩士班
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2005
畢業學年度:93
語文別:中文
論文頁數:71
中文關鍵詞:文本探勘自我組織圖主題地圖網頁目錄
外文關鍵詞:Text MiningSelf-Organizing MapTopic MapsWeb Directories
相關次數:
  • 被引用被引用:0
  • 點閱點閱:202
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:3
全世界存在著數量龐大的網頁,因此從全球資訊網檢索需要的資訊已成為一項艱鉅的任務。目前有許多不同的檢索網頁模型已被使用在全球資訊網上,其中最廣泛被使用的模型之一為藉由瀏覽預先定義的網頁目錄階層來滿足使用者的目的。這些網頁目錄通常被編輯或分類為網頁資料夾並組織成階層式架構,而把網頁分類到適當的目錄和組織目錄階層的工作通常由專家學者手動來執行。在本文中,我們提出一個應用文本探勘技術於網頁上並自動地產生網頁目錄並組織成階層式架構之方法,並以主題地圖標準格式來表達此架構。此方法應用自我組織圖學習演算法以進行文本探勘,其優點為在建構網頁目錄及階層時不需要人力介入。實驗結果顯示我們的方法可以產生出適當的網頁目錄及階層。
There are enormous amount of web pages in the world. Retrieval of required information from the World Wide Web (WWW) is thus an arduous task. Different models for retrieving web pages have been used by the WWW community. One of the most widely used models is to traverse a predefined web directory hierarchy to reach a user’s goal. These web directories are compiled or classified folders of web pages and are usually organized into a hierarchical structure. The classification of web pages into proper directories and the organization of directory hierarchies are generally performed by human experts. In this work, we provide a method to apply a kind of text mining techniques on a set of web pages to automatically create web directories and organize them into hierarchies. The created web directories are then represented confined to topic maps standard for convenient exchange. Our method applies the self-organizing map learning algorithm and requires no human intervention during the construction of web directories and hierarchies. The experiments show that our method can produce comprehensible and reasonable web directories and hierarchies.
誌謝 I
摘要 II
Abstract III
目錄 IV
表目錄 V
圖目錄 VI
第一章 緒論 1
1.1 研究背景 1
1.2 研究動機 3
1.3 研究目的 6
1.4 研究架構 7
第二章 文獻探討 8
2.1 文本探勘 8
2.2 類神經網路 14
2.3 自我組織圖 16
2.4 主題地圖 23
2.4.1 XTM 25
2.4.2 Omnigator 26
第三章 研究方法 31
3.1前置處理 32
3.1.1 中文斷詞 32
3.1.2 挑選關鍵字 35
3.1.3 向量空間模式 37
3.2 自我組織圖訓練 39
3.3 產生特徵圖 41
3.4 自動產生網頁目錄 44
3.4.1 產生目錄階層 44
3.4.2 產生目錄 49
3.5 主題地圖 51
第四章 實證分析 53
第五章 結論與未來展望 67
5.1 結論 67
5.2 未來展望 67
參考文獻 68
中文文獻
中研院詞庫小組(2005)。中研院平衡語料庫詞類標記集。線上檢索
日期:2005年6月30日。網址:http://ckipsvr.iis.sinica.edu.tw/cat.htm/

林光龍、歐陽彥正(2002)。佛教知識庫的建立:以Topic Map建置玄奘西域行為例。佛教圖書館館訊第32期,pp. 41-54。

周政宏(1995)。神經網路-理論與實務。台北:松崗電腦圖書資料股份有限公司。

林頌堅(2004)。以自組織映射圖進行計算語言學領域視覺化之研究。
In Proceedings of ROCLING XVI, pp. 69-77.

葉怡成(1998)。類神經網路模式應用與實作(第五版)。台北:儒林圖書有限公司

英文文獻
Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining Association Rules between Sets of Items in Large Databases. In Proceedings of the 1993 ACM SIGMOD international conference on Management of data, Washington, D.C, pp. 207-216.

Biezunski, M. (1997). Introduction to Topic Mapping. In SGML Europe
1997 GCA Conference, Barcelona, Spain.

Deerwester, S., Dumais, S., Furnas, G., & Landauer, K. (1990). Indexing
by Latent Semantic Analysis. Journal of the American Society for
Information Science, 40(6), pp. 391-407.
Dörre, J., Gerstl, P., & Seiffert, R. (1999). Text Mining: Finding Nuggets in Mountains of Textual Data. In Proceedings of the 5’s ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, pp. 398-401.

Felbaum, C. (1998). WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA.

Feldman, R., & Dagan, I. (1995). KDT-Knowledge Discovery in Texts. In
Proceedings of the First International Conference on Knowledge
Discovery and Data Mining (KDD95).

Feldman, R., & Hirsh, R. (1997a). Finding Associations in Collections of
Text. In R.S. Michalski, I. Bratko, & M. Kubat, editors, Machine
Learning and Data Mining: Methods and Applications, pp. 223-240. John Wiley and Sons.

Feldman, R., Klösgen, W., Yehuda, Y. B., Kedar, G., & Reznikv, V. (1997b). Pattern Based Browsing in Document Collections. In
Proceedings Principles of Data Mining and Knowledge Discovery, First European Symposium( PKDD '97), pp. 112-122.

Hearst, M.A. (1997). Text Data Mining: issues, Techniques, and the
Relation to Information Access. Retrieved March 29, 2005 from
the World Wide Web : http://www.sims.berkeley.edu/~hearst/

Hearst, M.A. (2004). What Is Text Mining? Retrieved January 10, 2005 from World Wide Web : http://www.sims.berkeley.edu/~hearst/

Honkela, T., Kaski, S., Lagus, K., & Kohonen, T. (1996). Newsgroup
Exploration with WEBSOM Method and Browsing Interface. Technical Report A32. Helsinki University of Technology, Laboratory of Computer and Information Science, Espoo, Finland..

International Organization for Standardization. (2000). ISO/IEC 13250,
Information technology V SGML Applications V Topic Maps. ISO,
Geneva.

Kaski, S., Honkela, T., Lagus, K., & Kohonen, T. (1998). WEBSOM-Self-Organizing Maps of Document Collections. Neurocomputing, Vol. 21, pp. 101-117.

Kohonen, T. (1982). Self-Organizing Formation of Topologically Correct
Feature Maps. Biological Cybernetics, Vol. 43, pp. 59-69.

Kohonen, T. (1995). Self-Organizing Maps, Springer Verlag, Berlin..

Kohonen, T., Kaski, S., Lagus, K., & Honkela, T. (1996). Very Large Two-Level SOM for the Browsing of Newsgroups. In Proceedings of ICANN 1996, pp. 269-274.

Kohonen, T. (1998). Self-Organization of Very Large Document
Collections: State of the Art. In Proceedings of ICANN98, the 8th International Conference on Artificial Neural Networks, Vol. 1, pp. 65-74. Springer.

Lin, X., Soergel, D., & Marchionini, G. (1991). A Self-organizing Semantic Map for Information. In Proceedings of SIGIR 1991,
pp. 262-269.

Losiewicz, P., Douglas W. O., & Ronald N. K. (2000). Textual Data Mining to Support Science and Technology Management. Journal of Intelligent Information System, Vol. 15, pp. 99-119.

Ma,Q., Zhang, M., Murata, M., Zhou, M., & Isahara, H. (2002) Self-organizing Chinese and Japanese Semantic Maps. In Proceedings of COLING 2002.

Merkl, D. (1997). Exploration of Text Collections with Hierarchical Feature Maps. In Proceedings of SIGIR 1997, pp. 186-195.

Moore, G. (2000). Topic Map Technology - the State of the Art. In XML
Europe 2000, Paris, France.

Pepper, S. (1999). Navigating Haystacks and Discovering Needles.
Markup Languages: Theory and Practice 1.4, pp. 41-68.
Pepper, S. (2000). The TAO of Topic Maps Finding the Way in the Age of Infoglut. XML Europe 2000 Conference, June 12-16, Paris, France.

Rath, H. H. (1999). Technical Issues on Topic Maps. In Proceedings of
Metastructures 99 Conference, GCA.

Ritter, H., & Kohonen, T. (1989). Self-organizing Semantic Maps.
Biological Cybernetics, 61, pp. 241-254.

Salton, G., Wong, A., & Yang, C. S. (1975). A Vector Space Model for
Automatic Indexing. Communications of the ACM, 18(11),
pp. 613-620.

Tan, A. H. (1999). Text Mining : The state of the art and the challenges. In Proceedings of the Pacic Asia Conference on Knowledge Discovery and Data Mining(PAKDD’99), Beijing, pp. 65-70.

Wermter, S., & Hung, C. (2002). Selforganizing Classification on the Reuters News Corpus. In Proceedings of COLING 2002.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top