跳到主要內容

臺灣博碩士論文加值系統

(3.229.137.68) 您好!臺灣時間:2021/07/25 16:43
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:林紘靖
研究生(外文):Hun-Ching Lin
論文名稱:以模糊正規概念分析法進行自動化文件分類
論文名稱(外文):Automatic Document Classification UsingFuzzy Formal Concept Analysis
指導教授:李昇暾李昇暾引用關係
指導教授(外文):Sheng-tun Li
學位類別:碩士
校院名稱:國立成功大學
系所名稱:資訊管理研究所
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2009
畢業學年度:97
語文別:中文
論文頁數:54
中文關鍵詞:資訊擷取模糊正規化概念分析模糊邏輯正規化概念分析文件分類
外文關鍵詞:fuzzy FCAfuzzy logicFCAdocument categorizationinformation retrieval
相關次數:
  • 被引用被引用:5
  • 點閱點閱:308
  • 評分評分:
  • 下載下載:20
  • 收藏至我的研究室書目清單書目收藏:1
隨著電腦的普及和網際網路的蓬勃發展以及知識時代的到來,數位化文件的數量跟著爆發性的成長。在搜尋引擎上輸入關鍵字後的回傳結果往往數以萬計,從大量的文件資料庫中找尋出符合需求的文件也成為越來越費時的任務。因此,人們開始研究如何從龐大的資料集中找出符合需求的文件,而如何對數位化文件進行自動化的分類整理也成為管理文件資料庫上的一個重要議題。
近年來利用正規化概念分析(Formal Concept Analysis, FCA)於資訊擷取上的研究日益蓬勃。但由於傳統正規概念分析難以表現文件分類此種不明確的資訊(Tho et al., 2006),因此有學者提出結合模糊理論於正規概念分析的模糊正規概念分析(Fuzzy FCA, FFCA) (Burusco and Fuentes-Gonzales, 1994)。此後,模糊正規概念分析的相關研究更如雨後春筍般地迅速發展。本研究試圖藉由資訊擷取技術(Information Retrieval)的方法讓機器自動對文件集進行分析,藉以挑選出整個文件集中最重要的數個關鍵字,並利用資訊擷取技術方法給予模糊隸屬度,再利用模糊正規化概念分析(Fuzzy Formal Concept Analysis, FFCA)方法對這些文件進行分類。本研究利用正規化概念分析所產生之概念網絡進行文件分類的計算推理,試圖尋找概念網絡圖除表現領域知識架構外的其他應用。期望能對於以正規化概念分析進行分類的研究乃至於正規化概念分析方法有所改善及幫助。
研究結果證實藉由正規概念分析所產生之概念網路圖輔以模糊數學方法確能得到準確的分類效果。在多個類別的分類上也能取得較為穩定的結果。
As computer becomes popular, the internet developes and the coming of the age of knowledge, the numerious of digital documents increases faster. There are always a huge deal of search resoult when we use search engine on the internet, and it becomes more and more difficult to find specified document from databases. Hense, people starts to find the way to find required documents from a huge database. Thus, automatical categorization of documennts becomes an important issue in managing document datas.
In recent years, more and more research uses formal concept analysis(FCA) on information retrieval. However, classical formal concept analysis present the fuzzy information of document categorization (Tho et al., 2006), some research thus combines fuzzy theory with FCA to fuzzy FCA (Burusco and Fuentes-Gonzales, 1994). The researches of FCA then become more and more. This proposed research is trying to analysis documents with information retrieval technology to find the most important keywords of the specified dataset, then give fuzzy membership degree and then categorize the documents with fuzzy FCA. In this research, the categorization is computed with the concept lattice produced from the FCA process to find an application of the concept lattice besides presenting the domain knowledge. We hope this to be helpful to the researches of document categorization using FCA.
The result shows that the categorization using concept lattice combining with fuzzy logic is precise. And the result is steady for all categories.
摘要 i
Abstract ii
誌謝 iii
目錄 iv
圖目錄 vi
表目錄 vii
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究目的 2
1.3 研究步驟與流程 2
1.4 論文架構 3
1.5 研究限制 4
第二章 文獻回顧 5
2.1 資訊擷取 5
2.1.1 逆向一致頻率 5
2.1.2 一致性 6
2.1.3 詞頻-逆向文件頻率 6
2.1.4 文件分類技術 8
2.2 模糊邏輯 9
2.2.1 模糊α-截集 10
2.2.2 標準模糊運算元 12
2.2.3 模糊合成運算 12
2.2.4 模糊集合相似度 13
2.3 正規概念分析 13
2.3.1 概念背景 14
2.3.2 概念網絡 14
2.3.3 概念學習 16
2.3.4 模糊正規概念分析 18
第三章 研究方法 19
3.1 概念學習 19
3.1.1 特徵擷取 21
3.1.2 建構概念網絡 22
3.1.3 建構關鍵字-類別關聯矩陣 25
3.2 新文件分類 26
3.2.1 尋求最似概念 27
3.2.2 求取最佳分類 29
第四章 實驗與分析 31
4.1 實驗方法 31
4.2 實驗結果 31
4.3 敏感度分析 33
4.4 實驗結果比較 39
4.5 各分類方法分析 43
4.5.1 K-NN 43
4.5.2 決策樹、決策表 45
4.5.3 類神經網路(支援向量機,SVM) 45
4.5.4 貝氏網路 46
4.5.5 模糊正規概念分析法 47
第五章 結論與未來展望 48
5.1 結論 48
5.2 應用價值 49
5.3 未來展望 49
參考文獻 51
Abebe, A. J., Guinot, V. and Solomatine, D. P. (2000). "Fuzzy alpha-cut vs. Monte Carlo techniques in assessing uncertainty in model parameters".
Ankerst, M., Ester, M. and Kriegel, H. P. (2000). "Towards an effective cooperation of the user and the computer for classification".
Becker, P. (2005). "Using intermediate representation systems to interact with concept lattices".
Brunato, M. and Battiti, R. (2005). "Statistical learning theory for location fingerprinting in wireless LANs." Computer Networks, 47(6): 825-845.
Burusco, A. and Fuentes-Gonzales, R. (1994). "The study of the L-fuzzy concept lattice." Mathware & Soft Computing, 3: 209-218.
Cole, R., Eklund, P. and Amardeilh, F. (2003). "Browsing Semi-structured Texts on the web using Formal Concept Analysis".
Cole, R. J. (2000). The Management and Visualisation of Document Collections Using Formal Concept Analysis. School of Information, Technology
Griffith University.
Cross, V. (2003). "Uncertainty in the automation of ontology matching". Paper presented at the ISUMA.
Eklund, P. and Wormuth, B. (2005). "Restructuring help systems using formal concept analysis".
El-Naqa, I., Yang, Y., Wernick, M. N., Galatsanos, N. P. and Nishikawa, R. M. (2002). "A support vector machine approach for detection of microcalcifications." IEEE Transactions on Medical Imaging, 21(12): 1552-1563.
Everts, T. J., Park, S. S. and Kang, B. H. (2006). "Using formal concept analysis with an incremental knowledge acquisition system for web document management".
Foo, S. and Li, H. (2004). "Chinese word segmentation and its effect on information retrieval." Information Processing and Management, 40(1): 161-190.
Frank, E., Hall, M., Trigg, L., Holmes, G. and Witten, I. H. (2004). Data mining in bioinformatics using Weka (Vol. 20, pp. 2479-2481): Oxford Univ Press.
Huang, Y. L. (1998). "A Theoretic and Empirical Research of Cluster Indexing for Mandarin Chinese Full Text Document." The Journal of Library and Information Science, 24: 1023-2125.
Hurtado, J. E. (2004). "An examination of methods for approximating implicit limit state functions from the viewpoint of statistical learning theory." Structural Safety, 26(3): 271-293.
Kim, M. and Compton, P. (2004). "Evolutionary document management and retrieval for specialized domains on the web." International Journal of Human-Computer Studies, 60(2): 201-241.
Klir, G. J. and Yuan, B. (1995). "Fuzzy sets and fuzzy logic: theory and applications": Prentice Hall Upper Saddle River, NJ.
Kretschmann, E., Fleischmann, W. and Apweiler, R. (2001). Automatic rule generation for protein annotation with the C4. 5 data mining algorithm applied on SWISS-PROT (Vol. 17, pp. 920-926): Oxford Univ Press.
Lee, H. M., Chen, C. M. and Hwang, C. W. (2000). "A Neural Network Document Classifier with Linguistic Feature Selection." LECTURE NOTES IN COMPUTER SCIENCE: 555-560.
Luo, J., Savakis, A. E., Etz, S. P. and Singhal, A. (2000). "On the application of Bayes networks to semantic understanding ofconsumer photographs".
Mahadevan, S. and Rebba, R. (2005). "Validation of reliability computational models using Bayes networks." Reliability Engineering and System Safety, 87(2): 223-232.
Manevitz, L. M. and Yousef, M. (2002). "One-class svms for document classification." The Journal of Machine Learning Research, 2: 139-154.
Nie, J. Y. and Ren, F. (1999). "Chinese Information Retrieval: Using Characters or Words?" Information Processing & Management, 35(4): 443-462.
Pawlak, Z. (2002). "Rough sets, decision algorithms and Bayes' theorem." European Journal of Operational Research, 136(1): 181-189.
Perez, M. S., Sanchez, A., Herrero, P., Robles, V. and Pena, J. M. (2005). "Adapting the weka data mining toolkit to a grid based environment." LECTURE NOTES IN COMPUTER SCIENCE, 3528: 492–497.
Priss, U. (2006). "Formal concept analysis in information science." ANNUAL REVIEW OF INFORMATION SCIENCE AND TECHNOLOGY, 40: 521.
Quan, T. T., Hui, S. C. and Cao, T. H. (2004). "A fuzzy FCA-based approach for citation-based document retrieval".
Quinlan, J. R. (1987). "Generating production rules from decision trees".
Salton, G. and Buckley, C. (1988). "Term Weighting Approaches in Automatic Text Retrieval." Information Processing and Management, 24(5): 513-523.
Tho, Q. T., Hui, S. C., Fong, A. C. M. and Cao, T. H. (2006). "Automatic Fuzzy Ontology Generation for Semantic Web." IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING: 842-856.
Wang, T. Y. and Chiang, H. M. (2007). "Fuzzy support vector machine for multi-class text categorization." Information Processing and Management, 43(4): 914-929.
Wille, R. (2005). "Formal Concept Analysis as Mathematical Theory of Concepts and Concept Hierarchies." LECTURE NOTES IN COMPUTER SCIENCE, 3626: 1.
Wille, R. and Mathematik, F. (1982). "RESTRUCTURING LATTICE THEORY: AN APPROACH BASED ON HIERARCHIES OF CONCEPTS".
Wolff, K. E. (1993). "A first course in formal concept analysis." SoftStat, 93.
Zadeh, L. A. (1965). "Fuzzy sets." Information and control, 8(3): 338-353.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊