跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.223) 您好!臺灣時間:2025/10/08 01:04
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:王誌鴻
研究生(外文):Zhi-Hong Wang
論文名稱:發展知識本體機制於中文新聞解析與分類
論文名稱(外文):Developing Ontological Mechanisms for Chinese News Analysis and Classification
指導教授:戚玉樑戚玉樑引用關係
指導教授(外文):Yu-Liang Chi
學位類別:碩士
校院名稱:中原大學
系所名稱:資訊管理研究所
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2007
畢業學年度:95
語文別:中文
論文頁數:76
中文關鍵詞:正規概念分析法知識本體資源描述架構語意概念圖網路本體語言
外文關鍵詞:Formal Concept AnalysisConcept MappingSemanticsWeb Ontology LanguageOntologyResource Description Framework
相關次數:
  • 被引用被引用:1
  • 點閱點閱:310
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:2
本研究發展以知識本體為核心進行新聞語意解析與情境分類,即知識架構中納入情境解析機制,使同一則新聞處於不同情境時,產生不同的解讀以利資訊分類。目前常見的文件解析方式,仍侷限於人類對詞彙的認知、詞性判斷以及同義詞的對應等,但對於文件內容更深層涵義(implications)之瞭解仍顯不足。現行的關鍵詞庫僅於個別語彙意義的制式化解釋,但無助於整體文件語意的解讀,文件解析能力應能考量需求者所處情境。故本研究於知識擷取時,需先蒐集相關元素及屬性或已存在的分類架構,利用正規概念分析法(Formal Concept Analysis, FCA),經由專家分析出元素及屬性之間之顯性及隱性關係,並輔以概念圖整合所有元素、屬性及分類架構間之關聯性。為建構新聞於不同情境時分類的法則,分別依過濾解析階段與情境分類階段兩階段,以轉換資料層級為語意層級,前者採用資源描述架構(Resource Description Framework, RDF),其目的為改善現有表達詞彙語意的方式;後者採用網路本體語言(Web Ontology Language, OWL),能加入描述邏輯以協助表達不同情境的知識,使解析新聞詞彙的資料格式由單純的資料層級,提升為具知識表達能力的語意層級。本研究最後以電子產業的「IC零組件」為例,擷取人類專家對於主要生產IC零組件製造商於不同情境下的知識,發展以本體機制解析中文新聞標題,並建置後續不同情境影響的語意分類,作為實證應用。
The common ways of context analysis have been limited to human understanding of vocabularies, speech judgment and synonym mapping, resulting in a lack of understanding of the deeper implications of the content of the text. Based on an ontology knowledge classification structure, our research aims to analyze news semantics and classify news scenarios. The integration of a scenario analysis mechanism into the knowledge structure would allow for different readings of news under different scenarios, benefiting classification of information.
In this research we collected relevant knowledge element properties, attributes and any existing classifying structures first. Following Formal Concept Analysis (FCA), we then integrate the elements and dominant/recessive attributes analyzed by the experts into a concept plan which shows the relationship among all the elements, their properties, and classification structures. To enhance the analysis of news contents from an information level to a semantic level, this research utilizes a two-step process, Resource Description Framework (RDF) and Web Ontology Language (OWL); the former improves the expression of vocabularies and the latter adds descriptive logic to help express knowledge under different scenarios.
We used the “IC Components” of the electronics industry as a case study to collect the knowledge the experts have regarding the different scenarios the manufacturers encounter. The knowledge was then used to analyze the Chinese news headlines based on the mechanism of ontology and establish a semantics classification as affected by different scenarios afterwards, which will be used as empirical application.
目  錄
摘 要...... I
英文摘要...... II
誌謝辭...... III
目  錄...... IV
圖 目 錄...... V
表 目 錄...... VI
第一章 緒論...... 1
1.1研究背景動機...... 1
1.2研究問題...... 2
1.3研究目的...... 3
第二章 文獻探討...... 6
2.1文件分析相關研究...... 6
2-2文件分類相關研究...... 11
2.3以本體為基礎的知識系統...... 13
第三章 研究設計...... 23
3.1過濾用本體設計...... 24
3.2分類用本體設計...... 26
3.3以本體機制解讀聚合新聞於分類系統...... 29
第四章 系統實作...... 36
4.1 知識擷取...... 36
4.2 實作篩選機制...... 41
4.3 分類機制應用於電子產業...... 44
4.4知識本體機制於新聞解析與分類系統實作...... 51
4.5系統評估...... 57
第五章 結論...... 61
5.1 研究貢獻...... 61
5.2 研究建議...... 62
參考文獻...... 63

圖 目 錄
圖2-1 中文斷詞服務回傳結果...... 10
圖2-2 建置本體的七個步驟...... 21
圖3-1 過濾用本體設計...... 26
圖3-2 分類用知識本體設計...... 28
圖3-3 以本體機制解讀聚合新聞於分類系統架構...... 30
圖3-4 概念圖-特定產業範例...... 31
圖3-5 RSS文件...... 33
圖3-6 中文新聞標題斷詞流程...... 34
圖3-7 過濾用本體比對斷詞結果...... 34
圖4-1 過濾用本體階層架構...... 37
圖4-2 FCA矩陣方格...... 38
圖4-3 ConExp對話盒...... 39
圖4-4 ConExp格線圖 ......39
圖4-5 電子產業概念圖...... 40
圖4-6 RSS文件範例...... 41
圖4-7 IC零組件過濾用本體...... 42
圖4-8 RSS處理對象...... 43
圖4-9 新聞篩選過程 ......44
圖4-10 情境分類-以廠商"三星電子"說明...... 45
圖4-11 素材類別...... 46
圖4-12 Protégé建立分類用本體T-Box...... 47
圖4-13 Protégé分類用本體推論結果...... 51
圖4-14 設定同義詞JSP頁面...... 53
圖4-15 RSS過濾新聞後JSP頁面...... 54
圖4-16 廠商三星電子於模組商情境分類...... 55
圖4-17 廠商三星電子於手機情境分類...... 55
圖4-18 廠商茂矽於模組商情境分類...... 56
圖4-19 廠商茂矽於手機情境分類...... 56
圖4-20 系統評估畫面...... 59

表 目 錄
表2-1 文件探勘相關研究...... 6
表2-2 中文斷詞工具整理...... 8
表2-3 自動分類技術目前相關研究-以關聯式資料庫為基礎...... 12
表2-4 自動分類技術目前相關研究-以本體為基礎...... 12
表2-5 歷年學者對本體定義...... 13
表2-6 知識表達方式...... 14
表2-7 XML-based 相關本體語言規範...... 15
表2-8 OWL規範版本(W3C, 2004)...... 18
表3-1 OWL公理限制式...... 24
表3-2 描述邏輯的類型與語法...... 27
表3-3 中研院平衡語料庫部份詞類標記集...... 33
表4-1 資料來源-RSS新聞頻道...... 42
表4-2 斷詞處理過程 ......43
表4-3 電子分類類別的屬性...... 48
表4-4 描述廠商概念之邏輯條件...... 49
表4-5 情境分類概念之邏輯條件...... 50
表4-6 系開發環境與開發工具...... 52
表4-7 資料來源RSS頻道...... 57
表4-8 混亂矩陣...... 58
表4-9 篩選機制評估結果...... 59
表4-10 分類評估結果...... 59
[中文部份]
[1]戚玉樑,「以本體技術為基礎的知識庫建置程序及其應用」,資訊科技與社會,第5卷,第1期,1-18頁,2005
[2]戚玉樑,「協同知識擷取與知識表達程序於建構本體的概念架構」,資訊管理學報,第13卷,第2期,193-212頁,2006
[3]曾元顯,「關鍵詞自動擷取技術之探討」,中國圖書館學會會訊,第5卷,第3期106頁,1997
[4]曾元顯,「文件主題自動分類成效因素探討」,中國圖書館學會會報,第68期,62-83頁, 2002

[英文部份]
[1]Alani, H., Kim, S., Millard, D.E., Weal, M.J., Hall, W., Lewis, P.H., and Shadbolt, N.R., “Automatic Ontology-Based Knowledge Extraction from Web Documents,” IEEE Intelligent Systems, 18(1), 2003, pp. 14-21.
[2]Borst, W. N., “Construction of Engineering Ontologies for Knowledge Sharing and Reuse,” PhD Thesis, University of Twente, 1997.
[3]Blosseville, M. J., Hebrail, G., Monteil M. G., and Penot, N., “Automatic document classification natural language processing, statistical analysis, and expert system techniques used together”, Proceedings of the 15th Annual International ACM SIGIR conference on Research and development in information retrieval, 1992, pp. 51-58.
[4]Caldas, C. H., and Soibelman, L., “Automating Hierarchical Document Classification for Construction Management Information Systems,” Automation in Construction, 12(4), 2003, pp. 359-406.
[5]Chen, H., Chau, M., and Zeng, D., “CI Spider: a tool for competitive intelligence on the Web,” Decision Support Systems, 34(1), 2002, pp. 1-17.
[6]Chandrasekaran, B., Josephson, J. R., and Benjamins, V. R., What Are Ontologies, and Why Do. We Need Them?,” IEEE Intelligent Systems, 14(1), 1999, pp. 20-26.
[7]Chang, Y.-M., Noh, Y.-H., “Developing a specialized directory system by automatically classifying Web documents,” in proceedings of journal of information science, 29(2), 2003, pp.117-126.
[8]Daconta, M.C., Obrst, L.J., and Smith, K.T. “The Semantic Web: A Guide to the Future of XML,” Web Services, and Knowledge Management, Indiana: Wiley, 2003.
[9]Decker, S., Melnik, S., Harmelen, F. V., Fensel, D., Klein, M., Broekstra, J., Erdmann, M., and Horrocks, I., “The Semantic Web: The Roles of XML and RDF,” Internet Computing, 4(4), 2000, pp. 63-74.
[10]Dieter, F., Horrocks, I., Harmelen, F. V., McGuinness, D., and Patel-Schneider, P.F., “OIL: Ontology Infrastructure to Enable the Semantic Web,” IEEE Intelligent System, 16(2), 2001, pp. 38-45.
[11]Dumais, S., and Chen, H., “Hierarchical Classification of Web Content,” in proceedings of the 23rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2000, pp.256-263.
[12]Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P., “The KDD Process for Extracting Useful Knowledge from Volumes of Data,” Communication of the ACM, 39, 1996b, pp. 27-34.
[13]Feldman, R., and Dagan, I., “Knowledge Discovery in Textual Database(KDT),” Proceedings of the first ACM DIGKDD International Conference on Knowledge Discovery and Date Mining, 1995, pp. 112-117.
[14]Feldman, R., Klösgen, W., Ben-Yehuda, Yaniv., and Kedar, Gil., “ Pattern Based Browsing in Document Collections,” Proceedings of First European Symposium on Principles of Data Mining and Knowledge Discovery, 1997, pp. 112-122.
[15]Feldman, R., and Dagan, I., “Mining Text Using Keyword Distributions,” Journal of Information Systems, 10(3), 1998, pp. 281-300.
[16]Fensel, D., McGuinness, D.L., Ng, W.K., Schulten, E., Lim, E., and Yan, G., “Ontologies and Electronic Commerce,” IEEE Intelligent Systems, 16(1), 2001, pp. 8-14.
[17]Gillam, L., Tariq, M., and Ahmad, K., “Terminology and the construction of ontology,” Terminology, 11(1), 2005, pp. 55-81.
[18]Guarino, N., “Formal ontology, conceptual analysis and knowledge representation, ” International Journal of Human-Computer Studies, 45(5), 1995, pp.625-640.
[19]Gruber, T.R., .“A translation approach to portable ontologies,” Knowledge Acquisition, 5(2), 1993, pp.199-200.
[20]Gomez-Perez, A., Fernandez-Lopez, M., and Corcho, O., “Ontological Engineering: with Examples from the Areas of Knowledge Management,” e-Commerce and the Semantic Web, Springer, New York, 2004.
[21]Huhns, M. N., and Stephens, L. M., “Personal Ontologies,” IEEE Internet Computing, 3(5), 1999, pp. 85-87.
[22]Huhns, M. N., and Singh, M.P., “Ontologies for Agents,” IEEE Internet Computing, 1(6), 1997, pp. 81-83.
[23]Horrocks, I., Patel-Schneider, P. F., and Harmelen, F. V., “From SHIQ and RDF to OWL: the making of a Web Ontology Language,” Web Semantics: Science,Services and Agents on the World Wide Web, 1(1), 2003, pp. 7-26.
[24]Joachims, T., “A Statistical Learning Model of Text Classification for Support Vector Machines,” in proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, 2001, pp. 128-136.
[25]Joachims, T., “Text Categorization with Support Vector Machines: Learning with Many Relevant Features,” in proceedings of 10th European ECML Conference on Machine Learning, 1998, pp.137-142.
[26]Kuo, Y. H., and W, M. H., “Web Document Classification based on Hyperlinks and Document Semantics,” PRICAI Workshop on Text and Web Mining, 2000.
[27]Kim, H.L., Kim, H.G., and Park, K.M.,”Posters Ontalk: Ontology-Based Personal Document Management System” the 13th International World Wide Web Conference on Alternate Track Papers & Posters, May. 17-22, 2004,pp. 420-421
[28]Lam, W., and Ho, C. Y., “Using A Generalized Instance Set for Automatic Text Categorization”,in proceedings of the 21th International ACM SIGIR Conference on Research and Development in Information Retrieval, 1998, pp.81-89.
[29]Larkey, L., and Croft, W., “Combining classifiers in Text Categorization,” in proceedings of the 19th International ACM SIGIR Conference on Research and Development in Information Retrieval, 1996, pp.289-297.
[30]Lee, C. S., Chen, Y. J., and Jian, Z. W., “Ontology-based Fuzzy Event Extraction Agent for Chinese E-news Summarization,” Expert Systems with Applications, 25(3), 2003, pp. 431-447.
[31]Lehmann, F., “(ed) Semantic Networks in Artificial Intelligence,” Elsevier Science Inc, New York, USA, 1992.
[32]López de Vergara, J.E., Villagrá, V.A., and Berrocal, J., “Applying the Web Ontology Language to management information definitions,” IEEE Communication magazine, 42(7), 2004, pp. 68-74.
[33]Maedche, A., Motik, B., Stojanovic, L., Studer, R., and Volz, R., “Ontologies for Enterprise Knowledge Management,” IEEE Intelligent Systems, 18(2), 2003, pp. 26-33.
[34]Marwick, A., D., “Knowledge Management Technology.” IBM Systems Journal, 40(4), 2001, pp. 814-830.
[35]Martin, P., and Eklund, P., “Embedding Knowledge in Web Documents,” Computer Networks, 31(11-16), 1999, pp. 1403-1419.
[36]Minsky, M., “A Framework for Representing Knowledge,” In Winston, P.J.,(ed) The psychology of computer visions, McGraw-Hill, New York, 1975.
[37]Moffat, M., “RSS-a primer for publiclisher and content provider,” New Review of Information Newtworking, 9(1), 2003, pp.123-144.
[38]Noy, N. F., and McGuinness, D. L., “Ontology Development 101: A Guide to Creating Your First Ontology,” Technical Report KSL-01-05, Stanford Knowledge Systems Laboratory, 2001.
[39]Neches, R.; Fikes, R.; Finin, T.; Gruber, T.; Patil, R.; Senator, T.; and Swartout, W., “Enabling technology for knowledge sharing,” AI Magazine, 12(3), 1991, pp. 36-56.
[40]Oh-Woog, K., and Lee, J-H.,“Web Page Classification Based on K-Nearest Neighbor Approach, ” in proceedings of the 5th international workshop on Information retrieval, 2000, pp.9-15.
[41]O’Leary, D. E., “Different Firms, Different Ontologies, and No One Best Ontology,” IEEE Intelligent Systems, 15(5), 2000, pp. 72-78.
[42]Romey, W. D., Inquiry techniques for science education. Englewood Clibbs, New Jersey:Prentice-Hall. 1968.
[43]Salton, G., and McGill, M. J., Introduction to Modern Information Retrieval, McGraw-Hill Book Co, New York, 1983.
[44]Sannomiya, T., Amagasa, T., Yoshikawa, M., and Uemura, S., “A Framework for Sharing Personal Annotations on Web Resources Using XML,” Information Technology for Virtual Enterprises, 2001. ITVE 2001. Proceedings. Workshop, Jan. 29-30, 2001, pp. 40-48.
[45]Shewhart M., Wasson, M., “Monitoring a newsfeed for hot topics,” Proceedings of the 5 th Int’l Conf. On Knowledge Discovery and Data Mining, 1999, pp. 402-404.
[46]Singh, L., Scheuermann, P., and Chen, B., “Generating Association Rules from Semi-Structured Documents Using and Extended Concept Hierarchy,” ACM IKM, 1997, pp. 193-200.
[47]Singh, L., Chen, B., Haight, R., and Scheuermann, P., “An Algorithm for Constrained Association Rule Mining in Semi-structured Data,” PAKDD-99, 1999, pp.148-158.
[48]Siolas, G., and Alche, F. d., “Support Vector Machines based on a Semantic Kernel for Text Categorization, ” in proceedings of the IEEE-INNS-ENNS International Joint Conference on , 2000, pp.205-209.
[49]Solomon Negash., “Bussiness Intelligence” Communications of the Association for Information Systems, 13, 2004, pp. 177-195
[50]Song, M. H., Lim, S. Y., Kang, D. J., and Lee, S. J., “Automatic Classification of Web Pages Based on the Concept of Domain Ontology,” Web Information Systems Engineering, 2002. WISE 2002. Proceedings of the Third International Conference on, 2005, pp. 182-191.
[51]Soo, V.W., Lee, C.Y., Li, C.C., Chen, S.L., and Chen, C.C., “Automated Semantic Annotation and Retrieval Based on Sharable Ontology and Case-Based Learning Techniques,” Digital Libraries, 2003. Proceedings. 2003 Joint Conference, May 27-31, 2003, pp. 61-72.
[52]Stamou, G., Osseenbruggen, J., Pan, Z., and Schreiber, G., “Multimedia Annotations on the Semantic Web,” IEEE Multimedia, 13(1), 2006, pp. 86-90.
[53]Studer, R., Benjamins, V. R., and Fensel, D., “Knowledge Engineering: Principles and methods,” Data & Knowledge Engineering, 25(1-2), 1998, pp.161-197.
[54]Swartout, B., R. Patil, K. Knight, T. Russ., “Toward. distributed used of large-scale ontologies,” Ontological. engineering, AAAI-97, Spring symposium series, 1997, pp. 138-148.
[55]Taghva, K., Borsack, J., Coombs, J., Condit, A., Lumos, S., and Nartker, T., “Ontology-based Classification of Email,” Information Technology, 2003, pp.194-198.
[56]Tam, A.M., and Leung, C.H.C., “Structured Natural-Language Descriptions for Semantic Content Retrieval of Visual Materials,” Journal of the American Society for Information Science and Technology, 52(11), 2000, pp. 930-937.
[57]Uschold, M., and Gruninger, M., “Ontologies: Principles, Methods and Applications,” Knowledge Sharing an Review, 11(2), 1996, pp.93-136.
[58]Vet van der, Dr. P.E. and Mars, Prof.dr.ir. N.J.I. “Bottom-up construction of ontologies,” IEEE Transaction on Knowledge and Data Engineering, 10(4), 1998, pp. 513-526.
[59]W3C Recommendation, “OWL: Web Ontology Language Overview,” 2004, Online Available at: http://www.w3c.org/TR/owl-features/
[60]W3C, “OWL: Web Ontology Language Use Cases and Requirements,” 2004, Online Available at: http://www.w3c.org/TR/webont-req/
[61]Wolstencroft, K., Lord, P., Tabernero, L., Brass, A., and Stevens, R., “Protein Classification Using Ontology Classification,” Bioinformatics , 2006.
[62]Wuthrich, B., Cho, V., Leung, S., Permunetilleke, D., Sankaran, K., Zhang, J., “Daily Stock Market Forecast from Textual Web Data,” IEEE International Conference on SMC, 3, 1998, pp. 2720-2725.
[63]Yang, Y., Liu, X., “Are-examination of text categorization methods,” in proceedings of the 22th International ACM SIGIR Conference on Research and Development in Information Retrieval, 1999,.pp.42-49.
[64]Zhuang, Y.T., Pan, Y.H., and Rui, Y., “Using Semantic Association to Support Content-Based Video Queries,” Journal of computer research & development, 36(5), 1999, pp. 613-616.
電子全文 電子全文(本篇電子全文限研究生所屬學校校內系統及IP範圍內開放)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top