跳到主要內容

臺灣博碩士論文加值系統

(35.174.62.102) 您好!臺灣時間:2021/07/25 03:50
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:張家豪
研究生(外文):Chia-Hao Chang
論文名稱:網際網路搜尋資訊之涵意探究及其變化偵測
論文名稱(外文):Concepts Extraction and Change Detection from Navigated Information over the Internet
指導教授:張德民張德民引用關係
指導教授(外文):Te-Min Chang
學位類別:碩士
校院名稱:國立中山大學
系所名稱:資訊管理學系研究所
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2004
畢業學年度:92
語文別:英文
論文頁數:63
中文關鍵詞:擴散催化理論涵意變化追蹤涵意擷取網際網路
外文關鍵詞:Spreading Activation Theory.Concept Change DetectionConcepts ExtractionInternet
相關次數:
  • 被引用被引用:2
  • 點閱點閱:194
  • 評分評分:
  • 下載下載:30
  • 收藏至我的研究室書目清單書目收藏:5
網際網路的出現使得全球資訊間的溝通變的更加容易。網際網路讓世界間的資訊能夠相連互通,使用者可以透過網際網路之搜尋引擎來查詢所需要的資訊。雖然搜尋引擎可以幫助使用者收集資訊,但是使用者無法從這些大量結果中整理出其所包含的涵意。另外,網際網路的資訊也會隨著時間增加而變化,使得使用者更不容易去追蹤主題涵意的變化及其意義。因此本研究提出一個二階段漸進式的方法針對使用者有興趣的主題,搜尋相關的資訊,找出可代表此主題的概念結構圖;並利用擴散催化理論近一步隨時間偵測概念的變化並找出其變化的意義。
本研究接著進行實驗以驗證所提方法之適用性。實驗一是評估所提方法第一階段的輸出結果,經專業專家驗證所得結果有很高的精確度與回覆率。實驗二是評估所提方法追蹤的涵意變化結果,經專業專家驗證所得結果亦有很高的同意率。這些實驗都說明所提方法在實際案例的實用性。因此,藉由本研究方法的幫助,使用者可以容易地瞭解他們有興趣的主題內容涵意,並知悉這些涵意隨時間的變化。
The emergence of the Internet has made the global information communications much easier than before. Users can navigate the desired information over the Internet by means of search engines. Even though search engine can help users search specified topic in a primary way, users usually cannot gain the overall idea of what the entire navigated results mean. In addition, information over the Internet keeps changing. Users cannot even keep track of the changes, let alone to comprehend the meanings of such changes. Consequently, this research proposes a two-stage incremental approach to figuring out the concept structure that represents the main concepts of the search results in the first stage, and keeping track of the concept changes with time based on spreading activation theory to assist users in the second stage.
Experiments are conducted to examine the feasibility of our proposed approach. The first experiment is to evaluate the results from the first stage. It shows that the performance on recall and precision is quite satisfactory based on human experts’ results. The second experiment is to examine the changing results from the entire proposed approach. It shows that high degree of agreement with our results is achieved from domain experts. Both experiments justify the feasibility of our proposed approach in real applications. That is, applying our proposed approach, users can easily focus on the topic they are interested in and learn its trend with great support.
Keywords: Internet, Concepts Extraction, Concept Change Detection, Spreading Activation Theory.
CHAPTER 1 INTRODUCTION 1
1.1 OVERVIEW 1
1.2 OBJECTIVE OF THE RESEARCH 2
1.3 ORGANIZATION OF THE THESIS 3
CHAPTER 2 LITERATURE REVIEW 4
2.1 INFORMATION RETRIEVAL 4
(1) Boolean retrieval model 4
(2) Vector space model 5
(3) Probabilistic model 5
2.2 TEXT MINING 6
(1) Text Categorization 7
(2) Document Clustering 7
(3) CONCEPT EXTRACTION 8
2.3 CLUSTERING TECHNIQUES 9
2.4 SPREADING ACTIVATION THEORY 13
CHAPTER 3 PROPOSED APPROACH 16
3.1 CONCEPT EXTRACTION STAGE 16
Step 1: Preprocessing documents 16
Step 2: Establishing the co-occurrence graph 20
Step 3: Calculating co-occurrence frequency 20
Step 4: Clustering features 21
Step 5: Extracting concepts 22
3.2 CONCEPT CHANGE DETECTION STAGE 23
Step 1: Collecting features over time 24
Step 2: Changing the activation strengths of links 25
Step 3: Analyzing the change of clusters 27
Step 4: Detecting concept changes 29
CHAPTER 4 EXPERIMENTS AND RESULTS 31
4.1 EXPERIMENT I ON CONCEPTS EXTRACTION PERFORMANCE 31
4.2 EXPERIMENT II ON THE OVERALL PERFORMANCE 38
4.2.1 Experiment II.1 38
4.2.2 Experiment II.2 42
CHAPTER 5 CONCLUSIONS 46
5.1 CONCLUDING REMARKS 46
5.2 FUTURE WORK 47
REFERENCE 48
APPENDIX A THE EXPERIMENT RESULTS OF CONCEPT EXTRACTION 48
APPENDIX B PREDEFINED KEYWORDS BY EXPERTS 48
APPENDIX C DETAIL INFORMATION OF INFORMATION CHANGE 48
APPENDIX D SUMMARY OF RESULT CONCEPTS 48
APPENDIX E DETAIL INFORMATION OF CONCEPT CHANGE DETECTION 48
APPENDIX F SUMMARY OF RESULT CONCEPTS 48
(1)賴志民,網際網路上資訊涵意探究與資訊變化追蹤之研究,中山大學資訊管理研究所碩士論文,民91
(2)葉飛, 線上問答集輔助建立之研究,中山大學資訊管理研究所碩士論文,民92
(3)Anderberg. M. R. , Cluster Analysis for Application. Academic Press, Inc., 1973.
(4)Anderson, J. R. “A spreading activation theory of memory,” Journal of Verbal Learning and Verbal Behavior 22, 1983, pp.261–295.
(5)Anderson, J. R. and Pirolli P. L., “Spread of Activation,” Journal of Experimental Psychology: Learning, Memory, and Cognition, 1984, Vol. 10 . No. 4. pp.261–295.
(6)Apt''e, C., Damerau, F. and Weiss, S., ”Automated Learning of Decision Rules for Text categorization,” ACM Transaction on Information System, Vol.12, No.3, 1994, pp. 233-251
(7)Baeza-Yates, R. and Ribeiro-Neto, B. Modern Information Retrieval, Addison Weseley, 1999.
(8)G Boley, D., Gini, M., Gross, R., Han, E., Hastings, K., Karypis, G., Kumar, V., Mobasher, B., and Moore, J., “Partitioning-based Clustering for Web Document Categorization,” Decision Support Systems, Vol. 27, No. 3, 1999, pp.329-341.
(9)Carpineto, C., Romano, G., “Effective reformulation of Boolean queries with concept lattices,” Datalogiske Skrifter, Issue.78, 1998.
(10)Chang, T. M., and Lai, C. M., "Cluster-based Keyword Extraction Approach," Proceedings of The 6th Pacific Asia Conference on Information Systems, Tokyo, Japan, September 2002.
(11)Chih-Ping Wei, Selwyn Piramuthu, and Michael J. Shaw, ”Knowledge Discovery and Data mining .”
(12)Chircu, A.M. and R.J. Kauffman, "Reintermediation Strategies in Business-to-Business Electronic Commerce," International Journal of Electronic Commerce, vol. 4, no. 4, 2000, pp. 7-42
(13)Collins, A.M and Loftus, E. F,”A spreading activation theory of semantic processing,” Psychological Review, 82, 1975, pp.407-425.
(14)Crestani, F.,” Application spreading activation techniques in information retrieval,” Artificial Intelligent Review,11(6),1995, pp.453-482.
(15)Croft, W. B. and D.J. Harper., “Using probabilistic models of document retrieval without relevance information,” Journal of Documentation, 1979.
(16)Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., and Harshman, R., “Indexing by latent semantic indexing,” Journal of the American Society for Information Science, Vol. 41, No. 6, 1990.
(17)Dumais, S. T., “Latent semantic indexing (LSI) and TREC-2,” Proceedings of Text Retrieval Conference, 1994.
(18)Dumais, S., Platt, J., Heckerman, D., and Sahami, M., “Inductive Learning Algorithms and Representation for Text Categorization,” Proceedings of the 1998 ACM 7th International Conference on Information and Knowledge Management (CIKM ''98), 1998, pp.148-155.
(19)Elizabeth D., Liddy., ”Text mining,” Bulletin of the American Society for Information Science, Vol. 27 , N0.1, 2000.
(20)Guha, S., Rastogi, R., Shim, K., “CURE: An Efficient Clustering Algorithm for Large Databases,” SIGMOD Conference, 1998, pp.73-84.
(21)Halliday, M. A. K. and Hansan, R., Cohesion in English, Longman, 1976.
(22)Hathaway, R.J., Bezdek, J.C., “An Iterative procedure for minimizing a generalized sum-of-squared-errors clustering criterion”, Neural, Parallel & Scientific Computations Vol. 2, 1994, pp. 1-16.
(23)Hathaway, R.J., Bezdek, J.C., Davenport, J.W., “On relational data versions of c-means algorithms,” Pattern Recognition Lett. 17, 1996, pp. 607-612.
(24)Huberman, B.A. and T. Hogg, “Phase transitions in artificial intelligence systems,” Artificial Intelligences, 33 (1987) , pp. 155-171
(25)Huffman, S., Learning information extraction patterns from examples. In IJCAI 1995 Workshop on New Approaches to Learning for Natural Language Processing, 1995, pp.127-142.
(26)Jain, A. K., Murt, M. N., P.J. Flynn., “Data Clustering: A Review,” ACM Computing Surveys, Vol.41, No.3, September 1999, pp.264-323
(27)Jardine, N., Sibson, R., Mathematical Taxonomy. London: Wiley, 1971.
(28)Jiawei , H. . Micheline, K ., Data mining: Concepts and Techniques, 2001.
(29)Karen Sparck Jones, “Automatic summarizing: factors and directions,” In Inderjeet Mani and Mark T. Maybury, editors, ADVANCES IN AUTOMATIC TEXT SUMMARIZATION, The MIT Press, 1999, pp. 1-12.
(30)Kaufman, L. and Rousseeuw, P. J., Clustering by means of Medoids, in Statistical Data Analysis Based on the L1-Norm and Related Methods, Amsterdam, North-Holland Publishing Company, 1987.
(31)Kaufman, L. and Rousseeuw, P.J., “Finding Groups in Data: An Introduction to Cluster Analysis,” New York: John Wiley & Sons, 1990.
(32)King, B., “Step-wise clustering procedures,” J.Am. Stat. Assoc.69, 1967, pp.86-101.
(33)Kucera, H., and Francis, W. N., 1967, “Computational Analysis of Present-Day American English.” Providence, Rhode Island: Brown University Press.
(34)Larsen, B. and Aone, C., “Fast and Effective Text Mining Using Linear-time Document Clustering,” Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1999, pp.16-22.
(35)MacQueen, J., ”Some Methods of Classification and Analysis of Multivariate Observations,” Proc. 5th Berkeley Symp. Math. Statist, Prob., 1,1967, pp.281-297.
(36)Matsumura, M., Ohsawa, Y., Ishizuka, M., “PAI: Automatic Indexing for Extracting Asserted Keywords from a Document,” AAAI Fall Symposium on Chance Discovery, 2002.
(37)Matthew KO, Lee and Efraim Turban, "A Trust Model for Consumer Internet Shopping", International Journal of Electronic Commerce, 6(1), Fall 2001, pp 75-91.
(38)Morris, J. and Hirst, G., “Lexical cohesion computed by thesaural relations as indicator of the structure of text,” Computational Linguistics, Vol. 17, No. 1, 1991.
(39)Nasukawa T. and Nagano, T., ”Text analysis and knowledge mining systems,” IBM System Journal, Vol. 40 , No. 4 ,2001, pp.967-984.
(40)Ng , R., Han, J., ”Efficient and Effective Clustering Method for Spatial Data Mining,” In Proc.1994 Int. Conf. Very Large Data Nases (VLDB’94), 1994, p.p144-155.
(41)Ohsawa, Y., Benson, N. E. and Yachida, M., “KeyGraph: Automatic indexing by co-occurrence graph based on building construction metaphor,” Proceedings of Advanced Digital Library Conference, 1998.
(42)Ohsawa, Y., “The Scope of Chance Discovery,” New Frontiers in Artificial Intelligence: Joint JSAI 2001 Workshop Post-Proceedings, 2001.
(43)Okazaki, N., Matsuo, Y., Matsumura, N., Ishizuka, M., “Sentence Extraction by Spreading Activation with Refined Similarity Measure,” Proc. 16th Int''l FLAIRS Conf., 2003, pp. 407-411.
(44)Peter Weill, Michael Vitale: “ What It Infrastructure Capabilities Are Needed To Implement E-Business Models?“ MIS Quarterly Executive Vol. 1 No. 1 / March 2002. pp.: 17-34.
(45)Paice, 1991 ,”A Thesaural model of information retrieval,” Information processing and management, 27(5), 1991, pp.433-447.
(46)Pirolli, P., Pitkiw, J., Rao, R., ”Silk from a sow’s ear: Extracting usable structures from the web,” In proceeding of Chi, 1996, pp.118-125.
(47)Ricardo B. Y. and. Berthier R. N., Modern Information Retrieval , Addison-Wesley Longman, 1999.
(48)Robert R. Korfhage, Information Storage and retrieval, John Wiley & Sons, Inc.,1997.
(49)Robertson, S. E., Jones, K. S., “Relevance weighting of search terms,” Journal of the American Society for Information Sciences, 27(3), 1976, pp.129-146.
(50)Roussinov, D, G. and Chen, H., “Document Clustering for Electronic Meetings: An Experimental Comparison of Two Techniques,” Decision Support Systems, Volume 27, Number 1-2, Pages 67-80, November 1999.
(51)Ruge, G., “Combining Corpus Linguistics and Human Memory Models for Automatic Term Association,” AI Group, Institut fuer Informatik, TU Muenchen. Natural Language Information Retrieval, Kluwer Academic Publishers, 1997.
(52)Rumelhat, D., & McClelland, J. (1986). Parallel Distributed Processing. Cambridge, MA.: MIT Press.
(53)Rumelhart, D., Norman, D., ”Representation in memory,” Technical report, Department of Psychology and Institute of Cognitive Science, UC, 1983.
(54)Saitou, N. and M. Nei, “The neighbor-joining method: A new method for reconstructing phylogenetic trees,” Molecular Biology and Evolution 4, 1987, pp. 406-425.
(55)Salton, G. and Buckley, C., “Term weighting approaches in automatic text retrieval,” Information Processing and Management, Vol. 14 No. 5, 1988.
(56)Salton, G., Wong, A., and Yang, C. S., “A vector space model for automatic indexing,” Communications of the ACM. Vol.18, 1975.
(57)Sneath, P. H. A., Sokal, R. R. Numerical Taxonomy., Freeman, London, UK, 1973.
(58)Tan, A.H., “Text Mining: The State of the art and challenges,” Proceedings of the Pacific Asia Conference on Knowledge Discovery and Data mining(PAKDD’99), Beijing, 1999, pp.65-70 ,
(59)Van Rijsbergen, C.J., Information retrieval, 2d ed. London: Butterworths, 1979.
(60)Voorhees, E. M. 1985, “The Effectiveness and Efficiency of Agglomerative Hierarchic Clustering in Document Retrieval,” Ph.D. Thesis, Cornell University .
(61)Ward, J. H. Jr., “Hierarchical Grouping to optimize an Objective Function.,“ Journal of American Statistical Association, Vol.69, 1963, pp.236-244.
(62)Wei, J., Bressan, S., and Ooi, B. C., “Mining Term Association Rules for Automatic Global Query Expansion: Methodology and Preliminary Results,” Proceedings of the First International Conference on Web Information Systems Engineering, 2000, pp.366-373.
(63)Yang Y., Pederson J.P., “a comparative study on feature selection in text categorization,” Proceeding of the Fourteenth International Conference on Machine Learning (ICML’97), 1997, pp.412-420.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top