跳到主要內容

臺灣博碩士論文加值系統

(44.211.84.185) 您好!臺灣時間:2023/05/30 05:50
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:陳俊達
研究生(外文):Jyun-Da Chen
論文名稱:EventEpisodeDiscovery:AnEnhancedTemporal-basedApproach
論文名稱(外文):以改良式時間為基礎之事件階段偵測技術
指導教授:魏志平魏志平引用關係
指導教授(外文):Chih-Ping Wei
學位類別:碩士
校院名稱:國立清華大學
系所名稱:科技管理研究所
學門:商業及管理學門
學類:其他商業及管理學類
論文種類:學術論文
畢業學年度:96
語文別:英文
論文頁數:56
中文關鍵詞:發現子事件事件偵測事件演化文件探勘文件分群
外文關鍵詞:Event Episode DiscoveryEvent DetectionEvent EvolutionText Mining
相關次數:
  • 被引用被引用:0
  • 點閱點閱:218
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
When performing environment scanning, organizations typically deal with a numerous of events and topics about their core business, relevant technique standards, competitors, and market, among many others, where each event or topic to monitor or track generally is associated with many news documents. To reduce such information overload and information fatigues when monitoring or tracking events, it is essential to develop an effective event episode discovery mechanism to organize all news documents pertaining to an event of interest. In this thesis, we propose a new metric, referred to as TF�eEnhanced-IDFTempo and develop an event episode discovery technique that uses the TF�eEnhanced-IDFTempo metric as its feature selection method and document representation schemes. Using the traditional TF�eIDF and the TF�eIDFTempo as performance benchmarks, our empirical evaluation results suggest that our proposed TF�eEnhanced-IDFTempo technique outperforms its benchmarks in cluster recall and cluster precision and demonstrates its better capability in reaching the true number of episodes.
Abstract i
摘要 ii
誌謝辭 iii
Table of Contents iv
LIST OF FIGURES v
LIST OF TABLES vi
1 Introduction 1
1.1 Background and Motivation 1
1.2 Research Objective 4
1.3 Organization of the Thesis 4
2 Problem Definition and Literature Review 5
2.1 Terminology 5
2.2 Formulation of Event Episode Discovery Problem 6
2.3 Literature Review 8
2.4 Overview and Analysis of TF�eIDFTempo 10
2.5 Related Research with Event-based Streams 15
3 Methodology: TF�eEnhanced-IDFTempo for Event Episode Discovery 17
3.1 New Metric: Enhanced-IDFTempo 18
3.2 System Overview 23
4 Experiments and Evaluation 27
4.1 Data Collection 27
4.2 Evaluation Design 28
4.3 Parameter Tuning Experiments and Results 31
4.4 Comparative Evaluation Results 35
4.5 Effects of Time Decaying 38
4.6 In-depth analyses of Effects of Enhanced-IDFTempo 41
4.7 Analysis of Capability in Reaching True Number of Episodes 44
5 Conclusion and Future Work 52
6 Reference 54
[AGF+05] Allan, J., Harding, S., Fisher, D., Bolivar, A., Guzman-Lara, S., and Amstutz, P., “Taking Topic Detection from Evaluation to Practice,” Proceedings of the Thirty-Eighth Annual Hawaii International Conference on System Sciences (HICSS), Big Island, Hawaii, January 2005.
[APL98] Allan, J., Papka, R., and Lavrenko, V., “On-line New Event Detection and Tracking,” Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’98), Melbourne, Australia, August 1998, pp.37–45.
[BA03] Bolshakova, N. and F. Azuaje, “Cluster Validation Techniques for Genome Expression Data,” Signal Process Vol. 83, No. 4, 2003, pp. 825-833.
[Chiang05] Chiang, Y.-S., “Event Episode Discovery from Document Sequences: A Temporal-based Approach,” Unpublished Master Thesis, Department of Information Management. National Sun Yat-Sen University, Kaohsiung, Taiwan, 2005.
[CC05] Chuang, S.L. and Chien, L.F., “Taxonomy Generation for Text Segments: A Practical Web-based Approach,” ACM Transactions on Information Systems, Vol. 23, No. 4, October 2005, pp.363-396.
[DHZ+01] Ding, Chris H.Q., He, X., Zha, H., Gu, M., and Simon, H.D., “A Min-max Cut Algorithm for Graph Partitioning and Data Clustering,“ Proceedings of the 2001 IEEE International Conference on Data Mining, IEEE Computer Society, 2001.
[KA04] Kumaran, G. and Allan, J., “Text Classification and Named Entities for New Event Detection,” Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM: Sheffield, United Kingdom, 2004.
[KA05] Kumaran, G. and Allan, J., “Using Names and Topics for New Event Detection,” Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, Association for Computational Linguistics: Vancouver, British Columbia, Canada, 2005.
[GDH03] Gabrilovich, E., Dumais, S., and Horvitz, E., “Newsjunkie: Providing Personalized Newsfeeds via Analysis of Information Novelty,” Proceedings of the 13th International Conference on World Wide Web (WWW’04), New York, NY, USA, May 2003, pp.482-489
[JC94] Jing, Y. and Croft, W.B., “An Association Thesaurus for Information Retrieval,” University of Massachusetts, 1994
[JMC+00] Goldstein, J., Mittal, V., Carbonell, J., and Kantrowitz, M.,” Multi-document Summarization by Sentence Extraction,” NAACL-ANLP 2000 Workshop on Automatic Summarization, Association for Computational Linguistics: Seattle, Washington, Vol. 4, 2000.
[GL01] Gong, Y. and Liu, X., “Generic Text Summarization Using Relevance Measure and Latent Semantic Analysis,” Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM: New Orleans, Louisiana, United States, 2001.
[ZLW07] Zhang, K., Li, J.Z., and Wu, G., “New Event Detection Based on Indexing-tree and Named Entity,” Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval., ACM: Amsterdam, The Netherlands, 2007.
[Luhn58] Luhn, H.P., “The Automatic Creation of Literature Abstracts,” IBM Journal of Research and Development, February 1958, pp.159-165
[NFP+04] Nallapati, R., Feng, A., Peng, F., and Allen, J., “Event Threading within News Topics,” Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management, ACM: Washington, D.C., USA., 2004
[SFW83] Salton, G., Fox, E.A., and Wu, H., “Extended Boolean Information Retrieval,” Community ACM, Vol. 26, No. 11,1983, pp.1022-1036.
[V86] Voorhees, E.M., “Implementing Agglomerative Hierarchical Clustering Algorithms for Use in Document Retrieval,” Information Processing and Management, 1986, pp. 465-476
[WC05] Wei, C.P. and Chang, Y.H., “Discovering Event Evolution Patterns From Document Sequences,” Systems, Man and Cybernetics, Part A, IEEE Transactions, Vol. 37, No. 2, 2007, pp.273-283.
[WWL04] Wei, C.P., Wu P.F.,and Lee Y.H., “Use of Text Summarization for Supporting Event Detection,” Proceeding. 8th PACIS Shanghai, China, 2004, p.1098.
[YL03] Yang, C.C. and Luk, J.,”Automatic Generation of English/Chinese Thesaurus Based on a Parallel Corpus in Laws,” Journal of the American Society for Information Science and Technology, Vol.54, No.7, 2003, pp.671 – 682
[YCB99] Yang, Y., Carbonell, J.G., Brown, R.D., Pierce, T., Archibald, B.T., and Liu, X., “Learning Approaches for Detecting and Tracking News Events,” IEEE Intelligent Systems, Vol. 14, No. 4, July/August 1999, pp.32-43.
[YPC98] Yang, Y., Pierce, T., and Carbonell, J.,”A Study of Retrospective and On-line Event Detection,” Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM: Melbourne, Australia, 1998.
[YZC+02] Yang, Y., Zhang, J., Carbonell, J., and Jin, C., “Topic-Conditioned Novelty Detection,” Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (CIKM’02), 2002, pp.688-693.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊