跳到主要內容

臺灣博碩士論文加值系統

(44.200.82.149) 您好!臺灣時間:2023/06/11 00:57
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:朱瑞琪
研究生(外文):Zhu-Chi Chu
論文名稱:TheSummarizationofChineseNewsArticlesbyTemporalorThemedSequences
論文名稱(外文):摘要中文新聞之報導-以時間或主題排序
指導教授:林福仁林福仁引用關係
指導教授(外文):Fu-ran Lin
學位類別:碩士
校院名稱:國立清華大學
系所名稱:科技管理研究所
學門:商業及管理學門
學類:其他商業及管理學類
論文種類:學術論文
畢業學年度:96
語文別:英文
論文頁數:82
中文關鍵詞:Text summarizationintra-paragraphinter-paragraphtemporalthemednews topic summarization
相關次數:
  • 被引用被引用:0
  • 點閱點閱:214
  • 評分評分:
  • 下載下載:16
  • 收藏至我的研究室書目清單書目收藏:0
Most of summarization can extract important sentences, but few of them concern the readability. This thesis proposes a summarization system which considers the sentences coherence and orders the sentences by the news features to facilitate readers to comprehend the news topics.
There are three major components of the summarization system proposed in this thesis. First, the event clustering module identifies the events by Self-Organized Map (SOM) and the episodes by Chameleon in every event. Second, the intra-paragraph sequencing module extracts the features of every event in a news topic, and selects the composition strategy either in temporal, themed, or hybrid to compose sentences for an event as a paragraph. Third, the inter-paragraph sequencing module orders the paragraphs and calculates the topic temporal dependence to decide inter-paragraph sequence. It can order inter-paragraph by temporal or by themed based on the feature of topic temporal dependence.
Experimental results show that different users may prefer different summaries using different composition methods, and there is a need of the mechanism to order sentences by different methods and choose suitable methods depending on the event’s features either in temporal, themed sequence, or both.
Table of Contents iv
Table of Figures vi
Table of Tables vii
1 Introduction 1
1.1 Research Background 1
1.2 Research Motivation 2
1.3 Research Objectives 3
1.4 Thesis Framework 3
2 Literature Review 5
2.1 Summarization System 5
2.1.1 Typical 6
2.1.2 Storyline-based 7
2.1.3 Graph-based 7
2.1.4 Ontology-based 8
2.1.5 Relationship-based 8
2.1.6 Chinese-summarization 9
2.2 Self-organizing Maps(SOM) 10
2.3 Chameleon 13
2.4 Summarization by Informative and Event Words 16
3 System Framework and Methodology 21
3.1 Definition 21
3.2 Research Framework 24
3.3 Preprocess 26
3.4 Event and Episode Identification 27
3.5 Extract Event Features 31
3.6 Ordering Inter-paragraph and Intra-paragraph 35

4 System Implementation and Results 38
4.1 System Implementation 38
4.2 News Topic Summarization Results 39
5 Experimental Design and Results 44
5.1 Experimental Design 44
5.2 Experimental Results 45
5.2.1 Intra-paragraph Results 45
5.2.2 Inter-paragraph Results 49
5.3 Discussions 50
6 Conclusion and Future work 52
6.1 Conclusion 52
6.2 Research Limitation 53
6.3 Future Work 54
References 55
Appendix A. Examples of NPD 58
Appendix B. Summarization Results 74
Appendix C. Snapshot of the User Interface in Experimentation 81
Aonet, C., M. E. Okurowski, et al. (1997). "A Scalable Summarization System Using Robust NLP." In Proceedings of the workshop on intelligent scalable text summarization at the 35th meeting of the association for computional linguistics, and the 8th conference of the European chapter of the association for computional linguistics(pp. 66-73).
Bollegala, D., N. Okazaki, and M. Ishizuka (2006). "A bottom-up approach to sentence ordering for multi-document summarization." Proceedings of COLING/ACL.
Chandrasekaran, B., J.R.Josephson, and V.R. Benjamins. (1999). "What Are Ontologies, and Why Do We Need Them?" IEEE Intelligent Systems 14(1): 20-26.
Chen, H. H., et al. (2003). "A summarization system for Chinese news from multiple sources." Journal of the American Society for Information Science and Technology 54(13): 1224-1236.
Goldstein, J., V. Mittal, et al. (2000). "Multi-document summarization by sentence extraction." NAACL-ANLP 2000 Workshop on Automatic summarization - Volume 4.
Gong, Y. and X. Liu (2001). "Generic text summarization using relevance measure and latent semantic analysis." Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval.
Gruber, T. (1992). "Ontology Definition." from http://www-ksl.Stanford.edu/kst/what-is-an-ontology.html.
Guha, S., R. Rastogi, et al. (2000). "Rock: A robust clustering algorithm for categorical attributes." Information Systems 25(5): 345-366.
Guha, S., R. Rastogi, et al. (2001). "Cure: an efficient clustering algorithm for large databases." Information Systems 26(1): 35-58.
Han, J. and M. Kamber (2006). Data Mining: Concepts and Techniques, Morgan Kaufmann.
Harabagiu, S. and F. Lacatusu (2005). "Topic themes for multi-document summarization." Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval: 202-209.
Hsueh, J. F. (2003). "Learning ontology from Web documents for supporting Web query."
Karypis, G., E. H. Han, V. Kumar (1999). "Chameleon: A hierarchical clustering algorithm using dynamic modeling." IEEE Computer 32(8): 68-75.
Kohonen, T. (2001). Self-Organizing Maps, Springer.
Kuo, J.-J. and H.-H. Chen (2008). "Multidocument Summary Generation: Using Informative and Event Words." ACM Transactions on Asian Language Information Processing (TALIP) 7(1): 1-23.
Lin, F. and C. H. Liang (2008). "Storyline-based summarization for news topic retrospection." Decision Support Systems 45(3): 473-490.
McKeown, K., R. J. Passonneau, et al. (2005). "Do summaries help?" Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval.
Mihalcea, R. and P. Tarau (2005). "A Language Independent Algorithm for Single and Multiple Document Summarization." In Proceedings of IJCNLP2005.
Okazaki, N., Y. Matsuo, and M. Ishizuka (2004). "Improving chronological sentence ordering by precedence relation." Proceedings of 20th International Conference on Computational Linguistics (COLING 04): 750–756.
Radev, D. R., H. Jing, et al. (2004). "Centroid-based summarization of multiple documents." Information Processing and Management 40(6).
Radev, D. R., Z. Zhang, J. Otterbacher (2004). "Cross-document relationship classification for text summarization." Association for Computational Linguistics.
Sahay, S. Study and Implementation of CHAMELEON algorithm for Gene Clustering, www-static.cc.gatech.edu/~ssahay/7001Report.pdf.
Salton, G. and C. Buckley (1988). "Term-weighting approaches in automatic text retrieval." Information Processing and Management: an International Journal 24(5): 513-523.
Salton, G. and M. J. McGill "Introduction to Modern Information Retrieval."
Tan, P. N., M. Steinbach, V. Kumar (2005). Introduction to Data Mining, Addison-Wesley Longman Publishing Co., Inc. Boston, MA, USA.
Van Rijsbergen, C. J. (1979). Information Retrieval, Butterworth-Heinemann Newton, MA, USA.
Wan, X. and J. Yang (2007). "CollabSum: exploiting multiple document clustering for collaborative single document summarizations." Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval.
Wang, G. B. D. and Z. D. Y. Zhu (2005). "Automatic Chinese Summarization Method Based on the HowNet and Clustering Algorithm." Journal of Chinese Information Processing.
Wang, G. B. D. and Z. D. Y. Zhu (2005). "Automatic Chinese Text Summarization System Based on Conceptual Vector Space Model." Journal of Chinese Information Processing.
Wu(吳家威), G. J. W. and J. L. Liou(劉昭麟) (2002). An Ontology-Based Article Summarization System. 2002 民生電子研討會論文集: 41-46.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top