(18.204.227.34) 您好!臺灣時間:2021/05/19 08:26
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:張立樸
研究生(外文):Li-Pu Chang
論文名稱:利用貝氏機率模組擷取中文名詞
論文名稱(外文):A Study on Chinese Named Entity Extraction by Naive Bayes
指導教授:蔡志忠蔡志忠引用關係
指導教授(外文):Jyh-Jong Tsay
學位類別:碩士
校院名稱:國立中正大學
系所名稱:資訊工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2001
畢業學年度:89
語文別:英文
論文頁數:67
中文關鍵詞:名詞擷取貝式定理文法分析資訊檢索資料挖礦資訊擷取
外文關鍵詞:Named Entity ExtractionNaive BayesGrammatical InferenceInformation RetrievalData MiningInformation Extraction
相關次數:
  • 被引用被引用:5
  • 點閱點閱:189
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:3
在本篇論文中,我們研究有關於資訊檢索(Information Retrieval)中的一個領域-中文名詞擷取(Named Entity Extraction)。我們利用貝式定理的機率模組設法解決此問題,而後我們再利用此模組配合一種文法分析的方式,稱做Alergia。而後我們提出幾種新的方法來解決既有的模組的問題,在擷取人名的實驗中,我們發現我們可以達到大約90%的準確度在新聞以及校園網頁的領域。
In this thesis, we study the problem of named entity extraction which is a subset of information extraction(IE) problem. We solve the problem with Naive Bayes and its variants. Then we mix Naive Bayes with a grammatical inference method called Alergia. We propose several new methods to solve the drawbacks of original method. In experiment result, our approaches can achieve about 90% $F_1$ measure on Web and 80% F1 measure on
CNA Chinese newspaper.
1.Introduction
2.Naive Bayes
3.Grammatical Inference
4.Modifications of Naive Bayes
5.Experiment
6.Related Work
7.Conclusion
Ricardo Baeza-Yates, Berthier Ribeiro-Neto. Modern Information Retrieval. New York, NY: ACM Press, 1999.
D. Freitag. Machine Learning for Information Extraction in Informal Domains. Ph.D Dissertation, Carnegie Melon University, 1999.
D. Freitag. Using grammatical inference to improve precision in information extraction. In Notes of the ICML-97 Workshop on Automata Induction, Grammatical Inference, and Language Acquistion, 1997
Manabu Sassano, Takehito Utsuro. Named Entity Chunking Techniques in Supervised Learning for Japanese Named Entity Recognition. Proceedings of the 18th International
Conference on Computational Linguistics, pp.705-711, August 2000.
Tom M. Mitchell. Machine Learning. The McGraw-Hill Companies, Inc. 1997.
Central News Agency . http://www.cna.com.tw
C.J. van Rijsbergen. Information Retrieval. Butterworths, Inc., Boston 1979
D.D. Lewis and W.A. Gale. A sequential algorithm for training text classifiers. In Proceedings of the 17th International Conference on Research and Development in Information Retrieval, July 1994.
R.C. Carrasco and J. Oncina. Learning stochastic regular grammars by means of a state merging method. In R.C. Carrasco and J.Oncina, editors, Grammatical Inference and Applications: Second International Colloquium, ICGI-94. Springer-Verlag, September 1994.
W. Hoeffding. Probability inequalities for sums of bounded random variables. American Statistical Association Journal, 58:13-30, 1963
Hsin-Hsi Chen and Guo-Wei Bian. Proper Name Extraction
from Web Pages for Finding People in Internet. Proceedings of ROCLING X, Taipei, Taiwan, August 22-24, 1997
T.R. Leek. Information Extraction Using Hidden Markov
Models. Master thesis of CS department of university of California, San Diego, 1997
L. Rabiner. A Tutorial on Hidden Markov Models
and selected applications in speech recognition. Preceedings of IEEE, 1989
D.M. Bikel, S. Miller, R. Schwartz, and R. Weischedel.
Nymble: a High-Performance Learning Name-Finder. In Proceedings of Fifth Conference on Applied Natural Language Processing, pages 194-201, 1997
N. Kushmerick. Wrapper Induction for Information
Extraction. PhD thesis, University of Washington, 1997. Tech Report UW-CSE-97-11-04
Ryszard S. Michalski, Ivan Bratko, Miroslav Kubat.
Machine Learning and Data Mining Methods and Applications. John Wiley and Sons LTD. 1998.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top