(3.238.96.184) 您好!臺灣時間:2021/05/08 21:04
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

: 
twitterline
研究生:王中楚
研究生(外文):Chung-Chu Wang
論文名稱:情緒語料庫建構技術之研究-以財經新聞為例
論文名稱(外文):Sentiment Corpus Constructed Research-A Case Study Based on Financial News
指導教授:柯淑津柯淑津引用關係
指導教授(外文):Sue-Jin Ker
口試委員:許見章吳怡瑾
口試委員(外文):Chien-Chang HsuI-Chin Wu
口試日期:2013-01-28
學位類別:碩士
校院名稱:東吳大學
系所名稱:資訊管理學系
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2013
畢業學年度:101
語文別:中文
論文頁數:38
中文關鍵詞:財經新聞資料探勘情緒詞彙卡方計算類別比例差異
外文關鍵詞:financial newsdata miningsentiment wordschi-squareCategorical Proportional Difference
相關次數:
  • 被引用被引用:6
  • 點閱點閱:1185
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:95
  • 收藏至我的研究室書目清單書目收藏:0
現今的資訊流量日益漸大,對於投資者而言,要確實掌握精準的財經訊息甚為困難,因此許多研究者開始對大量的財經報導進行探勘的處理,希望透過自動化系統將這些資料萃取出對使用者有效的資訊。
為了提供投資者精準的投資訊息,本研究從聯合報社取得一年份財經新聞,建立財經主題的情緒語料庫。語料庫根據情緒詞彙資訊量以及卡方計算(Chi-square) 篩選重要的情緒詞彙,過程中使用類別比例差異(Categorical Proportional Difference-CPD) 方法對情緒詞彙的權重值進行調整,主要目的為計算文章的情緒分數並分類文章情緒,再測試這些情緒詞彙是否為重要的財經情緒詞彙。
本研究可將結果以視覺化的方式呈現,列出公司部分時段的分數高低,最主要的貢獻可讓投資者可以觀察該公司的分數並預測該公司的近期趨勢。實驗中我們所建立的情緒語料庫應用效能,在文章自動分類上有73.8%的準確率。
Nowadays the information is getting difficult for investors to know precisely accurate at financial information, so there are many researchers beginning mining a lot of financial news, hope to use automated system to extract some effective information from data to users.
In order to provide accurate investment information for investors, our study obtain a year of financial news from United Daily News and establish Finance theme of corpus. Corpus according to sentiment words information capacity and chi-square calculation to filter some important sentiment words. We use sentiment words after filtering then use Categorical Proportional Difference (CPD) method to adjust weighted value of sentiment words in the process, the main purpose on calculate sentiment scores of articles, and test these sentiment words whether is important Finance sentiment words or not.
In this study, we can display the result visually and list some period score of companies, the main contribution to investors is that they can observe company's score and predict the company's recent trends. In the experiment, we used the corpus which we established on automatic classification performance with 73.8% accuracy rate.

誌謝 I
摘要 II
Abstract III
目錄 IV
圖目錄 VI
表目錄 VII
第一章 緒論 1
1.1研究動機 1
1.2研究目的 1
1.3論文架構 2
第二章 文獻探討 3
2.1 特徵選取與情緒分析 3
2.1.1反義詞判斷 4
2.1.2 權重值計算 5
2.2 語料庫 6
2.2.1 語料庫建構 6
第三章 情緒語料庫建構方法 8
3.1 情緒詞彙蒐集 8
3.1.1 原始資料說明 9
3.1.2 人工標記輔助系統 10
3.1.3 情緒詞彙 13
3.2 情緒詞彙篩選 13
3.2.1 詞性篩選 14
3.2.2 卡方檢定 15
3.3 情緒語料建構 17
3.3.1 文章結構 17
3.3.2 標籤說明 18
3.3.3 情緒詞彙標籤 19
第四章 情緒語料應用方法 21
4.1 量化詞彙情緒 21
4.1.1 文本情緒分類 22
4.1.2 權重值計算與調整 23
4.2 上市櫃公司新聞 25
第五章 情緒語料應用效能分析 30
5.1 實驗設計 30
5.2 實驗資料 31
5.3 實驗效能分析 32
5.4 錯誤分析 34
第六章 結論與未來工作 36
參考文獻 37

孫瑛澤,陳建良,劉峻杰,劉昭麟,蘇豐文,“中文短句之情緒分類”,第22屆計算語言學和語音處理國際研討會 (ROCLING 2010),頁184-198,台灣南投浦里鎮,2010。

劉詩音,中文詞義標示集之設計與製作,東吳大學資訊科學系碩士論文,2008。

Church, K.W. and P. Hanks, “Word Association Norms, Mutual Information and Lexicography,” Proceedings of the 27th Annual Conference of the Association of Computational Linguistics, Association for Computational Linguistics, New Brunswick, NJ, p. 76-83, 1990.

Chen, L.S. and C.W. Chang, “A New Term Weighting Method by Introducing Class Information for Sentiment Classification of Textual Data,” Proceedings of the International MultiConference of Engineers and Computer Scientists 2011, IMECS, 2011.

Fu, G. and X. Wang, “Chinese Sentence-Level Sentiment Classification Based on Fuzzy Sets,” Proceedings of the 23rd International Conference on Computational Linguistics, p. 312-319, 2010, Beijing, China.

Fu, T.C., K.K. Lee, D. Sze, F.L. Chung and C.M. Ng, “Discovering the Correlation between Stock Time Series and Financial News,” Web Intelligence and Intelligent Agent Technology, 2008.

Lewis, M., “Teaching Collocation: Further Developments in the Lexical Approach,” Hove: Language Teaching Publications. p. 186-204, 2000.

NTUSD (National Taiwan University Semantic Dictionary)
http://nlg18.csie.ntu.edu.tw:8080/opinion/pub1.html [2012/5/22]


Simeon, M. and R. Hilderman, “Categorical Proportional Difference: A Feature Selection Method for Text Categorization,” Seventh Australasian Data Mining Conference, p. 201-208, 2008.

Turney, P.D., and M.L. Littman, “Measuring Praise and Criticism: Inference of Semantic Orientation from Association,” ACM Transactions on Information System (TOIS), p. 315-346, 2003.

Watson, T. R., “Building and using your own corpus and concordance,” ThaiTESOL Bulletin vol. 14 no. 2, 2001.

Yuen, W.M., Y.W. Chan, B.Y. Lai, O.Y. Kwong and K.Y. T'sou, “Morpheme-based Derivation of Bipolar Semantic Orientation of Chinese Words,” Proceeding COLING '04 Proceedings of the 20th international conference on Computational Linguistics, 2004.

Yang, Y., and Pedersen, Jan O., “A Comparative Study on Feature Selection in Text Categorization,” Proceedings of ICML-97, 14th International Conference on Machine Learning (ICML), p. 412-420, 1997.

Zhai, J., N. Cohen and A. Atreya, “CS224N Final Project: Sentiment analysis of news articles for financial signal prediction,” CS224N/Ling284 Final Projects 2010-11, 2011.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔