研究生(外文):Yu Ran
論文名稱(外文):Semi-supervised method for Improving Stance Classification on Insufficient Labeled Chinese Newspaper
指導教授(外文):Shou-De Lin
口試委員(外文):Hsin-Hsi ChenPu-Jen Cheng
外文關鍵詞:Stance ClassificationSemi-supervised LearningLadder NetworkDeep Learning
We aim at developing an intelligent program to classify the stance on the Chinese news article on several controversial topics based on the former crawled data. The difficulty in this problem is the insufficient labeled news so that the model cannot learn enough knowledge. Wei-Ming mainly focus on the feature division, feature clustering to reduct the feature dimension and get higher accuracy with supervised method. We
aimed at how to make full use of unlabeled data and use deep learning representation vector as feature to get the result beyond the Wei-Ming’s method. We first use paragraph vector as news’ feature and compare them with word feature and dependency feature, then we use the semi-supervised method, that is self-learning and ladder network with paragraph vector feature. We get the better result in topic 2 with self-learning and other 3 topics beyond the Wei-Ming’s method.
誌謝..................................... i
中文摘要 ............................... ii
ABSTRACT .............................. iii
CONTENTS .............................. iv
Chapter 1 Introduction ................. 1
Chapter 2 Related Work ................. 3
2.1 Traditional Sentiment Analysis ............ 3
2.2 Senior’s Method .......................... 3
2.3 Semi-supervised Method .................... 5
Chapter 3 Data Preprocessing and Feature Extraction ........................................ 6
3.1 Data Cleaning and Processing .............. 6
3.2 Feature Extraction ........................ 6
Chapter 4 Methodology ................. 10
4.1 Supervised Method ......................... 10
4.2 Semi-supervised Method .................... 10
4.2.1 Self-learning ........................... 10
4.2.2 Denoising Autoencoder ................... 11
4.2.3 Ladder Network .......................... 12
Chapter 5 Experiments ................. 17
5.1 Dataset... ................................ 17
5.1.1 Chinese Newspaper Dataset ............... 17
5.2 Evaluation Methods ........................ 17
5.3 Comparing Paragraph Vector Feature and Word-based Feature and Dependency Feature ........................ 18
5.4 Comparing State-of-the-art Semi-supervised Method ............................................... 19
Chapter 6 Conclusion and Future Work ....................................... 21
REFERENCE ............................. 22
