研究生(外文):Hsin-Hsuan Sung
論文名稱(外文):Finding Documents Related to Taiwan in the Veritable Records of Qing Using Relevance Feedback
外文關鍵詞:Veritable records of the Qing dynastydigital humanitiesTHDLTaiwanese historytext miningInformation Retrieval
“The Veritable Records of Qing” is a comprehensive historical records. It is a chronologically arranged collection of important issues with the day-to-day routine activities of the emperor and of memorials, including the submission or appointment of significant officials, imperial decrees, demographic information, cargo delivery and expeditions. It is compiled through emperors’ order, and it is also with strict structure. Therefore, it provides a valuable source for historians who conduct research on Qing dynasty. However, when scholars do research in “The Veritable Records of Qing”, to extract a small portion of relevance issue from this huge records can be a problem.
Although after these historical records are digitalized, scholars can use keywords search to find relevant historical text. Nevertheless, if these relevant historical texts of interest do not contain the used keywords, it cannot be found by the tool.
In this research, a method for finding relevant historical texts is proposed. It will compute the level of relevance between each text, instead of using keyword search. Based on some selected texts of interest by the researcher, the methods will compute the level of relevance between the selected texts and the potential texts of interest. After the computation, the potential texts of interest are listed by its rank. Researchers can choose texts they are interested in and send out their result. Having the feedback texts chosen from researchers, the method will continue on the next iteration, and find out the texts that are even more likely to be of interest of the researchers.
In 1990s, scholars retrieved the supposed texts relevant to “Taiwan” from “Veritable Records of Qing” manually, and then edited them into “Veritable Records of Qing-Taiwan Selection”. In the research, this edition and “Veritable Records of Qing” are adopted to examine the performance of different relevance algorithm on general historical records. Next, a system based on relevance feedback algorithm is proposed to provide users or researchers with an interface to search for relevant texts in huge historical records. Finally, the research used “Veritable Records of Qing-Taiwan Selection” as an example to find out more relevance historical texts from “Veritable Records of Qing” that have not been chosen.
The research can be divided into two part. The first part will be deliberating on the method proposed to match the two digitalized historical records mentioned above. Besides, different ways for computing relevance level in texts and branch mark of these methods on the performance on these two historical records will be introduced.
While in the second part, the relevance feedback system based the most well-performed method in the experiment is introduced. Finally, with some testing by historians, the texts found out through this method are analyzed and observed
誌謝 ii
中文摘要 iii
目錄 vi
表目錄 viii
圖目錄 x
Chapter 1 諸論 1
1.1 研究背景與動機 1
1.2 文獻回顧 2
1.3 研究方向 6
1.4 論文架構 6
Chapter 2 研究資料介紹與整理 8
2.1 清實錄 8
2.2 清實錄臺灣史資料專輯 11
2.3 資料整理 14
Chapter 3 前置實驗 22
3.1 條目相關度演算法評估 22
3.2 實驗步驟 28
3.3 實驗結果分析 31
Chapter 4 使用者回饋相關條目實驗設計 37
4.1 架構概述 38
4.2 相關條目搜尋程序 41
4.3 使用者操作功能 45
4.4 實驗結果與觀察 50
4.5 交叉驗證 80
Chapter 5 結論與未來工作 83
5.1 結論 83
5.2 未來工作 84
參考文獻 88
附錄 90
1.1 各年分條目數量分布 90
1.2 人名、地名詞頻與分析圖 97
1.3 本研究補輯含有詞彙"臺灣"條目數 105
1.4 本研究臺灣相關條目補輯 109
