研究生(外文):Huei-Chen Wu
論文名稱(外文):A Study on Multi-layered Automatic Book Classification System Using Data Mining
指導教授(外文):June-Jei Kuo
口試委員(外文):Ming-Jiu HwangShun-Der Ryan Chen
外文關鍵詞:multi-layered automatic book classification systemvoting strategyclassifierdata mining
Cataloging books are the kernel and foundation of the management for the library at all levels. Most of librarians only understand the knowledge of the library information sciences, but they are responsible for bibliography of the knowledge fields. Due to lack of background knowledge the bibliography becomes more and more difficult for the librarians. Moreover, as the recent repid achievement in every knowledge field the amount of publishing increases very quickly, the bibliography load further increases. The good quality of bibliography cannot be maintained such as high inter-consistency and high intra-consistency of library classification.
Thus, this paper deals with issues of traditional one layered book classification systems and employs the advantages of various classifiers to propose a two layered book classification system using voting strategy. Moreover, the collection of dissertations from National Chung Hsing University and the bibliographies of network bookstore are used as the training and test corpus. The classification codes of each dissertation are employed as the gold standard as well. Each dissertation contains various content parts such as title, authors or cited papers et al. On the one hand, to understand the classification effect of all the combinations of content parts, various combinations are studied as well and the best combination is recommended. On the other hand, to obtain the best classification performance, the combination of classifier for multi-layered book classification system is studied and the best combination is also recommended as well. Finally, the experimental results show that the performance of the proposed multi-layered book classification system outperforms the traditional one layered book classification systems.

摘要 i
Abstract iii
目次 v
表目次 vii
圖目次 ix
第一章 緒論 1
第一節 研究背景與動機 1
第二節 研究目的與問題 3
第三節 研究範圍與限制 3
第四節 名詞解釋 7
第二章 文獻探討 11
第一節 文件表示方法 13
第二節 分類器建構方法相關研究 14
第三節 評估分類器成效之方法 21
第四節 影響分類成效之因素 23
第五節 圖書自動分類之相關研究 24
第三章 研究設計與實施 27
第一節 研究架構 27
第二節 研究對象 29
第三節 研究工具 29
第四節 分類模組流程 30
第四章 語料分析與實驗 33
第一節 先導實驗-博碩士論文資料集 33
第二節 先導實驗-少量網路書店書目資料集 43
第三節 正式實驗-網路書店書目資料 49
第四節 效用評估與討論 62
第五章 結論與未來研究方向 67
第一節 結論 67
第二節 未來研究方向 69
參考文獻 71
附錄一 中文停用字 77
附錄二 英文停用字 79
附錄三 書目資料原始樣態 83
附錄四 經中文詞斷字系統處理後之書目樣態 87
附錄五 經軟體轉換後的特徵值-內容檔範例 91
附錄六 經軟體轉換後的特徵值-二進位格式範例 97
附錄七 使用WEKA軟體進行文件自動分類步驟 101

