研究生(外文):Kuen-Jhe Shih
論文名稱(外文):Mining a Novel Biometrics to Improve the Accuracy of Personal Authentication in Free Text
指導教授(外文):Cheng-Jung Tsai
外文關鍵詞:data miningclusteringbioinformaticskeystroke dynamicsfree textclassifier
電腦網路為人們的生活帶來了許多便利,但同時也提供了電腦病毒快速且便捷的散播管道。因人們在網路上無法確認彼此真實身分以及帳號密碼易被破解、盗用等問題,使得近年來網路犯罪的事件層出不窮。近年來,有學者將擊鍵特徵運用至自由文辨識,相關研究結果顯示擊鍵特徵確實能提昇自由文之辨識率。為了提昇自由文之辨識準確性,本論文藉由資料探勘中的群集分析技術分析使用者的擊鍵方式,建構出新的生物特徵「擊鍵鍵盤分群圖」(Keystroke Clusters Map ),簡稱KC-Map。因KC-Map是群集分析後之結果,已非使用者真實的擊鍵資料,所以並不適用於一般的分類器。為解決這個問題,本論文亦提出一個適用於KC-Map的分類器,稱之為「擊鍵鍵盤分群圖相似性分類器」(Keystroke Clusters Map Similarity Classifier),簡稱KCMS分類器。實驗結果顯示, 結合KC-Map與KCMS分類器可改善自由文辨識的準確性,其準確度提昇達1.27倍。

Internet brings people lots of conveniences, but it provides a mode which can spread virus of computer easily and quickly. There are some problems can be showed; for example, people cannot identify personal details to each other accurately on Internet. Also, username as well as password are easily cracked or embezzled. Therefore, cybercrime is highly increased. Recently, some scholars draw keystroke dynamics on free text identification; some relevant researches show that the keystroke dynamics can really improve the accuracy of personal authentication in free text. In order to improve the accuracy of personal authentication in free text, this study proposes a new biometrics referred to KC-Map (Keystroke Clusters Map) by clustering users’ keystrokes. Because KC-Map is the results of clustering, the user’s keystrokes have been non-informed. Therefore, KC-Map is not suitable for the traditional statistical classifier, which is used for authentication. In order to solve this problem, the study also proposes a KCMS classifier (Keystroke Clusters Map Similarity Classifier). Experimental results show that combination of KC-Map and KCMS classifier can improve the accuracy of personal authentication in free text with up to 1.27 times.
In addition, there is a big problem on the current free text identification that users are required to be trained for several months. Long training time makes free text identification impractical. Another motivation of this study is to explore if it is possible to shorten the training time in an acceptable range. Experimental results show that users need to carry out only about 20 minutes for training to achieve a good identification accuracy.

