跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.91) 您好!臺灣時間:2025/01/15 09:49
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:劉正康
研究生(外文):Zheng-Kang Liu
論文名稱:針對注音輸入法所設計的敲鍵特徵
論文名稱(外文):Keystroke Dynamics Feature Designed for Zhuyin Input Method
指導教授:陳和麟
指導教授(外文):Ho-Lin Chen
口試日期:2017-07-31
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:電機工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2017
畢業學年度:105
語文別:英文
論文頁數:59
中文關鍵詞:生物特徵測量學敲鍵特徵型樣辨識計算機安全性
外文關鍵詞:BiometricsKeystroke DynamicsPattern RecognitionComputer Security
相關次數:
  • 被引用被引用:0
  • 點閱點閱:236
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
生物特徵量測學(Biometrics)是一種根據人類生理特徵或行為特徵作為辨識手法的一門學問。生物特徵量測學(Biometrics)通常分為生理型(physiological)生物特徵以及行為型(behavioral)生物特徵。在行為型生物特徵之中,敲鍵特徵(keystroke dynamics)因其具備低成本、容易佈署、不易引人注目的特點使其成為行為生物特徵之中最為熱門的研究議題。

傳統上,在關於敲鍵特徵的研究裡,兩字一音(digraph)如同被奉為圭臬般的特徵(feature)。然而,在此之前,絕大多數的研究都著重於輸入文字為英文的情境‧兩字一音(digraph)適不適合透過注音輸入法輸入中文的情境仍有待商榷。

本論文提出了一個新的特徵(feature)稱作聲調切割(tonal separation),此特徵紀錄了從第一個注音符號到第一個注音聲符的所有按鍵按住時間(holdtime)和相間時間(release-press time)。 而這個想法是源自於絕大多數注音輸入法的運作模式‧一般的注音輸入法都會在使用者輸入聲調符號的時候將注音符號轉成所預測的中文字。為了比較這兩個特徵(feature)的優缺,我們進行了一個實驗,此實驗會用這兩個特徵(feature)來分別訓練模型(model)。本論文將會使用AUC(Area under ROC curve,ROC曲線下的面積)來比較聲調切割(tonal separation)和兩字一音(digraph)的在同樣的模型下的性能。


在實驗資料裡,我們選擇「ㄉㄜ˙」和「ㄋㄨㄥˋ」作為此次實驗的目標型樣(pattern),在「ㄉㄜ˙」的情況下,聲調切割(tonal separation)在AUC的表現均超過此型樣(pattern)對應的兩字一音(digraph)。而在「ㄋㄨㄥˋ」的情況下,聲調切割(tonal separation)的AUC表現除了與此型樣(pattern)對應的第二組兩字一音(digraph)差不多之外,其餘的第一組兩字一音(digraph)和第三組的兩字一音(digraph)的AUC均略遜於聲調切割(tonal separation)。總結來說,聲調切割(tonal separation)的整體表現優於兩字一音(digraph)。
Biometrics is a measurement which uses a distinctive aspect of people''s biology or behavior to identify people. Biometrics is often categorized into physiological and behavioral. Keystroke dynamics is the most popular behavioral biometrics because it is low-cost, easily deployed, unobtrusive, etc.

Traditionally, in the field of keystroke dynamics, digraph has been regarded as a standard feature. Nonetheless, those researchers had been mostly focusing on the patterns typed in English. Whether using digraph to analyze patterns in Chinese typed through Zhuyin IME is suitable remains to be justified.

In this thesis, we propose a new feature called tonal separation derived from the mechanics of Zhuyin IME, which is a type of input method based on a phonetic system for Chinese. Tonal separation is the time duration which records all holdtime and release-press time for every key from the first Zhuyin character to the first tone mark. It is an intuitive thought coming from that Zhuyin IME transforms patterns into a Chinese character every time when a tone mark is triggered. In order to evaluate the performance of these two features, we conducted an experiment in which we trained models with these two different features. We will compare tonal separation with digraph using AUC, area under ROC curve, to show the performance of classifiers.


In our experiment, we choose (ㄉㄜ˙) and (ㄋㄨㄥˋ) as our target pattern to which we are going to apply the sample extraction using the feature tonal separation and digraph. In the case(ㄉㄜ˙), the AUC of tonal separation samples is better than the AUC of all corresponding digraph samples. In the case (ㄋㄨㄥˋ), the AUC of tonal separation samples is almost the same as the AUC of the second corresponding digraph samples. However, the AUC of tonal separation is still better than that of other two corresponding digraph sample sets. In summary, the overall performance of tonal separation outperforms that of digraph.
口試委員會審定書 iii
誌謝 v
摘要 vii
Abstractix

1 Introduction 11.1
Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2
1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
1.3 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . .6
1.4 Outline of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7

2 Machine Learning 9
2.1 Basic Procedure of Learning Tasks . . . . . . . . . . . . . . . . . . . . .9
2.2 Types of Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11
2.3 Random Forests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11

3 Biometrics and Keystroke Dynamics 15
3.1 Biometrics: Human Characteristics . . . . . . . . . . . . . . . . . . . . .15
3.1.1 Performance Metrics for a Biometrics System . . . . . . . . . . .16
3.1.2 Confusion Matrix and Derivative Performance Metrics . . . . . .17
3.1.3 ReceiverOperatingCharacteristic(ROC)andAreaUnderanROCCurve (AUC) . . . . . . . . . . . . . . . . . . . . . . . . . . . .20
3.2 Keystroke Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . .25
3.2.1 Types of Keystroke Dynamics . . . . . . . . . . . . . . . . . . .26
3.2.2 Common Features . . . . . . . . . . . . . . . . . . . . . . . . .27

4 Standard Chinese Phonetic System 29
4.1 Typical Non-romanization System for Chinese: Zhuyin . . . . . . . . .30
4.2 Typical Romanization System for Chinese: Pinyin . . . . . . . . . . . ..30
4.3 Overview of Chinese Input Method (IME) and Keyboard Layout . . . ..33
4.3.1 Chinese Input Method Editor (IME) . . . . . . . . . . . . . . . .33
4.3.2 Keyboard Layout for Chinese-speaking Countries . . . . . . . . .35

5 Tonal Separation and Assessment 37
5.1 Feature: Digraph and Tonal Separation . . . . . . . . . . . . . . . . . . .38
5.2 Experiment Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41
5.2.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . .41
5.2.2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . .42
5.2.3 Model Training and Model Assessment . . . . . . . . . . . . . .44
5.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . .45

6 Conclusion and Future Work 53

Bibliography 55
[1] A. Jain, L. Hong, and S. Pankanti, “Biometric identification,” Commun. ACM, vol. 43, pp. 90–98, Feb. 2000.
[2] A. K. Jain, P. Flynn, and A. A. Ross, Handbook of Biometrics. Secaucus, NJ, USA: Springer-Verlag New York, Inc., 2007.
[3] J. Ashbourn, Biometrics: Advanced identity verification: The complete guide. 2000.
[4] S. A. Cole, History of Fingerprint Pattern Recognition, pp. 1–25. New York, NY: Springer New York, 2004.
[5] A. R. Burks and A. W. Burks, The First Electronic Computer: The Atanasoff Story. Ann Arbor, MI, USA: University of Michigan Press, 1988.
[6] C. G. Northcutt, A. D. Ho, and I. L. Chuang, “Detecting and preventing ”multipleaccount” cheating in massive open online courses,” Comput. Educ., vol. 100, pp. 71–80, Sept. 2016.
[7] T. Dunstone and N. Yager, Biometric System and Data Analysis: Design, Evaluation, and Data Mining. Springer Publishing Company, Incorporated, 1st ed., 2008.
[8] R. S. Gaines, W. Lisowski, S. J. Press, and N. Shapiro, “Authentication by keystroke timing some preliminary results,” 1980.
[9] R. Joyce and G. Gupta, “Identity authentication based on keystroke latencies,” Commun. ACM, vol. 33, pp. 168–176, Feb. 1990.
[10] A. Guven and I. Sogukpinar, “Understanding users’ keystroke patterns for computer access security,” Computers Security, vol. 22, no. 8, pp. 695 – 706, 2003.
[11] M. S. Obaidat and D. T. Macchiarolo, “An online neural network system for computer access security,” IEEE Transactions on Industrial Electronics, vol. 40, pp. 235–242, Apr 1993.
[12] W. S. McCulloch and W. Pitts, “A logical calculus of the ideas immanent in nervous activity,” The bulletin of mathematical biophysics, vol. 5, pp. 115–133, Dec 1943.
[13] R. Giot, M. El-Abed, and C. Rosenberger, “Greyc keystroke: A benchmark for keystroke dynamics biometric systems,” in 2009 IEEE 3rd International Conference on Biometrics: Theory, Applications, and Systems, pp. 1–6, Sept 2009.
[14] C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, pp. 273–297, Sep 1995.
[15] F. Monrose and A. Rubin, “Authentication via keystroke dynamics,” in Proceedings of the 4th ACM Conference on Computer and Communications Security, CCS ’97, (New York, NY, USA), pp. 48–56, ACM, 1997.
[16] F. Bergadano, D. Gunetti, and C. Picardi, “Identity verification through dynamic keystroke analysis,” Intell. Data Anal., vol. 7, pp. 469–496, Oct. 2003.
[17] D. Gunetti, C. Picardi, and G. Ruffo, Keystroke Analysis of Different Languages: A Case Study, pp. 133–144. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005.
[18] S. Pinker, The Language Instinct. New York, NY: Harper Perennial Modern Classics, 1994.
[19] T. Samura and H. Nishimura, “Keystroke timing analysis for individual identification in japanese free text typing,” in 2009 ICCAS-SICE, pp. 3166–3170, Aug 2009.
[20] T. Samura and H. Nishimura, “Keystroke timing analysis for personal authentication in japanese long text input,” in SICE Annual Conference 2011, pp. 2121–2126, Sept 2011.
[21] T. Samura and H. Nishimura, “Influence of keyboard difference on personal identification by keystroke dynamics in japanese free text typing,” in 2012 Fifth International Conference on Emerging Trends in Engineering and Technology, pp. 30–35, Nov 2012.
[22] T. Samura, Y. Matsubara, and H. Nishimura, “Performance assessment in keystroke dynamics by combined profile documents for free text typing,” in The SICE Annual Conference 2013, pp. 265–270, Sept 2013.
[23] A. Alsultan, K. Warwick, and H. Wei, “Free-text keystroke dynamics authentication for arabic language,” IET Biometrics, vol. 5, no. 3, pp. 164–169, 2016.
[24] L. Breiman, Classification and regression trees. Chapman & Hall/CRC, 1984.
[25] R. Jensen, Performing Feature Selection with ACO, pp. 45–73. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006.
[26] X. Li and J. Liu, Keystroke Biometric Recognition on Chinese Long Text Input, pp. 260–271. Cham: Springer International Publishing, 2016.
[27] A. L. Samuel, “Some Studies in Machine Learning Using the Game of Checkers,” IBM Journal of Research and Development, vol. 3, no. 3, pp. 210–229, 1959.
[28] T. Mitchell, Machine Learning. McGraw-Hill Education, 1st ed., 1997.
[29] Y. S. Abu-Mostafa, M. Magdon-Ismail, and H.-T. Lin, Learning From Data. AMLBook, 2012.
[30] T. K. Ho, “Random decision forests,” in Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1, ICDAR ’95, (Washington, DC, USA), p. 278, IEEE Computer Society, 1995.
[31] T. K. Ho, “The random subspace method for constructing decision forests,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, pp. 832–844, Aug. 1998.
[32] J. R. Quinlan, C4.5: Programs for Machine Learning. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1993.
[33] L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001.
[34] L. Breiman, “Bagging predictors,” Machine Learning, vol. 24, no. 2, pp. 123–140, 1996.
[35] Y. Amit and D. Geman, “Shape quantization and recognition with randomized trees,”
Neural Computation, vol. 9, pp. 1545–1588, July 1997.
[36] K. Murphy, Machine Learning: A Probabilistic Perspective. Adaptive computation and machine learning, MIT Press, 2012.
[37] A. K. Jain, R. Bolle, and S. Pankanti, Biometrics: personal identification in networked society, vol. 479. 2006.
[38] D. Gunetti and C. Picardi, “Keystroke analysis of free text,” ACM Transactions on Information and System Security, vol. 8, no. 3, pp. 312–347, 2005.
[39] I. Rish, “An empirical study of the naive bayes classifier,” tech. rep., 2001.
[40] H. B. Demuth, M. H. Beale, O. De Jess, and M. T. Hagan, Neural Network Design. USA: Martin Hagan, 2nd ed., 2014.
[41] W. J. Youden, “Index for rating diagnostic tests,” Cancer, vol. 3, no. 1, pp. 32–35, 1950.
[42] T. Fawcett, “An introduction to roc analysis,” Pattern Recognition Letters, vol. 27, no. 8, pp. 861 – 874, 2006. ROC Analysis in Pattern Recognition.
[43] T. Sim and R. Janakiraman, “Are digraphs good for free-text keystroke dynamics?,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2007.
[44] J. Leggett, G. Williams, M. Usnick, and M. Longnecker, “Dynamic identity verification via keystroke characteristics,” International Journal of Man-Machine Studies, vol. 35, no. 6, pp. 859–870, 1991.
[45] P. S. Teh, A. B. J. Teoh, and S. Yue, “A survey of keystroke dynamics biometrics,” The Scientific World Journal, vol. 2013, 2013.
[46] 李鍌, 國語注音符號手冊. Taiwan: 中華民國教育部, 2000.
[47] 注音符號的文化演現. 東大學術, 秀威資訊科技, 2012.
[48] 敎育部國語統一籌備委員會, 國音常用字彙. 商務印書館, 1932.
[49] “汉语拼音方案,” 1958.
[50] S. van der Walt, S. C. Colbert, and G. Varoquaux, “The numpy array: A structure for efficient numerical computation,” Computing in Science & Engineering, vol. 13, no. 2, pp. 22–30, 2011.
[51] W. McKinney, “Data structures for statistical computing in python,” in Proceedings of the 9th Python in Science Conference (S. van der Walt and J. Millman, eds.), pp. 51– 56, 2010.
[52] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.
[53] J. D. Hunter, “Matplotlib: A 2d graphics environment,” Computing In Science & Engineering, vol. 9, no. 3, pp. 90–95, 2007.
[54] L. Breiman, “Out-of-bag estimation.”
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top