跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.82) 您好!臺灣時間:2024/12/08 18:08
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:陳艿玟
研究生(外文):Chen, Nai-Wen
論文名稱:針對k近鄰分類器的特徵權重使用差分進化算法
論文名稱(外文):Feature Weighting for k-Nearest Neighbor Classifiers Using Differential Evolution Algorithms
指導教授:阮議聰
指導教授(外文):Juan, Yee-Tsong
口試委員:賴榮滄廖怡欽黃崇仁
口試委員(外文):Lai, Zone-ChangLiaw, Yi-ChingHuang, Tsung-Jen
學位類別:碩士
校院名稱:國立臺灣海洋大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2016
畢業學年度:104
語文別:中文
論文頁數:27
中文關鍵詞:特徵權重巨量資料機器學習差分進化算法k-近鄰算法
外文關鍵詞:Feature WeightingBig DataMachine learningDifferential Evolutionk-Nearest Neighbors
相關次數:
  • 被引用被引用:0
  • 點閱點閱:247
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
自工業革命發展以來,為了節省勞力,時間與成本,人們一直追求要讓機器取代人工。因此,機器學習一直是很多研究領域內的一個熱門的研究課題。近幾年來,電腦的軟體及硬體技術迅速提升,使得實際上所收集的資料越來越龐大,更複雜,以及快速變化,形成所謂的巨量資料(big data)。巨量資料包含數量龐大和/或高維度的數據,使得人們對現今資料的理解與實際應用產生了嚴重障礙。機器學習可以從數據來反覆學習,使得電腦在沒有明確的知識下能夠從龐大的資料中找出數據隱藏的見解。機器學習技術已被廣泛應用到挖掘有價值的資訊,並幫助我們做適當的決策。當處理高維度的巨量資料時,決定特徵的重要性是當今一個非常重要的議題,以求減少計算和資料存儲的高複雜度問題。

本論文提出了一種方法來決定特徵重要性和特徵權重經由結合差分進化(DE)演算法和k-近鄰(kNN)演算法。DE演算法是一個啟發式最佳化演算法,它效仿生物演化方式,經由不斷突變、交叉與及選擇的運算來找到一個最佳的解決方案。kNN演算法是一種簡單的非參數分類演算法,但在各個領域的應用有斐然地成效。在kNN演算法中,一個對象的k個最近鄰被選擇並用於進行多數表決,來求得那個對象的分類。在我們提出的方法中,特徵權重首先由DE演算法決定,然後通過kNN演算法的準確性作為特徵權重的效能評估。六個UCI數據的實驗結果顯示出我們提出的方法在預測的效果優於其他五個被比較的方法。

Since the industrial revolution, people seek to replace human workers with machines in terms of benefits in labor, time and cost savings etc. With the advances in hardware and software technology in the recent years, data collected in practice are becoming larger, fast-changing and more complex. Big Data, which contain large-scale and/or high-dimensional data, cause serious obstacles for people in data interpretation and applications. As a result, machine learning has been a popular research topic within many fields of study. Machine learning, which can iteratively learn from data, allows computers to find hidden insights of data without explicit knowledge. Machine learning techniques have been widely applied to mine valuable information and help us in the decision making process. In dealing with high-dimensional Big Data, the determination of feature importance plays a key issue in order to reduce the high complexity of computing and data storage.

This paper presents a method to determine feature importance and feature weighting using an integration of Differential Evolution (DE) algorithm and k-Nearest Neighbors (kNN) algorithm. DE algorithm, a heuristic optimization algorithm, follows biological evolution via mutation, crossover and selection operations to find an optimal solution. The kNN algorithm is a simple classifier algorithm but works incredibly well in various fields in practice. In our proposed method, the weights of features and the k value for kNN are first chosen by DE algorithm and then evaluated by the accuracy performance of kNN algorithm. Our experimental results on six UCI datasets show that when using appropriate DE parameters, the proposed method can have the better overall accuracy performance and outperform the six compared approaches.

摘要 i
Abstract ii
目錄 iii
圖目錄 v
表目錄 vi
第一章 1
緒論 1
1.1 研究背景與動機 1
1.2 論文架構 3
第二章 4
相關研究 4
第三章 7
系統架構 7
3.1 系統流程簡介 7
3.2 DE演算法(Differential Evolution) 8
3.2.1 DE參數向量的初始化 9
3.2.2 DE突變利用差異向量(Difference Vectors) 9
3.2.3 DE交叉操作(Crossover) 10
3.2.4 DE選擇運算(Selection) 14
3.2.5 DE迭代 14
3.3 k-近鄰演算法(k-nearest Neighbor) 14
3.4 歐幾里德距離(Euclidean distance) 15
3.5皮爾森距離(Pearson distance) 15
第四章 17
實驗與分析 17
4.1 測試資料集合 17
4.2 評估方法 18
4.3 分類器參數設定 19
4.4 系統預測表現的比較 20
第五章 23
結論 23
參考文獻 24


[1] P. Abbeel, A. Coates, A. Y. Ng, and M. Quigley, "An application of reinforcement learning to aerobatic helicopter flight," B. Scholkopf, J. C. Platt, T. Hoffman eds., Advances in Neural Information Processing Systems, 19, pp. 1–8, 2005.
[2] W. Daelemans and W. Hoste, "Evaluation of machine learning methods for natural language processing tasks," Proceedings of the 3rd International Conference on Language Resources and Evaluation, pp.755–760, 2002.
[3] R. H. Davis, D. B. Edelman, and A. J. Gammerman, "Machine-learning algorithms for credit-card applications," Journal of Mathematics Applied in Business and Industry, pp. 43–51, 1992, 4.
[4] V. M. Krasnopolsky, and M. S. Fox-Rabinovitz, "Complex hybrid models combining deterministicand machine learning components for numerical climate modeling and weatherprediction," Neural Networks, 19(2), pp. 122–134, 2006.
[5] Y.-X. Li, S. Ji, S. Kumar, J. Ye, and Z.-H. Zhou, "Drosophila gene expression patternannotation through multi-instance multi-label learning," ACM/IEEE Transactions onComputational Biology and Bioinformatics, 9(1), pp. 98–112, 2012.
[6] M. Li and Z.-H. Zhou. "Improve computer-aided diagnosis with machine learning techniquesusing undiagnosed samples," IEEE Transactions on Systems, Man and Cybernetics - Part A:Systems and Humans, 37(6): pp. 1088–1098, 2007.
[7] Y. Lu, M. A. Masrur, Z. H. Chen, and B. F. Zhang, "Model-based fault diagnosis in electricdrives using machine learning," Transactions On Mechatronics, 11(3), pp. 290–303, 2006.
[8] C. Sammut, "Automatic construction of reactive control systems using symbolic machinelearning," Knowledge Engineering Review, pp. 27–42, 11, 1996.
[9] C. Sinclair, L. Pierce, and S. Matzner, "An application of machine learning to networkintrusion detection," Proceedings of 15th Computer Security Applications Conference, pp. 371–37., 1999.
[10] Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C. and Byers, A. "Big data: The next frontier for innovation, competition, and productivity," Technical report, McKinsey Global Institute, 2011.
[11] T. M. Mitchell TM, "Machine Learning," New York, McGraw-Hill, 1997.
[12] [2] P. R. CohenandE. A. Feigenbaum, eds, "The Handbook of Artificial Intelligence," vol.3, New York, NY, William Kaufmann, 1983.
[13] X. Zhu, and A. Goldberg, "Introduction to semi-supervised learning. Synthesis Lectures on Artificial Intelligence and Machine Learning," 3, Morgan & Claypool Publishers, pp. 1–130, 2009.
[14] Gantz J, Reinsel D, "Extracting value from chaos," IDC iView, pp 1–12, 2011.
[15] Richard O. Duda, Peter E. Hart, David G. Stork, Pattern Classification, 2nd Edition, Wiley, 2000.
[16] K. V. Price, "Differential evolution vs. the functions of the 2nd ICEO,” in Proc. IEEE Int. Conf. Evol. Comput., pp. 153–157, Apr. 1997.
[17] K. V. Price and R. Storn, "Differential evolution: A simple evolution strategy for fast optimization," Dr. Dobb’s J., vol. 22, no. 4, pp. 18–24, 1997.
[18] R. Storn, "On the usage of differential evolution for function optimization," in Proc. North Am. Fuzzy Inform. Process. Soc., pp. 519–523, 1996.
[19] Isabelle Guyon, André Elisseeff, "An Introduction to variable and feature selection," JMLR, 2003
[20] Zare, Habil, "Scoring relevancy of features based on combinatorial analysis of Lasso with application to lymphoma diagnosis," BMC Genomics, 14, 2013.
[21] K. Pearson,"On Lines and Planes of Closest Fit to Systems of Points in Space," Philosophical Magazine, pp. 559–572, 1901.
[22] J. Lee, M. Verleysen, Nonlinear Dimensionality Reduction, Springer, 2007.
[23] J. Goldberger, S. Roweis, G. Hinton, and R. Salakhutdinov, “Neighbourhood components analysis,” Advances in Neural Information Processing Systems, 17, pp. 513-520, 2005.
[24] K. Weinberger and L. K. Saul, “Distance metric learning for large margin nearest neighbor classification,” The Journal of Machine Learning Research archive, 10, pp. 207-244, 2009.
[25] J. V. Davis, B. Kulis, P. Jain, S. Sra, and I. Dhillon, “Information-theoretic metric learning,” 24th international conference on Machine learning, pp. 209-216, 2007.
[26] Y. Hong, Q. Li, J. Jiang, and Z. Tu, “Learning a mixture of sparse distance metrics for classification and dimensionality reduction,” 2011 International Conference on Computer Vision, pp. 906 – 913, 2011.
[27] S. Das, A. Abraham, U. K. Chakraborty, and A. Konar, "Differential evolution using a neighborhood based mutation operator," IEEE Trans. Evol. Comput., vol. 13, no. 3, pp. 526–553, Jun. 2009.
[28] S. Rahnamayan, H. R. Tizhoosh, and M. M. A. Salama, "Oppositionbased differential evolution," IEEE Trans. Evol. Comput., vol. 12, no. 1, pp. 64–79, Feb. 2008.
[29] J. Zhang and A. C. Sanderson, "JADE: Adaptive differential evolution with optional external archive," IEEE Trans. Evol. Comput., vol. 13, no. 5, pp. 945–958, Oct. 2009.
[30] J. Brest, S. Greiner, B. Boˇskovi´c, M. Mernik, and V. ˇZumer, “Selfadapting control parameters in differential evolution: A comparative study on numerical benchmark problems,” IEEE Trans. Evol. Comput., vol. 10, no. 6, pp. 646–657, Dec. 2006.
[31] A. K. Qin, V. L. Huang, and P. N. Suganthan, “Differential evolution algorithm with strategy adaptation for global numerical optimization,” IEEE Trans. Evol. Comput., vol. 13, no. 2, pp. 398–417, Apr. 2009.
[32] J. Zhang and A. C. Sanderson, “JADE: Adaptive differential evolution with optional external archive,” IEEE Trans. Evol. Comput., vol. 13, no. 5, pp. 945–958, Oct. 2009.
[33] K. Price, R. Storn, and J. Lampinen, "Differential Evolution—A Practical Approach to Global Optimization," Berlin, Germany, Springer, 2005.
[34] K. V. Price, "An introduction to differential evolution," New Ideas in Optimization, D. Corne, M. Dorigo, and V. Glover, Eds. London, U.K., McGraw-Hill, pp. 79–108, 1999.
[35] N. Suguna and Dr. K. Thanushkodi, "An Improved k-Nearest Neighbor Classification Using Genetic Algorithm," International Journal of Computer Science Issues, 7(4), pp. 18–21, 2010.
[36] Chiang TH, Lo HY, Lin SD, "A Ranking-based KNN Approach for Multi-Label Classification," Proceedings of JMLR: Workshop and Conference, 25, pp. 81-96, 2012.
[37] Fulekar MH, "Bioinformatics: Applications in Life and Environmental Sciences, " Springer. p. 110. 2009.
[38] Egghe L, Leydesdorff L. the relation between Pearson's correlation coefficient r and Salton's cosine measure. American Society for Information Science and Technology. 2009;60(5). doi: 10.1016/j.eswa.2012.07.016.
[39] https://archive.ics.uci.edu/ml/index.html
[40] Kohavi Ron, "A study of cross-validation and bootstrap for accuracy estimation and model selection," Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, vol. 2, pp. 1137-1143, 1995.
[41] M. M. Ali and A. Törn, “Population set based global optimization algorithms: Some modifications and numerical studies,” Comput. Oper. Res., vol. 31, no. 10, pp. 1703–1725, 2004.
[42] J. Liu and J. Lampinen, “A fuzzy adaptive differential evolution algorithm,” Soft Comput. A Fusion Founda. Methodol. Applicat., vol. 9, no. 6, pp. 448–462, 2005.
[43] J. Rönkkönen and J. Lampinen, “On using normally distributed mutation step length for the differential evolution algorithm,” in Proc. 9th Int. Conf. Soft Comput. MENDEL, 2003, pp. 11–18.
[44] R. Storn and K. V. Price, “Differential evolution: A simple and efficient adaptive scheme for global optimization over continuous spaces,” ICSI, USA, Tech. Rep. TR-95-012, 1995 [Online]. Available: http://icsi.berkeley.edu/∼storn/litera.html
[45] http://openmp.org

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊