(3.236.6.6) 您好!臺灣時間:2021/04/22 19:33
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:潘咨亦
研究生(外文):Tzu-Yi Pan
論文名稱:權重型基因法則於肝癌質譜資料特徵選取之應用
論文名稱(外文):Application of Gene Weighted Genetic Alogorithm to feature selecte with SELDI-TOF MS Hepatoma data
指導教授:姚立德姚立德引用關係
口試委員:阮雪芬黃宣誠陳文輝曾傳蘆
口試日期:2007-07-27
學位類別:碩士
校院名稱:國立臺北科技大學
系所名稱:機電整合研究所
學門:工程學門
學類:機械工程學類
論文種類:學術論文
論文出版年:2007
畢業學年度:95
語文別:中文
論文頁數:131
中文關鍵詞:SELDI-TOF MS支援向量機特徵選取基因法則
外文關鍵詞:SELDI-TOF MSSupport Vector MachineFeature SelectionGenetic Algorithm
相關次數:
  • 被引用被引用:1
  • 點閱點閱:198
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:3
  • 收藏至我的研究室書目清單書目收藏:0
對於一份經由表面增強的雷射解析電離飛行時間質譜儀(Surface Enhance Laser Desorption/Ionization Time-Of-Flight Mass Spectrometry, SELDI-TOF MS)並且搭配該儀器生產公司所附的Ciphergen Protein Chip 軟體產生的我國國人臨床肝癌峰值資料,想要將其做患有癌症與否的分析是有著相當程度的困難。雖然已經藉由軟體大幅降低所需分析的資料量和雜訊,但是卻衍生出其他的問題。所以本文先提出由生物醫學之觀點,將資料轉換為有相同依據的特徵向量。再以此作為分析上的依據,使用特徵選取搭配支援向量機做資料的分類。
將基因法則(Genetic Algorithm)應用於特徵選取上則是近年來被發展出來的方法。而本文所提出的權重型基因法則(Gene Weighted Genetic Algorithm, GWGA),將染色體中的每一個基因給予一個交配的權重値,並以機率的方式來做交配的運算,改變了傳統基因法則交配的方式以解決容易陷入局部最佳解(local optimum)的問題,並且能夠適當的減少特徵數目。
It’s very difficult to analyse the clinical SELDI-TOF MS (Surface Enhance Laser Desorption/Ionization Time-Of-Flight Mass Spectrometry)data which is obtained form our countrymen. The data also had been pre-processed with the software which is appended with the instrument producer. Although the data have already been reduced the capacity of data and noise, come into another problems. In this way, it is proposed to transform the data into feature vectors on equal basis which according to the biomedical point in this dissertation. Using feature selection with SVM(Support Vector Machine) to classifer these feature vectors datas.
Among the different categories of feature selection algorithms, the genetic algorithm (GA) is a rather recent development. In this dissertation, the Gene Weighted Genetic Algorithm, that is give a gene weighted value in each gene of chromosome, and make the crossover operation by the probability, this change can reduce local optimum when used traditional method of Genetic Algorithm.
中文摘要 i
ABSTRACT ii
誌謝 iii
目錄 iv
表目錄 vi
圖目錄 viii
第一章 緒論 1
1.1 研究動機及目的 1
1.2 文獻探討 2
1.3 論文規劃 5
第二章 SELDI-TOF MS原理及資料特性 7
2.1 SELDI-TOF MS技術 7
2.2 SELDI-TOF MS的優點 9
2.3 SELDI-TOF MS原始質譜資料 9
2.4 SELDI-TOF MS事前處理資料 11
2.5 SELDI-TOF MS的資料來源 14
第三章 特徵產生 15
3.1 SELDI-TOF MS 峰值資料的特性 15
3.2 SELDI-TOF MS 峰值資料的特徵產生 17
第四章 特徵選取之理論 22
4.1 特徵選取問題的定義 22
4.2 特徵子集合的搜尋方式 24
4.2.1 完整搜尋法 24
4.2.2 順序搜尋法 25
4.2.3 隨機搜尋 28
4.3 特徵選取之流程 28
4.3.1 停止準則 33
4.3.2 結果確認 33
第五章 基因法則於特徵選取之設計 34
5.1 基因演算法於特徵選取之設計 34
5.2 染色體編碼方式、族群與親代 35
5.3 複製親代染色體 36
5.4 交配 37
5.5 突變 38
5.6 適應函數 39
5.7 排序和停止準則 39
第六章 權重型基因法則於特徵選取之設計 40
6.1 GWGA之設計概念 40
6.2 GWGA之排序設計 43
6.3 GWGA之更新權重設計 44
6.4 更新權重之用運 49
6.4.1 更新權重於GWGA中交配之設計 49
6.4.2 更新權重於GWGA中移民滅種之設計 51
6.5 GWGA之突變 53
6.6 GWGA中之適應函數 54
6.7 GWGA之停止準則 57
6.8 GWGA之程式流程 58
6.8.1 更新權重於交配之GWGA 58
6.8.2 更新權重於移民滅種之GWGA 61
第七章 GWGA於SELDI-TOF MS肝癌資料之應用 63
7.1 SELDI-TOF MS肝癌資料前處理 63
7.1.1 肝癌質譜資料之特徵向量 64
7.1.2 正規化SELDI-TOF MS肝癌之特徵向量資料 76
7.1.3 交互驗證法 76
7.2 肝癌資料於交配機制之分析 78
7.3 肝癌資料於GWGA交配機制之分析 94
7.4 肝癌資料於GWGA移民滅種之分析 99
7.5 肝癌資料實驗結果比較 110
7.6 卵巢癌資料於GWGA之分析結果 121
第八章 結論與未來展望 124
8.1 結論 124
8.2 未來展望 125
參考文獻 126
作者簡介 131
[1]江漢聲,攝護腺-疾病與保健,台北:健康,1991。
[2]長庚大學台灣蛋白質體學簡介(2002)。取自 http://memo.cgu.edu.tw/inscorelab/Intro.htm
[3]賴基銘,「癌症篩檢未來的展望:SELDI血清蛋白指紋圖譜的應用」,國家衛生研究院電子報,第52期,2004年6月25日。
[4]行政院衛生署,「中華民國九十三年台灣地區死因統計結果摘要」。取自http://www.doh.gov.tw/statistic/data/死因摘要/93年/93.htm
[5]潘荔錞、蔡志彥和簡志清,「蛋白質體學在臨床醫學之應用」,化工資訊與商情月刊第3期,2003年9月號。
[6]G. Bill, S. Mian, F. Holding, R. O. Allibone, J. Lowe, S. Ali, G. Li, S. McCardle, I. O. Ellis, C. Creaser, and R. C. Rees, “An integrated approach utilizing artificial neural networks and SELDI mass spectrometry for the classification of human tumours and rapid identification of potential biomarkers,” Bioinformatics, vol. 18, 2002, pp. 395-404.
[7]M. Wagner, D. N. Naik, A. Pothen, S. Kasukurti, R. R. Devineni, B. L. Adam, O. J. Semmes, and G. L. W. Jr, “Computational protein biomarker prediction : a case study for prostate cancer,” BMC Bioinformatics, vol. 5, 2004, pp. 26-35.
[8]R. H. Lilien, H. Farid, and B. R. Donald, “Probabilistic disease classification of expression -depentent proteomic data from mass spectrometry of human serum,” J. of Computational Biology, vol. 6, 2003, pp. 925-946.
[9]E. F. Petricoin, A. M. Ardekani, B. A. Hitt, P. J. Levine, V. A. Fusaro, S. M. Steinberg, G. B. Mills, C. Simone, D. A. Fishman, E. C. Kohn, and L. A. Liotta, “Use of proteomic patterns in serum to identify ovarian cancer,” Lancet, vol. 259, 2002, pp. 572-577.
[10]L. H. Loo, J. Quinn, H. Cordingley, S. Roberts, L. Hrebien, and M. Kam, “Classification of SELDI-TOF mass spectra of ovarian cancer serum samples using a proteomic pattern recognizer,” IEEE 29th Annual, proceed. of Bio. Eng., 2003, pp. 130-131.
[11]J. Quinn, L. H. Loo, J. Armitage, H. Cordingley, S. Roberts, P. J. Bugelski, M. Kam, and L. Hrebien, “Classification of pharmaceutical toxicity by feature analysis,” IEEE 28th Annual, proceed. of Bio. Eng., 2002, pp. 211-212.
[12]D. Woetzel, D. Driesch, M. Pfaff, F. V. Eggeling, K. Junker, and R. Guthke, “Applying data mining methods to SELDI-TOF analysed renal cell carcinoma samples to identify relevant tumor markers,” Jena Center for Bioinformatics, 2002.
[13]K. Junker, J. Gneist, C. Melle, D. Driesch, J. Schubert, U. Claussen, and F. V. Eggeling, “Identification of protein pattern in kidney cancer using ProteinChip arrays and bioinformatics,” Int. J. of Molecular Medicine, vol. 15, 2005, pp. 285-290.
[14]B. L. Adam, Y. Qu, J. W. Davis, M. D. Ward, M. A. Clements, L. H. Cazares, O. J. Semmes, P. F. Schellhammer, Y. Yasui, Z. Feng, and G. L. Wright, “Serum protein fingerprinting coupled with a pattern-matching algorithm distinguishes prostate cancer from begin prostate hyperplasia and healthy men,” Cancer Res., vol. 62, 2002, pp. 3609-3614.
[15]A. Jain and D. Zongker, “Feature selection : evaluation, application and small sample performance,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 2, Feb. 1997, pp. 153-158.
[16]P. Mitra, C. A. Murthy, and S. K. Pal, “Unsupervised feature selection using feature similarity,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 3, Mar. 2002, pp. 301-312.
[17]Y. Kaifeng, L. Wenkai, Z. Shanwen, X. Huanqin and L. Yanda, “Feature expansion and feature selection for general pattern recognition problems,” Proceedings of the 2003 International Conference on Neural Networks and Signal Processing, vol. 1, Dec. 2003, pp. 14-17.
[18]L. Zhang, G. Sun and J. Guopattern, “Feature selection for pattern classification problems,” The Fourth International Conference on Computer and Information Technolog, Sep. 2004, pp. 233-237.
[19]A. L. Blum and P. Langley, “Selection of relevant features and examples in machine learning,” Artificial Intelligence, vol. 97, 1997, pp. 245-271.
[20]G. H. John, R. Kohavi, and K. Pfleger, “Irrelevant feature and the subset selection problem,” Proc. 11th Int’l Conf. Machine Learning, 1994, pp. 121-129.
[21]P. Langley, “Selection of relevant features in machine learning,” Proc. AAAI Fall Symp. Relevance, 1994, pp. 140-144.
[22]N. Kwark and C. H. Choi, “Input feature selection for classification problems,” IEEE Transactions Neural Networks, vol. 13, no. 1, Jan. 2002, pp.143-159.
[23]K. B. Duan, J. C. Rajapakse, H. Wang, and F. Azuaje, “Multiple SVM-RFE for gene selection in cancer classification with expression data,” IEEE Transactions on NanoBioscience, vol. 4, no. 3, Apr. 2005, pp. 491-502.
[24]E. Y. Tov and G. F. Inbar, “Feature selection for the classification of movements from single movement-related potentials,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 10, no. 3, Sep. 2002, pp. 170-177.
[25]M. Dash, K. Choi, P. Scheuermann, and L. Huan, “Feature selection for clustering - a filter solution,” IEEE International Conference on Data Mining, Dec. 2002, pp. 115-122.
[26]P. Mitra and D. D. Majumder, “Feature selection and gene clustering from gene expression data,” Proceedings of the 17th International Conference on Pattern Recognition, vol. 2, 2004, pp. 343-346
[27]R. Butterworth, G. P. Shapiro, and D. A. Simovici, “On feature selection through clustering,” IEEE International Conference on Data Mining, Nov. 2005, pp. 27-30.
[28]J. Guajardo, J. Miranda, and R. Weber, “A hybrid forecasting methodology using feature selection and support vector regression,” Fifth International Conference on Hybrid Intelligent Systems, Nov. 2005, pp. 6.
[29]R. F. Cheng and C. N. Lu, “Load profile assignment of low voltage customers for power retail market applications,” IEE Proceedings Transmission and Distribution, no. 3, May. 2003, pp. 263-267.
[30]M. Gavrilas, V. C. Sfintes, and M. N. Filimon, “Identifying typical load profiles using neural-fuzzy models,” Transmission and Distribution Conference and Exposition, vol. 1, Oct. 2001, pp. 421-426.
[31]H. T. Yang, S. C. Chen, and W. N. Tasi, “Classification of direct load control curves for performance evaluation,” IEEE Transactions Power Systems, vol. 19, May. 2004, pp. 811-817.
[32]R. F. Cheng, R. C. Leou, and C. N. Lu, “Distribution transformer load modeling using load research data,” IEEE Transactions Power Delivery, vol. 17, Apr. 2002, pp. 655-661.
[33]S. Jianbo and C. Tomasi, “Good features to track,” IEEE Conference Computer Vision and Pattern Recognition, 1994, pp. 593-600.
[34]M. Unser and M. Eden, “Multiresolution feature extraction and selection for texture segmentation,” IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 2, 1989, pp. 717-728.
[35]M. L. Raymer, W. F. Punch, E. D. Goodman, L. A. Kuhn, and A. K. Jain, “Dimensionality reduction using genetic algorithms,” IEEE Transactions on Evolutionary Computation, vol. 4, no. 2, Jul. 2000, pp. 164-171.
[36]J. H. Yang and V. Honavar, “Feature subset selection using a genetic algorithm,” IEEE Intelligent Systems and Their Applications, vol. 13, no. 2, 1998, pp. 44-49.
[37]Eggeling, V. F., Junker, K., Fiedler, W., Wollschied, V., Durst, M., Claussen, U., and Ernst, G.,” A new proteomic tool in cancer research.” Mass spectrometry meets chip technology Electrophoresis 22, pp.2898–2902, 2001.
[38]Haleem J. Issaq, Timothy D. Veenstra, Thomas P. Conrads, and Donna Felscho,” The SELDI-TOF MS Approach to Proteomics:Protein Profiling and Biomarker Identification,” Biochemical and Biophysical Research Communications 292, pp.587–592, 2002.
[39]Merchant, M., and Weinberger, S.,” Recent advancements in surface enhanced laser desorption/ionization time of flight mass spectrometry,” Electrophoresis 21, pp.1164–1167, 2000.
[40]Wright, G. L., Cazares, L. H., Leung, S. M., et al.,” Protein-Chip surface enhanced laser desorption/ionization (SELDI) mass spectrometry: A novel protein biochip technology for detection of prostate cancer biomarkers in complex protein mixtures,” Prostate Cancer Prostatic Dis. 2, pp.264–276, 2000.
[41]西滿正,癌的最新診斷與治療,台北:建宏,1996。
[42]L. Yu and H. Liu, “Redundancy based feature selection for microarray data,” Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, Aug. 2004, KDD ''04.
[43]楊敦翔,「以類神經網路與特徵選取技巧處理空氣能見度預測問題之研究」,國立中山大學機械與機電工程學系碩士論文,民國92年。
[44]H. Liu and H. Motoda, “Feature selection for knowledge discovery and data mining,” Boston: Kluwer Academic, 1998.
[45]H. Liu and R. Setiono, “A probabilistic approach to feature selection-A filter solution,” Proc. 13th Int’l Conf. Machine Learning, 1996, pp. 319-327.
[46]E. F. Combarro, E. Montanes, I. Diaz, J. Ranilla and R. Mones, “Introducing a family of linear measures for feature selection in text categorization,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 9, Sep. 2005, pp. 1223-1232.
[47]H. C. Nan, J. H. Huang, and S. Dietrich, ”The ANNIGMA-wrapper approach to fast feature selection for neural nets,” IEEE Transactions on Systems, Man and Cybernetics, Part B, vol. 32, no. 2, Apr. 2002, pp. 207-212.
[48]H. Liu and R. Setiono, “Feature selection and classification-A probabilistic wrapper approach,” Proc. Ninth Int’l Conf. Industrial and Eng. Applications of AI and ES, 1996, pp. 419-424.
[49]S. Das, “Filters, wrappers and a boosting-based hybrid for feature selection,” Proc. 18th Int’l Conf. Machine Learning, 2001, pp. 74-81.
[50]H. Liu and H. Motoda, Feature selection for knowledge discovery and data mining, Kluwer Academic Publishers: Massachusetts, 1998.
[51]S. S. Sancho, C. V. Gustavo, P. C. Fernando, S. S. Jose, and B. C. Carlos, ”Enhancing genetic feature selection through restricted search and walsh analysis,” IEEE Transactions on Systems, Man and Cybernetics, Part C, vol. 34, no. 4, Nov. 2004, pp. 398-406.
[52]黃成德,「基於遺傳式橢圓型體分類演算法之不完整資料模糊分群」,國立台北科技大學機電整合研究所碩士論文,民國94年。
[53]R. Babuška, Fuzzy Modeling for Control, Kluwer Academic Publishers: Massachusetts, 1998.
[54]J. A. K. Suykens, T. V. Gestel, J. D. Brabanter, B. D. Moor, and J. Vandewalle, Least squares support vector machines, World Scientific Publishing Co. Pte. Ltd: Singapore, 2002.
[55]Y. Tang, Y. Q. Zhang, and Z. Huang, “FCM-SVM-RFE gene feature selection algorithm for leukemia classification from microarray gene expression data,” IEEE International Conference on Fuzzy Systems, May. 2005, pp. 97-101.
[56]I. I. S. Oh, J. S. Lee, and B. R. Moon, “Hybrid genetic algorithms for feature selection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 11, Nov. 2004, pp. 1424-1437.
[57]M. Dorigo, V. Maniezzo, and A. Colorni, “Positive feedback as a search strategy,” Technical Report, no. 91-016, Dipartimento Elettronica, Politecnico di Milano, Italy, 1991
[58]M. Dorigo, “Optimization learning and natural algorithms,”PhD Thesis, Dip. Elettronica, Politecnico di Milano, Italy, 1992.
[59]M. Dorigo, V. Maniezzo, and A. Colorni, “Ant system : Optimization by a colony of cooperating agents,” IEEE Transactions on Systems, Man, and Cybernetics-Part B, vol. 26(1), 1996, pp.29-41.
[60]Shawe-Taylor, N.C.a.J., An Introduction to Support Vector Machines and Other Kernel-based Learning Methods, Cambridge University Press, 2000.
[61]C. Blake, E. Keogh, and C. J. Merz, UCI Repository of Machine Learning Databases, University of California at Irvine, Dept. Inform. Comput. Sci., CA., available at http://www.ics.uci.edu/~mlearn/MLRepository.html
[62]Niloofar Arshadi, and Igor Jurisica, “Data Mining for Case-Based Reasoning in High-Dimensional Biological Domains,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 8, pp.1127-1137, Agu. 2005.
[63]黃琮榆,「類神經網路魚肝癌與卵巢癌質譜資料分類之應用」,國立台北科技大學自動化科技研究所碩士論文,民國96年。
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔