跳到主要內容

臺灣博碩士論文加值系統

(44.220.247.152) 您好!臺灣時間:2024/09/13 16:33
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:李中丞
研究生(外文):Chong-Cheng Lee
論文名稱:用於微陣列資料分析的最佳分類器設計
論文名稱(外文):Design of Optimal Classifiers for Microarray Data Analysis
指導教授:黃秀芬黃秀芬引用關係何信瑩
指導教授(外文):Shiow-Fen HwangShinn-Yin Ho
學位類別:碩士
校院名稱:逢甲大學
系所名稱:資訊工程所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2005
畢業學年度:93
語文別:中文
論文頁數:54
中文關鍵詞:基因演算法微陣列基因選取
外文關鍵詞:MicroarrayGenetic AlgorithmGene Selection
相關次數:
  • 被引用被引用:0
  • 點閱點閱:169
  • 評分評分:
  • 下載下載:15
  • 收藏至我的研究室書目清單書目收藏:0
微陣列(Mircroarray)是一種能夠快速產生大量基因表現資料的技術。我們藉由分析微陣列基因表現資料而找出與疾病相關的關鍵基因,做為往後診斷之依據。但由於人類的基因數量相當龐大,若是逐一篩選不僅花費的時間成本過高且效果也不佳。因此藉由資訊領域中的分類器設計及特徵值選取技術即可自動化地達到上述目的。在特徵基因選取的問題上,由於微陣列測試基因的數量非常龐大且樣本數量通常不多,因此要從這些基因中挑選出對於分類辨識率有貢獻的特徵基因是一個非常高難度的挑戰。在微陣列分析領域中,過去許多文獻都曾提出各種不同的特徵基因選取方法來達到降低維度以利分析之目的。在著重於多類別微陣列資料的文獻中,曾有文獻以基因演算法(Genetic algorithm, GA)結合最相似分類法(Maximum Likelihood, MLHD)來做測試基因的過濾(簡稱GA/MLHD方法),並獲得不錯的效能。因此,本論文提出以智慧型基因演算法(Intelligent genetic algorithm, IGA)結合MLHD分類器,並改良過去文獻中所提出的適應值估評函式及編碼方式,來達到最大化分類器辨識率及最小化所需基因數量之目的;並希望藉此能夠提出適用於多類別微陣列資料的基因選取方法。本論文以11組常見的微陣列資料來進行實驗,在實驗的結果中,可以明顯看出本文所提出的IGA/MLHD方法不僅比GA/MLHD方法能夠獲得更高的辨識率,且所需的特徵基因數量也較少。此外,由特徵基因挑選頻率也能夠看出IGA/MLHD所挑選出的特徵基因也較為穩定。因此,IGA/MLHD不論是在分類器辨識率、特徵基因數量或是穩定性方面,都明顯優於GA/MLHD。
Microarray is a very useful technique for producing massive gene expression data. We attempt to find the relevant genes of a particular disease by analyzing the gene expression data of microarray. However, because the number of human genes is very large and most of human genes are not relevant to a particular disease, the computation cost will be high and the classification accuracy will likely be low if all human genes are sieved one by one. The expression data of microarray usually have a large number of features but a small amount of samples. Hence, it imposes a great challenge in the problem of selecting relevant genes from all test genes. One of existing efficient methods used to identify relevant genes and effectively discriminate the classes of samples is the hybrid approach based on genetic algorithm and maximum likelihood classification (GA/MLHD). In this thesis, an intelligent genetic algorithm (IGA) based method (IGA/MLHD) with control genes and an improved fitness function is proposed to determine the minimal number of relevant genes and identify these genes, while maximizing classification accuracy simultaneously. In the experiment results, it is shown that IGA/MLHD is superior to the existing method GA/MLHD in terms of the number of selected genes, classification accuracy and the stability.
誌 謝 i
摘 要 ii
Abstract iii
目錄 iv
圖目錄 v
表目錄 vi

第一章 導論 1
1.1 微陣列簡介 1
1.2 微陣列樣本分類 2
1.3 微陣列基因選取 4
1.4 研究目標 5
1.5 論文架構 6

第二章 相關研究 8
2.1 微陣列樣本分類器 8
2.2 基因選擇 12
2.3 基因演算法 15
2.4 智慧型基因演算法IGA 21

第三章 最佳分類器設計 24
3.1 設計方法 24
3.2 設計方法之分析 28

第四章 效能分析與比較 32
4.1 微陣列資料介紹 32
4.2 IGA/MLHD效能提昇之分析 33
4.3 GA/MLHD與IGA/MLHD效能比較 42
4.4 10等份交叉驗證實驗 46

第五章 結論與未來展望 48
5.1 結論 48
5.2 未來展望 50

參考文獻 51
[1]H.-L. Huang and S.-Y. Ho, "Mesh Optimization for Surface Approximation Using an Efficient Coarse-to-Fine Evolutionary Algorithm," Pattern Recognition, vol. 36, pp. 848-864, 2003.
[2]S.-Y. Ho and H.-L. Huang, "Facial Modeling from an Uncalibrated Face Image Using an Intelligent Genetic Algorithm," Proceedings of National Computer Symposium, pp. B415-B422, 1999.
[3]S.-Y. Ho and H.-L. Huang, "Facial Modeling from an Uncalibrated Face Image Using Flexible Generic Parameterized Facial Models," IEEE Trans. Systems, Man, and Cybemetics-PartB, vol. 31, pp. 706-719, 2001.
[4]C. Li, Y. H. Yang, S. Dudoit, and H. Chipman, Statistical Analysis of Gene Expression Microarray Data. Washington, D.C.: A CRC Press Company, 2003.
[5]W. J. Fu, E. R. Dougherty, B. Mallick, and R. J. Carroll, "How Many Samples Are Needed to Build A Classifier: A General Sequential Approach," Bioinformatics Advance Access, 2004.
[6]D.-T. Chen, S.-H. Lin, and S.-j. Soong, "Gene selection for oligonucleotide array: an approach using PM probe level data," Bioinformatics Advance Access, 2004.
[7]Li, L., Weinberg, R. C., Darden, T. A., Pedersen, and L. G., "Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method," BIOINFORMATICS, vol. 17, pp. 1131-1142, 2001.
[8]K. Bae and B. K. Mallick, "Gene selection using a two-level hierarchical Bayesian model," Bioinformatics Advance Access, 2004.
[9]C. H. Ooi and P. Tan, "Genetic algorithms applied to multi-class prediction for the analysis of gene expression data," BIOINFORMATICS, vol. 19, pp. 37-44, 2003.
[10]Y. Xia, H. Tong, Li, W. K., a. Zhu, and L. X., "An adaptive estimation of dimension reduction space," Journal of The Royal Statistical Society Series B, vol. 64, pp. 364-410, 2002.
[11]Y. Su, T. M. Murali, V. Pavlovic, M. Schaffer, and S. Kasif, "RankGene: identification of diagnostic genes based on expression data," BIOINFORMATICS, vol. 19, pp. 1578-1579, 2003.
[12]R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, Second ed: Wiley, 2001.
[13]M. James, Classification Algorithm. New York: Wiley, 1985.
[14]Z. R. Yang, "Mining gene expression data based on template theory," Bioinformatics Advance Access, 2004.
[15]J. B. Tobler, M. N. Molla, E. F. Nuwaysir, R. D. Green, and J. W. Shavlik, "Evaluating machine learning approaches for aiding probe selection for gene-expression arrays," BIOINFORMATICS, vol. 18, pp. S164-S171, 2002.
[16]Y. Chen and D. Xu, "Understanding protein dispensability through machine-learning analysis of high-throughput data," Bioinformatics Advance Access, 2004.
[17]R. Linder, D. Dew, H. Sudhoff, D. Theegarten, K. Remberger, S. J. Poppl, and M. Wagner, "The ''subsequent artificial neural network'' (SANN) approach might bring more classificatory power to ANN-based DNA microarray analyses," Bioinformatics Advance Access, 2004.
[18]T. Li, C. Zhang, and M. Ogihara, "A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression," BIOINFORMATICS, vol. 20, pp. 2429?437, 2004.
[19]A. Statnikov, C. F. Aliferis, I. Tsamardinos, D. Hardin, and S. Levy, "A Comprehensive Evaluation of Multicategory Classification Methods for Microarray Gene Expression Cancer Diagnosis," Bioinformatics Advance Access, 2004.
[20]T. S. Furey, N. Cristianini, N. Duffy, D. W. Bednarski, M. Schummer, and D. Haussler, "Support vector machine classification and validation of cancer tissue samples using microarray expression data," BIOINFORMATICS, vol. 16, pp. 906-914, 2000.
[21]Y. Lee and C.-K. Lee, "Classification of multiple cancer types by multicategory support vector machines using gene expression data," BIOINFORMATICS, vol. 19, pp. 1132-1139, 2003.
[22]J. Qin, D. P. Lewis, and W. S. Noble, "Kernel hierarchical gene clustering from microarray expression data," BIOINFORMATICS, vol. 19, pp. 2097-2104, 2003.
[23]Y. Wang, F. Makedon, J. Ford, and J. Pearlman, "HykGene: a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data," Bioinformatics Advance Access, 2004.
[24]O. Abul, R. Alhajj, F. Polat, and K. Barker, "Finding differentially expressed genes: pattern generation using Q-values," Bioinformatics Advance Access, 2004.
[25]D. J. Bakewell and E. Wit, "Weighted analysis of microarray gene expression using maximum likelihood," Bioinformatics Advance Access, 2004.
[26]A.-L. Boulesteix, G. Tutz, and K. Strimmer, "A CART-based approach to discover emerging patterns in microarray data," BIOINFORMATICS, vol. 19, pp. 2465-2472, 2003.
[27]M. J. L. d. Hoon, S. Imoto, and S. Miyano, "Statistical analysis of a small set of time-ordered gene expression data using linear splines," BIOINFORMATICS, vol. 18, pp. 1477-1485, 2002.
[28]E. Wit and J. McClure, "Statistical adjustment of signal censoring in gene expression experiments," BIOINFORMATICS, vol. 19, pp. 1055-1060, 2003.
[29]S.-Y. Ho, L.-S. Shu, and H.-M. Chen, "Intelligent genetic algorithm with a new intelligent crossover using orthogonal arrays," Proceedings of 1999 Genetic and Evolutionary Computation Conference, pp. 289-296, 1999.
[30]G. Taguch and S. Konishi, Orthogonal Arrays and Linear Graphs. MI: American Supplier Institute, 1987.
[31]D. E. Goldberg, Genetic Algorithms in search, Optimization and Machine Learning: Addison-Wesley Publishing Company, 1989.
[32]D. T. Ross, U. Scherf, M. B. Eisen, C. M. Perou, C. Rees, P. Spellman, V. Iyer, S. S. Jeffery, M. Van de Rijn, and M. Waltham, "Systematic Variation in Gene Expression Patterns in Human Cancer Cell Lines," Nature Genetics, vol. 24, pp. 227-235, 2000.
[33]S. Rammaswamy and e. al., "Multiclass cancer diagnosis using tumor gene expression signatures," Proc Natl Acad Sci U S A, 2001.
[34]Su and A.I., "Chemosensitivity prediction by transcriptional profiling," Proc Natl Acad Sci U S A, vol. 98, pp. 10787-10792, 2001.
[35]Pomeroy and e. al., "Prediction of central nervous system embryonal tumour outcome based on gene expression," Nature, vol. 415, 2002.
[36]C. Nutt and e. al, "Gene expression-based classification of malignant gliomas correlates better with survival than histological classification," Cancer Res., vol. 63(7), pp. 1602-1607, 2003.
[37]T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri, C. D. Bloomfield, and E. S. Lander, "Molecular classification of cancer: class discovery and class prediction by gene expression monitoring," Science, vol. 286, pp. 531-537, 1999.
[38]S. Armstrong and e. al, "MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia," Nature Genetics, vol. 30, 2002.
[39]A. Bhattacherjee and e. al., "Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses," Proc Natl Acad Sci U S A, vol. 98, pp. 13790-13795, 2002.
[40]J. Khan and e. al., "Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks," Nature Medicine, vol. 7, 2001.
[41]D. Singh and e. al., "Gene expression correlates of clinical prostate cancer behavior," Cancer Cell, vol. 1, 2002.
[42]M. Shipp and e. al., "Diffuse large B-cell lymphoma outcome prediction by gene expression profiling and supervised machine learning," Nature Medicine, vol. 8, 2002.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top