跳到主要內容

臺灣博碩士論文加值系統

(3.236.110.106) 您好!臺灣時間:2021/07/25 08:34
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:江素倩
研究生(外文):Chiang, Su-Chien
論文名稱:藉由基因表現資料預測基因多重功能
論文名稱(外文):Prediction of Multiple Gene Functions Using Gene Expression Data
指導教授:陳春賢陳春賢引用關係
指導教授(外文):Chen, Chun-Hsien
學位類別:碩士
校院名稱:長庚大學
系所名稱:資訊管理研究所
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2005
畢業學年度:93
語文別:中文
中文關鍵詞:基因功能預測多重功能預測基因功能分類架構分類問題到傳遞類神經網路
外文關鍵詞:gene function predictionmultiple function predictionclassification schemeclassification problembackpropagation neural network
相關次數:
  • 被引用被引用:0
  • 點閱點閱:100
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
後基因體時代,對基因體的研究由強調定序工作轉而強調了解功能與運作機制的功能性基因體學,而了解基因所扮演的角色,進而了解其產物間的互動關係更是實現基因體醫學重要的工作。本論文以酵母菌為研究對象,以倒傳遞類神經網路為方法,來分析基因微陣列基因表現資料,並結合MIPS所制定的基因功能分類架構,將基因功能預測問題視為一多類別分類問題,採監督式學習的方式以比例共軛梯度演算法進行類神經網路分類器的訓練。論文中探討不同類神經網路架構的分類器訓練情形,最後根據MIPS的基因功能分類架構,將基因多重功能預測問題視為多個二元分類問題,針對各個功能類別進行分類器模型的訓練,並以分類準確率討論各不同架構的類神經網路分類器的分類能力。本研究探討了倒傳遞類神經網路的各種不同架構在預測基因多重功能問題上的分類能力,提供一種有別於叢集分析方法僅能將基因歸類於單一功能的基因功能分析的方法。
After the completion of human genome project, genome researchers are shifting their research attentions from structural genomics to functional genomics. In the post-genomic era, prediction of gene functions is one of the most important tasks. In order to realize the genomic medicine, the first important step is to understand the functions of genes and to infer their biological mechanisms such as interactions among genes. In this thesis, we analyzed microarray gene expression data of yeast by using backpropagation neural network to predict the functions of yeast genes. Unsupervised learning (clustering analysis) is extensively used for the identification of genes with similar functions, but the limitation of such analysis is that each gene is only assigned to a cluster. Accordingly, each gene may be considered having only a single function in the analysis. Therefore, supervised learning is used as our approach to this problem. Thus, the prediction of gene functions is treated as a classification problem and well-trained classifiers can be used to predict functions of a novel gene. The training and testing data are gene expression data of 2426 ORFs of yeast Saccharomyces cerevisiae in 156 experiments, and MIPS functional classification scheme is used as the annotation of gene functions. In this thesis, various structures of neural network classifiers are designed, and they are evaluated by their classification ability.
指導教授推薦書
口試委員審定書
長庚大學授權書 iii
誌 謝 iv
摘 要 v
ABSTRACT vi
目 錄 vii
圖目錄 x
表目錄 xii
第一章 緒論 1
1.1 研究背景 2
1.2 研究動機與目的 4
1.3 研究假設與限制 7
1.4 論文章節介紹 8
第二章 文獻回顧 9
2.1 基因與基因功能 9
2.2 基因功能註解分類架構 17
2.3 基因表現資料 21
2.3.1 基因微陣列(DNA Microarray) 21
2.3.2 基因表現資料分析 23
2.4 以監督式分析方法預測基因功能之相關研究 26
2.5 倒傳遞類神經網路 28
2.5.1 類神經網路基本架構 28
2.5.2 感知機(Perceptron) 32
2.5.3 倒傳遞類神經網路 32
第三章 研究方法 35
3.1 基因表現量資料 35
3.2 MIPS基因功能分類架構分析 39
3.3 實驗資料的處理 45
3.4 實驗流程與倒傳遞類神經網路架構 47
3.5 研究所用之分析軟體 50
第四章 結果與討論 51
4.1 257類別多輸出分類器架構:156-h-257 51
4.2 17類別多輸出分類器架構:156-h-17 53
4.3 17類別單一輸出二元分類器架構:(156-h-1)×17 55
第五章 結論與未來展望 67
參考文獻 69
附錄一:MIPS酵母菌分類架構(259類別) 75
附錄二:Eisen’s & Spellman’s實驗資料 83
附錄三:Eisen’s實驗資料 90
附錄四:Spellman’s實驗資料 93

圖目錄
圖2-1 基本核苷酸結構 11
圖2-2 五碳糖結構 11
圖2-3 鹼基結構 12
圖2-4 DNA雙螺旋與核苷酸結構 13
圖2-5 DNA、mRNA、胜肽鏈各序列之對應關係 15
圖2-6 基因表現步驟 17
圖2-7 監督式與非監督式分析 25
圖2-8 生物神經元 29
圖2-9 類神經網路計算單元 30
圖2-10 單層前饋式網路 30
圖2-11 多層前饋式網路 31
圖3-1 基因功能分布情形(259類別) 42
圖3-2 基因功能分布情形(19類別) 43
圖3-3 基因多重功能分布情形(259類別) 44
圖3-4 基因多重功能分布情形(19類別) 44
圖3-5 類神經網路輸入、輸出圖 46
圖3-6 網路輸入資料 46
圖3-7 網路輸出資料 46
圖3-8 156-h-257分類器網路架構示意圖 47
圖3-9 156-h-17分類器網路架構示意圖 48
圖3-10 (156-h-1)×17分類器網路架構示意圖 48
圖3-11 實驗流程圖 49
圖4-1 網路推論輸出與目標輸出關係圖 52
圖4-2 隱藏層個數與網路收斂指標MSE關係圖 54
圖4-3 隱藏層神經元個數8與80收斂情形 55
圖4-4 訓練資料集樣本分布情形 56
圖4-5 測試資料集樣本分布情形 56
圖4-6 個別分類器隱藏層神經元個數與準確率關係 61
圖4-7 樣本數與準確率關係圖(Eisen’s & Spellman’s) 65
圖4-8 樣本數與準確率關係圖(Eisen’s) 65
圖4-9 樣本數與準確率關係圖(Spellman’s) 66

表目錄
表2-1 標準遺傳密碼 14
表2-2 FunCat所註解的生物體 20
表3-1 基因表現資料來源 36
表3-2 Spellman’s 資料集 37
表3-3 Eisen’s 資料集 38
表3-4 功能類別架構 41
表4-1 隱藏層測試收斂數據 54
表4-2 隱藏層神經元個數與準確率 62
表4-3 隱藏層神經元個數與準確率(Eisen’s) 63
表4-4 隱藏層神經元個數與準確率(Spellman’s) 64
李權益編著,2001,分子生物學,初版,台北:合記,pp. 1-275.
黃國華,2004,基因晶片與生物醫學,科學發展,381期,pp. 64-69.
蘇木春、張孝德編著,1999,機器學習:類神經網路、模糊系統以及基因演算法則,二版,台北:全華。
羅華強編著,2001,類神經網路:MATLAB的應用,初版,新竹市:清蔚科技。
蒙以正編著,2003,MATLAB入門與精進,初版,台北:儒林。
Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., Walter, P., 2002, Molecular Biology of the Cell, 4th Ed., Garland Publishing.
Amaratunga, D. & Cabrera, J., 2004, Exploration and analysis of DNA microarray and protein array data, 1st Ed., John Wiley & Sons, Inc.
Baldi, P. & Hatfield, G. W., 2002, DNA Microarrays and Gene Expression, 1st Ed., Cambridge University Press.
Bork, P., Dandekar, T., Daiz-Lazcoz, Y., Eisenhaber, F., Huynen, M., Yuan, Y., 1998, “Predicting Function: From Genes to Genomes and Back,” J. Mol. Biol., vol. 283, no. 4, pp. 707-725.
Boucherie, H., Dujardin, G., Kermorgant, M., Monribot, C., Slonimski, P., Perrot, M., 1995, “Two Dimensional Protein Map of Saccharomyces cerevisiae: Construction of a Gene-Protein Index,” Yeast, vol. 11, no. 7, pp. 601-613.
Brazma, A. & Vilo, J., 2000, “Gene Expression Data Analysis,” FEBS Letters, vol. 480, pp. 17-24.
Brown, M. P. S., Grundy, W. N., Lin, D., Cristianini, N., Sugnet, C. W., Furey, T. S., Ares, M., Jr., Haussler, D., 2000, “Knowledge-based Analysis of Microarray Gene Expression Data by Using Support Vector Machines,” Proc. Natl. Acad. Sci. USA, vol. 97, no. 1, pp. 262-267.
Cai, Y. D. & Doig, A. J., 2004, “Prediction of Saccharomyces cerevisiae Protein Functional Class from Functional Domain Composition,” Bioinformatcs, vol. 20, no. 8, pp. 1292-1300.
Cho, R., Campbell, M., Winzeler, E., Steinmetz, L., Conway, A., Wodicka, L., Wolfsberg, T., Gabrielian, A., Landsman, D., Lockhart, D., Davis, R., 1998, “A Genome-wide Transcription Analysis of the Mitotic Cell Cycle,” Molecular Cell, vol. 2, no. 1, pp. 65-73.
Clare, A., King, R., 2003, “Predicting Gene Function in Saccharomyces cerevisiae,” Bioinformatics, vol. 19, Suppl 2: II42-II49.
DeRisi, J., Iyer, V., Brown, P., 1997, “Exploring the Metabolic and Genetic Control of Gene Expression on Genomic Scale,” Science, vol. 278, no. 5338, pp. 680-686.
Eisen, M., Spellman, P., Brown, P., Botstein, D., 1998, “Cluster Analysis and Display of Genome-wide Expression Patterns,” Proc. Natl. Acad. Sci. USA, vol. 95, no. 25, pp. 14863-14868.
Eisenberg, D., Marcotte, M., Xenarios, I., Yeates, O., 2000, “Protein function in the post-genomic era,” Nature, 405, pp. 823-826.
Forsburg, S. L., 2001, “The Art and Design of Genetics Screens: Yeast,” Nat. Rev. Genet., vol. 2, no. 9, pp. 659-668.
Gasch, A., Spellman, P., Kao, C., Harel C. O., Eisen, M., Storz, G., Botstein, D., Brown, P., 2000, “Genomic Expression Program in the Response of Yeast Cells to Environmental Changes,” Mol. Bio. Cell, vol. 11, pp. 4241-4257.
Goffeau, A., Barrell, B., Bussey, H., Davis, R., Dujon, B., Feldmann, H., Galibert, F., Hoheisel, J., Jacq, C., Johnston, M., Louis, E., Mewes, H., Murakami, Y., Philippsen, P., Tettelin, H., Oliver, S., 1996, “Life with 6000 Genes,” Science, vol. 274, no. 5287, pp. 563-567.
Hagan, M. T., Demuth, H. B., Beale, M. H., 1996, Neural Network Design, 1st Ed., PWS Publishing.
Hieter, P., Boguski, M., 1997, “Functional Genomics: It’s All How You Read It,” Science, vol. 278, no. 5338, pp. 601-602.
Ihmels, J., Friedlander, G., Gergmann, S., Sarig, O., Ziv, Y., Barkai, N., 2002, “Revealing Modular Organization in the Yeast Transcriptional Network,” Nat. Genet., vol. 31, no. 4, pp. 370-377.
International human genome sequencing consortium, 2001, “Initial Sequencing and Analysis of the Human Genome,” Nature, vol. 409, pp. 860-921.
International human genome sequencing consortium, 2004, “Finishing the Euchromatic Sequence of the Human Genome,” Nature, vol. 431, pp. 931-945.
Kell, D., King, R., 2000, “On the Optimization of Classes for the Assignment of Unidentified Reading Frames in Functional Genomics Programmes: the Need for Machine Learning,” Trends Biotechnol., vol. 18, no. 3, pp. 93-98.
King, R., Karwath, A., Clare, A., Dehaspe, L., 2000, “Accurate Prediction of Protein Functional Class in the M. tuberculosis and E. coli Genomes Using Data Mining,” Yeast, vol. 17, pp. 283-293.
Kretschmann, E., Fleischmann, W., Apweiler, R., 2001, “Automatic Rule Generation for Protein Annotation with the C4.5 Data Mining Algorithm Applied on SWISS-PROT,” Bioinformatics, vol. 17, no. 10, pp. 920-926.
Kuramochi, M. & Karypis, G., 2001, “Gene Classification Using Expression Profiles: A Feasibility Study,” Proceedings of the 2nd IEEE International Symposium on Bioinformatics & Bioengineering (BIBE2001), pp. 1-16.
Lodish, H., Berk, A., Zipursky, S. L., Matsudaira, P., Baltimore, D., Darnell, J. E., 1999, Molecular cell biology, 4th Ed., W. H. Freeman & Co.
Mewes, H., Albermann, A., Heumann, K., Liebl, S., Pfeiffer, F., 1997, “MIPS: A Database for Protein Sequences, Homology Data and Yeast Genome Information,” Nucleic Acids. Res., vol. 25, no. 1, pp. 28-30.
Moller, M., 1993, “A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning,” Neural Network, vol. 6, pp. 525-533.
Negnevitsky, M., 2002, Artificial intelligence: a guide to intelligent systems, 1st Ed., Pearson Education.
Paolella, P., 1998, Introduction to molecular biology, 1st Ed., McGraw-Hill, Inc. (邱珮琪譯,2000,基礎分子生物學,初版,台北:麥格羅希爾。)
Quackenbush, J., 2001, “Computational Analysis of Microarray Data,” Nat. Rev. Genet., vol. 2, no. 6, pp. 418-427.
Rison, S. C. G., Hodgman, T. C., Thornton, J. M., 2000, “Comparison of Functional Annotation Schemes for Genomes,” Funct. Integr. Genomics, vol. 1, pp. 56-59.
Rost, B., Liu, J., Nari, R., Wrzeszczynski, K. O., Ofran, Y., 2003, “Automatic Prediction of Protein Function,” Cell. Mol. Life Sci., vol. 60, pp. 2637-2650.
Ruepp, A., Zollner, A., Maier, D., Albermann, K., Hani, J., Mokrejs, M., Tetko, I., Güldener U., Mannhaupt, G., Münsterkötter, M., Mewes, H. W., 2004, “The FunCat, a Functional Annotation Scheme for Systematic Classification of Proteins from Whole Genomes,” Nucleic Acids. Res., vol. 32, no. 18, pp. 5539-5545.
Schena, M., Shalon, D., Davis, R., Brown, P., 1995, “Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray,” Science, vol. 270, no. 5235, pp. 467-470.
Setubal, J. C. & Meidanis, J., 1997, Introduction to computational molecular biology, 1st Ed., International Thomson Publishing.
Spellman, P., Sherlock, G., Zhang, M., Iyer, V., Anders, K., Eisen, M., Brown, P., Botstein, D., Futcher, B., 1998, “Comprehensive Identification of Cell Cycle-regulated Genes of the Yeast Saccharomyces cerevisiae by Microarray Hybridization,” Mol. Biol. Cell, vol. 9, no. 12, pp. 3273-3297.
Venter, J. C. et al., 2001, “The Sequence of the Human Genome,” Science, vol. 291, pp. 1304-1351.
Zhang, G., 2000, “Neural Networks for Classification: a Survey,” IEEE Trans. Syst., Man and Cybern., Part C: Application and Reviews, vol. 4, no. 30, pp. 451-462.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top