跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.91) 您好!臺灣時間:2025/01/16 19:02
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:吳信宏
研究生(外文):Hsin-Hung Wu
論文名稱:用序列結構排列法來預測蛋白質GO分類功能
論文名稱(外文):Predict Gene Ontology Functions Using Sequence-Structure Alignment Method
指導教授:許文龍許文龍引用關係
指導教授(外文):Wen-Lung Hsu
學位類別:碩士
校院名稱:中華大學
系所名稱:資訊工程學系碩士班
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2006
畢業學年度:94
語文別:英文
論文頁數:58
中文關鍵詞:蛋白質功能預測蛋白質功能蛋白質
外文關鍵詞:Protein FunctionPredict Protein FunctionProtein
相關次數:
  • 被引用被引用:0
  • 點閱點閱:249
  • 評分評分:
  • 下載下載:11
  • 收藏至我的研究室書目清單書目收藏:0
生物資訊的資料,隨著「人類基因體計劃」的啟動日漸增加,然而有不少基因體與蛋白質體已經定序,卻未知其功能,因此預測功能的方法有其重要性。
蛋白質的功能與其結構有著密切的關係。本篇論文中,將蛋白質結構考慮至其中,發展出一個序列結構排序法來預測蛋白質功能。這個方法是利用蛋白質一級結構序列及其二級結構序列來建造出一個HMM數學統計模型,接著利用已知的蛋白質資料訓練模型,再將所得的機率值拿來做為預測時的依據。此篇論文是依著名的Gene Ontology分類來做為實驗中分類的依據。蛋白質資料則是從Protein Data Bank取得。
依據序列結構排序法的原則,提出兩個預測的系統。第一個系統是假設某未知蛋白質用不同類別的模型來計算整體的期望值,並以此值來判斷該蛋白質屬於那個類別,其精確率為63%。然而蛋白質可能有一種或是多種的功能,而第二個系統則再建立不是此類別的數學模型,不但可以測試出某未知蛋白質所可能存在的一種或多種功能,而且其精確率提昇為81%。
As the Human Genome Project (HGP) progresses, there are more biological data available. Many proteins and genes have been sequenced, but their functions remain unknown. Therefore, function predicting methods become important.
It is recognized that molecular structure of a protein is closely related to its function. In this thesis, we develop sequence-structure alignment method to predict protein function. This method uses protein sequence and secondary structure to build a statistical HMM model which is trained according to pre-existing databases. In this paper, the function is classified by Gene Ontology (GO), and the protein data files are collected from Protein Data Bank (PDB).
Two predicted algorithms are adopted. The first one builds different HMM models based on function classification of GO, and then determines overall probability of protein belonging to this function. The accuracy of this algorithm reaches to 63%. However, the protein may have one or more functions. The second algorithm builds additional HMM model using the data not belong to this category. This algorithm can predict multiple functions of an unknown protein. Its accuracy is raised to 81%.
Table of Contents
Abstract in Chinese i
Abstract ii
Acknowledgment iii
Table of Contents iv
List of Figures vii
List of Tables ix
Chapter 1 Introduction 1
1-1 Significance of protein 1
1-2 Motivation and purpose 2
1-3 Dissertation organization 3
Chapter 2 Research Background 4
2-1 The overview of protein structure 4
2-1-1 Primary structure 4
2-1-2 Secondary structure 6
2-1-3 Tertiary structure 8
2-1-4 Quaternary structure 8
2-2 Methods to predict protein function 9
2-2-1 Sequence alignment 9
2-2-2 Phylogenetic profiles 13
2-2-3 Artificial neural networks 14
2-3 Gene ontology 15
2-4 Hidden markov model 17
Chapter 3 Sequence-Structure Alignment Method 20
3-1 Data set 20
3-2 Training and predicting 21
3-2-1 Using hidden markov model 22
3-2-2 Calculate every variable's frequency of passing through 25
3-2-3 Calculate variable’s probability 27
3-2-4 Probability translates into log value 28
3-2-5 Predicting method 30
3-2-6 An example for predicting 31
3-3 Test 32
Chapter4 Performance of Our Approach 36
4-1 Data source and our database 36
4-2 Training 37
4-3 Performance 40
Chapter5 Conclusion and Future Work 44
Reference 45
Appendix A 48
[1] Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) “Basic local alignment search tool”, J. Mol. Biol., 215(3):403-410.
[2] Pearson, William R (2000) “Flexible sequence similarity searching with the FASTA3 program package”, Methods Mol Biol 132, 185–219.
[3] M. Pellegrini, E.M. Marcotte, M.J. Thompson, D. Eisenberg, and T.O. Yeates (1999) “Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles”, Proc. Natl. Acad. Sci. USA, vol. 96, pp. 4285-4288.
[4] T.F. Smith, M.S. Waterman (1981) “Identification of common molecular subsequences”, J. Mol. Biol. 147, 1, pp. 195-197.
[5] S. B. Needleman and C. D. Wunsch (1970) “Ageneral method applicable to the search for similarities in the amino acid sequence of two proteins”, J. Mol. Biol. 147: 195-197.
[6] T. F. Simth and M. S. Waterman (1981) “Comparison of biosequences”, Adv. Appl. Math. 2: 482-489.
[7] The website has described LCS
http://en.wikipedia.org/wiki/Longest-common_subsequence_problem
[8] Barton, G. J. (1993c) “An efficient algorithm to locate all locally optimal alignments between two sequences allowing for gaps”, Comput. Appl. Biosci. 9, 729-734.
[9] Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. (1997)“Gapped BLAST and PSI-BLAST: a new generation of protein database search programs”, Nucleic Acids Res ,25(17):3389-3402.
[10] J. Zhang and T. L. Madden (1997) “PowerBLAST: a new network BLAST application for interactive or automated sequence analysis and annotation”, Genome Res, Vol. 7, No. 6, pp. 649-656.
[11] NCBI BLAST
http://www.ncbi.nlm.nih.gov/BLAST/
[12] Lars Juhl Jensen (2002) “Prediction of Protein Function from Sequence Derived Protein Features”, Center for Biological Sequence Analysis, BioCentrum-DTU,Technical University of Denmark, Lyngby.
[13] The Gene Ontology
http://www.geneontology.org/index.shtml
[14] Ashburner, M., C. A. Ball, et al. (2000) “Gene ontology: tool for the unification of biology. The Gene Ontology Consortium”, Nature Genetics 25(1): 25-29.
[15] Lawrence R. Rabiner (1989) “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE”, 77 (2), p. 257–286.
[16] C. Yan, D. Dobbs, V. Honavar and D. Dobbs. (2004) “A two-stage classifier for identification of protein-protein interface residues”, Bioinformatics, 20(S1):i371--i378.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
1. 10.伍忠賢,企業轉型的策略方案選擇,管理雜誌第314期,頁98-110,民國89年
2. 27.張元祥,尋找中小企業藍海市場,遠見雜誌,民國95年5月
3. 27.張元祥,尋找中小企業藍海市場,遠見雜誌,民國95年5月
4. 10.伍忠賢,企業轉型的策略方案選擇,管理雜誌第314期,頁98-110,民國89年
5. 6. 司徒達賢,我國中小型企業升級問題與對策,經濟情勢暨評論,第5卷第4期,頁1-9,民國89年
6. 6. 司徒達賢,我國中小型企業升級問題與對策,經濟情勢暨評論,第5卷第4期,頁1-9,民國89年
7. 4. 王素彎,中小企業的現況與未來,國家政策論壇,第一卷第七期,民國90年9月
8. 4. 王素彎,中小企業的現況與未來,國家政策論壇,第一卷第七期,民國90年9月
9. 2. 王健全,中小企業的升級與轉型,國家政策論壇第一卷第七期,民國90年
10. 2. 王健全,中小企業的升級與轉型,國家政策論壇第一卷第七期,民國90年
11. 31.陳明璋,企業轉型的策略與成功關鍵,貿易週刊第1690期,民國85年
12. 31.陳明璋,企業轉型的策略與成功關鍵,貿易週刊第1690期,民國85年
13. 45.楊美玲,隱身台中住宅區的得獎王,詰佑用設計走出代工宿命,數位時代雙週第115期,民國94年10月
14. 45.楊美玲,隱身台中住宅區的得獎王,詰佑用設計走出代工宿命,數位時代雙週第115期,民國94年10月
15. 46.鄭榮郎,傳統產業向上提昇的轉型策略,能力雜誌第540 期,頁77-80,民國90年