跳到主要內容

臺灣博碩士論文加值系統

(35.172.223.251) 您好!臺灣時間:2022/08/17 01:09
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:王祖勝
研究生(外文):Tsu-Sheng Wang
論文名稱:全序列同源模擬法的評估及其在Xanthomonascampestrispv.campestris結構基因體的應用
論文名稱(外文):Evaluation of Homology Modeling for Global Sequences and Application to the Structure Genomics of Xanthomonas campestris pv. campestris
指導教授:呂平江
指導教授(外文):Ping-Chiang Lyu
學位類別:碩士
校院名稱:國立清華大學
系所名稱:生命科學系
學門:生命科學學門
學類:生物學類
論文種類:學術論文
論文出版年:2002
畢業學年度:90
語文別:中文
論文頁數:61
中文關鍵詞:全序列同源模擬法結構基因體
外文關鍵詞:Global SequencesHomology ModelingStructureGenomics
相關次數:
  • 被引用被引用:0
  • 點閱點閱:109
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
隨著基因體計畫的陸續完成,接下來的工作將會是針對所註解的蛋白質序列的解出其立體結構,而由實驗的方法來獲得大量結構是既費時又不可行的的,此使就可以利用電腦來預測結構或是篩選目標蛋白質。而電腦模擬的方法,又以同源模擬法較為準確,而先前的文獻,都是針對其片段序列比對的相似度與碳原子的結構相似度的關係,並未考慮全序列比對的相似度與各種原子的結構相似度的關係,故我們將觀察此方面的關係。
我們從蛋白質結構資料庫選取序列樣本出來,作配對的全序列比對,然後利用同源模擬法,將結構預測出來,並將預測出來的結構與其實驗的結構作比對。由實驗中我們發現,在作同源法時,最好選擇模板序列長度大於目標序列。如果序列相似度低於30%,可以尋找擁有類似功能或是同一家族的蛋白質來當模版。當序列相似度大於60%,其預測出來的結構可信度一樣好。從不同的原子比對結果,我們認為預測蛋白質和模板蛋白質之間的序列相似度至少要高38%時,預測出來的結構才是可信的。
此外,我們還對剛完成定序的Xanthomonas campestris pv. campestris基因體進行結構預測,在模板蛋白質序列相似度需超過30%的條件下,我們從1291個已被註解的序列中預測出86個蛋白質結構,以我們上面得到的結果,其中有39個預測結果是可以被信任的。

While more and more completely sequenced genomes are available, the next important work is to know the three-dimension structures of the annotated sequences. However, large-scale determination of protein structures by X-ray crystallography and nuclear magnetic resonance spectroscopy is time-consuming and unpractical. Computing methods can speed up the structures determination. Homology modeling is the best method to predict the structures among the others.
Since previous studies focused only on the relation between the local sequences alignments and the RMSD (root meaning square deviation) of the  carbon atom, we want to focus on the relation between the global sequences alignments and the RMSD of the other atoms including all atom, backbone and side-chain.
First, we got the samples from the PDB (Protein Data Bank, http://www.rcsb.org/pdb/). Second, we did the pair-wise global sequences alignments and predicted the three-dimension structures via homology modeling. Finally, we calculated the RMSD between the predicted structures and the original structures. The results indicate that better predicted structures would be obtained while the template sequences are longer than the target sequences. If the identity between the sequences is low, one could select related proteins with the similar functions or belonged to the same family to be your templates. When the identity is > 60%, the RMSD between the templates and the target would be almost the same. Besides, the cut-offs of least identity in order to get the fine structures, with RMSD below 3.5Å, for different atoms are 38~45%. Furthermore, we predicted structures for the whole genome of Xanthomonas campestris pv. campestris. As a result, we got 86 reasonable structures from 1291 annotated sequences.

圖目錄 4
表目錄 5
中文摘要 6
Abstract 7
謝誌 8
第一章、緒論 9
如何決定蛋白質結構 9
研究動機 11
同源模擬法 11
建立蛋白質分子模型的程度可分成六個步驟 12
步驟1:選擇模板的蛋白質序列 14
步驟2:將目標和模板的胺基酸序列做序列比對 14
步驟3:建立目標序列核心部份的結構骨幹 15
步驟4:產生各結構守衡區域之間鬆散分子鏈的結構 16
步驟5:目標蛋白質結構做修正微調 17
步驟6:三維分子結構的檢驗和證實 18
同源模擬法的限制 18
同源模擬法的應用 18
同源模擬法的相關網站 19
第二章、材料跟方法 20
材料 20
蛋白質的胺基酸序列(Protein Amino Acid Sequence) 20
軟體 20
(1)FASTA34 20
(2)ALIGN 21
(3)MODELLER 22
(4)MMTSB 24
(5)AMBER 26
方法 27
蛋白質結構和序列的取得 28
蛋白質序列的篩選及建立資料庫 28
蛋白質序列比對及建立比對資料庫 28
篩選蛋白質序列比對數據 29
建立蛋白質結構 29
蛋白質結構比對 29
能量最小化 30
預測Xanthomonas campestris pv. campestris基因體的立體結構 30
第三章、結果與討論 31
蛋白質的結構及序列的樣本 31
序列比對 34
篩選序列比對樣本 36
同源模擬法產生錯誤的機率 37
序列相似度與結構相似度之關係 49
結構能量最小化 54
MODELLER程式的錯誤 55
Xanthomonas campestris pv. campestris基因體的結構預測 56
第四章、結論 57
第五章、參考資料 58

1. Sanchez, R., Pieper, U., Melo, F., Eswar, N., Marti-Renom, M.A., Madhusudhan, M.S., Mirkovic, N., and Sali, A. 2000. Protein structure modeling for structural genomics. Nat. Struct. Biol. 7 Suppl: p. 986-990.
2. Sali, A. 1998. 100,000 protein structures for the biologist. Nat. Struct. Biol. 5(12): p. 1029-1032.
3. Al-Lazikani, B., Jung, J., Xiang, Z., and Honig, B. 2001. Protein structure prediction. Curr. Opin. Chem. Biol. 5(1): p. 51-56.
4. Sanchez, R. and Sali, A. 2000. Comparative protein structure modeling. Introduction and practical examples with modeller. Methods Mol. Biol. 143: p. 97-129.
5. Jones, D. and Thornton, J. 1993. Protein fold recognition. J. Comput. Aided. Mol. Des. 7(4): p. 439-456.
6. Lemer, C.M., Rooman, M.J., and Wodak, S.J. 1995. Protein structure prediction by threading methods: evaluation of current techniques. Proteins. 23(3): p. 337-355.
7. Chothia, C. and Lesk, A.M. 1986. The relation between the divergence of sequence and structure in proteins. EMBO J. 5(4): p. 823-826.
8. Hilbert, M., Bohm, G., and Jaenicke, R. 1993. Structural relationships of homologous proteins as a fundamental principle in homology modeling. Proteins. 17(2): p. 138-151.
9. Sander, C. and Schneider, R. 1991. Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins. 9(1): p. 56-68.
10. Sanchez, R. and Sali, A. 1998. Large-scale protein structure modeling of the Saccharomyces cerevisiae genome. Proc. Natl. Acad. Sci. U S A. 95(23): p. 13597-13602.
11. Pieper, U., Eswar, N., Stuart, A.C., Ilyin, V.A., and Sali, A. 2002. MODBASE, a database of annotated comparative protein structure models. Nucleic Acids Res. 30(1): p. 255-259.
12. Rost, B. 1999. Twilight zone of protein sequence alignments. Protein Eng. 12(2): p. 85-94.
13. Marti-Renom, M.A., Stuart, A.C., Fiser, A., Sanchez, R., Melo, F., and Sali, A. 2000. Comparative protein structure modeling of genes and genomes. Annu. Rev. Biophys. Biomol. Struct. 29: p. 291-325.
14. Chung, S.Y. and Subbiah, S. 1996. A structural explanation for the twilight zone of protein sequence homology. Structure. 4(10): p. 1123-1127.
15. Geourjon, C., Combet, C., Blanchet, C., and Deleage, G. 2001. Identification of related proteins with weak sequence identity using secondary structure information. Protein Sci. 10(4): p. 788-797.
16. Wrobel, J.A., Chao, S.F., Conrad, M.J., Merker, J.D., Swanstrom, R., Pielak, G.J., and Hutchison, C.A., 3rd. 1998. A genetic approach for identifying critical residues in the fingers and palm subdomains of HIV-1 reverse transcriptase. Proc. Natl. Acad. Sci. U S A. 95(2): p. 638-645.
17. Baxevanis, A.D. 1998. Practical aspects of multiple sequence alignment. Methods Biochem. Anal. 39: p. 172-188.
18. Briffeuil, P., Baudoux, G., Lambert, C., De Bolle, X., Vinals, C., Feytmans, E., and Depiereux, E. 1998. Comparative analysis of seven multiple protein sequence alignment servers: clues to enhance reliability of predictions. Bioinformatics. 14(4): p. 357-366.
19. Alexandrov, N.N. and Luethy, R. 1998. Alignment algorithm for homology modeling and threading. Protein Sci. 7(2): p. 254-258.
20. Saqi, M.A., Russell, R.B., and Sternberg, M.J. 1998. Misleading local sequence alignments: implications for comparative protein modelling. Protein Eng. 11(8): p. 627-630.
21. Martin, A.C., MacArthur, M.W., and Thornton, J.M. 1997. Assessment of comparative modeling in CASP2. Proteins. Suppl 1: p. 14-28.
22. Mezei, M. 1998. Chameleon sequences in the PDB. Protein Eng. 11(6): p. 411-414.
23. Sali, A. 2001. Target practice. Nat. Struct. Biol. 8(6): p. 482-484.
24. Gopal, S., Schroeder, M., Pieper, U., Sczyrba, A., Aytekin-Kurban, G., Bekiranov, S., Fajardo, J.E., Eswar, N., Sanchez, R., Sali, A., and Gaasterland, T. 2001. Homology-based annotation yields 1,042 new candidate genes in the Drosophila melanogaster genome. Nat. Genet. 27(3): p. 337-340.
25. Guex, N. and Peitsch, M.C. 1997. SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis. 18(15): p. 2714-2723.
26. Vriend, G. 1990. WHAT IF: a molecular modeling and drug design program. J. Mol. Graph. 8(1): p. 52-6, 29.
27. Sali, A. and Blundell, T.L. 1993. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234(3): p. 779-815.
28. Shindyalov, I.N. and Bourne, P.E. 2000. An alternative view of protein fold space. Proteins. 38(3): p. 247-260.
29. Rost, B. 1996. PHD: predicting one-dimensional protein structure by profile-based neural networks. Methods Enzymol. 266: p. 525-539.
30. Berman, H.M., Bhat, T.N., Bourne, P.E., Feng, Z., Gilliland, G., Weissig, H., and Westbrook, J. 2000. The Protein Data Bank and the challenge of structural genomics. Nat. Struct. Biol. 7 Suppl: p. 957-959.
31. Pearson, W.R. 1990. Rapid and sensitive sequence comparison with FASTP and FASTA. Methods Enzymol. 183: p. 63-98.
32. Feig, M., Rotkiewicz, P., Kolinski, A., Skolnick, J., and Brooks, C.L., 3rd. 2000. Accurate reconstruction of all-atom protein representations from side-chain-based low-resolution models. Proteins. 41(1): p. 86-97.
33. Rathore, N. and de Pablo, J.J. 2002. Monte Carlo Simulation of Proteins Through a Random Walk in Energy Space. J. Chem. Phys. 116: p. 7225-7230.
34. Brooks, C.L., Bruccoleri, R.E., Olafson, B.D., States, D.J., Swaminathan, S., and Karplus, M. 1983. A Program for Macromolecular Energy, Minimization, and Dynamics Calculations. J. Comp. Chem. 4: p. 187-217.
35. Pearlman, D.A., Case, D.A., Caldwell, J.W., Ross, W.S., Cheatham, T.E., III, DeBolt, S., Ferguson, D., Seibel, G., and Kollman, P. 1995. AMBER, a package of computer programs for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules. Comp. Phys. Commun. 91: p. 1-41.
36. Sali, A., Potterton, L., Yuan, F., van Vlijmen, H., and Karplus, M. 1995. Evaluation of comparative protein modeling by MODELLER. Proteins. 23(3): p. 318-326.
37. Sanchez, R. and Sali, A. 1997. Evaluation of comparative protein structure modeling by MODELLER-3. Proteins. Suppl 1: p. 50-58.
38. Srinivasan, N. and Blundell, T.L. 1993. An evaluation of the performance of an automated procedure for comparative modelling of protein tertiary structure. Protein Eng. 6(5): p. 501-512.
39. van Vlijmen, H.W. and Karplus, M. 1997. PDB-based protein loop prediction: parameters for selection and methods for optimization. J. Mol. Biol. 267(4): p. 975-1001.
40. Barton, G.J. and Sternberg, M.J. 1987. A strategy for the rapid multiple alignment of protein sequences. Confidence levels from tertiary structure comparisons. J. Mol. Biol. 198(2): p. 327-337.
41. Taylor, W.R., Flores, T.P., and Orengo, C.A. 1994. Multiple protein structure alignment. Protein Sci. 3(10): p. 1858-1870.
42. Fiser, A., Do, R.K., and Sali, A. 2000. Modeling of loops in protein structures. Protein Sci. 9(9): p. 1753-1773.
43. Oliva, B., Bates, P.A., Querol, E., Aviles, F.X., and Sternberg, M.J. 1997. An automated classification of the structure of protein loops. J. Mol. Biol. 266(4): p. 814-830.
44. Rapp, C.S. and Friesner, R.A. 1999. Prediction of loop geometries using a generalized born model of solvation effects. Proteins. 35(2): p. 173-183.
45. Rufino, S.D., Donate, L.E., Canard, L., and Blundell, T.L. 1996. Analysis, clustering and prediction of the conformation of short and medium size loops connecting regular secondary structures. Pac. Symp. Biocomput.: p. 570-589.
46. Murzin, A.G., Brenner, S.E., Hubbard, T., and Chothia, C. 1995. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247(4): p. 536-540.
47. Hubbard, T.J., Murzin, A.G., Brenner, S.E., and Chothia, C. 1997. SCOP: a structural classification of proteins database. Nucleic Acids Res. 25(1): p. 236-239.
48. Lo Conte, L., Brenner, S.E., Hubbard, T.J., Chothia, C., and Murzin, A.G. 2002. SCOP database in 2002: refinements accommodate structural genomics. Nucleic Acids Res. 30(1): p. 264-267.
49. Dill, K.A. and Chan, H.S. 1997. From Levinthal to pathways to funnels. Nat. Struct. Biol. 4(1): p. 10-19.
50. Kazlauskas, R. 2001. Tech.Sight. Modeling--a tool for experimentalists. Science. 293(5538): p. 2277-2279.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top