跳到主要內容

臺灣博碩士論文加值系統

(3.95.131.146) 您好!臺灣時間:2021/07/29 01:48
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:林家仁
研究生(外文):Chia-Jen Lin
論文名稱:高效率整合性蛋白質資料庫的研究-演化分析法的應用ETiprot
論文名稱(外文):An Efficiently Integrated Protein Database- the Application of Evolutionary Trace Method
指導教授:許世宜
指導教授(外文):Sheh-Yi Sheu
學位類別:碩士
校院名稱:國立陽明大學
系所名稱:衛生資訊與決策研究所
學門:醫藥衛生學門
學類:公共衛生學類
論文種類:學術論文
論文出版年:2004
畢業學年度:92
語文別:中文
中文關鍵詞:生物資訊活性區演化分析法
外文關鍵詞:bioinformaticsactive siteevolutionary trace method
相關次數:
  • 被引用被引用:0
  • 點閱點閱:123
  • 評分評分:
  • 下載下載:10
  • 收藏至我的研究室書目清單書目收藏:0
蛋白質之間的交互作用在生命體的運作扮演著重要的角色,對於蛋白質活性區的研究包含活化中心的尋找及預測,配位基和蛋白質表面的作用情形以及作用時蛋白質立體結構的改變。倘若能了解其作用機制,對於藥物設計、蛋白質定點氨機酸突變以及分子反應途徑的決定,將可達到事半功倍的效果。在後基因體時代來臨的趨勢下,如何從大量的蛋白質資訊中快速地註解蛋白質活性區的位置,已是現今最重要的課題。
近幾年來隨著生物技術發展的成熟,蛋白質結構以及序列資料量快速的增加,在結構生物資訊快速發展的趨勢之下,對於蛋白質結構資訊已經有了許多特有的且功能性導向的資料庫,如蛋白質結構和序列資料,蛋白質家族以及分類等。大量的資料分散在多個資料庫中,生物學家要大規模進行彼此交互關聯的生物資訊實驗計算,往往需要花費許多時間進行資料的處理,因此,本論文在於整合蛋白質相關資訊,提供生物學家快速且自動化的處理程序,以達到最大綜效。
  演化分析法的應用在過去對於諸多活性區預測的研究有著良好的成果,其理論基礎為,同一個蛋白質家族,經由演化的過程,蛋白質結構中,活性區域的胺基酸發生突變的機率遠小於其他的區域。利用這項特點可以將同屬於一個家族的蛋白質序列做多序列排比,由此所得到的小片段守恆區域,對應在結構上,成為蛋白質活性區的機率相對較大,以此對於蛋白質活性區域做合理的預測。
本論文整合蛋白質結構與序列和分類以及蛋白質家族等相關資源,利用資料庫系統將彼此資料的相關性做一有效的整合,並設計網路查詢介面以供查照。以及利用此資料庫作為蛋白質活性區域預測的資料來源,並設計演化分析法自動分析計算流程,以計算蛋白質功能性以及結構性守恆區域,並設計網路查詢介面和加值型資料庫系統,協助結構生物資訊對於蛋白質結構研究之需求。
The Protein Protein interaction plays a central role in biology. Studies about protein functional sites include discovery, prediction, interfaces with ligands and conformational switches. Realizing the mechanism, drug design, engineering protein mimetic, and elucidating molecular pathways through site-directed mutagenesis could be done effectively. In the post genomeics, annotating the protein functional site effectively from massive protein information has become the most important subject.
By the recent maturation of development of biotechnology and vast amount of protein structures and sequence data, some specific and function-oriented database has been setup, such as protein structure, sequence, family and classification under the fast development about structure bioinformatics. However, those vast data were stored in different databases that biologists might spent much time on processing and they need to integrate different databases for getting large scale bioinformation computing. Therefore, we opposed to integrate the protein related information for providing biologists the fast and automatic processing procedure to meet the best synergy.
The application of evolutionary trace method had well performance in predicting many active sites. The theory was based on the hypothesis. In the protein family, via evolution, the residue mutation rate of protein structure active site is much lower than others. Using this strategy to predict protein functional region reasonably, protein sequences could take a multiple sequence alignment, and the conserved region mapping to the structure had more opportunity becoming active site.
This article integrated protein structures, sequences, classification, protein families and some relative resources. We used database system to summarize the relation of each data source, develop the network interface for further search, and serve it as the data source of prediction about protein active sites. The automatic calculation analysis system pipeline by evolutionary trace method for getting protein functional and structural conserved regions had been built and the network searching interface and value add database system for fitting the demand of protein structure research in structure bioinformatics had been developed.
(1) Kelavkar U, Shah K. Advances in the Human Genome Project. A review. Mol Biol Rep 1999; 26(3):215.
(2) Kanehisa M, Bork P. Bioinformatics in the post-sequence era. Nature 2003; 33:305-310.
(3) Roos DS. Computational biology. Bioinformatics--trying to swim in a sea of data. Science 2001; 291(5507):1260-1261.
(4) Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H et al. The Protein Data Bank. Nucleic Acids Research 2000; 28:235-242.
(5) Lang F. SWISS-PROT + TREMBL. Trends Genet 1997; 13(10):417.
(6) George DG, Dodson RJ, Garavelli JS, Haft DH, Hunt LT, Marzec CR et al. The Protein Information Resource (PIR) and the PIR-International Protein Sequence Database. Nucleic Acids Res 1997; 25(1):24-28.
(7) Hubbard TJ, Murzin AG, Brenner SE, Chothia C. SCOP: a structural classification of proteins database. Nucleic Acids Res 1997; 25(1):236-239.
(8) Orengo CA, Pearl FM, Bray JE, Todd AE, Martin AC, Conte L et al. The CATH Database provides insights into protein structure/function relationships. Nucleic Acids Res 1999; 27(1):275-279.
(9) Apweiler R, Bateman A, Bairoch A, Birney E, Bucher P, Cerutti L et al. The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res 2001; 29(1):37-40.
(10) Jenuth JP. The NCBI. Publicly available tools and resources on the Web. Methods Mol Biol 2000; 132:301-312.
(11) Kulikova T, Aldebert P, Althorpe N, Baker W, Bates K, van den Broek A et al. The EMBL Nucleotide Sequence Database. Nucleic Acids Res 2004; 32:27-30.
(12) Rodriguez-Tome P. EBI databases and services. Mol Biotechnol 2001; 18(3):199-212.
(13) Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S et al. UniProt: the Universal Protein knowledgebase. N 2004; 32:115-119.
(14) Marchler-Bauer A, Panchenko AR, Shoemaker BA, Thiessen PA, Geer LY, Bryant SH. CDD: a database of conserved domain alignments with links to domain three-dimensional structure. Nucleic Acids Res 2002; 30(1):281-283.
(15) Sonnhammer EL, Eddy SR, Birney E, Bateman A, Durbin R. Pfam: multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Res 1998; 26(1):320-322.
(16) Attwood TK, Beck ME, Bleasby AJ, Parry-Smith DJ. PRINTS--a database of protein motif fingerprints. Nucleic Acids Res 1994; 22(17):3590-3596.
(17) Corpet F, Gouzy J, Kahn D. The ProDom database of protein domain families. Nucleic Acids Res 1998; 26(1):323-326.
(18) Ponting CP, Schultz J, Milpetz F, Bork P. SMART: identification and annotation of domains from signalling and extracellular protein sequences. Nucleic Acids Res 1999; 27(1):229-232.
(19) Kirkness EF, Kerlavage AR. The TIGR human cDNA database. Methods Mol Biol 1997; 69:261-268.
(20) Schuler GD, Epstein JA, Ohkawa H, Kans JA. Entrez: molecular biology database and retrieval system. Methods Enzymol 1996; 266:141-162.
(21) Zdobnov EM, Lopez R, Apweiler R, Etzold T. The EBI SRS server--recent developments. Bioinformatics 2002; 18(2):368-373.
(22) Akiyama Y, Onizuka K, Noguchi T, Ando M. Parallel Protein Information Analysis (PAPIA) System Running on a 64-Node PC Cluster. Genome Inform Ser Workshop 1998; 9:131-140.
(23) Lichtarge O, Bourne HR, Cohen FE. An evolutionary trace method defines binding surfaces common to protein families. J Mol Biol 1996; 257(2):342-358.
(24) Jeanmougin F, Thompson JD, Gouy M, Higgins DG, Gibson TJ. Multiple sequence alignment with Clustal X. Trends Biochem Sci 1998; 23(10):403-405.
(25) Vingron M. Towards integration of multiple alignment and phylogenetic tree construction. J Comput Biol 1997; 4(1):23-34.
(26) Landgraf R, Xenarios I, Eisenberg D. Three-dimensional cluster analysis identifies interfaces and functional residue clusters in proteins. J Mol Biol 2001; 307(5):1487-1502.
(27) Madabushi S, Yao H, Marsh M, Kristensen DM, Philippi A, Sowa ME et al. Structural clusters of evolutionary trace residues are statistically significant and common in proteins. J Mol Biol 2002; 316(1):139-154.
(28) Buckingham S. Bioinformatics: data''s future shock. Nature 2004; 428(6984):774-777.
(29) Laudon, Laudon. Management information system organization and technology in the networked enterprise. Sixth ed. 2003.
(30) Michalickova K, Bader GD, Dumontier M, Lieu H, Betel D, Isserlin R et al. SeqHound: biological sequence and structure database as a platform for bioinformatics research. BMC Bioinformatics 3(1), 32. 2002.
(31) Sheth AP, Larson JA. Federated database systems for managing distributed heterogeneous and autonomous databases. ACM computing surveys 22(3), 183-236. 1990.
(32) Coronel R. Database Systems Design, Implemention, &Management. Fourth ed. 2002.
(33) Linux. http://www.linux.org . 2004.
(34) MySQL. http://www.mysql.com/ . 2004.
(35) GPL. http://www.gnu.org/ . 2004.
(36) UNIX. http://www.unix-systems.org . 2004.
(37) Oepn Source. http://www.opensource.org/ . 2004.
(38) F□bio, A.M., Porto, S□rgio. R. Carvalho, Maur□cio J. Vianna de Silva,J, et al. Persistent Object Synchronization with Active Relational Databases. Technology of Object-Oriented Languages and Systems . 1999.
(39) Lee B, Hurson AR. Dataflow Architectures and Multithreading. Computer 27(8), 27-39. 1994.
(40) PERL. http://www.perl.com/ . 2004.
(41) JAVA. http://java.sun.com/ . 2004.
(42) JSP. http://java.sun.com/products/jsp/ . 2004.
(43) Tomcat. http://jakarta.apache.org/tomcat/ . 2004.
(44) The Java Tutorial. http://java.sun.com/docs/books/tutorial/index.html . 2004.
(45) Clustal W. http://www.ebi.ac.uk/clustalw/ . 2004.
(46) How to make a phylogenetic tree. http://www.hiv.lanl.gov/content/hiv-db/TREE_TUTORIAL/Tree-tutorial.html . 2004.
(47) GD. http://www.boutell.com/gd/ . 2004.
(48) Apache. http://www.apache.org/ . 2004.
(49) MDL Chime. http://www.mdlchime.com/ . 2004.
(50) Tajima F. Determination of window size for analysing DNA sequences. J.Mol.Evol. 33, 470-473. 2004.
(51) Fu Z, Wang M, Potter D, Miziorko HM, Kim JJP. The Structure of a Binary Complex between a Mammalian Mevalonate Kinase and ATP. J.Biol.Chem 277(20), 18134-18142. 2002.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top