(3.238.235.155) 您好!臺灣時間:2021/05/16 07:26
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

: 
twitterline
研究生:鄭弘翊
研究生(外文):Cheng, Hung-Yi
論文名稱:α螺旋蛋白質內部重複單元切割與分類
論文名稱(外文):Internal Repeat Segmentation and Subtype Classification of α-solenoid Proteins
指導教授:白敦文
指導教授(外文):Pai, Tun-Wen
口試委員:許輝煌張顥騰
口試委員(外文):Hsu, Hui-HuangChang, Hao-Teng
口試日期:2015-11-16
學位類別:碩士
校院名稱:國立臺灣海洋大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2015
畢業學年度:104
語文別:英文
論文頁數:40
中文關鍵詞:α螺旋結構串聯重複序列結構二級結構結構生物學
外文關鍵詞:Alpha-solenoidTandem repeat structureSecondary structureStructural biology
相關次數:
  • 被引用被引用:0
  • 點閱點閱:22
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
具內部重複特性的蛋白質結構廣泛地分佈在各類的蛋白質家族,而各樣式重複基本單元具功能多樣性並在各生物物種體內參與不同生物功能的交互作用及反應。α螺旋串聯重複結構為普遍常見的重複結構蛋白質,但是同一串聯重複結構中任兩個重複單元的序列相似度非常低,因此使用序列比較的方法無法成功的自動切割每個基本重複單元並給予正確的分類。根據基本重複單元的幾何特徵和二級結構資訊,本論文開發一套自動切割、重複結構分類及功能註釋的α螺旋串聯重複結構自動辨識系統。該系統使用Psi和Alpha二面角特徵來確認每個基本重複單元的α螺旋片段範圍,並計算兩個鄰近α螺旋片段的虛擬向量相交角度來確認基本重複單元組合。經確認為內部基本重複單元後,所組成的兩個α螺旋片段長度、幾何曲度特徵、及兩兩鄰近重複單元的相對位置資訊可以提供α螺旋串聯重複的子分類結構辨識。為評估此預測系統的性能,我們採用RepeatsDB資料庫的923個α螺旋重複結構、SMART/Pfam資料庫的905個α螺旋重複結構、及CATH資料庫的166個α螺旋重複結構等三個測試資料集進行分析。此辨識系統自動判斷是否為α螺旋重複結構的評估結果在召回率(Recall rate)的表現可達到94.24%、精準率(Precision rate)達76.16%、特異性(Specificity rate)達99.76%以及正確率(Accuracy rate)達99.71%。對於基本重複單元的自動切割表現,系統召回率可達94.20%、精準率達94.66%、特異性達96.73%以及正確率達95.62%。最後對於α螺旋重複結構的自動分類表現,系統平均召回率為81.76%、平均精準率82.46%、平均特異性96.06%以及平均正確率93.38%。本系統是第一個僅基於結構資訊就可以同時自動辨識四種極為相似的α螺旋重複結構,並可自動切割重複結構內部的所有基本重複單元及詳細註解其幾何位置,線上即時自動辨識及友善的視覺設計可提供結構生物學家快速比較分析各類α螺旋重複結構的共同特徵及差異性,對後續的蛋白質結構分類及生物實驗有實質的幫助。
Tandem repeat structures are widely distributed among all classes of proteins. Various basic structural units of repetitive nature possess functional diversity and reflect important influences on protein interaction and biological responses for different organisms. One of the most common types of protein repeat structure is the α-solenoid tandem repeat, which possesses low sequence similarity between any two internal repeat units within a structure. Therefore, a successful segmentation and classification system for identifying α-solenoid repeats cannot be achieved mainly based on sequence alignment based approaches. For a comprehensive analysis on fundamental repeat unit segmentation, subclass identification, and functional annotation on such repeat structures, we have developed an automatic identification system according to geometrical characteristics and secondary structure information. Dihedral angles of Psi and Alpha were applied to define locations of candidate α helix elements, and the included angle between the vectors formulated by neighboring α helix element was calculated for constructing fundamental repeat units. Characteristics of length of helix elements, geometric curvatures, and relative position of neighboring repeat units were considered for classifying the subtypes of α-solenoid tandem repeats. To evaluate the performance of our developed prediction system, we employed three databases including 923 α-solenoid repeats collected in the RepeatsDB database, 905 α-solenoid repeats retrieved from CATH database, and 166 α-solenoid repeats collected from SMART/Pfam database. The results showed that our proposed system achieved a recall rate of 94.24%, precision rate 76.16%, specificity rate 99.76% and accuracy rate 99.71% for identifying α-solenoid repeats. Regarding internal repeat unit segmentation for identified repeats, the developed system achieved a recall rate of 94.20%, precision rate 94.66%, specificity rate 96.73% and accuracy rate 95.62%. For subtype classification, system could achieve a recall rate of 81.76%, precision rate 82.46%, specificity rate 96.06%, and accuracy rate 93.38%. This is the first comprehensive classification system for identifying four different subtypes of α-solenoid repeats, and including fundamental internal repeat segmentation and geometric annotation. The on-line recognition and friendly interface designed system could facilitate structural biologists for efficiently comparing common and unique features of different subtypes of α-solenoid tandem repeats, and it is beneficial for protein classification, annotation, and perhaps the biological experiments.
Table of Contents
摘要 III
Abstract IV
致謝 V
Table of Contents VI
List of Figures VIII
List of Tables IX
1 Introduction 1
1.1 Rapid growth of protein structures 1
1.2 Periodic/non-periodic classification of protein structures 2
1.2.1 Major categories for repeat structures 2
1.3 Characteristics of α-solenoid repeat structures 4
1.3.1 HEAT repeat structures 5
1.3.2 ANK repeat structures 6
1.3.3 ARM repeat structures 6
1.3.4 TPR repeat structures 7
1.4 Structural database for tandem repeats 8
2 Materials and Methods 10
2.1 Datasets 10
2.2 System configuration 10
2.3 Segmentation module 13
2.3.1 Segmentation processes for candidate alpha helix elements 14
2.3.2 Recovering processes for identified alpha helix elements 16
2.3.3 Analysis of identified repeat units 17
2.4 Classification module 18
2.4.1 ANK repeat characteristics 18
2.4.2 ARM repeat characteristics 20
2.4.3 TPR repeat characteristics 20
2.4.4 HEAT repeat characteristics 21
3 Results 23
3.1 Predicted results of the proposed system 23
3.2 Performance evaluation 25
3.2.1 Performance of automatic segmentation module 25
3.2.2 Performance of automatic classification module 27
3.3 Performance of α-solenoid repeat detection 31
3.4 System Comparison with ANKPred 31
4 An on-line system of ARCS 33
5 Conclusions 35
References 36

[1] J. Drenth, Principles of protein X-ray crystallography. New York: Springer-Verlag, 1994.
[2] A. T. Brunger, P. D. Adams, G. M. Clore, W. L. DeLano, P. Gros, R. W. Grosse-Kunstleve, et al., "Crystallography & NMR system: A new software suite for macromolecular structure determination," Acta Crystallographica Section D-Biological Crystallography, vol. 54, pp. 905-921, Sep 1 1998.
[3] D. L. Dorset, Structural electron crystallography. New York: Plenum Press, 1995.
[4] W. A. Hendrickson, "Determination of macromolecular structures from anomalous diffraction of synchrotron radiation," Science, vol. 254, pp. 51-8, Oct 4 1991.
[5] G. Binnig and H. Rohrer, "Scanning tunneling microscopy (Reprinted from IBM Journal of Research and development, vol 30, 1986)," Ibm Journal of Research and Development, vol. 44, pp. 279-293, Jan-Mar 2000.
[6] D. J. Muller, H. Janovjak, T. Lehto, L. Kuerschner, and K. Anderson, "Observing structure, function and assembly of single proteins by AFM," Progress in Biophysics & Molecular Biology, vol. 79, pp. 1-43, May-Jul 2002.
[7] H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, et al., "The Protein Data Bank," Nucleic Acids Res, vol. 28, pp. 235-42, Jan 1 2000.
[8] Z. H. Zaidi and D. L. Smith, Protein structure--function relationship. New York: Plenum Press, 1996.
[9] E. M. Marcotte, M. Pellegrini, T. O. Yeates, and D. Eisenberg, "A census of protein repeats," J Mol Biol, vol. 293, pp. 151-60, Oct 15 1999.
[10] M. A. Andrade, C. Perez-Iratxeta, and C. P. Ponting, "Protein repeats: Structures, functions, and evolution," Journal of Structural Biology, vol. 134, pp. 117-131, May-Jun 2001.
[11] A. V. Kajava, "Tandem repeats in proteins: from sequence to structure," J Struct Biol, vol. 179, pp. 279-88, Sep 2012.
[12] H. T. Orr and H. Y. Zoghbi, "Trinucleotide repeat disorders," Annu Rev Neurosci, vol. 30, pp. 575-621, 2007.
[13] A. Rich and F. H. Crick, "The structure of collagen," Nature, vol. 176, pp. 915-6, Nov 12 1955.
[14] B. Kobe and A. V. Kajava, "When protein folding is simplified to protein coiling: the continuum of solenoid protein structures," Trends Biochem Sci, vol. 25, pp. 509-15, Oct 2000.
[15] M. Alvarez, J. P. Zeelen, V. Mainfroid, F. Rentier-Delrue, J. A. Martial, L. Wyns, et al., "Triose-phosphate isomerase (TIM) of the psychrophilic bacterium Vibrio marinus. Kinetic and structural properties," J Biol Chem, vol. 273, pp. 2199-206, Jan 23 1998.
[16] M. S. Lee, G. P. Gippert, K. V. Soman, D. A. Case, and P. E. Wright, "Three-dimensional solution structure of a single zinc finger DNA-binding domain," Science, vol. 245, pp. 635-7, Aug 11 1989.
[17] J. K. Forwood, A. Lange, U. Zachariae, M. Marfori, C. Preast, H. Grubmuller, et al., "Quantitative structural analysis of importin-beta flexibility: paradigm for solenoid protein structures," Structure, vol. 18, pp. 1171-83, Sep 8 2010.
[18] C. Kappel, U. Zachariae, N. Dolker, and H. Grubmuller, "An unusual hydrophobic core confers extreme flexibility to HEAT repeat proteins," Biophys J, vol. 99, pp. 1596-603, Sep 8 2010.
[19] M. Kim, K. Abdi, G. Lee, M. Rabbi, W. Lee, M. Yang, et al., "Fast and forceful refolding of stretched alpha-helical solenoid proteins," Biophys J, vol. 98, pp. 3086-92, Jun 16 2010.
[20] G. L. Blatch and M. Lassle, "The tetratricopeptide repeat: a structural motif mediating protein-protein interactions," Bioessays, vol. 21, pp. 932-9, Nov 1999.
[21] M. A. Andrade and P. Bork, "HEAT repeats in the Huntington's disease protein," Nat Genet, vol. 11, pp. 115-6, Oct 1995.
[22] M. R. Groves, N. Hanlon, P. Turowski, B. A. Hemmings, and D. Barford, "The structure of the protein phosphatase 2A PR65/A subunit reveals the conformation of its 15 tandemly repeated HEAT motifs," Cell, vol. 96, pp. 99-110, Jan 8 1999.
[23] I. R. Vetter, A. Arndt, U. Kutay, D. Gorlich, and A. Wittinghofer, "Structural view of the Ran-importin beta interaction at 2.3 angstrom resolution," Cell, vol. 97, pp. 635-646, May 28 1999.
[24] M. A. Andrade, C. Petosa, S. I. O'Donoghue, C. W. Muller, and P. Bork, "Comparison of ARM and HEAT protein repeats," J Mol Biol, vol. 309, pp. 1-18, May 25 2001.
[25] A. F. Neuwald and T. Hirano, "HEAT repeats associated with condensins, cohesins, and other complexes involved in chromosome-related functions," Genome Res, vol. 10, pp. 1445-52, Oct 2000.
[26] T. Hirano, "Condensins: universal organizers of chromosomes with diverse functions," Genes Dev, vol. 26, pp. 1659-78, Aug 1 2012.
[27] A. J. Wood, A. F. Severson, and B. J. Meyer, "Condensin and cohesin complexity: the expanding repertoire of functions," Nat Rev Genet, vol. 11, pp. 391-404, Jun 2010.
[28] G. D. Mehta, R. Kumar, S. Srivastava, and S. K. Ghosh, "Cohesin: functions beyond sister chromatid cohesion," FEBS Lett, vol. 587, pp. 2299-312, Aug 2 2013.
[29] J. Huang, F. Chen, C. Del Casino, A. Autino, M. Shen, S. Yuan, et al., "An ankyrin repeat-containing protein, characterized as a ubiquitin ligase, is closely associated with membrane-enclosed organelles and required for pollen germination and pollen tube growth in lily," Plant Physiology, vol. 140, pp. 1374-1383, Apr 2006.
[30] S. J. Aves, B. W. Durkacz, A. Carr, and P. Nurse, "Cloning, sequencing and transcriptional control of the Schizosaccharomyces pombe cdc10 'start' gene," EMBO J, vol. 4, pp. 457-63, Feb 1985.
[31] L. Breeden and K. Nasmyth, "Similarity between cell-cycle genes of budding yeast and fission yeast and the Notch gene of Drosophila," Nature, vol. 329, pp. 651-4, Oct 15-21 1987.
[32] L. K. Mosavi, T. J. Cammett, D. C. Desrosiers, and Z. Y. Peng, "The ankyrin repeat as molecular architecture for protein recognition," Protein Sci, vol. 13, pp. 1435-48, Jun 2004.
[33] M. K. Miller, M. L. Bang, C. C. Witt, D. Labeit, C. Trombitas, K. Watanabe, et al., "The muscle ankyrin repeat proteins: CARP, ankrd2/Arpp and DARP as a family of titin filament-based stress response molecules," J Mol Biol, vol. 333, pp. 951-64, Nov 7 2003.
[34] R. Yano, M. Oakes, M. Yamaghishi, J. A. Dodd, and M. Nomura, "Cloning and characterization of SRP1, a suppressor of temperature-sensitive RNA polymerase I mutations, in Saccharomyces cerevisiae," Mol Cell Biol, vol. 12, pp. 5640-51, Dec 1992.
[35] L. R. Zeng, S. Qu, A. Bordeos, C. Yang, M. Baraoidan, H. Yan, et al., "Spotted leaf11, a negative regulator of plant cell death and defense, encodes a U-box/armadillo repeat protein endowed with E3 ubiquitin ligase activity," Plant Cell, vol. 16, pp. 2795-808, Oct 2004.
[36] M. A. Samuel, J. N. Salt, S. H. Shiu, and D. R. Goring, "Multifunctional arm repeat domains in plants," Int Rev Cytol, vol. 253, pp. 1-26, 2006.
[37] C. Scheufler, A. Brinker, G. Bourenkov, S. Pegoraro, L. Moroder, H. Bartunik, et al., "Structure of TPR domain-peptide complexes: critical elements in the assembly of the Hsp70-Hsp90 multichaperone machine," Cell, vol. 101, pp. 199-210, Apr 14 2000.
[38] M. J. Baker, A. E. Frazier, J. M. Gulbis, and M. T. Ryan, "Mitochondrial protein-import machinery: correlating structure with function," Trends Cell Biol, vol. 17, pp. 456-64, Sep 2007.
[39] O. Mirus, T. Bionda, A. von Haeseler, and E. Schleiff, "Evolutionarily evolved discriminators in the 3-TPR domain of the Toc64 family involved in protein translocation at the outer membrane of chloroplasts and mitochondria," J Mol Model, vol. 15, pp. 971-82, Aug 2009.
[40] C. Brocard and A. Hartig, "Peroxisome targeting signal 1: is it really a simple tripeptide?," Biochim Biophys Acta, vol. 1763, pp. 1565-73, Dec 2006.
[41] M. Fransen, L. Amery, A. Hartig, C. Brees, A. Rabijns, G. P. Mannaerts, et al., "Comparison of the PTS1- and Rab8b-binding properties of Pex5p and Pex5Rp/TRIP8b," Biochim Biophys Acta, vol. 1783, pp. 864-73, May 2008.
[42] L. D. D'Andrea and L. Regan, "TPR proteins: the versatile helix," Trends Biochem Sci, vol. 28, pp. 655-62, Dec 2003.
[43] T. Di Domenico, E. Potenza, I. Walsh, R. G. Parra, M. Giollo, G. Minervini, et al., "RepeatsDB: a database of tandem repeat protein structures," Nucleic Acids Res, vol. 42, pp. D352-7, Jan 2014.
[44] I. Letunic, T. Doerks, and P. Bork, "SMART: recent updates, new developments and status in 2015," Nucleic Acids Res, vol. 43, pp. D257-60, Jan 2015.
[45] R. D. Finn, A. Bateman, J. Clements, P. Coggill, R. Y. Eberhardt, S. R. Eddy, et al., "Pfam: the protein families database," Nucleic Acids Res, vol. 42, pp. D222-30, Jan 2014.
[46] I. Sillitoe, T. E. Lewis, A. Cuff, S. Das, P. Ashford, N. L. Dawson, et al., "CATH: comprehensive structural and functional annotations for genome sequences," Nucleic Acids Res, vol. 43, pp. D376-81, Jan 2015.
[47] P. Forrer, M. T. Stumpp, H. K. Binz, and A. Pluckthun, "A novel strategy to design binding molecules harnessing the modular nature of repeat proteins," FEBS Lett, vol. 539, pp. 2-6, Mar 27 2003.
[48] B. Chakrabarty and N. Parekh, "Identifying tandem Ankyrin repeats in protein structures," BMC Bioinformatics, vol. 15, p. 6599, 2014.
[49] O. Lohi and V. P. Lehto, "VHS domain marks a group of proteins involved in endocytosis and vesicular trafficking," FEBS Lett, vol. 440, pp. 255-7, Dec 4 1998.
[50] Y. Mao, A. Nickitenko, X. Duan, T. E. Lloyd, M. N. Wu, H. Bellen, et al., "Crystal structure of the VHS and FYVE tandem domains of Hrs, a protein involved in membrane trafficking and signal transduction," Cell, vol. 100, pp. 447-56, Feb 18 2000.
[51] E. H. Walker, O. Perisic, C. Ried, L. Stephens, and R. L. Williams, "Structural insights into phosphoinositide 3-kinase catalysis and signalling," Nature, vol. 402, pp. 313-20, Nov 18 1999.
[52] J. Marcotrigiano, I. B. Lomakin, N. Sonenberg, T. V. Pestova, C. U. Hellen, and S. K. Burley, "A conserved HEAT domain within eIF4G directs assembly of the translation initiation machinery," Mol Cell, vol. 7, pp. 193-203, Jan 2001.


連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關論文
 
無相關期刊
 
無相關點閱論文