(3.236.222.124) 您好!臺灣時間:2021/05/11 07:58
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

: 
twitterline
研究生:孫江曼
研究生(外文):Chiang-Man Sun
論文名稱:基於蛋白質二級結構間距離矩陣之多重結構排比
論文名稱(外文):Distance Matrix Analysis of Mutual Secondary Structure Pairs for Multiple Structure Alignment
指導教授:白敦文
指導教授(外文):Tun-Wen Pai
學位類別:碩士
校院名稱:國立臺灣海洋大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2009
畢業學年度:97
語文別:英文
論文頁數:38
中文關鍵詞:距離矩陣區域搜尋多重蛋白質結構排比角度距離圖疊代精煉演算法
外文關鍵詞:distance matrixlocal region searchmultiple protein structure alignmentangle-distance mapiterative refinement algorithm
相關次數:
  • 被引用被引用:0
  • 點閱點閱:182
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:10
  • 收藏至我的研究室書目清單書目收藏:0
在生物領域中,生物學家可以透過分析蛋白質序列、結構和表面來了解蛋白質與蛋白質之間的反應關係或者在演化上的關聯性。然而,蛋白質序列及結構資料庫快速成長,如何使用生物資訊演算法協助分析蛋白質之間的關係是一個急需解決且重要的課題。傳統的方法是針對蛋白質序列進行比對,但是研究發現,有部分的蛋白質序列雖相似度低,但在形成結構後卻呈現高度相似性。這樣的情況並不適合使用序列比對分析,所以本論文提出一套以蛋白質二級結構為依據的多重蛋白質排比演算法。該演算法是建構在二級結構片段在空間上的相對距離及角度所形成的矩陣特徵,以此矩陣的相似度作為蛋白質之間保留特性的依據。首先,使用最佳向量轉換技術將二級結構片段轉換成向量格式,再個別計算任兩個向量間的幾何特徵,包括距離矩陣、距離及角度等特徵,每一對二級結構向量所形成的特徵都記錄在角度距離圖上的一個特徵點。接著,透過區域篩選與矩陣比對挑選出蛋白質結構間最佳的三對候選點資訊。根據所選出的二級結構對進行向量平移與旋轉,將蛋白質結構對應到適當的初始位置,最後再使用疊代精煉演算法的技術求得最佳多重蛋白質結構的排比結果。本論文選用SCOP、Homstrad與SABmark資料庫驗證此演算法的正確性,並選取蛋白質序列相似度低的家族與其他知名系統比較。根據實驗結果證明,即使在蛋白質序列相似度低於20%的情況下,亦能達到快速正確的比對結果。
In the fields of biology, biologists comprehend the reaction and correlation of protein evolution through analyzing protein sequences, structures, and surfaces. However, the number of protein sequences and resolved structures continues to grow exponentially, so how to design bioinformatics algorithms to efficiently and effectively analyze the relationship among proteins becomes one of the most important issues. The traditional method for protein comparison applies sequence alignment, but it has been found that a certain number of protein sets hold low sequence identities but possess functional or structural similarities. Depending only on sequence alignment cannot overcome this dilemma. Hence, this study developed a multiple structure alignment system which is based on the secondary structure information. The main approach of this proposed system is to feature the characteristics of mutual correlation of secondary structure element (SSE) pairs in a protein structure. The algorithm utilizes the local matching advantages through distance matrix matching criteria to extract suitable candidates of SSE pairs from each protein. The similarity scores of compared distance matrices of mutual SSE pairs are calculated and ranked to decide representative points as key anchors for multiple structure alignment. Based on these three key anchor points, translation and rotation transformations are performed to obtain an initial alignment results and followed by an iterative refinement procedure for an optimal solution. The experimental results were verified by SCOP, Homstrad, and SABmark benchmark databases. Several cases with low sequence identity were compared with well-known protein structure alignment tools. The results showed the averagely aligned residues were evidently increased and the root-mean-square-deviance decreased within these low sequence similarity protein sets.
Abstract(chinese) i
Abstract iii
Acknowledgments v
List of figures vii
List of tables viii
1. Introduction 1
2. Protein structure alignment 4
3. System architectures 9
4. System module description 11
4.1. Data Preprocessing 11
4.2. Vector Transformation 12
4.3. Intra-relationship analysis 13
4.4. Target protein determination 14
4.5. Inter-relationship analysis 17
4.6. Constrained multiple structure feature alignment 22
4.7. Iterative refinement techniques 24
5. Experiment results 25
6. Conclusions 34
7. Reference 36
[1] Berman H, Henrick K & Nakamura H. "Announcing the worldwide protein data bank." Nat. Struct. Biol.. 10: p. 980. (2003)
[2] Murzin AG, Brenner SE, Hubbard T & Chothia C. "Scop: a structural classification of proteins database for the investigation of sequences and structures." J. Mol. Biol.. 247: pp. 536-540. (1995)
[3] Cuff AL, Sillitoe I, Lewis T, Redfern OC, Garratt R, Thornton J & Orengo CA. "The cath classification revisited--architectures reviewed and new ways to characterize structural divergence in superfamilies." Nucleic Acids Res.. 37: p. D310-4. (2009)
[4] Guerler A & Knapp E. "Novel protein folds and their nonsequential structural analogs." Protein Sci.. 17: pp. 1374-1382. (2008)
[5] Mosca R & Schneider TR. "Rapido: a web server for the alignment of protein structures in the presence of conformational changes." Nucleic Acids Res.. 36: p. W42-6. (2008)
[6] Shindyalov IN & Bourne PE. "Protein structure alignment by incremental combinatorial extension (ce) of the optimal path." Protein Eng.. 11: pp. 739-747. (1998)
[7] Menke M, Berger B & Cowen L. "Matt: local flexibility aids protein multiple structure alignment." PLoS Comput. Biol.. 4: p. e10. (2008)
[8] Birzele F, Gewehr JE, Csaba G & Zimmer R. "Vorolign--fast structural alignment using voronoi contacts." Bioinformatics. 23: p. e205-11. (2007)
[9] Guda C, Pal LR & Shindyalov IN. "Dmaps: a database of multiple alignments for protein structures." Nucleic Acids Res.. 34: p. D273-6. (2006)
[10] Chang R, Wang L, Chen J & Pai T. "Enhanced mutual correlation of secondary structure elements for multiple structure alignment." Proc. 10th Joint Conference on Information Sciences(JCIS/CBGI). pp. 1-7. (July,2007)
[11] Godzik A. "The structural alignment between two proteins: is there a unique answer?." Protein Sci.. 5: pp. 1325-1338. (1996)
[12] Holm L & Sander C. "Protein structure comparison by alignment of distance matrices." J. Mol. Biol.. 233: pp. 123-138. (1993)
[13] Zhu J & Weng Z. "Fast: a novel protein structure alignment algorithm." Proteins. 58: pp. 618-627. (2005)
[14] Kolbeck B, May P, Schmidt-Goenner T, Steinke T & Knapp E. "Connectivity independent protein-structure alignment: a hierarchical approach." BMC Bioinformatics. 7: p. 510. (2006)
[15] W. Kabsch. "A solution for the best rotation to relate two sets of vectors." Acta Crystallographica Section. 32(5): pp. 922-923. (1976)
[16] Schneider TR. "A genetic algorithm for the identification of conformationally invariant regions in protein molecules." Acta Crystallogr. D Biol. Crystallogr.. 58: pp. 195-208. (2002)
[17] Zhang Y & Skolnick J. "Scoring function for automated assessment of protein structure template quality." Proteins. 57: pp. 702-710. (2004)
[18] O. Dror, H. Benyamini, R. Nussinov, and H. Wolfson. "Mass: multiple structural alignment by secondary structures." Bioinformatics. : p. i95-i104 . (2003)
[19] Krissinel EAKH. "Mutiple alignment of protein structures in three dimensions." Computational Life Sciences: First International Symposium,CompLife. : . (2005)
[20] Shatsky M, Nussinov R & Wolfson HJ. "A method for simultaneous alignment of multiple protein structures." Proteins. 56: pp. 143-156. (2004)
[21] Kabsch W & Sander C. "Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features." Biopolymers. 22: pp. 2577-2637. (1983)
[22] A. H. Land and A. G Doig. "An automatic method of solving discrete programming problems." Econometrica. 28: pp. 497-520. (1960)
[23] B. Su, T. Pai, W. Chou & D. Chang, H. Chang, W. Chou. "Constrained multiple structure feature alignment." National Computer Symposium. : . (2005)
[24] Mizuguchi K, Deane CM, Blundell TL & Overington JP. "Homstrad: a database of protein structure alignments for homologous families." Protein Sci.. 7: pp. 2469-2471. (1998)
[25] Van Walle I, Lasters I & Wyns L. "Sabmark--a benchmark for sequence alignment that covers the entire known fold space." Bioinformatics. 21: pp. 1267-1268. (2005)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔