(3.80.6.131) 您好!臺灣時間:2021/05/15 01:51
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:陳尹
研究生(外文):Yin Chen
論文名稱:以橢圓群集為基礎的蛋白質結構比對演算法之研究
論文名稱(外文):A Protein Structure Comparison Method Based on Hyper-Ellipsoidal Clusters
指導教授:歐陽彥正歐陽彥正引用關係
指導教授(外文):Yen-Jen Oyang
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊工程學研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2004
畢業學年度:92
語文別:中文
論文頁數:53
中文關鍵詞:二級結構動態規劃橢圓群集幾何雜湊蛋白質結構比對
外文關鍵詞:secondary structuredynamic programmingalignmentgeometric hshingprotein structure comparison
相關次數:
  • 被引用被引用:2
  • 點閱點閱:87
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
本論文介紹一個以橢圓群集為基礎的二階段式蛋白質結構比對演算法,二階段式的設計是為了增加比對的速度。第一階段利用啟發式演算法找出粗略比對,以減少結構比對的次數;第二階段利用修正演算法,使得粗略比對修正為最佳比對。橢圓群集則為嶄新的概念,除了利用二級結構的ㄠ袹憿Bβ折板資訊之外,更加上了蛋白質結構當中其餘彎曲部分的資訊,藉以增加比對的準確率,並獲得具有生物意義的結果。
本論文研究的演算法,針對各種類型的蛋白質結構,皆能有效地找出結構的對應結果。本論文採取三組不同的資料集,針對一對多、多對多進行比對實驗,藉此檢驗演算法在結構比對上表現。當中並以十個困難個案這組基準資料集,與Dali、CE、VAST、ProSup、FLASH等演算法進行比較,發現本文演算法的確在整體比對、局部比對之中,皆能得到更好的比對效果。同時我們不僅計算出各個比對結果的相似度,並呈現出蛋白結構實際的對應結果,確認演算法能夠找出蛋白質之間具有生物意義的共同結構。
本論文所提出的演算法之時間複雜度,決定於比對演算法的選擇,利用動態規劃以及幾何雜湊時,分別為O(n4)和O(n3)。與其他結構比對演算法相比,除FLASH之外,本論文之結構比對演算法時間複雜度均相等或低於,其餘著名的以胺基酸片段為基礎的比對演算法。
This thesis reports a study on a two-stage protein structural alignment algorithm based on hyper-ellipsoidal clusters. The design of the two-stage algorithm is aimed at improving the efficiency of protein structural alignment without trading off analysis accuracy. In the first stage of the proposed approach, hyper-ellipsoidal clusters are employed to model the substructures of random coils as well as the ?helix and β-sheet structures. Due to this practice, the number of possible alignments of two protein tertiary structures to be examined in the first stage of analysis is substantially reduced and, as a result, the efficiency of the alignment operation is greatly improved. In the second stage of analysis, a refinement algorithm is invoked to fine-tune the alignment outputted by the first stage. The main distinction of the approach proposed in this thesis, in comparison with the existing approaches, is that the structural information of random coils is exploited so that the accuracy of analysis is not traded for efficiency. This thesis also reports the experiments conducted to evaluate the performance of the proposed approach. Experimental results reveal that protein structure alignment based on the hyper-ellipsoidal clusters generally achieves higher accuracy in both global alignment and local alignment than the existing algorithms.
第一章 緒論 - 1 -
1.1蛋白質結構比對的興起 - 1 -
1.2 研究動機及目的 - 1 -
1.3 論文架構 - 2 -
第二章 蛋白質結構比對之相關研究 - 3 -
2.1 定義 - 3 -
2.1.1 剛體重疊 - 3 -
2.1.2 相似度 - 3 -
2.1.3 二級結構 - 4 -
2.1.4 整體比對、局部比對 - 4 -
2.2 相似度 - 4 -
2.2.1對應殘基數 - 4 -
2.2.2 Root Mean Square Error - 5 -
2.2.3 P value - 6 -
2.3 蛋白質結構比對的類型 - 6 -
2.3.1 內容型 - 6 -
2.3.2 非順序型 - 7 -
2.3.3 順序型 - 7 -
2.4 剛體重疊 - 8 -
2.4.1 直接比對 - 9 -
2.4.2 兩階段式比對 - 9 -
2.5 以胺基酸序列片段為基礎的蛋白質結構比對演算法 - 10 -
2.5.1 ProSup - 11 -
2.5.2 FLASH - 11 -
2.5.3 討論 - 13 -
2.6 幾何雜湊 - 14 -
2.7 複雜度分析 - 16 -
第三章 以橢圓模型為基礎的蛋白質結構比對演算法 - 17 -
3.1 前言 - 17 -
3.2 演算法簡介 - 17 -
3.1演算法第一步驟:分群 - 18 -
3.3.1 以二級結構為基礎的分群 - 18 -
3.3.2 以橢圓群集為基礎的分群 - 20 -
3.3.3 群集之幾何意義 - 23 -
3.2演算法第二步驟:結構比對 - 26 -
3.2.1結構比對演算法 - 27 -
3.2.2評分方式 - 28 -
3.3演算法第三步驟:修正 - 32 -
3.3.1修正演算法 - 32 -
3.3.2最小平方法 - 34 -
3.3.3討論 - 34 -
3.4複雜度分析 - 34 -
第四章 實驗結果 - 36 -
4.1 實驗一 - 36 -
4.1.1資料集 - 36 -
4.1.2實驗數據 - 36 -
4.1.3 實驗結果 - 37 -
4.2 實驗二 - 37 -
4.2.1 資料集 - 37 -
4.2.2 實驗數據 - 37 -
4.2.3 實驗結果 - 39 -
4.3 實驗三 - 40 -
4.3.1 資料集 - 41 -
4.3.2 實驗數據 - 42 -
4.3.3實驗結果 - 42 -
第五章 結論與展望 - 47 -
5.1 結論 - 47 -
5.2 展望 - 49 -
參考文獻 - 51 -
[1] C. Branden and J. Tooze, Introduction to protein structure. Second edition, Garland Publishing, New York. 1999
[2] H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov, P.E. Bourne: The Protein Data Bank. Nucleic Acids Research, vol. 28, pp. 235-242, 2000
[3] H. Frigui, A Scalable Clustering Approach Based on the Synchronization of Pulse-Coupled Oscillators. To appear in Data Mining and Knowledge Discovery.
[4] C. Branden and J. Tooze, Introduction to Protein Structure, Second edition, Garland Publishing, New York, 1999
[5] M. Zuker and R. L. Somorjai, The alignment of protein structures in three dimensions. Bull. Math. Biol., 51, pp. 55-78, 1989
[6] M. Levitt and M. Gerstein, A Unified Statistical framework for Sequence Comparison and Structure Comparison. Proc. Nat. Acad. Sci. USA, vol. 95, pp. 5913-5920, 1998
[7] N.P. Brown, C.A. Orengo and W.R. Taylor, A Protein Structure Comparison Methodology. Computers Chem. Vol. 20, No. 3, pp. 359-380, 1996
[8] S. Rackovsky and D.A. Goldstein, Protein Comparison and Classification:
a Differential Geometric Approach. Proc. Natl. Acad. Sci. USA, 85, pp. 777-781, 1988
[9] A.M. Lesk, Lex – A Lexical AnalyzerGgenerator. Comm. ACM, pp. 219, 1979
[10] A.M. Lesk, Protein Architecture, a Practical Approach. Oxford University Press, pp. 129-160, 1991
[11] S. Subbiah, D.V. Laurents and M. Levitt, structural similarity of DNA-binding domains of bacteriophage repressors and the globin core. Current Biology, vol. 3, pp. 141-148, 1993

[12] M. Gerstein and M. Levitt, Using iterative dynamic programming to obtain accurate pairwise and multiple alignments of protein structures. In Proceedings of the Fourth International Conference on Intelligent Systems for Molecular Biology, pp. 59-67, 1996
[13] R.H. Lathrop, The protein threading problem with sequence amino acid interaction preference is NP-complete. Protein Eng., vol. 7, pp. 1059-1068, 1994
[14] W. Kabsch and C. Sander, Dictionary of protein secondary structure: Pattern recognition of hydrogen bonded and geometrical features. Biopolumers, vol. 22, pp. 2577-2637, 1983
[15] P. Lackner, W.A. Koppensteiner, F.S. Domingues and M.J. Sippl, Automated Large Scale Evaluation of Protein Structure Predictions. Proteins, vol. 3, pp. 112-120, 1999
[16] S.C. Shin and Ming-Jing Hwang, Protein structure comparison by probability-based matching of secondary structure elements. Bioinformatics, vol. 19, pp. 735-741, 2003
[17] I. Gath and A.B. Geva, A robust competitive clustering algorithm with applications in computer vision. IEEE Trans. Patt. Analysis Math. Intell, vol 11, pp. 773-781, 1989
[18] J.C. Bezdek, J. Keller, R. Krishnapuram and N.R. Pal, Fuzzy models and algorithms for pattern recognition and image processing. Kluwer Academic Publishers, Boston. 1999
[19] S. Theodoridis and K. Koutroumbas, Pattern Recognition. Academic Press. 1998
[20] R.O. Duda, P.E. Hart and D.G. Stork, Pattern Classification. Second Edition, Wiley-Interscience Publication. 2001
[21] S.B. Needleman anf C.D. Wunsch, A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol., vol. 48, pp. 443-453. 1970
[22] J.E. Haim, Geometric hashing: an overview. IEEE Comput. Sc. and Eng .vol. 4, pp. 10-21. 1997
[23] Zhengyou Zhang, Iterative point matching for registration of free-form curves. Robotique, Image et Vision. 1992
[24] D. Fisher, A. Eloffson, D. Rice and D. Eisenberg, Assessing the performance of fold recognition methods by means of comprehensive benchmark. Proceedings of the 1st pacific Symposium on Biocomputing. World scientific Publishing Company, Singapore, pp. 300-318. 1996
[25] L. Holm and C. sander, Dali: a network tool for protein structure comparison. Trends biochem. Sci., vol. 20, pp. 478-480. 1995
[26] I.N. Shindyalov and P.E. Bourne, Protein structure alignment by imcremental combinatorial extension (CE) of the optimal path. Protein Eng., vol. 11, pp. 739-747. 1998
[27] T. Madej, J.F. Gibrat and S.H. Btyant, Threading a database of protein cores. Proteins: struct. Funct. Genet., vol. 23, pp. 356-369. 1995
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊