跳到主要內容

臺灣博碩士論文加值系統

(44.200.82.149) 您好!臺灣時間:2023/06/05 11:44
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:李亦宸
研究生(外文):I-Cheng Li
論文名稱:無穩定構型蛋白質中Molecular Recognition Features之二級結構預測
論文名稱(外文):Secondary Structure Prediction of Molecular Recognition Features in Intrinsically Disordered Proteins
指導教授:陸志豪陸志豪引用關係
指導教授(外文):Chih-Hao Lu
學位類別:碩士
校院名稱:中國醫藥大學
系所名稱:基礎醫學研究所碩士班
學門:醫藥衛生學門
學類:醫學學類
論文種類:學術論文
論文出版年:2014
畢業學年度:102
語文別:中文
論文頁數:36
中文關鍵詞:無穩定構型蛋白基因演算法
外文關鍵詞:Molecular recognition featuresIntrinsically disordered protein regionsGenetic Algorithm
相關次數:
  • 被引用被引用:0
  • 點閱點閱:211
  • 評分評分:
  • 下載下載:7
  • 收藏至我的研究室書目清單書目收藏:0
Molecular recognition features (MoRFs) 是一段位於無穩定構型蛋白質片段(intrinsically disordered protein regions, IDPs) 中的結合位,藉由與特定分子或蛋白質結合,這一小段蛋白質會由無穩定構型轉變為具有暫時穩定的結構,藉由這種結構上的改變來表現出他們於分子細胞內的功能,雖然這段序列平時並沒有穩定的結構,但在許多細胞中的訊息傳遞和調節作用方便扮演極為重要的角色,並且與許多基因方面的疾病相關聯。因此,如何辨認Molecular recognition features 對於幫助我們瞭解這類蛋白質並加以運用是非常重要的,針對於如何辨認這些片段,現今也已有發展出許多研究和方法。而蛋白質的結構與它的功能之間的關係是密不可分的,所以發展出一個高準確度來預測Molecular recognition features的結構的方法是很重要的,如果知道這段序列的結構或許能幫助我們更深入了解這個不具有穩定構型的蛋白質之相關特性與運用。
在我們的研究中,只利用了Molecular recognition features片段的序列,來建立一個預測多種Molecular recognition features的二級結構的方法,從無穩定構型蛋白質中擷取出Molecular recognition features的序列,接著透過五種不同的特徵選取方式建立28種不同的特徵群,再以及基因演算法來挑選較好的特徵,藉由分析這些特徵值來預測此段序列的二級結構。實驗中所用的序列皆來自蛋白質資料庫(Protein Data Bank)。
我們的方法在TRAINING421和TEST419 dataset上達到 96.4%和95% 的陽性率(True positive rate)。

Molecular recognition features (MoRFs) are short binding regions located in long intrinsically disordered protein regions (IDPs) that bind to protein to carry out their functions via disorder-to-order transitions. Although these short regions lack stable structures in their natural state, they play critical roles in the molecular signaling and regulation in cell, and are associated with many human genetic diseases. Therefore, identification of MoRFs are an important step in understanding the function of these proteins. There are also many researches and tools developed to identify MoRFs nowadays. Also, the structure of protein has something to do with protein function, so it is necessary to develop a highly accuracy method to predict the sturctures of MoRFs. In this research, we focus on developing a method to predict the sturctures of MoRFs using only sequence information. The sequences of intrinsically disordered proteins with MoRFs were used to generate features, and then these features were filtered by Genetic Algorithms. By analyzing these selected features, an accurate way to predict the structures of MoRFs might be found. In our study, our method achieved the true positive rate of 0.964 and 0.950 in TRAINING421 and TEST419 datasets.

Content
中文摘要 i
Abstract ii
致謝 iii
Content iv
1. Introduction 1
2. Material and Methods 4
2.1 Datasets 4
2.2 Feature extraction method 4
2.3 Support Vector Machines 5
2.4 Genetic algorithms 5
2.4.1 Selection operator 6
2.4.2 Mutation operator 6
2.4.3 Crossover operator 7
2.5 Performance measure 7
2.6 Procedure secondary structure prediction 8
3. Results 9
3.1 Composition of MoRF sequences 9
3.2 Secondary structure prediction performance 9
3.3 Multiple secondary structure : complex 10
3.4 Secondary structure prediction performance after adjusting 10
3.5 Feature selected from TRAINING421 test on TEST419 datasets 11
3.6 Training model generate from TRAIN421 test on TEST419 11
3.7 Comparison with previous work 12
4. Discussion 13
Tables 14
Figures 27
Reference 35


1.Disfani, F.M., et al., MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins. Bioinformatics, 2012. 28(12): p. i75-83.
2.Fang, C., et al., MFSPSSMpred: identifying short disorder-to-order binding regions in disordered proteins based on contextual local evolutionary conservation. BMC Bioinformatics, 2013. 14: p. 300.
3.Dunker, A.K., et al., Intrinsically disordered protein. J Mol Graph Model, 2001. 19(1): p. 26-59.
4.Klee, C.B., T.H. Crouch, and M.H. Krinks, Calcineurin: a calcium- and calmodulin-binding protein of the nervous system. Proc Natl Acad Sci U S A, 1979. 76(12): p. 6270-3.
5.Mohan, A., et al., Analysis of molecular recognition features (MoRFs). J Mol Biol, 2006. 362(5): p. 1043-59.
6.Oldfield, C.J., et al., Coupled folding and binding with alpha-helix-forming molecular recognition elements. Biochemistry, 2005. 44(37): p. 12454-70.
7.Cheng, Y., et al., Mining alpha-helix-forming molecular recognition features with cross species sequence alignments. Biochemistry, 2007. 46(47): p. 13468-77.
8.Dosztanyi, Z., B. Meszaros, and I. Simon, ANCHOR: web server for predicting protein binding regions in disordered proteins. Bioinformatics, 2009. 25(20): p. 2745-6.
9.Berman, H., et al., The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res, 2007. 35(Database issue): p. D301-3.
10.Joosten, R.P., et al., A series of PDB related databases for everyday needs. Nucleic Acids Res, 2011. 39(Database issue): p. D411-9.
11.Yu, C.S. and C.H. Lu, Identification of antifreeze proteins and their functional residues by support vector machine and genetic algorithms based on n-peptide compositions. PLoS One, 2011. 6(5): p. e20445.
12.Dubchak, I., et al., Recognition of a protein fold in the context of the Structural Classification of Proteins (SCOP) classification. Proteins, 1999. 35(4): p. 401-7.
13.Cao, J., et al., Identifying the singleplex and multiplex proteins based on transductive learning for protein subcellular localization prediction. Biotechnol Lett, 2013. 35(7): p. 1107-13.
14.Hua, S. and Z. Sun, A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach. J Mol Biol, 2001. 308(2): p. 397-407.
15.Ward, J.J., et al., Secondary structure prediction with support vector machines. Bioinformatics, 2003. 19(13): p. 1650-5.
16.Yu, C.S., et al., Fine-grained protein fold assignment by support vector machines using generalized npeptide coding schemes and jury voting from multiple-parameter sets. Proteins, 2003. 50(4): p. 531-6.
17.Cherkassky, V., The nature of statistical learning theory~. IEEE Trans Neural Netw, 1997. 8(6): p. 1564.
18.LLC.S, The PyMOL Molecular Graphics System. 2010.
19.Elshorst, B., et al., NMR solution structure of a complex of calmodulin with a binding peptide of the Ca2+ pump. Biochemistry, 1999. 38(38): p. 12320-32.
20.Im, Y.J., et al., Crystal structure of GRIP1 PDZ6-peptide complex reveals the structural basis for class II PDZ target recognition and PDZ domain-mediated multimerization. J Biol Chem, 2003. 278(10): p. 8501-7.
21.Zhang, X. and X. Cheng, Structure of the predominant protein arginine methyltransferase PRMT1 and analysis of its binding to substrate peptides. Structure, 2003. 11(5): p. 509-20.
22.McGuffin, L.J., K. Bryson, and D.T. Jones, The PSIPRED protein structure prediction server. Bioinformatics, 2000. 16(4): p. 404-5.
23.Rost, B. and C. Sander, Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol, 1993. 232(2): p. 584-99.


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top