跳到主要內容

臺灣博碩士論文加值系統

(35.175.191.36) 您好!臺灣時間:2021/08/02 13:46
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:林聖富
研究生(外文):Shen-fu lin
論文名稱:整合支持向量機與遺傳演算法預測雙硫鍵鍵結情形
論文名稱(外文):Integration of Support Vector Machine and Genetic Algorithm to Predict Disulfide Bonding Connectivity
指導教授:陳 玉 菁 博 士
指導教授(外文):Yu-Ching Chen.
口試委員:胡文品朱彥煒陳玉菁
口試委員(外文):Wen-Pin Hu.Chu Y.W.Yu-Ching Chen.
口試日期:2012-07-23
學位類別:碩士
校院名稱:亞洲大學
系所名稱:生物與醫學資訊學系碩士班
學門:工程學門
學類:生醫工程學類
論文種類:學術論文
論文出版年:2013
畢業學年度:100
語文別:中文
論文頁數:43
中文關鍵詞:雙硫鍵半胱胺酸支持向量機基因演算法雙硫鍵鍵結情形
外文關鍵詞:DisulfideCysteineSupport Vector MachineGenetic AlgorithmDisulfide Connectivity
相關次數:
  • 被引用被引用:1
  • 點閱點閱:359
  • 評分評分:
  • 下載下載:13
  • 收藏至我的研究室書目清單書目收藏:0
雙硫鍵是兩個半胱胺酸(cysteine)上之SH氧化所形成的共價鍵。半胱胺酸不僅可與序列鄰近的半胱胺酸鍵結形成雙硫鍵,同時亦可與空間鄰近之半胱胺酸形成鍵結;此特性可幫助蛋白質摺疊、穩定蛋白質結構,且雙硫鍵亦與蛋白質功能的調控有很大的關係,雙硫鍵越多的蛋白質其內部結構越穩定,因此該蛋白質結構也就越不容易被破壞。本研究使用基因演算法選取特徵值,並結合支持向量機(SVM)的方式對蛋白質中雙硫鍵的鍵結情形進行預測。所使用的特徵值包含半胱胺酸周圍之胺基酸、半胱胺酸間在序列上的幾何距離、蛋白質序列的演化資訊、胺基酸的物理特性與整體胺基酸的組成。並在對在具有二到五個雙硫鍵之蛋白質,且兩兩蛋白質序列間序列相似程度皆小於30%以上之雙硫鍵鍵結情形做預測,本研究之整體準確度介於62-67%。
Disulfide bond is formed by two SH group of cysteines, and these two SH are oxidized to from a covalent bond. Furthermore, disulfide bond not only can be formed by sequence adjacent cysteine, but also is spatial proximity of cysteine, However, disulfide bond can help protein folding, stabilizing the protein structure, and have a great relationship with the regulation of protein function. In general, more disulfide bonds in a protein can make protein structure more stable, and not easily be destroyed.
In this study, I combine the genetic algorithm (GA) and Support Vector Machine to predict disulfide connectivity. Genetic algorithm is used to select the feature vectors, and the feature vectors I used here includ cysteine-cysteine coupling, cysteine spacing patterns, position specific substitution matrix, AAindex, and, amino acid contents. The proteins I used to predict their disulfide connectivity are with two to five disulfide bonds, and the sequence similarity within proteins are equal and less than 30%. Finally, the accuracy of this method is between 62% to 67%.

摘要 ......................................................................................................... 1
ABSTRACT ........................................................................................... 2
目錄 ......................................................................................................... 3
一、 緒論 ........................................................................................ 6
二、 文獻探討 ................................................................................ 9
三、 研究方法 .............................................................................. 12
3.1 資料集 ......................................................................................... 12
3.2 支持向量機(SUPPORT VECTOR MACHINE , SVM) ................. 13
3.3 遺傳演算法(GENETIC ALGORITHM , GA) .............................. 15
3.4 四倍交叉驗證(FOUR-FOLD CROSS–VALIDATION) .................. 16
3.5 特徵值 ......................................................................................... 17
3.5.1胺基酸組成(COM) .......................................................... 17
3.5.2 胺基酸組成之演化資訊(PSSMC) ................................ 18
3.5.3 兩兩半胱胺酸的距離(CON) ......................................... 18
3.5.4 半胱胺酸周圍之胺基酸環境(POS) .............................. 19
3.5.5 半胱胺酸周圍之胺基酸演化環境(PSSM) .................... 19
3.5.6 ALPHA ............................................................................... 20
3.5.7 胺基酸表面積 (ASA) ................................................... 20
3.5.8 HMMSTR .......................................................................... 20
3.5.9 KA ...................................................................................... 21
3.5.10 PB ..................................................................................... 21
3.5.11 STR ................................................................................... 21
3.5.12 TCO .................................................................................. 22
3.5.13 SS ...................................................................................... 22
3.5.14 AAindex............................................................................ 23
3.6 QP、QC定義 ............................................................................. 24
四、 實驗結果 .............................................................................. 25
五、 結論 ...................................................................................... 28
六、 圖表 ...................................................................................... 29
圖一、蛋白質中的雙硫鍵。兩半胱胺酸的硫原子形成一個雙硫鍵,本圖具有兩個雙硫鍵,虛線及黑線代表雙硫鍵之形成。此蛋白質之雙硫鍵的鍵結情形為(C1_C3,C2-C4)。C1、C2、C3與C4分別代表蛋白質序列中出現之第一個、第二個、第三個與第四個之半胱胺酸。 ................................................................................................... 29
圖二、不同雙硫鍵個數下之蛋白質序列數 ..................................... 30
圖三、SVM超平面釋義圖............................................................... 31
圖四 SVM誤差項 ......................................................................... 31
圖五、遺傳演算法挑選特徵值之流程圖 ......................................... 32
圖六、四倍交叉驗證(FOUR-FOLD CROSS–VALIDATION)流程 ..... 33
圖七、雙面角ALPHA ..................................................................... 34
表一、二十種胺基酸編碼 ................................................................ 35
表二、添加演化資訊之二十種胺基酸編碼 ..................................... 36
表三、二十種胺基酸表面積 ............................................................ 37
表四、使用支持向量機(SVM)預測雙硫鍵鍵結情形之結果 ..... 38
表五、使用遺傳演算法(GA)預測雙硫鍵鍵結情形之結果 ........ 39
表六、將遺傳演算法所挑選之特徵值作組合再利用遺傳演算法(GA)搭配支持向量機預測雙硫鍵鍵結情形之結果 ................................. 40
參考文獻 ............................................................................................... 42
[1]C. Branden and J. Tooze, "Introduction to protein structure," pp. p.3-p.19, 1991.
[2]C. H. Tsai, B. J. Chen, C. H. Chan, H. L. Liu, and C. Y. Kao, "Improving disulfide connectivity prediction with sequential distance between oxidized cysteines," Bioinformatics, vol. 21, pp. 4416-9, Dec 15 2005.
[3]E. Zysman-Colman, N. Nevins, N. Eghbali, J. P. Snyder, and D. N. Harpp, "Crossover point between dialkoxy disulfides (ROSSOR) and thionosulfites ((RO)2S=S): prediction, synthesis, and structure," J Am Chem Soc, vol. 128, pp. 291-304, Jan 11 2006.
[4]R. R. Thangudu, A. Vinayagam, G. Pugalenthi, A. Manonmani, B. Offmann, and R. Sowdhamini, "Native and modeled disulfide bonds in proteins: knowledge-based approaches toward structure prediction of disulfide-rich polypeptides," Proteins, vol. 58, pp. 866-79, Mar 1 2005.
[5]J. Cheng, H. Saigo, and P. Baldi, "Large-scale prediction of disulphide bridges using kernel methods, two-dimensional recursive neural networks, and weighted graph matching," Proteins, vol. 62, pp. 617-29, Mar 15 2006.
[6]P. Fariselli and R. Casadio, "Prediction of disulfide connectivity in proteins," Bioinformatics, vol. 17, pp. 957-64, Oct 2001.
[7]E. Zhao, H. L. Liu, C. H. Tsai, H. K. Tsai, C. H. Chan, and C. Y. Kao, "Cysteine separations profiles on protein sequences infer disulfide connectivity," Bioinformatics, vol. 21, pp. 1415-20, Apr 15 2005.
[8]J. Song, Z. Yuan, H. Tan, T. Huber, and K. Burrage, "Predicting disulfide connectivity from protein sequence using multiple sequence feature vectors and secondary structure," Bioinformatics, vol. 23, pp. 3147-54, Dec 1 2007.
[9]M. Vincent, A. Passerini, M. Labbe, and P. Frasconi, "A simplified approach to disulfide connectivity prediction from protein sequences," BMC Bioinformatics, vol. 9, p. 20, 2008.
[10]M. Alhamdoosh, "Disulfide Connectivity Prediction Using
Machine Learning Approaches," 2010.
[11]C. H. Lu, Y. C. Chen, C. S. Yu, and J. K. Hwang, "Predicting disulfide connectivity patterns," Proteins, vol. 67, pp. 262-70, May 1 2007.
[12]B. J. Chen, C. H. Tsai, C. H. Chan, and C. Y. Kao, "Disulfide connectivity prediction with 70% accuracy using two-level models," Proteins, vol. 64, pp. 246-52, Jul 1 2006.
[13]A. Vullo and P. Frasconi, "Disulfide connectivity prediction using recursive neural networks and evolutionary information," Bioinformatics, vol. 20, pp. 653-9, Mar 22 2004.
[14]C.-C. Chang and C.-J. Lin, "LIBSVM: A Library for Support Vector Machines," Available, Initial version: 2001,Last updated: April 4, 2012.
[15]F. Ferre and P. Clote, "Disulfide connectivity prediction using secondary structure information and diresidue frequencies," Bioinformatics, vol. 21, pp. 2336-46, May 15 2005.

QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊