(3.230.154.160) 您好!臺灣時間:2021/05/07 18:41
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:張家銘
研究生(外文):Che-Ming Chang
論文名稱:利用倒傳遞網路於蛋白質二級結構預測與分析
論文名稱(外文):Prediction and Analysis of Protein Secondary Structures with the Back-propagation Neural Networks
指導教授:陳重臣陳重臣引用關係
指導教授(外文):Jong-Chen Chen
學位類別:碩士
校院名稱:國立雲林科技大學
系所名稱:資訊管理系碩士班
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2003
畢業學年度:91
語文別:中文
論文頁數:65
中文關鍵詞:倒傳遞網路胺基酸蛋白質立體結構蛋白質二級結構
外文關鍵詞:protein tertiary structureamino acidback-propagation networkprotein secondary structure
相關次數:
  • 被引用被引用:0
  • 點閱點閱:87
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
蛋白質立體結構的了解有助於新藥的開發,因此,使得蛋白質立體結構的預測成為蛋白質體學的研究重點之一。以往預測蛋白質立體結構的方法,大致上可以分成兩類,一類是由蛋白質的一級結構直接預測立體結構,在這一類的方法中,有的是以化學實驗方式來決定蛋白質立體結構,如X光結晶繞射與核磁共振技術,有的則是透過理論計算的方式,來決定立體結構。化學實驗的方法雖然可以精確地解出蛋白質的立體結構,但由於無法適用於所有的蛋白質分子且在實行過程中相當的耗時,因此,才會出現用理論計算的方式,不過,用理論計算所決定的立體結構卻不若實驗方式來的可靠。另一類方法,是由蛋白質的一級結構來預測二級結構。這類方法有鑒於直接由蛋白質一級結構預測立體結構的困難,因此,希望透過二級結構預測,經由對二級結構的了解,來預測蛋白質的立體結構。

蛋白質二級結構是由蛋白質胺基酸序列的區段構成,一個胺基酸序列區段之所以形成二級結構,是因為區段中的某些胺基酸,形成氫鍵等化學作用力,彼此相互吸引的關係,而使得該胺基酸序列區段形成某種二級結構。

但是在何種序列區段長度下,哪些胺基酸在哪些位置會構成特定的二級結構,目前並沒有一個通則被歸納出來,因此,本研究不同於以往的蛋白質二級結構預測,希望藉由找出實驗資料集中,使得胺基酸序列區段形成特定二級結構的關鍵胺基酸(簡稱模具),經由對模具的整理,希望能夠發現一些二級結構形成的規則,對蛋白質結構預測的領域有所貢獻。

本研究所使用的工具為倒傳遞網路,倒傳遞網路屬於監督式的學習演算法,目前被廣泛的應用在各種領域當中。在本研究中,除了訓練倒傳遞網路,看其對蛋白質二級結構資料的歸類效能如何以外,並利用實驗所得有最高歸類準確率的倒傳遞網路模型,進行模具找尋的實驗,最後針對模具進行整理,提出本研究的發現。


關鍵字:倒傳遞網路,胺基酸,蛋白質二級結構,蛋白質立體結構
The understanding of protein tertiary structure facilitates the development of new medicine. Therefore, to predict protein tertiary structure is one of important research of studying protein form. Foretime to predict the measure of protein tertiary structure, on the whole we divided into two ways, one is from protein primary structure to predict the solid structure (tertiary structure) directly. In this way some of using a chemical experiment to decide the protein solid structure (tertiary structure) e.g. the x-ray crystallography and nuclear magnetic resonance (NMR), and another one is by theory calculation to decide the protein solid structure (tertiary structure). Chemical experiment can be precise to analyze the protein tertiary structure whereas it is too waste time and isn’t adaptive for all of protein molecule. Thus, it appears the theoretical calculation. But Chemical experiment of protein tertiary structure is more accurate than theoretical calculation. The other way is by protein primary structure to predict protein secondary structure, because it is difficult by protein primary structure directly to predict protein tertiary structure. Hence, it is through prediction and understanding of secondary structure to predict the protein solid structure (tertiary structure).

Amino acid of protein area constructs protein secondary structure. How amino acid of protein area forms protein secondary structure that because the certain amino acid of area to construct the chemical action that is hydrogen bonding. Because of forming the secondary structure through one area of amino acid of protein. Some amino acid of the area could construct hydrogen bonding; the relation between some amino acid in this area will attract each other, then it forms to construct the kind of secondary structure.

In what kind of length of amino acid of protein area and which position of amino acid of protein could construct the particular protein secondary structure, it is not having a rule to induce. Moreover, it is different from the past prediction of secondary structure. It expects to find in the dataset of experiment and it finds out just some of key amino acid of area could form the secondary structure. Through arranging template to discover some rule about shape of protein secondary structure and it will be having some contribution the prediction of protein tertiary structure.

Instrument of this research is back-propagation network that belongs supervisory learning algorithm and be applied in every domain. In the research is not only training back-propagation network to observe the classified act of protein secondary structure but also using the experimental result to get the back- propagation network model of topmost classified accurate rate and progressing the experiment of finding template. Finally, it aims at template to arrange and offers the discovering of the research.


Keywords: back-propagation network,amino acid,protein secondary structure,protein tertiary structure
目錄
中文摘要……………………………………………………………………………….i
英文摘要……………………………………………………………………………...ii
誌謝………………………………………………………………………………… ..iv
目錄………………………………………………………………………………….. v
表目錄……………………………………………………………………………… .vii
圖目錄………………………………………………………………………………...viii
第一章、 緒論………………………………………………………………………..1
1.1研究背景與動機………………………………………………………… ….....1
1.2研究目的…………………………………………………………………… .2
1.3研究貢獻…………………………………………………………………… .2
1.4研究限制…………………………………………………………………… .2
第二章、 文獻探討…………………………………………………………………..3
2.1胺基酸…………………………………………………………………….….3
2.2蛋白質………………………………………………………………………..4
2.3蛋白質的結構………………………………………………………………..5
2.3.1蛋白質的一級結構…………………………………………………….5
2.3.2蛋白質的二級結構…………………………………………………….6
2.3.3蛋白質的立體結構…………………………………………………….7
2.3.4蛋白質的四級結構…………………………………………………….8
2.4蛋白質資料庫……………………………………………………………… .8
2.5蛋白質結構的預測………………………………………………………….10
2.5.1實驗方法……………………………………………………………..11
2.5.1.1 X光結晶繞射…………………………………………………11
2.5.1.2核磁共振法………………………………………………… 11
2.5.2理論模型………………………………………………………………12
2.5.2.1同源模擬法……………………………………………… ….12
2.5.2.2分緒法…………………………………………………… ….13
2.5.2.3重頭起算法……………………………………………………13
2.5.3二級結構預測…………………………………………………………14
2.5.3.1以胺基酸特性為基礎的預測方法……………………………15
2.5.3.2以胺基酸序列區段為基礎的預測方法……………………..15
2.5.3.3比對與混合多種預測工具的方法……………………………16
2.5.4蛋白質資料庫與預測工具相關網站…………………………………18
2.6倒傳遞網路………………………………………………………………….19
2.7 MatLab所提供的倒傳遞網路函數…………………………………………21
2.7.1共軛梯度演算法………………………………………………………21
2.7.1.1 Fletcher-Reeves更新法(traincgf)……………………21
2.7.1.2 Polak-Ribiere更新法(traincgp)………………………22
2.7.1.3 Powell-Beale Restarts更新法(traincgb)……………22
2.7.1.4 Charalambous Search………………………………………23
2.7.1.5比例共軛梯度演算法………………………………………..23
2.7.2適應性學習速率的倒傳遞演算法…………………………………...23
2.7.2.1適應性學習速率演算法-traingda…………………………23
2.7.2.2適應性學習速率演算法-traingdx…………………………24
2.7.3有彈性的倒傳遞演算法(trainrp)…………………………………25
2.7.4一步階正割法(trainoss)………………………………………….25
第三章、 研究架構與實驗分析…………………………………………………...27
3.1實驗的假設………………………………………………………………….27
3.2進行實驗的原因…………………………………………………………….27
3.3實驗所遭遇的問題………………………………………………………...28
3.4實驗設計的目的…………………………………………………………...29
3.5研究流程…………………………………………………………………...30
3.5.1蛋白質的取得………………………………………………………..32
3.5.2二級結構資料的取得………………………………………………..32
3.5.3刪除重複及互斥的資料……………………………………………..32
3.5.4訓練資料與測試資料集……………………………………………..32
3.5.5資料編碼與前置處理………………………………………………..33
3.5.5.1第一種編碼方式……………………………………………..33
3.5.5.2第二種編碼方式……………………………………………..33
3.5.6倒傳遞網路的訓練與測試…………………………………………..35
3.5.7二級結構歸類的評估………………………………………………..36
3.5.8挑出結果最好的模型………………………………………………..37
3.5.9倒傳遞網路訓練與測試結果………………………………………..37
3.5.10 模型穩定度測試…………………………………………………..39
3.5.11 模型穩定度測試結果……………………………………………..39
3.5.12 找出資料集模具………………………………………………....41
3.5.13模具整理方式……………………………………………………...42
3.5.14研究貢獻……………………………………………………….....42
3.5.15與以往研究的差異與比較………………………………………….44
第四章、 結論與建議………………………………………………………….....47
4.1結論…………………………………………………………………….....47
4.2未來研究方向………………………………………………………….....48
參考文獻…………………………………………………………………………....49
附錄一 模具彙整表………………………………………………………………...52
附錄二 實驗資料集…………………………………………………………….....63
參考文獻:
【1】北京大學生物訊息中心.
http://www.cbi.pku.edu.cn/chinese/

【2】葉怡成,1998,應用類神經網路,儒林書局,台北市。

【3】羅華強,2001,類神經網路-Matlab 的應用,清蔚科技,新竹市。

【4】Alignment Database Information(PIR-ALN)
http://pir.georgetown.edu/pirwww/dbinfo/piraln.html

【5】Andrzej, K., et al.(1999).A method for the Improvement of Threading-Based Protein Models. Laboratory of computational Genomics and Bioinformatics, Danforth Plant Science Center. Department of chemistry, University of Warsaw.

【6】Baldi, P., et al. (1999). Bidirectional dynamics for Protein secondary structure prediction. IJCAI99 workshop on neural, symbolic, and reinforcement methods of sequence learning.

【7】Baldi, P., et al. (1999). Exploiting the past and the future in Protein secondary structure Prediction. Bioinformatics 15, 937-946.

【8】Barker*, W. C., et al. (1999). The PIR-International Protein Sequence Database . Nucleic Acids Res, Vol.26, No.1, 27-32.

【9】Berman, H.M., et al.(2000). The Protein Data Bank. Nucleic Acids Res. Vol.28, No.1, 235-242.

【10】Chou, P. Y. & Fasman, G. D.(1974). Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from Proteins. Biochemistry 13, 211-222.

【11】Cuff,J.A., et al.(1998). Jpred: A consensus secondary structure prediction server. Bioinformatics 14, 892-893.
【12】Cynthia G. & Per J.(2001). Developing Bioinformatics Computer Skills. Oreilly.

【13】Dlberto, M. S. & Anna, B. (2000). Ab initio Methods for Protein Structure Prediction: A New Technique based on Ramachandran Plots. ERCIM News No.43 .

【14】DSSP: Database of Secondary Structure in Proteins.
http://www.sander.ebi.ac.uk/dssp/

【15】Frieden, C. Holeltzli, SD., Ropson, IJ.(1993). NMR and Protein folding: equilibrium and stopped-flow studies.Department of Biochemistry and Molecular Biophysics . Washington University School of Medicine. Protein Science. 2(12):2007-14.

【16】Frishman, D. & Argos, P.(1996). Incorporation of non-local interations in protein secondary structure prediction from the amino-acid sequence. Prot.Eng. 9,133-142.

【17】Frishman, D. & Argos, P.(1997).Seventy-five percent accuracy in protein secondary structure prediction. Proteins 27, 329-335.

【18】Garnier, J., Osguthorpe, D.J. & Robson, B.(1978). Analysis of the accuracy and implications of simple methods for predicting the Secondary Structure of globural proteins. J. Mol. Biol. 120, 97-120.

【19】Gibrat, J. F., Robson, B. & Garnier, J.(1987). Further developments of Protein secondary structure prediction using information theory. New parameters and consideration of residue pairs. J. Mol. Biol. 198, 425-443.

【20】Holm, L. & Sander, C.(1996). “Mapping the Protein Universe”. Science, 273, 595-602.

【21】Hua, S. & Sun*, Z.(2001). A Novel Method of Protein secondary Structure Prediction with High Segment Overlap Measure: Support Vector Machine Approach. J, Mol, Biol. 308, 397-407.

【22】Hubbard, T., et al.(1997). SCOP:a structure classification of Proteins database. Nucleic Acids Res.25, 236-239.
【23】Jolanta, Z. Use of homology in protein modeling. Wroclaw University of Technology. http://www.mmi.ch.pwr.wroc.pl/events/workshop/abs/zurek.doc

【24】Jones, D. T.(1999). Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292, 195-202.

【25】Kamaric Riis, S. & Krogh*, A.(1996). Improving Prediction of Protein Secondary structure using Structured Neural Networks and Multiple Sequence Alignments. Journal of Computational Biology, Vol.3, P. 163-183.

【26】Perutz, M. F., et al.( 1960). Structure of haemoglobin: A three-dimensional Fourier synthesis at 5.5 Å resolution, obtained by x-ray analysis, Nature 185, 416-422.

【27】Protein Identification Newsletter, (1988), National Biomedical Tesearch Foundation, USA, 6, 1-2.

【28】Qian, N. & Sejnowski, T.(1988). Predicting the secondary structure of globular proteins using neural network models. J. Mol. Biol. 202, 865-884.

【29】Rost, B. & Sander, C.(1993). Prediction of Protein Secondary Structure at Better than 70﹪Accuracy. J. Mol. Biol. (1993) 232, 584-599.

【30】Yi, T.M. & Lander, E. S.(1993). Protein secondary structure Prediction using nearest neighbor methods. J. Mol. Biol. 232, 1117-1129.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔