跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.81) 您好!臺灣時間:2025/10/05 05:08
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:鐘詩維
研究生(外文):Shih-Wei Chung
論文名稱:混合遺傳演算及動態規劃之方法解決多重序列排比的問題
論文名稱(外文):A Hybrid Approach for Multiple Sequence Alignment Using Genetic Algorithms and Heuristic Dynamic Programming
指導教授:賴智錦賴智錦引用關係
指導教授(外文):Chih-Chin Lai
學位類別:碩士
校院名稱:樹德科技大學
系所名稱:資訊管理研究所
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2003
畢業學年度:91
語文別:中文
論文頁數:87
中文關鍵詞:多重序列排比遺傳演算法動態規劃法計算分子生物學
外文關鍵詞:Multiple Sequence AlignmentGenetic AlgorithmsDynamic ProgrammingComputational Molecular Biology
相關次數:
  • 被引用被引用:3
  • 點閱點閱:536
  • 評分評分:
  • 下載下載:26
  • 收藏至我的研究室書目清單書目收藏:2
多重DNA或是蛋白質序列的排比在計算分子生物學中,扮演著相當重要的角色。它不僅能夠找出序列之間的相似程度,更可以透過蛋白質的分析來發現其生物序列間的功能與結構。但是隨著序列的數量及長度的增加,找尋合適/最佳排比結果的困難度亦隨之增加。另外,因為某些分析工具無法達到生物分析人員之需求,使得研究者只好求助於其它序列編輯的工具,透過人工的方式對序列的結果進行修正。
為了克服多重序列排比之相關問題,本論文所提出之混合方式,乃結合遺傳演算法在解決複雜問題上的特殊能力,及動態規劃法在兩兩排比(pairwise alignment)上的準確性,共同合作找尋最佳化的排比品質。除了利用啟發式的動態規劃法來產生一條種子染色體外,並且運用自行開發及既有之遺傳運算元(operator)來進行序列的排比,最後再以重新修正(realignment)的功能進行序列排比後的調整,經過反覆的演化過程之後,期望能夠提高序列排比的品質,並且加強採用遺傳演算法為主之多序列排比技術。
在實驗的過程中,將分為人工資料、真實資料及統整性分析,並將本論文所提之方法,與各類不同之軟體進行比較。從實際的結果顯示,所提方法的確適合應用在解決多重序列排比的問題。
Multiple sequence alignment (MSA) is a fundamental problem in the study of computational molecular biology and refers to the search for maximal similarity overall DNA /protein sequences. Unfortunately, existed multiple sequence alignment algorithms work well for sequences with high similarity but do not scale well when either the length or the number of the sequences is large. In this paper, we view the multiple sequence alignments problem as an optimization problem and propose a hybrid approach to solve it. The hybrid approach combines genetic algorithm and heuristic dynamic programming method.
Genetic Algorithms (GAs) has been used in a wide variety of applications to find solutions for hard optimization problems. The purpose of the heuristic dynamic programming is used to provide a seed for the initial population used in GAs. We design five kinds of problem-dependent mutation operators and simple realignment rules used in evolution process. The proposed approach has been test on several sets of artificially generated sequences and real biological sequences. The experimental results are compared with a variety of MSA methods to illustrate the effectiveness of the proposed approach.
中文摘要 i
英文摘要 ii
誌 謝 iii
目 錄 iv
圖目錄 vi
表目錄 vii
第一章 導論 1
1.1 研究背景 1
1.2 研究動機與目的 3
1.3 論文架構 4
第二章 文獻探討 5
2.1 兩條與多條序列排比 5
2.2 序列資料格式 18
2.3 遺傳演算法 20
2.3.1 遺傳演算法在多重序列排比上的應用 23
第三章 系統架構與方法 26
3.1 系統架構與流程 26
3.2 問題描述與染色體編碼 26
3.3 啟發式初始化染色體 28
3.4 評估函數 31
3.5 遺傳演算法各類運算元 32
3.5.1 選擇與菁英政策 32
3.5.2 交配 33
3.5.3 突變 34
3.5.4 重新修正 46
第四章 實驗結果與分析 50
4.1 測試資料 51
4.2 實驗結果 54
4.2.1 人工序列資料實驗結果 56
4.2.2 真實序列資料實驗結果 61
4.2.3 實驗數據之統整性分析 65
4.2.4 演化收斂代數分析 69
4.3 啟發式動態規劃法選擇方式之驗證 71
4.4 時間複雜度分析 76
第五章 結論 80
參考文獻 81
附錄一 86
[1] 黃崑峰,利用叢集分類進行多重序列排列,國立中山大學,碩士論文,2001。
[2] 楊兵河,應用遺傳演算法解決蛋白質多重序列排比問題,國立中央大學,碩士論文,2001。
[3] 葉承銓,應用適應性基因演算法於資料分群的問題,樹德科技大學,碩士論文,2002。
[4] C. Gibas and P. Jambeck,生物資訊學電腦技術,林仲彥、李士傑、陳淑華、OSB-TW譯,美商歐萊禮台灣分公司,台北,2002。
[5] S.F. Altschul, T.L. Madden, A.A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D.J. Lipman, “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs,” Nucleic Acids Research, vol. 25, no. 17, pp.3389-3402, 1997.
[6] E.L. Anson and E.W. Myers, “ReAligner: A program for refining DNA sequence multi-alignments,” Proc. of the First Conference on Computational Molecular Biology, pp.9-16, ACM-Press, 1997.
[7] D. Beasley, D.R. Bull, and R.R. Martin, “An Overview of Genetic Algorithms: Part1, Fundamentals,” University Computing, vol. 15, no. 2, pp.58-69, 1993.
[8] P. Bonizzoni and G.D. Vedova, “The complexity of multiple sequence alignment with SP-score that is a metric,” Theoretical Computer Science, vol. 259, no. 1, pp.63-79, 2001.
[9] K. Bucka-Lassen, O. Caprani, and J. Hein, “Combining many multiple alignments in one improved alignment,” Bioinformatics, vol. 15, no. 2, pp.122-130, 1999.
[10] W.L. Chan, C.C. Lim, W.K. Poon, J.M. Xu, L. Yuan, S. Zeng, and S. Zeng, Multiple Sequence Alignment, CS5238: Group Report 4, August, 2002. (http://www.comp.nus.edu.sg/~ksung/cs5238/group_list.htm)
[11] K. Chellapilla and G.B. Fogel, “Multiple sequence alignment using evolutionary programming,” Proc. of the 1999 Congress on Evolutionary Computation (CEC99), vol. 1, pp.445-452, 1999.
[12] M.E. Clamp, J.A. Cuff, and G.J. Barton, “Jalview: analysis and manipulation of multiple sequence alignments,” embnet.news, Hinxton, Cambridge, UK, vol. 15, no. 4, 1998. ( http://www.hgmp.mrc.ac.uk/embnet.news/vol5_4/ )
[13] F. Corpet, “Multiple sequence alignment with hierarchical clustering,” Nucleic Acids Research, vol. 16, no. 22, pp.10881-10890, 1998. (http://prodes.toulouse. inra.fr/multalin/multalin.html)
[14] M. Damsbo, Evolutionary Algorithms in Constrained Sequence Optimization, Maersk Mc-Kinney Moller Institute for Production Technology Odense University, Denmark, January, 1998.
[15] K. Deb, “Genetic algorithms in search and optimization: The technique and applications,” Proc. of International Workshop on Soft Computing and Intelligent Systems, Calcutta, India: Machine Intelligence Unit, Indian Statistical Institute, pp.58-87, 1998.
[16] G. Fuellen, “Multiple alignment,” Complexity International, Department of Computer Science and Biotechnology, University of Bielefeld, vol. 4, 1997. (http://www.csu.edu.au/ci/vol04/mulali/mulali.html)
[17] G.J. Gaskell, “Internet on-ramp - multiple sequence alignment tools on the web,” BioTechniques, vol. 29, pp.60-62, 2000.
[18] D.E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, Reading, MA, 1989.
[19] W.N. Grundy, T.L. Bailey, C.P. Elkan, and M.E. Baker, “Meta-MEME: Motif-based hidden markov models of protein families,” Computer Applications in the Biological Sciences (CABIOS), vol. 13, pp.397-406, 1997.
[20] S. Henikoff and J.G. Henikoff, “Amino acid substitution matrices from protein blocks,” Proc. Natl. Acad. Sci., vol. 89, no. 22, pp.10915-10919, 1992.
[21] S. Henikoff, J.G. Henikoff, W.J. Alford, and S. Pietrokovski, “Automated construction and graphical presentation of protein blocks from unaligned sequences,” Gene, vol. 163, no. 2, pp.GC17-GC26, 1995.
[22] J. Heringa, “Local weighting schemes for protein multiple sequence alignment,” Computers & Chemistry, vol. 26, no. 5, pp.459-477, 2002.
[23] M. Hirosawa, M. Hoshida, and M. Ishikawa, “Protein sequence analysis using knowledge,” Proc. of the Twenty-Sixth Hawaii International Conference on System Sciences, pp.803-812, 1993.
[24] T. Jiang and P. Zhao, “A heuristic algorithm for blocked multiple sequence alignment,” Bio-Informatics and Biomedical Engineering(BIBE), Proc. IEEE International Symposium, pp.176-183, November, 2000.
[25] T. Lassmann and E.L.L. Sonnhammer, “Quality assessment of multiple alignment programs,” FEBS Letters, vol. 529, no. 1, pp.126-130, 2002.
[26] H.-P. Lenhof, B. Morgenstern, and K. Reinert, “An exact solution for the segment-tosegment multiple sequence alignment problem,” Bioinformatics, vol. 15, no. 3, pp.203-210, 1999.
[27] C.-M. Lin, Using Genetic Algorithms to Solve Multiple Sequence Alignments, Master thesis, Institute of Computer Science and Information Engineering, National Central University, 2000.
[28] P.W. Lord, J.N. Sellyey, and T.K. Attwood, “CINEMA-MX: A modular multiple alignment editor,” Bioinformatics, vol. 18, no. 10, pp.1402-1403, 2002.
[29] B. Manthey, “Non-approximability of weighted multiple sequence alignment,” Theoretical Computer Science, vol. 296, no. 1, pp.179-192, 2003.
[30] M.A. McClure, T.K. Vasi, and W.M. Fitch, “Comparative analysis of multiple protein-sequence alignment methods,” Molecular Biology and Evolution, vol. 11, no. 4, pp.571-592, 1994.
[31] B. Morgenstern, A. Dress, and T. Werner, “Multiple DNA and protein sequence alignment based on segment-to-segment comparison,” Nucleic Acids Research, vol. 93, no. 22, pp.12098-12103, 1996.
[32] B. Morgenstern, “DIALIGN 2:improvement of the segment-tosegment approach to multiple sequence alignment,” Bioinformatics, vol. 15, no. 3, pp.211-218, 1999.
[33] C. Notredame and D.G. Higgins, “SAGA: sequence alignment by genetic algorithm,” Nucleic Acids Research, vol. 24, no. 8, pp.1515-1524, 1996.
[34] C. Notredame, E.A. O’Brien, and D.G. Higgins, “RAGA: RNA sequence alignment by genetic algorithm,” Nucleic Acids Research, vol. 25, no. 22, pp.4570-4580, 1997.
[35] C. Notredame, L. Holm, and D.G. Higgins, “COFFEE: an objective function for multiple sequence alignments,” Bioinformatics, vol. 14, no. 5, pp.407-422, 1998.
[36] C. Notredame, D.G. Higgins, and J. Heringa, “T-COFFEE: A novel method for fast and accurate multiple sequence alignment,” Journal of Molecular Biology, vol. 302, no. 1, pp.205-217, 2000.
[37] C. Notredame, “Recent progress in multiple sequence alignment: a survey,” Pharmacogenomics, vol. 3, no. 1, pp.131-144, 2002.
[38] S.W. Perrey, J. Stoye, V. Moulton, and A.W.M. Dress, On Simultaneous versus Iterative Multiple Sequence Alignment, Universität Bielefeld, Forschungsschwerpunkt Mathematisierung - Strukturbildungsprozesse, MaterialienPreprints 111, 1997.
[39] K. Reinert, J. Stoye, and T. Will, “An iterative method for faster sum-of-pairs multiple sequence alignment,” Bioinformatics, vol. 16, no. 9, pp.808-814, 2000.
[40] A.B. Robinson and L.R. Robinson, “Distribution of glutamine and asparagines residues and their near neighbors in peptides and proteins,” Proc. Natl. Acad. Sci., vol. 88, no. 20, pp.8880-8884, 1991.
[41] J. Setubal and J. Meidanis, Introduction to Computational Molecular Biology, PWS, Boston, MA, 1997.
[42] J.S. Sim and K. Park, “The consensus string problem for a metric is NP-complete,” Proc. of the 10th Australasian Workshop on Combinatorial Algorithms (AWOCA), Perth, Australia, pp.107-113, August 1999.
[43] J. Stoye, “Multiple sequence alignment with the divide-and-conquer method,” Gene, vol. 211, no. 2, pp.GC45-GC56, 1998.
[44] W.R. Taylor, G. Saelensminde, and I. Eidhammer, “Multiple protein sequence alignment using double-dynamic programming,” Computers & Chemistry, vol. 24, no. 1, pp.3-12, 2000.
[45] J.D. Thompson, D.G. Higgins, and T.J. Gibson, “CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice,” Nucleic Acids Research, vol. 22, no. 22, pp.4673-4680, 1994.
[46] J.D. Thompson, T.J. Gibson, F. Plewniak, F. Jeanmougin, and D.G. Higgins, “The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools,” Nucleic Acids Research, vol. 25, no. 24, pp.4876-4882, 1997.
[47] J.D. Thompson, F. Plewniak, and O. Poch, “A comprehensive comparison of multiple sequence alignment programs,” Nucleic Acids Research, vol. 7, no. 13, pp.2682-2690, 1999(a).
[48] J.D. Thompson, F. Plewniak, and O. Poch, “Balibase: a benchmark alignment database for the evaluation of multiple alignment programs,” Bioinformatics, vol. 15, no. 1, pp.87-88, 1999(b).
[49] R. Thomsen, G. B. Fogel, and T. Krink, “A clustal alignment improver using evolutionary algorithms,” Proc. of the Fourth Congress on Evolutionary Computation (CEC2002), vol. 1, pp.121-126, 2002.
[50] M. Tompa, Lecture Notes on Biological Sequence Analysis, Department of Computer Science and Engineering University of Washington, Seattle, Washington, Winter, 2000. (http://www.cs.washington.edu/education/courses/527/00wi/)
[51] Z. Weng, Protein and DNA Sequence Analysis, Boston University, Fall, 2002. (http://sullivan.bu.edu/be561/)
[52] L.D. Wheeler, D.M. Church, A.E. Lash, D.D. Leipe, T.L. Madden, J.U. Pontius, G.D. Schuler, L.M. Schriml, T.A. Tatusova, L. Wagner, and B.A. Rapp, “Database resources of the National Center for Biotechnology Information: 2002 update,” Nucleic Acids Research, vol. 30, no. 1, pp.13-16, 2002.
[53] C. Zhang and A.K.C. Wong, “Toward efficient multiple molecular sequence alignment: A system of genetic algorithm and dynamic programming,” IEEE Trans. on Systems, Man, and Cybernetics-part B: Cybernetics, vol. 27, no. 6, pp.565-581, 1997.
[54] Y. Zhang, Sequence Alignment Methods, Department of Computer Science & Engineering University of Minnesota, Twin Cities, October, 2002.
[55] C.M. Zmasek and S.R. Eddy, “ATV: display and manipulation of annotated phylogenetic trees,” Bioinformatics, vol. 17, no. 4, pp.383-384, 2001.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top