跳到主要內容

臺灣博碩士論文加值系統

(3.231.230.177) 您好!臺灣時間:2021/08/04 03:25
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:彭永興
研究生(外文):Yung-Hsing Peng
論文名稱:巢狀弧標記之RNA二級結構比對
論文名稱(外文):The Comparison of RNA Secondary Structures with Nested Arc Annotation
指導教授:楊昌彪楊昌彪引用關係
指導教授(外文):Chang-Biau Yang
學位類別:碩士
校院名稱:國立中山大學
系所名稱:資訊工程學系研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2004
畢業學年度:92
語文別:英文
論文頁數:61
中文關鍵詞:計算生物動態規劃核醣核酸二級結構比對
外文關鍵詞:Computational BiologyRNASecondary StructuresComparisonDynamic Programming
相關次數:
  • 被引用被引用:0
  • 點閱點閱:79
  • 評分評分:
  • 下載下載:2
  • 收藏至我的研究室書目清單書目收藏:0
近年來,RNA的結構比對已經變成生物資訊研究領域的一個重要問題。一般而言,用巢狀弧標記(arc annotation)的方式來表示RNA的結構是個普遍被接受的表示方式。而要比較兩個RNA的結構可以有很多種比較法,比如tree edit distance, longest arc-preserving common subsequence以及stem-based alignment等等。然而,這些比對方法因為時間複雜度較高,所以只能適用於較短的序列。在本論文中,我們提出了一個較精簡的方法來比較兩個RNA的結構,此方法的時間複雜度為O(mn),m及n分別為這兩個RNA結構的序列長度。我們的方法是將RNA的結構先轉換成所謂的「物件序列」,再將這些物件序列進行比對進而得到原來兩個RNA的共同子結構。我們從RNase P Database中下載了118個RNA的結構來針對我們的比對方法進行測試。對於任兩個結構,我們試著用我們的方法進行結構比對,同時也用LCS的方式來進行序列比對,試圖辨識某兩個RNA是否屬於同一個家族。從實驗的結果發現,我們的結構比對法在家族辨識上比傳統的序列比對法擁有更高的準確率以及更快的速率。因此,我們的方法更有生物上的意義,而且在時間複雜度方面也更有效率。
In recent years, RNA structural comparison becomes a crucial problem in bioinformatic research. Generally, it is a popular approach for representing the RNA secondary
structures with arc-annotation sets. Several methods can be used to compare two RNA structures, such as tree edit distance, longest arc-preserving common subsequence
(LAPCS) and stem-based alignment. However, these methods may be helpful only for small RNA structures because of their high time complexity. In this thesis, we propose
a simplified method to compare two RNA structures in O(mn)time, where m and n are the lengths of the two RNA sequences, respectively. Our method transforms the RNA
structures into specific sequences called object sequences, then compare these object sequences to find their common substructures. We test our comparison method with 118 RNA structures obtained from RNase P Database. For any two structures, we try to identify whether they are in the same family by both structure comparison and sequence comparison. In our experiment, we find that our method for comparing RNA structures can yield better hit rates and is faster than the traditional method to compare the RNA sequences. Therefore, our approach for comparing RNA secondary structures is more sensitive in biology and more efficient in time complexity.
ABSTRACT . . . . . . . . . . . . . . . . . . . . 0

Chapter 1. Introduction . . . . . . . . . . . . 1
1.1 DNA . . . . . . . . . . . . . . . . . . . . 1
1.2 RNA . . . . . . . . . . . . . . . . . . . . 2
1.3 RNA Structure Representation . . . . . . . . 2
1.4 Pseudoknots in RNA Structures . . . . . . . 5

Chapter 2. Preliminaries . . . . . . . .. . . . 7
2.1 Related Research on RNA Structure . . . . . 7
2.2 Dynamic Programming . . . . . . . . . . . . 8
2.3 The Longest Common Subsequence Problem . . . 10

Chapter 3. Related Algorithms . . . . . . . . . 14
3.1 Hamming Distance . . . . . . . . . . . . . . 14
3.2 Tree Edit Distance . . . . . . . . . . . . . 16
3.3 Longest Arc-Preserving Common Subsequence .. 18
3.4 Stem-Based Alignment .. . . . . . . . . . . 20

Chapter 4. Our Method . . . . . . . . . . . . . 24
4.1 Merging Arcs into an Object . . . . . .. . . 24
4.2 The Object Sequence . . . . . . . . . . . . 25
4.3 LCS in Object Sequences . . . . . . . . . . 27
4.4 Our Match Function . . . . . . . . . . . . . 27
4.5 OLCS with Different Depths . . . . . . . . . 30
4.6 The Additional Sequence Alignment Score . . 30
4.7 Time Complexity and Space Complexity . . . . 31

Chapter 5. Experimental Results . . . . . .. . . 35
5.1 Definition of Similarity . . . . . . . . . . 35
5.2 Family Relationship . . . . . . .. . . . . . 36
5.3 Distribution of Similarity . . . . . . . . . 36
5.4 Family identification . . . . . . . . . . . 37

Chapter 6. Conclusions . . . . . . . . . . . . . . . . . . 57

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . 59
[1] “Paul Gardner’s Homepage.” http://www.massey.ac.nz/ ppgardne/.
[2] “PseudoBase.” http://wwwbio.leidenuniv.nl/ Batenburg/PKB.html.
[3] “RNase P Database.” http://www.mbio.ncsu.edu/RNaseP/.
[4] J. Alber, J. Gramm, J. Guo, and R. Niedermeier, “Computing the similarity of two
sequences with nested arc annotations,” Theoretical Computer Science, pp. 337–
358, 2004.
[5] M. Andronescu, A. P. Fejes, F. Hutter, H. H. Hoos, and A. Condon, “A new algorithm
for RNA secondary structure design,” Journal of Molecular Biology, pp. 607–624,
2004.
[6] K. Charter, J. Schaeffer, and D. Szafron, “Sequence alignment using FastLSA,”
Mathematics and Engineering Techniques in Medicine and Biological Sciences,
pp. 239–245, 2000.
[7] Y. Ding and C. E. Lawrence, “A statistical sampling algorithm for RNA secondary
structure prediction,” Nucleic Acids Research, pp. 7280–7301, 2003.
[8] S. Dulucq and L. Tichit, “RNA secondary structure comparison: exact analysis of
the zhang-shasha tree edit algorithm,” Theoretical Computer Science, pp. 471–484,
2003.
[9] S. R. Eddy, “A memory-efficient dynamic programming algorithm for optimal alignment
of a sequence to an RNA secondary structure,” BMC Bioinformatics, pp. 3–18,
2002.
[10] P. Evans, “Algorithms and complexity for annotated sequence analysis.” citeseer.
ist.psu.edu/evans99algorithms.html, Department of Computer Science, University
of Victoria, 1999.
59
[11] W. Gruner, R. Giegerich, D. Storthmann, C. Reidys, J. Weber, I. L. Hofacker, P. F.
Stadler, and P. Schuster, “Analysis of RNA sequence structure maps by exhaustive
enumeration,” Santa Fe Institute Working Papers, 1995.
[12] I. L. Hofacker, M. Fekete, and P. F. Stadler, “Secondary structure prediction for
aligned RNA sequences,” Santa Fe Institute Working Papers, 2001.
[13] I. L. Hofacker, W. Fontana, P. F. Stadler, L. S. Bonhoeffer, M. Tacker, and P. Schuster,
“Fast folding and comparison of RNA secondary structures,” Santa Fe Institute
Working Papers, 1993.
[14] J. Jansson and A. Lingas, “A fast algorithm for optimal alignment between similar
ordered trees,” Proceedings of the 12th Annual Symposium on Combinatorial
Pattern Matching, pp. 232–240, 2001.
[15] T. Jiang, G.-H. Lin, B. Ma, and K. Zhang, “A general edit distance between RNA
structures,” Journal of Computational Biology, pp. 371–388, 2002.
[16] V. Juan and C. Wilson, “RNA secondary structure prediction based on free energy
and phylogenetic analysis,” Journal of Molecular Biology, pp. 935–947, 1999.
[17] R. C. T. Lee, S. S. Tseng, R. C. Chang, and Y. T. Tsai, Introduction to the Design
and Analysis of Algorithms. Unalis Corporation, first ed., 1999.
[18] G.-H. Lin, Z.-Z. Chen, T. Jiang, , and J. Wen, “The longest common subsequence
problem for sequences with nested arc annotations,” International Colloquium on
Automata, Languages and Programming, pp. 444–455, 2001.
[19] G.-H. Lin, B. Ma, and K. Zhang, “Edit distance between two RNA structures,” Research
in Computational Molecular Biology, pp. 211–220, 2001.
[20] T. Liu and B. Schmidt, “Parallel RNA sequence-structure alignment,” High Performance
Computational Biology, 2004.
[21] Y. Sakakibara, M. Brown, R. Hughey, I. S. Mian, K. Sjolander, R. C. Underwood,
and D. Haussler, “Stochastic context-free grammars for tRNA modeling,” Nucleic
Acids Research, pp. 5112–5120, 1994.
[22] J. T. Wang, B. A. Shapiro, D. Shasha, K. Zhang, and K. M. Currey, “An algorithm
for finding the largest approximately common substructures of two trees,” IEEE
Transactions on Pattern Analysis and Machine Intelligence, 1998.
60
[23] M.-Y.Wu, C.-B. Yang, and K.-S. Huang, “RNA secondary structure alignment based
on stem representation,” Proceedings of the 21st Workshop on CombinationalMathematics
and Computation Theory, Taichung, Taiwan, pp. 60–69, 2004.
[24] K. Zhang and D. Shasha, “Simple fast algorithms for the editing distance between
trees and related problems,” SIAM Journal on Computing, pp. 1245–1262, 1989.
[25] K. Zhang, L. Wang, and B. Ma, “Computing similarity between RNA structures,”
Combinatorial Pattern Matching, pp. 281–293, 1999.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊