

( 您好!臺灣時間:2024/12/03 13:07
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::


研究生(外文):Shu-ying Sue
論文名稱(外文):The Design and Implementation of Algorithm for Aligning ESTs to Genome with the Low-Frequency and High-Density Index
指導教授(外文):Fang-rong Hsu
外文關鍵詞:sequence alignmentrepetitive sequenceexpressed sequence tag
  • 被引用被引用:0
  • 點閱點閱:234
  • 評分評分:
  • 下載下載:6
  • 收藏至我的研究室書目清單書目收藏:0
With the development of computer and information technology, many researches of biology can be facilitated by computer software. Computer software can speed up researches of biology and analysis of biological data. However, the completion of the Human Genome Project (HGP) promotes the development of other related researches. Those data of sequences need to be analyzed and explored to discover the potential mechanisms of life. Hence, tools that can assist the analysis are needed. Based on the requirement, we hope to provide a method that can align the ESTs to the genome. Yet, the human genome contains the repetitive sequences that hold one-tenth of the human genome. And in the past, most of the associated researches cannot handle those repetitive sequences well, and even cannot deal with those sequences. Hence, our research hopes to handle both those repetitive and unique sequences in the genome to make all ESTs can be aligned to the correct regions. And we can employ the results to have an advance research and analysis. Besides, the human ESTs in dbEST have achieved the number of 7,678,812. If we align the entire ESTs in dbEST, it costs much time. Thus, we provide different strategies that can save time and get results within an acceptable correctness to align a single EST and the entire ESTs in dbEST to the genome. We consider the low frequency and high density index problem to provide the EST to locate to the genome. And then, we propose a heuristic algorithm and employ MUGUP to check our research with different test sets of ESTs.
Acknowledgement i
中文摘要 ii
Abstract iii
Table of Contents iv
List of Figures vi
List of Tables viii
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Thesis Organization 5
Chapter 2 Relevant Researches 6
2.1 Spliced Alignment Problem 6
2.2 Researches on Mapping and Alignment 6
2.2.1 SSAHA 7
2.2.2 EST_GENOME 8
2.2.3 SIM4 9
2.2.4 Spidey 10
2.2.5 BLAT 11
2.2.6 SQUALL 12
2.2.7 UM Method 13
2.2.8 MUGUP 15
2.2.9 GMAP 17
2.3 Discussion 19
Chapter 3 Low Frequency and High Density Index 22
3.1 Definition 22
3.2 Idea of Selecting LFHD Index 23
3.3 The Algorithm for Selecting the LFHD Index 27
3.4 LFHD Index with Different K-values 31
Chapter 4 Aligning a Single EST to Genome 33
4.1 Overview for the EST to Genome Alignment 33
4.2 Test Sets 35
4.3 Alignment Results 35
Chapter 5 Aligning the Entire ESTs in dbEST to Genome 40
5.1 Problem Statement 40
5.2 Strategy 40
Chapter 6 Conclusion 44
References 45
[1] S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. Lipman, “ Basic local alignment search tool,” Journal of Molecular Biology, Vol. 215, pp. 403–410, 1990
[2] K. M. Chao, J. Zhang, J. Ostell and W. Miller, “A tool for aligning very similar DNA sequences,” Computer Applications in the Biosciences, Vol. 13, pp. 75-80 , 1997
[3] L. Y. Chen, S. H. Lu, E. S. Shih, and M. J. Huang, “Single Nucleotide Polymorphism Mapping Using Genome-Wide Unique Sequences,” Genome Research, Vol. 12, Issue 7, pp. 1106-1111, July 2002
[4] L. Florea, G. Hartzell, Z. Zhang, G. M. Rubin, and W. Miller, “A computer program for aligning a cDNA sequence with a genomic DNA sequence,” Genome Research., Vol. 8, Issue 9, pp. 967-974, September 1998.
[5] F. R. Hsu and J. F. Chen, “Aligning ESTs to Genome Using Multi-Layer Unique Markers,” IEEE Computational Systems Bioinformatics Conference ’03, pp. 564-567, 2003
[6] F. R. Hsu, H. Y. Chang, Y. L. Lin, Y. T. Tsai, H. L. Peng, Y. T. Chen, C. F. Chen, C. Y. Cheng, C. H. Liu and M. Y. Shih, “Genome-Wide Alternative Splicing Events Detection through Analysis of Large Scale ESTs,” IEEE Fourth Bioinformation Symposium on Bioinformatics and Bioengineering (BIBE''2004), pp. 310-316, May 2004
[7] X. Huang, “On global sequence alignment,” Computer Applications in the Biosciences, Vol. 10, pp. 227-235, 1994
[8] W. J. Kent, “BLAT – The BLAST Like Alignment Tool,” Genome Research, Vol. 12, Issue 4, pp. 656-664, April 2002
[9] E. W. Myers and W. Miller, “Optimal alignments in linear space”, Computer Applications in the Biosciences, Vol. 4, pp. 11-17, March 1988
[10] R. Mott, “EST_GRNOME : a program to align spliced DNA sequences to unspliced genomic DNA,” Bioinformatics, Vol. 13, no. 4, pp. 477-478, 1997
[11] W. Miller and E. W. Myers, “A file comparison program,” Software-Practice Experience, Vol. 15, pp. 1025-1040, 1985
[12] Z. Ning, A. J. Cox, and J. C. Mullikin, “SSAHA: A Fast Search Method for Large DNA Databases,” Genome Research, Vol. 11, pp. 1725-1729, October 2001.
[13] S. B. Needleman and C. D. Wunsch, “A general method applicable to the search for similarities in the amino acid sequence of two proteins,” Journal of Molecular Biology, Vol. 48, Issue 3, pp. 443-453, March 1970
[14] J. Ogasawara and S. Morishita, “Fast and sensitive algorithm for aligning ESTs to human genome,” IEEE Computer Society Bioinformatics Conference, Vol. 1, pp. 43-53, 2002
[15] S. Schwartz, W. Miller, C. M. Yang and R. C. Hardison, “Software tools for analyzing pairwise alignments of long sequences,” Nucleic Acids Research, Vol. 19, pp. 4663- 4667, 1991
[16] T. E. Smith and M. S. Waterman, “Identification of common molecular subsequences,” Journal of Molecular Biology, Vol. 147, Issue 1, pp. 198-197, March 1981
[17] J. Usuka, W. Zhu and V. Brendel, “Optimal spliced alignment of homologous cDNA to a genomic DNA template,” Bioinformatics, Vol. 16, no. 3, pp. 203-211, 2000.
[18] S. J. Wheelan, D. M. Church, and J. M. Ostell, “Spidey: a tool for mRNA-to-genomic alignments,” Genome Research, Vol. 11, Issue 11, pp. 1952-1957, November 2001.
[19] T. D. Wu and C. K. Watanabe, “GMAP: a genomic mapping and alignment program for mRNA and EST sequences,” Bioinformatics, Vol. 21, no. 9, pp. 1859-1875, February 2005
第一頁 上一頁 下一頁 最後一頁 top