跳到主要內容

臺灣博碩士論文加值系統

(18.97.9.170) 您好!臺灣時間:2024/12/03 13:07
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:蘇淑瑛
研究生(外文):Shu-ying Sue
論文名稱:使用低頻高密度索引對齊表現序列標籤至基因體之演算法設計與實作
論文名稱(外文):The Design and Implementation of Algorithm for Aligning ESTs to Genome with the Low-Frequency and High-Density Index
指導教授:許芳榮許芳榮引用關係
指導教授(外文):Fang-rong Hsu
學位類別:碩士
校院名稱:逢甲大學
系所名稱:資訊工程所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2006
畢業學年度:94
語文別:英文
論文頁數:56
中文關鍵詞:表現序列標籤序列對齊重覆序列
外文關鍵詞:sequence alignmentrepetitive sequenceexpressed sequence tag
相關次數:
  • 被引用被引用:0
  • 點閱點閱:234
  • 評分評分:
  • 下載下載:6
  • 收藏至我的研究室書目清單書目收藏:0
隨著電腦資訊技術的發達,許多生物領域的研究都可藉由資訊軟體的輔助,加快研究的速度與分析。然而人類基因體計劃的完成,促進許多研究的發展。由於許多資料需要被分析、處理,以發掘潛藏的生命機制,因此,輔助分析的工具是需要的。基於這樣的需求下,我們希望提供一個方法能將人類的表現序列標籤對齊至基因體上。然而人類基因體序列中存在著重覆性片段的序列,約人類基因體的十分之一,而在過去類似的研究中,相關的表現序列標籤大多無法被準確的對齊至這些有重覆序列的區域上,甚至無法處理。因此,本研究希望能處理基因體中重覆的序列及一般正常的序列,使得表現序列標籤皆可被對齊至所屬的區域,以便更進一步的分析與研究。此外,由於人類至今存在dbEST資料庫的表現序列標籤已高達7,678,812筆,若要把如此大量的序列對齊至基因體上,必須耗費很多的時間,因此在本研究中,我們提出了不同的策略,可節省時間花費並且在允許的準確度範圍內,分別將單筆表現序列標籤及dbEST中所有人類表現序列標籤對齊至基因體上。我們考慮在基因體上選擇低頻高密度索引的問題以提供表現序列標籤定位至基因體的查詢,進而提出一個啟發式演算法,結合MUGUP,以不同的表現序列標籤測試。
With the development of computer and information technology, many researches of biology can be facilitated by computer software. Computer software can speed up researches of biology and analysis of biological data. However, the completion of the Human Genome Project (HGP) promotes the development of other related researches. Those data of sequences need to be analyzed and explored to discover the potential mechanisms of life. Hence, tools that can assist the analysis are needed. Based on the requirement, we hope to provide a method that can align the ESTs to the genome. Yet, the human genome contains the repetitive sequences that hold one-tenth of the human genome. And in the past, most of the associated researches cannot handle those repetitive sequences well, and even cannot deal with those sequences. Hence, our research hopes to handle both those repetitive and unique sequences in the genome to make all ESTs can be aligned to the correct regions. And we can employ the results to have an advance research and analysis. Besides, the human ESTs in dbEST have achieved the number of 7,678,812. If we align the entire ESTs in dbEST, it costs much time. Thus, we provide different strategies that can save time and get results within an acceptable correctness to align a single EST and the entire ESTs in dbEST to the genome. We consider the low frequency and high density index problem to provide the EST to locate to the genome. And then, we propose a heuristic algorithm and employ MUGUP to check our research with different test sets of ESTs.
Acknowledgement i
中文摘要 ii
Abstract iii
Table of Contents iv
List of Figures vi
List of Tables viii
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Thesis Organization 5
Chapter 2 Relevant Researches 6
2.1 Spliced Alignment Problem 6
2.2 Researches on Mapping and Alignment 6
2.2.1 SSAHA 7
2.2.2 EST_GENOME 8
2.2.3 SIM4 9
2.2.4 Spidey 10
2.2.5 BLAT 11
2.2.6 SQUALL 12
2.2.7 UM Method 13
2.2.8 MUGUP 15
2.2.9 GMAP 17
2.3 Discussion 19
Chapter 3 Low Frequency and High Density Index 22
3.1 Definition 22
3.2 Idea of Selecting LFHD Index 23
3.3 The Algorithm for Selecting the LFHD Index 27
3.4 LFHD Index with Different K-values 31
Chapter 4 Aligning a Single EST to Genome 33
4.1 Overview for the EST to Genome Alignment 33
4.2 Test Sets 35
4.3 Alignment Results 35
Chapter 5 Aligning the Entire ESTs in dbEST to Genome 40
5.1 Problem Statement 40
5.2 Strategy 40
Chapter 6 Conclusion 44
References 45
[1] S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. Lipman, “ Basic local alignment search tool,” Journal of Molecular Biology, Vol. 215, pp. 403–410, 1990
[2] K. M. Chao, J. Zhang, J. Ostell and W. Miller, “A tool for aligning very similar DNA sequences,” Computer Applications in the Biosciences, Vol. 13, pp. 75-80 , 1997
[3] L. Y. Chen, S. H. Lu, E. S. Shih, and M. J. Huang, “Single Nucleotide Polymorphism Mapping Using Genome-Wide Unique Sequences,” Genome Research, Vol. 12, Issue 7, pp. 1106-1111, July 2002
[4] L. Florea, G. Hartzell, Z. Zhang, G. M. Rubin, and W. Miller, “A computer program for aligning a cDNA sequence with a genomic DNA sequence,” Genome Research., Vol. 8, Issue 9, pp. 967-974, September 1998.
[5] F. R. Hsu and J. F. Chen, “Aligning ESTs to Genome Using Multi-Layer Unique Markers,” IEEE Computational Systems Bioinformatics Conference ’03, pp. 564-567, 2003
[6] F. R. Hsu, H. Y. Chang, Y. L. Lin, Y. T. Tsai, H. L. Peng, Y. T. Chen, C. F. Chen, C. Y. Cheng, C. H. Liu and M. Y. Shih, “Genome-Wide Alternative Splicing Events Detection through Analysis of Large Scale ESTs,” IEEE Fourth Bioinformation Symposium on Bioinformatics and Bioengineering (BIBE''2004), pp. 310-316, May 2004
[7] X. Huang, “On global sequence alignment,” Computer Applications in the Biosciences, Vol. 10, pp. 227-235, 1994
[8] W. J. Kent, “BLAT – The BLAST Like Alignment Tool,” Genome Research, Vol. 12, Issue 4, pp. 656-664, April 2002
[9] E. W. Myers and W. Miller, “Optimal alignments in linear space”, Computer Applications in the Biosciences, Vol. 4, pp. 11-17, March 1988
[10] R. Mott, “EST_GRNOME : a program to align spliced DNA sequences to unspliced genomic DNA,” Bioinformatics, Vol. 13, no. 4, pp. 477-478, 1997
[11] W. Miller and E. W. Myers, “A file comparison program,” Software-Practice Experience, Vol. 15, pp. 1025-1040, 1985
[12] Z. Ning, A. J. Cox, and J. C. Mullikin, “SSAHA: A Fast Search Method for Large DNA Databases,” Genome Research, Vol. 11, pp. 1725-1729, October 2001.
[13] S. B. Needleman and C. D. Wunsch, “A general method applicable to the search for similarities in the amino acid sequence of two proteins,” Journal of Molecular Biology, Vol. 48, Issue 3, pp. 443-453, March 1970
[14] J. Ogasawara and S. Morishita, “Fast and sensitive algorithm for aligning ESTs to human genome,” IEEE Computer Society Bioinformatics Conference, Vol. 1, pp. 43-53, 2002
[15] S. Schwartz, W. Miller, C. M. Yang and R. C. Hardison, “Software tools for analyzing pairwise alignments of long sequences,” Nucleic Acids Research, Vol. 19, pp. 4663- 4667, 1991
[16] T. E. Smith and M. S. Waterman, “Identification of common molecular subsequences,” Journal of Molecular Biology, Vol. 147, Issue 1, pp. 198-197, March 1981
[17] J. Usuka, W. Zhu and V. Brendel, “Optimal spliced alignment of homologous cDNA to a genomic DNA template,” Bioinformatics, Vol. 16, no. 3, pp. 203-211, 2000.
[18] S. J. Wheelan, D. M. Church, and J. M. Ostell, “Spidey: a tool for mRNA-to-genomic alignments,” Genome Research, Vol. 11, Issue 11, pp. 1952-1957, November 2001.
[19] T. D. Wu and C. K. Watanabe, “GMAP: a genomic mapping and alignment program for mRNA and EST sequences,” Bioinformatics, Vol. 21, no. 9, pp. 1859-1875, February 2005
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top