跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.104) 您好!臺灣時間:2025/12/03 15:43
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:李昇達
研究生(外文):Sheng Ta Lee
論文名稱:使用CUDA技術於圖形處理器環境開發次世代定序技術之短序列比對工具
論文名稱(外文):Develop RNA short reads alignment tool based on GPU with CUDA
指導教授:林俊淵
指導教授(外文):C. Y. Lin
學位類別:碩士
校院名稱:長庚大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2012
畢業學年度:100
論文頁數:41
中文關鍵詞:計算統一設備架構次世代定序技術樣式比對短序列
外文關鍵詞:CUDANGSPattern MatchingShort Reads
相關次數:
  • 被引用被引用:0
  • 點閱點閱:395
  • 評分評分:
  • 下載下載:45
  • 收藏至我的研究室書目清單書目收藏:0
在許多物種的參考基因體被定序出的後基因體時代,利用高產出的序列讀數,進行個體基因體的重定序是一門重要的課題。過去幾年有許多的次世代定序技術陸續被提出,連帶著一系列可以將次世代定序技術產出的短序列讀數與參考基因體進行重定序的工具也跟著被開發出來。FRESCO 是一個對短序列基於分群切段與計算頻率的重定序工具,其特色就是不使用雜湊表或柏洛-菲勒轉換這類使用大量記憶體的資料結構來進行短序列的比對。FRESCO 提供了更有彈性的比對方法使得比對結果更為精準,然而FRESCO 是一個需要大量且密集的運算工具,因此本論文中提出了 CUDA-FRESCO 以圖形處理器來輔助運算減少 FRESCO 進行比對的時間。透過和FRESCO比較,CUDA-FRESCO 在序列比對上可達到63倍的效能提升,而整體的效能也有20倍的改進。而後發現CUDA-FRESCO在大量資料傳輸上的瓶頸,將之改善後提出CUDA-FRESCO 2.0; 於不同型號的圖形處理器上與FRESCO相比,在總體效能可以得到53至141倍的效能提升。
After the reference genomes of many organisms are sequenced in the post-genetic era, an important issue is to do the re-sequencing of individual genomes with high-throughput reads. Many next-generation sequencing machines have been proposed in the last few years and a series of re-sequencing tools have been developed for mapping short reads to the reference genome. FRESCO is a frequency-based re-sequencing tool without using hash look-up table algorithm and Burrows Wheeler Transformation. FRESCO offers more flexibility in the mapping and then obtains satisfied mapping results. However, FRESCO is a computation-intensive tool. Therefore, in this paper, a tool, CUDA-FRESCO , was proposed to reduce the computation time of FRESCO by using the graphics processing units with CUDA. By comparing to FRESCO, CUDA-FRESCO achieved 63x speedups for the mapping kernel and 20x speedups for the overall computation time. Further more, we discovered the bottleneck of massive data transfer with CUDA-FRESCO. Soon after we proposed the CUDA-FRESCO 2.0 to solve this problem; we compare with FRESCO on different GPUs, we can get 53x to 141x speedups for the overall computation time.
目錄

誌謝 v
中文摘要 vi
英文摘要 vii
目錄 viii
圖目錄 ix
Chapter 1 - Introduction - 1 -
Chapter 2 - Preliminary Concepts - 8 -
Chapter 3 - CUDA-FRESCO - 13 -
Chapter 4 - Experimental Results - 18 -
Chapter 5 - Conclusions - 20 -
FIGURES - 21 -
參考文獻 - 26 -

圖目錄
figure 1 : The flowchart of the mapping phase in CUDA-FRESCO . - 21 -
figure 2: The computation time of mapping kernel by FRESCO and CUDA-FRESCO . - 21 -
figure 3: The overall computation time by FRESCO, SOAP, and CUDA-FRESCO . - 22 -
figure 4: The overall computation time analysis for CUDA-FRESCO . - 22 -
figure 5: The result buffer usage without pre-gather. - 23 -
figure 6: The result buffer usage with pre-gather. - 23 -
figure 7. The overall computation time by FRESCO, CUDA-FRESCO, CUDA-FRESCO 2.0 and SOAP. - 24 -
figure 8. the overall computation time by FRESCO and CUDA-FRESCO 2.0 in different load/kernel. - 25 -
1. 黃名遠,〈對短序列基於分群切段與計算頻率的重定序工具〉,國立清華大學,碩士論文,民國98年。
2. Altschul, S.F., Gish, W., and Miller, W. (1990) “Basic Local Alignment Search Tool,” J. Mol. Biol., Vol. 215, pp. 403-410.
3. Aji, A.M., Zhang, L., and Feng, W.C. (2010) “GPU-RMAP: Accelerating Short-Read Mapping on Graphics Processors,” in ICCSE 2010, pp.168–175.
4. Chen, S, and Jiang, H. (2011) “An Exact Matching Approach for High Throughput Sequencing based on BWT and GPUs,” in CSE 2011, pp. 173-180.
5. Flynn, M. (1972) “Some Computer Organizations and Their Effectiveness,” IEEE Trans. Comput. Vol. C-21, pp. 948.
6. Glazov, E.A., Cottee, P.A., Barris, W.C., Moore, R.J., Dalrymple, B.P., and Tizard, M.L. (2008) “A microRNA catalog of the developing chicken embryo identified by a deep sequencing approach,” Genome Res., Vol. 18, pp. 957-964.
7. Jiang, H., and Wang, W.H. (2008) “SeqMap: mapping massive amount of oligonucleotides to the genome,” Bioinformatics, Vol. 24, pp. 2395-2396.
8. Kahveci, T., Ljosa, V., and Singh, A.K. (2004) “Speeding up whole-genome alignment by indexing frequency vectors,” Bioinformatics, Vol. 20, pp. 2122–2134.
9. Kent, W.J. (2002) “BLAT—the BLAST-like alignment tool,” Genome Res., Vol. 12pp. 656-664.
10. Kent, W.J., Sugnet, C.W., Furey, T.S., Roskin, K.M., Pringle, T.H., Zahler, A.M., and Haussler, D. (2002) “The human genome browser at UCSC,” Genome Res., Vol. 12, pp. 996-1006.
11. Khajeh-Saeed, A., Poole, S., and Perot. J.B. (2010) “Acceleration of the smith-waterman algorithm using single and multiple graphics processors,” J. Comput. Phys., Vol. 229, No. 11, pp. 4247-4258.
12. Kurtz, S., Phillippy, A., Delcher, A.L., Smoot, M., Shumway, M., Antonescu, C., Salzberg, S.L. (2004) “Versatile and open software for comparing large genomes,” Genome Biol., Vol. 5, pp.R12.
13. Langmead, B., Trapnell, C., Pop, M., and Salzberg, SL. (2009) “Ultrafast and memory-efficient alignment of short DNA sequences to the human genome,” Genome Biology, Vol. 10, pp. R25.
14. Li, H., Ni, B., Wong, M.H., and Leung, K.S. (2011) “A Fast CUDA Implementation of Agrep Algorithm for Approximate Nucleotide Seqeunce Matching,” in SASP 2011, pp. 74-77.
15. Li, H., Ruan, J., and Durbin, R. (2008) “Mapping short DNA sequencing reads and calling variants using mapping quality scores,” Genome Res., Vol. 18, pp. 1851-1858.
16. Li, H., and Durbin, R. (2009) “Fast and accurate short read alignment with Burrows-Wheeler transform,” Bioinformatics, Vol. 25, pp. 1754-176.
17. Li, R., Li, Y., Kristiansen, K., and Wang, J. (2008) “SOAP: short oligonucleotide alignment program,” Bioinformatics, Vol. 24, pp. 713-714.
18. Li, R., Yu, C., Li, Y., Lam, T.-W., Yiu, S.-M., Kristiansen, K., and Wang, J. (2009) “SOAP2: an improved ultrafast tool for short read alignment,” Bioinformatics, Vol. 25, pp. 1966-1967.
19. Lin, C.Y., Huang, M. Y., Chu, C.H., Tang, P., and Tang, C.Y. (2009) “Mapping short reads to a genome without using hash look-up table algorithm and Burrows Wheeler Transformation,” in BIBMW 2009, pp. 232-237.
20. Lin, H., Zhang, Z., Zhang, M.Q., Ma, B., and Li, M. (2008) “ZOOM! Zillions of oligos mapped,” Bioinformatics, Vol. 24, pp. 2431-2437.
21. Liu, C.M., Lam, T.W, Wong, T., Wu, E., Yiu, S.M., Li, Z., Luo, R., Wang, B., Yu, C., Chu, X., and Zhao, K. (2011) “SOAP3: GPU-based Compressed Indexing and Ultra-fast Parallel Alignment of Short Reads,” in MASSIVE 2011.
22. Liu, P., and Paul, K. (2011) “A Coarse-Grained Reconfigurable Processor for Sequencing and Phylogenetic Algorithms in Bioinformatics,” in ReConFig 2011, pp.190-197.
23. Liu, W., Schmidt, B., Liu, Y., Voss, G., and Mu¨ller-Wittig, W. (2011) “Mapping of BLASTP Algorithm onto GPU Clusters,” in ICPADS 2011, pp. 236-243.
24. Liu, W., Schmidt, B., and Mu¨ller-Wittig, W. (2011) “CUDA-BLASTP: Accelerating BLASTP on CUDA-Enabled Graphics Hardware,” IEEE/ACM Trans. Compute. Biol. Bioinformatics, Vol. 8, No. 6, pp. 1678-1684.
25. Liu, Y., Maskell, D.L., and Schmidt, B. (2009) “CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units,” BMC Res. Notes, Vol.2, pp.73.
26. Liu, Y., Schmidt, B., Liu, W., Maskell, D.L. (2010) “CUDA–MEME: Accelerating motif discovery in biological sequences using CUDA-enabled graphics processing units,” Pattern Recogn. Lett., vol. 31, pp.2170-2177.
27. Liu, Y., Schmidt, B., Maskell, D.L. (2010) “CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions,” BMC Res. Notes, Vol. 3, pp. 93.
28. Liu, Y., Schmidt, B., Maskell, D. L. (2009) “MSA-CUDA: Multiple Sequence Alignment on Graphics Processing Units with CUDA,” in ASAP 2009, pp. 121-128.
29. Liu, Y., Schmidt, B., Maskell, D. L. (2009) “Parallel reconstruction of neighbor-joining trees for large multiple sequence alignments using CUDA,” in IPDPS 2009, pp. 23-29.
30. Manavski, S.A., and Valle, G. (2008) “CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment,” BMC Bioinformatics, Vol. 9 (Suppl 2), pp.S10.
31. Misra, S., Narayanan, R., Liao, W.K., Choudhary, A.N., and Lin, S. (2010) “pFANGS: Parallel high speed sequence mapping for Next Generation 454-roche Sequencing reads,” in IPDPSW 2010, pp.1-8.
32. Misra, S., Narayanan, R., Lin, S., and Choudhary, A.N. (2009) “Fangs: High speed sequence mapping for next generation sequencers,” in SAC 2009.
33. Nickolls, J., Buck, I., Garland, M., and Skadron, K. (2008) “Scalable parallel programming with CUDA,” ACM Queue, Vol. 6, pp. 40-53.
34. Sandes, F.D.O., Melo, A.C.M.A.D. (2010) “CUDAlign: using GPU to accelerate the comparison of megabase genomic sequences,” in PPOPP 2010, pp. 137-146.
35. Sandes, F.D.O., Melo, A.C.M.A.D. (2011) “Smith-Waterman Alignment of Huge Sequences with GPU in Linear Space,” in IPDPS 2011, pp. 1199-1211.
36. Schatz, M.C., Trapnell, C., Delcher, A.L., and Varshney, A. (2007) “High-throughput sequence alignment using Graphics Processing Units,” BMC Bioinformatics, Vol. 8, pp. 474-484.
37. Smith, A.D., Xuan, Z., and Zhang, M.Q. (2008) “Using quality scores and longer reads improves accuracy of Solexa read mapping,” BMC Bioinformatics, Vol. 9, pp.128.
38. Striemer, G.M., and Akoglu, A. (2009) “Sequence Alignment with GPU: Performance and Design Challenges,” in IPDPS 2009, pp. 1-10.
39. Trapnell, C., and Schatz, M.C. (2009) “Optimizing data intensive GPGPU computations for DNA sequence alignment,” Parallel Computing, Vol. 35, pp. 429-440.
40. Wu, S., and Manber, U. (1992) “Fast Text Searching: Allowing Errorsm,” Communications of the ACM, Vol. 35, pp.83-91.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊