跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.85) 您好!臺灣時間:2024/12/12 10:32
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:吳禕文
研究生(外文):I-Man Ng
論文名稱:以次世代定序結果基因體定位分析NP-mPB跳躍酶的插入點趨性
論文名稱(外文):Transposition Preference Analysis of a Nucleolus-Predominant PiggyBac Transposase (NP-mPB) by Mapping Next Generation Sequencing- Determined Genomic Insertion Sites
指導教授:陳中平陳中平引用關係
指導教授(外文):Chung-Ping Chen
口試委員:高成炎賴飛羆李盛安
口試委員(外文):Cheng-yan KaoFei-Pei LaiSheng-An Li
口試日期:2015-07-13
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:生醫電子與資訊學研究所
學門:工程學門
學類:生醫工程學類
論文種類:學術論文
論文出版年:2015
畢業學年度:104
語文別:英文
論文頁數:69
中文關鍵詞:跳躍基因次世代定序生醫資訊
外文關鍵詞:PiggyBactransposonnucleolus organizer regionsnext generation sequencing
相關次數:
  • 被引用被引用:0
  • 點閱點閱:164
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
跳躍基因(亦稱為轉座子)是一類DNA序列,它們能夠從染色體DNA上單獨複製或‘跳躍’出來,再而插入另一位點,因而對插入位點上的基因調控造成影響。PiggyBac跳躍基因是一種從粉紋夜蛾 (cabbage looper moth) 基因組中取得的跳躍基因, 其系統已被廣泛應用在各種哺乳動物細胞系中作為基因組操縱的工具, 已在基因組功能研究和誘導多能幹細胞等領域得到了廣泛的應用。PiggyBac系統的主要特徵包括在不同的物種上有高轉效率,有相對低的插入位點偏好,以及跳離基因體時會不留痕跡。而此研究的對象NP-mPB跳躍酶,則是一種針對核仁的PiggyBac跳躍酶 (PBase),可以通過在哺乳類慣用轉譯碼優化之PiggyBac跳躍酶 (mPB) 上添加來自HIV-1的TAT蛋白訊號多肽來建造。在之前的研究發現NP-mPB跳躍酶可以有效的提升跳躍效率。另外,建造NP-mPB的目的是要透過修改mPB,將跳躍基因引導進核仁組織區(nucleolus organizer regions; NORs),因為在NORs中有很多rDNA 的複本,若進入此區域對基因的造成破壞比較可以避免影響其他基因的正常運作,以達到有效的基因治療。本研究的目的是分析次世代定序(next generation sequencing)的數據,,找出實驗中所有NP-mPB與mPB的插入位點,以顯示其在小鼠基因組的分佈,研究是否有偏向於NORs或其它基因區域的插入趨性。

PiggyBac is a popular transposon system used to diver transgenes and explore the unknown genomic territory. PiggyBac transposase (PBase) has been widely applied as a genomic manipulation tool to various mammalian cell lines and model organisms. Major features of the piggyBac system include high transposition efficiency in different species, relatively low insertion site preference, and the ability of seamless removal from genome. These features allow its potential uses in functional genomics in a wide range of organisms, such as plants, cattle, pigs, mice, rats, flies, yeast, and several non-model insects. A novel nucleolus-predominant PBase, NP-mPB, was constructed by adding a nucleolus-predominant (NP) signal peptide from HIV-1 TAT protein to a mammalian codon-optimized PBase (mPB). The initial goal is to create a modified mPB that would increase transposition efficiency and mediate transposition towards the nucleolus organizer regions (NORs), which contains several tandem copies of ribosomal DNA genes. Gene disruption at NORs are believed to be less harmful to the species. This research aims at analyzing raw next generation sequencing (NGS) data of mouse ES cells after being transfected with mPB and NP-mPB. Insertion sites of the two PBs was identified by aligning the processed NGS data to the reference genome. Comparisons of the datasets reveal the transposition preferences of NP-mPB towards NORs and other genome regions.

誌謝 i
中文摘要 ii
ABSTRACT iii
CONTENTS iv
LIST OF FIGURES vii
LIST OF TABLES ix
Chapter 1 Introduction 1
1.1 Background 1
1.2 Motivation and Objectives 2
1.3 Significance 3
1.4 Restrictions 4
1.5 Research Procedure 5
1.6 Thesis Organization 6
Chapter 2 Extended Background and Basic Concepts 7
2.1 DNA Transposable Elements 7
2.1.1 Classification of Transposable Elements 7
2.1.2 Transposons-based Applications for Functional Genomics 8
2.1.3 PiggyBac Transposition System 9
2.1.4 A Nucleolus-Predominant piggyBac Transposase NP-mPB 11
2.2 Splinkerette-PCR method 14
2.3 NGS sequencing 16
2.4 Bioinformatics 17
2.4.1 System and Programming 17
2.4.2 Databases and Tools 17
Chapter 3 Data Pre-processing of Raw NGS Data 18
3.1 NGS Raw Data 18
3.1.1 FASTQ and FASTA Format 18
3.1.2 Structure of Single End Reads 19
3.2 Data Pre-processing 20
3.2.1 Pre-processing pipeline 20
3.2.2 Primer and Transposon Sequences Removal 21
3.2.3 Trimming Adaptor Sequences 22
3.2.4 Read Length Selection 23
3.2.5 Grouping Repeat Reads 24
3.2.6 Unique Read Identifier 26
3.3 Data Quality 27
3.4 Data size 29
3.5 Methodology for Target Site Motif analysis 30
Chapter 4 Alignment for Insertion Sites 31
4.1 Sequence Alignment 31
4.1.1 Basic Local Alignment Search Tool 31
4.1.2 Output Format 33
4.2 Databases Selection 33
4.2.1 The Mouse Genome Database 33
4.2.2 Database for the nucleolar organizer regions (NORs) 35
4.2.3 Gene Annotation Database 36
4.3 BLAST Queries and Outputs 37
4.3.1 BLAST for Chromosomal Regions 37
4.3.2 BLAST for NORs 38
4.4 Process Flow for BLAST Results 40
4.4.1 BLAST output 40
4.4.2 Processing pipeline for BLAST Output 42
4.4.3 Total Amount of Repeat Sequences 44
4.4.4 Strategies for Multiple Matches 46
4.4.5 Plotting of Insertion Sites 48
Chapter 5 Statistical Evaluation 49
5.1 Unequal Variances t-test 49
5.2 Poisson Distribution for CIS Analysis 50
5.3 Multiple Hypothesis Correction 51
Chapter 6 Results and Discussion 52
6.1 Target Site Motif Preference of PiggyBac Transposon 52
6.2 Insertions Distributed Throughout Mouse Genome 54
6.2.1 Profile of mPB and NP-mPB Insertion sites 54
6.2.2 Insertions of mPB and NP-mPB at Intragenic Regions 56
6.2.3 Common Insertion Sites of Intragenic Regions 58
6.3 NP-mPB Mediates Insertion Preference towards NORs 60
Chapter 7 Conclusion and Future Work 62
7.1 Conclusions 62
7.2 Future Work 63
BIBLIOGRAPHY 64


[1]D. E. Berg, C. M. Berg, and C. Sasakawa, "Bacterial transposon Tn5: evolutionary inferences," Molecular biology and evolution, vol. 1, pp. 411-422, 1984.
[2]K. Yoshida and M. Aoki, DNA transposable elements research: Nova Biomedical Books, 2008.
[3]R. Mitra, J. Fain‐Thornton, and N. L. Craig, "piggyBac can bypass DNA synthesis during cut and paste transposition," The EMBO Journal, vol. 27, pp. 1097-1109, 2008.
[4]S. Ding, X. Wu, G. Li, M. Han, Y. Zhuang, and T. Xu, "Efficient transposition of the piggyBac (PB) transposon in mammalian cells and mice," Cell, vol. 122, pp. 473-483, 2005.
[5]J. Cadiñanos and A. Bradley, "Generation of an inducible and optimized piggyBac transposon system," Nucleic acids research, vol. 35, p. e87, 2007.
[6]J.-B. Hong, F.-J. Chou, A. T. Ku, H.-H. Fan, T.-L. Lee, Y.-H. Huang, et al., "A Nucleolus-Predominant piggyBac Transposase, NP-mPB, Mediates Elevated Transposition Efficiency in Mammalian Cells," PloS one, vol. 9, p. e89396, 2014.
[7]A. G. Uren, H. Mikkers, J. Kool, L. van der Weyden, A. H. Lund, C. H. Wilson, et al., "A high-throughput splinkerette-PCR method for the isolation and sequencing of retroviral insertion sites," Nature protocols, vol. 4, pp. 789-798, 2009.
[8]J. S. Reis-Filho, "Next-generation sequencing," Breast Cancer Res, vol. 11, p. S12, 2009.
[9]C. M. Bergman and H. Quesneville, "Discovering and detecting transposable elements in genome sequences," Briefings in bioinformatics, vol. 8, pp. 382-392, 2007.
[10]T. M. Keane, K. Wong, and D. J. Adams, "RetroSeq: transposable element discovery from next-generation sequencing data," Bioinformatics, vol. 29, pp. 389-390, 2013.
[11]B. McClintock, "The origin and behavior of mutable loci in maize," Proceedings of the National Academy of Sciences, vol. 36, pp. 344-355, 1950.
[12]T. Wicker, F. Sabot, A. Hua-Van, J. L. Bennetzen, P. Capy, B. Chalhoub, et al., "A unified classification system for eukaryotic transposable elements," Nature Reviews Genetics, vol. 8, pp. 973-982, 2007.
[13]C. A. Rappleye and J. R. Roth, "A Tn10 derivative (T-POP) for isolation of insertions with conditional (tetracycline-dependent) phenotypes," Journal of bacteriology, vol. 179, pp. 5827-5834, 1997.
[14]M. Hensel, J. E. Shea, C. Gleeson, M. D. Jones, E. Dalton, and D. W. Holden, "Simultaneous identification of bacterial virulence genes by negative selection," Science, vol. 269, pp. 400-403, 1995.
[15]Z. Ivics and Z. Izsvak, "Transposons for gene therapy!," Current gene therapy, vol. 6, pp. 593-607, 2006.
[16]Z. Ivics, P. B. Hackett, R. H. Plasterk, and Z. Izsvák, "Molecular reconstruction of Sleeping Beauty, a Tc1-like transposon from fish, and its transposition in human cells," Cell, vol. 91, pp. 501-510, 1997.
[17]Z. Izsvák and Z. Ivics, "Sleeping beauty transposition: biology and applications for molecular therapy," Molecular Therapy, vol. 9, pp. 147-156, 2004.
[18]P. A. Callinan, D. J. Hedges, A.-H. Salem, J. Xing, J. A. Walker, R. K. Garber, et al., "Comprehensive analysis of Alu-associated diversity on the human sex chromosomes," Gene, vol. 317, pp. 103-110, 2003.
[19]L. C. Cary, M. Goebel, B. G. Corsaro, H.-G. Wang, E. Rosen, and M. Fraser, "Transposon mutagenesis of baculoviruses: analysis of Trichoplusia ni transposon IFP2 insertions within the FP-locus of nuclear polyhedrosis viruses," Virology, vol. 172, pp. 156-169, 1989.
[20]M. Fraser, T. Clszczon, T. Elick, and C. Bauser, "Precise excision of TTAA‐specific lepidopteran transposons piggyBac (IFP2) and tagalong (TFP3) from the baculovirus genome in cell lines from two species of Lepidoptera," Insect molecular biology, vol. 5, pp. 141-151, 1996.
[21]H. M. Robertson, "Evolution of DNA transposons in eukaryotes," 2002.
[22](2014, 2015/3/9). PiggyBac transposon Vector System. Available: https://www.systembio.com/downloads/Manual_PiggyBac_Web.pdf
[23]R. S. Devon, D. J. Porteous, and A. J. Brookes, "Splinkerettes--improved vectorettes for greater efficiency in PCR walking," Nucleic acids research, vol. 23, p. 1644, 1995.
[24]E. R. Mardis, "The impact of next-generation sequencing technology on genetics," Trends in genetics, vol. 24, pp. 133-141, 2008.
[25](June 1). An introduction to next-generation sequencing technology. Available: http://www.illumina.com/content/dam/illumina-marketing/documents/products/illumina_sequencing_introduction.pdf
[26]P. Pevzner and R. Shamir, Bioinformatics for biologists: Cambridge University Press, 2011.
[27]M. J. Zvelebil and J. O. Baum, Understanding Bioinformatics: Garland Science, 2008.
[28]O. Bosu, "Bioinformatics: databases, tools, algorithms."
[29]C. Gibas and P. Jambeck, Developing bioinformatics computer skills: " O''Reilly Media, Inc.", 2001.
[30]D. S. Ray and E. J. Ray, Unix: Peachpit Press, 2003.
[31]J. M. Kinser, "Python for bioinformatics," 2009.
[32]National Center for Biotechnology Information. Available:
http://www.ncbi.nlm.nih.gov/About/disclaimer.html
[33](June 1). FASTQ format. Available:
https://en.wikipedia.org/w/index.php?title=FASTQ_format&oldid=666457064
[34](2015, June 19). FASTA format. Available:
https://en.wikipedia.org/w/index.php?title=FASTA_format&oldid=667328288
[35]D. Aird, M. G. Ross, W.-S. Chen, M. Danielsson, T. Fennell, C. Russ, et al., "Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries," Genome biol, vol. 12, p. R18, 2011.
[36](June 23). Repeated sequence (DNA). Available:
https://en.wikipedia.org/wiki/Repeated_sequence_(DNA)
[37](June 24). FastQC. Available:
http://www.bioinformatics.babraham.ac.uk/projects/fastqc/
[38](June 24). Weblogo. Available: http://weblogo.berkeley.edu/logo.cgi
[39]BLAST. Available: http://blast.ncbi.nlm.nih.gov/Blast.cgi
[40]J.-M. Claverie and C. Notredame, Bioinformatics for dummies: John Wiley & Sons, 2011.
[41]Standalone BLAST. Available: http://www.ncbi.nlm.nih.gov/books/NBK52640/
[42]Mus musculus (house mouse) [Online]. Available:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF_000001635.23_GRCm38.p3/GCF_000001635.23_GRCm38.p3_genomic.fna.gz
[43]H. Suzuki, Y. Kurihara, T. Kanehisa, and K. Moriwaki, "Variation in the distribution of silver-staining nucleolar organizer regions on the chromosomes of the wild mouse, Mus musculus," Molecular biology and evolution, vol. 7, pp. 271-282, 1990.
[44]Mus musculus ribosomal DNA [Online]. Available:
http://www.ncbi.nlm.nih.gov/nuccore/BK000964
[45]P. Grozdanov, O. Georgiev, and L. Karagyozov, "Complete sequence of the 45-kb mouse ribosomal DNA repeat: analysis of the intergenic spacer☆," Genomics, vol. 82, pp. 637-643, 2003.
[46]Gene2refseq [Online]. Available:
ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2refseq.gz
[47](June 20). BLAST tutoral. Available:
http://www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html
[48]E. S. Lander, L. M. Linton, B. Birren, C. Nusbaum, M. C. Zody, J. Baldwin, et al., "Initial sequencing and analysis of the human genome," Nature, vol. 409, pp. 860-921, 2001.
[49]A. T. Chinwalla, L. L. Cook, K. D. Delehaunty, G. A. Fewell, L. A. Fulton, R. S. Fulton, et al., "Initial sequencing and comparative analysis of the mouse genome," Nature, vol. 420, pp. 520-562, 2002.
[50](2015). RepeatMasker. Available: http://www.repeatmasker.org/
[51]J. Herrera-Galeano, K. G. Frey, R. Z. Cer, A. J. Mateczun, K. A. Bishop-Lilly, and V. P. Mokashi, "BLASTPLOT: a PERL module to plot next generation sequencing NCBI-BLAST results," Source code for biology and medicine, vol. 9, p. 7, 2014.
[52](2014). Flot. Available: http://www.flotcharts.org/
[53](2015). R. Available: http://www.r-project.org/
[54]W. contributors. Welch''s t test. Available:
https://en.wikipedia.org/w/index.php?title=Welch%27s_t_test&oldid=665955640
[55]W. contributors. Student''s t-test. Available:
https://en.wikipedia.org/w/index.php?title=Student%27s_t-test&oldid=668900412
[56]F. E. Satterthwaite, "An approximate distribution of estimates of variance components," Biometrics bulletin, pp. 110-114, 1946.
[57]X. Wu, B. T. Luke, and S. M. Burgess, "Redefining the common insertion site," Virology, vol. 344, pp. 292-295, 2006.
[58]S. K. Mathur, Statistical bioinformatics: with R: Academic Press, 2009.
[59]W. contributors. (1 July 2015 03:54 UTC). Familywise error rate. Available: https://en.wikipedia.org/w/index.php?title=Familywise_error_rate&oldid=667153391
[60]E. I. Boyle, S. Weng, J. Gollub, H. Jin, D. Botstein, J. M. Cherry, et al., "GO:: TermFinder—open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes," Bioinformatics, vol. 20, pp. 3710-3715, 2004.
[61]Y.-T. Chen, K. Furushima, P.-S. Hou, A. T. Ku, J. M. Deng, C.-W. Jang, et al., "PiggyBac transposon-mediated, reversible gene transfer in human embryonic stem cells," Stem cells and development, vol. 19, pp. 763-771, 2009.
[62]Y. Nishioka and E. Lamothe, "Isolation and characterization of a mouse Y chromosomal repetitive sequence," Genetics, vol. 113, pp. 417-432, 1986.


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top