( 您好!臺灣時間:2021/08/05 16:32
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::


研究生(外文):Kai-Chiang Liang
論文名稱(外文):Construct a gene-based repetitive element database
指導教授(外文):Hsiao-Fang Sunny Sun
外文關鍵詞:databaseuntranslated regionpromoter regionrepetitive element
  • 被引用被引用:0
  • 點閱點閱:120
  • 評分評分:
  • 下載下載:21
  • 收藏至我的研究室書目清單書目收藏:0
人類的基因體中有超過50%是由重複性序列所組成。最近的研究指出生物複雜的多樣性不僅僅只是由能轉譯出蛋白質的編碼序列所造成;在基因體中的其他部分仍然潛藏著一些調控基因表現的訊息,而重複性序列在其中即佔有一席之地。許多基因的表現可以被其鄰近的序列所調控,比方說5端的非轉譯區域,3端的非轉譯區域以及啟動子區域。目前的研究也指出在這些區域中的重複性序列會影響到基因的表現,但是並沒有一個系統性的研究去分析這些不同種類重複性序列的分佈情形。因此本研究的目的在於利用生物資訊的工具,全面性的分析在人類基因體當中可能參與在調控基因表現的重複性序列分布的位置以及其種類。我們由網路上的資料庫下載鄰近於所有人類基因的序列,並利用RepeatMasker這個軟體進行交叉比對以辨認在這些序列中是否含有重複性序列。有趣的是,我們發現縱向重複性序列 (Tandem repetitive elements) 優先分佈在離基因較近的位置,而散置重覆序列 (Interspersed repetitive elements) 則會隨著離基因越遠,其數量會呈現逐漸增加的趨勢。接著,我們就將這些重複性序列以及其相對於基因分佈的趨勢,建構成一個以基因為基礎的重複性序列資料庫 (Gene-Based Repetitive Element Database, GBRED)。除此之外,我們也設計一個便利的網頁介面,以方便使用者搜尋特定的重複性序列或是基因的資料。為了更進一步確認這些重複性序列在基因調控上所帶來的影響,我們使用了一個生物路徑分析軟體來分析具有不同重複性序列的基因群。其結果顯示在非轉譯區域以及啟動子區域具有縱向重複性序列的基因群與發育過程的路徑有高度的相關性,而在3端非轉譯區域及啟動子區域具有散置重複性序列的基因群則是與生物代謝路徑有較高的關連性。最後,我們發現不同種類的重複性序列其散佈在基因周遭的趨勢不盡相同,並且特定種類的重複性序列可能與特定的生物路徑有所相關。這點或許能幫助科學家們預測未知的基因功能,以及它們所參與的生物路徑;以上提到的這些資料,均儲存在GBRED當中。
More than 50% of human genome is constituted by repetitive elements. Recent studies implied that the complexity of living organisms is not just a direct outcome of the number of coding sequence; there should be harbored regulatory information in other genome parts where repetitive elements may play a role in it. Most genes can be regulated by their sequences flanking the coding region, such as 5’ untranslated region (5’ UTR), 3’ untranslated region (3’ UTR), and promoter regions. Nowadays, we know repetitive elements in these regions may play a role in gene expression. However there is no systematical survey the distribution of type and location of these elements. This study aims to thoroughly examine the repetitive elements in the human genome that may be involved in gene regulation by computational approaches. We downloaded the sequences flanking all human genes from internet resources and identified repetitive elements by cross-matching sequences against RepeatMasker database. Interestingly, we found that the tandem repetitive elements preferentially locate close to genes and interspersed repetitive elements showed a tendency to be accumulated distantly from genes. The annotation and distribution of distinct classes of repetitive elements associated with individual gene were then used to construct a gene-based repetitive element database (GBRED). Furthermore, we designed a user-friendly web interface to provide searching function for repetitive elements associated with any particular gene(s). To further characterize the role of these repetitive elements in gene regulation, programs for pathway analysis were used to analyze genes from various repeat groups. Our data suggested that genes containing tandem repetitive elements in their UTRs and upstream 1000-bp region may be involved in development processes, whereas the genes with interspersed repetitive elements in 3’ UTR and upstream 1000-bp region are related to metabolic processes. Finally, our data indicate that distinct classes of repetitive elements display different distribution in human genome and might imply associations with specific biological processes that might provide a hint to predict the unknown gene functions and the biological processes they involving in. All information mentioned above could be acquired from the GBRED.
摘 要 I
Abstract III
誌謝 V
Table of contents VII
List of tables IX
List of figures X

1. Introduction 1
1.1. Repetitive elements in the human genome 1
1.2. Classes of repetitive elements 1
1.2.1. Tandem repetitive elements 1
1.2.2. Interspersed repetitive element 2
1.3. Functional roles of repetitive elements 4
1.4. Databases of repetitive elements on the internet 6
1.5. Objective of this study 9
2. Material and methods 11
2.1. Data sources 11
2.2. Repetitive elements annotation tool – RepeatMasker 11
2.3. Establish gene-based repetitive element database (GBRED) 13
2.3.1. Software 13
2.3.2. Database structure 14
2.4. Pathway analysis program 16
3. Result 18
3.1. Gene-based repetitive element database (GBRED) 18
3.2. Distribution of repetitive classes by regions 19
3.2.1. Distribution of repetitive classes in flank_u1000 19
3.2.2. Distribution of repetitive classes in 5’ UTR 19
3.2.3. Distribution of repetitive classes in 3’ UTR 20
3.2.4. Distribution of repetitive classes in flank_d1000 20
3.2.5. Summary of distribution of repetitive classes 21
3.3. Introduction of GBRED web interface 21
3.3.1. GBRED online-searching tool 22
3.3.2. The result of online-searching tool 23
3.3.3. Result page of GBRED 23
3.3.4. Graph table 24
3.4. Pathway analysis of each individual repetitive element gene groups 25
3.4.1. Genes contain tandem repetitive elements in UTR regions 26
3.4.2. Genes contain interspersed repetitive elements in UTR regions 27
3.4.3. Genes contain repetitive elements in promoter regions 28
3.4.4. Summary of pathway analysis 29
4. Discussion 30
4.1. The characteristic of GBRED 30
4.2. Distribution of repetitive classes by regions 30
4.3. Pathway analysis of each individual repetitive element gene groups 32
4.4. Conclusion 34
5. Reference 36
6. Appendix 69
Almeida, T., I. Alonso, S. Martins, E. M. Ramos, L. Azevedo, K. Ohno, A. Amorim, M. L. Saraiva-Pereira, L. B. Jardim, T. Matsuura, J. Sequeiros and I. Silveira (2009) Ancestral Origin of the ATTCT Repeat Expansion in Spinocerebellar Ataxia Type 10 (SCA10). PLoS ONE. 4, e4553.
Boby, T., A. M. Patch and S. J. Aves (2005) TRbase: a database relating tandem repeats to disease genes for the human genome. Bioinformatics. 21, 811-816.
Bourque, G., B. Leong, V. B. Vega, X. Chen, Y. L. Lee, K. G. Srinivasan, J.-L. Chew, Y. Ruan, C.-L. Wei, H. H. Ng and E. T. Liu (2008) Evolution of the mammalian transcription factor binding repertoire via transposable elements. Genome Research. 18, 1752-1762.
Bugrim, A., T. Nikolskaya and Y. Nikolsky (2004) Early prediction of drug metabolism and toxicity: systems biology approach and modeling. Drug Discovery Today. 9, 127-135.
Charlesworth, B., P. Sniegowski and W. Stephan (1994) The evolutionary dynamics of repetitive DNA in eukaryotes. Nature. 371, 215-220.
Chen, C., A. J. Gentles, J. Jurka and S. Karlin (2002) Genes, pseudogenes, and Alu sequence organization across human chromosomes 21 and 22. Proceedings of the National Academy of Sciences of the United States of America. 99, 2930-2935.
Cohen, C. J., W. M. Lock and D. L. Mager Endogenous retroviral LTRs as promoters for human genes: A critical assessment. Gene. In Press, Corrected Proof.
Cummings, C. J. and H. Y. Zoghbi (2000) Fourteen and counting: unraveling trinucleotide repeat diseases. Hum. Mol. Genet. 9, 909-916.
Dombroski, B. A., Q. Feng, S. L. Mathias, D. M. Sassaman, A. F. Scott, H. H. Kazazian, Jr. and J. D. Boeke (1994) An in vivo assay for the reverse transcriptase of human retrotransposon L1 in Saccharomyces cerevisiae. Mol. Cell. Biol. 14, 4485-4492.
Eller, C. D., M. Regelson, B. Merriman, S. Nelson, S. Horvath and Y. Marahrens (2007) Repetitive sequence environment distinguishes housekeeping genes. Gene. 390, 153-165.
Eppig, J. T., C. J. Bult, J. A. Kadin, J. E. Richardson, J. A. Blake and G. the Mouse Genome Database (2005) The Mouse Genome Database (MGD): from genes to mice--a community resource for mouse biology. Nucl. Acids Res. 33, D471-475.
Gaillard, C. and F. Strauss (2006) DNA topology and genome organization in higher eukaryotes: A model. Journal of Theoretical Biology. 243, 604-607.
Goll, M. G. and T. H. Bestor (2005) EUKARYOTIC CYTOSINE METHYLTRANSFERASES. Annual Review of Biochemistry. 74, 481-514.
Grover, D., P. P. Majumder, C. B. Rao, S. K. Brahmachari and M. Mukerji (2003) Nonrandom Distribution of Alu Elements in Genes of Various Functional Categories: Insight from Analysis of Human Chromosomes 21 and 22. Mol Biol Evol. 20, 1420-1424.
Heidenfelder, B. L. and M. D. Topal (2003) Effects of sequence on repeat expansion during DNA replication. Nucl. Acids Res. 31, 7159-7164.
Hickey, D. A. (1982) SELFISH DNA: A SEXUALLY-TRANSMITTED NUCLEAR PARASITE. Genetics. 101, 519-531.
Jasinska, A. and W. J. Krzyzosiak (2004) Repetitive sequences that shape the human transcriptome. FEBS Letters. 567, 136-141.
John, S. M. (2003) Challenging the dogma: the hidden layer of non-protein-coding RNAs in complex organisms. Bioessays. 25, 930-939.
Jones, P. A. and D. Takai (2001) The Role of DNA Methylation in Mammalian Epigenetics. Science. 293, 1068-1070.
Jurka, J. (1990) Novel families of interspersed repetitive elements from the human genome. Nucl. Acids Res. 18, 137-141.
Jurka, J. (1998) Repeats in genomic DNA: mining and meaning. Current Opinion in Structural Biology. 8, 333-337.
Jurka, J., V. V. Kapitonov, A. Pavlicek, P. Klonowski, O. Kohany and J. Walichiewicz (2005) Repbase Update, a database of eukaryotic repetitive elements. Cytogenetic and Genome Research. 110, 462-467.
Jurka, J., O. Kohany, A. Pavlicek, V. V. Kapitonov and M. V. Jurka (2004) Duplication, coclustering, and selection of human Alu retrotransposons. Proceedings of the National Academy of Sciences of the United States of America. 101, 1268-1272.
Jurka, J., J. Walichiewicz and A. Milosavljevic (1992) Prototypic sequences for human repetitive DNA. Journal of Molecular Evolution. 35, 286-291.
Kashi, Y. and D. G. King (2006) Simple sequence repeats as advantageous mutators in evolution. Trends in Genetics. 22, 253-259.
Ke, X., S. Thomas, D. Robinson and A. Collins (2002) A novel approach for identifying candidate imprinted genes through sequence analysis of imprinted and control genes. Human Genetics. 111, 511-520.
Korenberg, J. R. and M. C. Rykowski (1988) Human genome organization: Alu, LINES, and the molecular structure of metaphase chromosome bands. Cell. 53, 391-400.
Kreahling, J. and B. R. Graveley (2004) The origins and implications of Aluternative splicing. Trends in Genetics. 20, 1-4.
Lander, E. S., et al. (2001) Initial sequencing and analysis of the human genome. Nature. 409, 860-921.
Le Goff, W., M. Guerin, M. J. Chapman and J. Thillet (2003) A CYP7A promoter binding factor site and Alu repeat in the distal promoter region are implicated in regulation of human CETP gene expression. J. Lipid Res. 44, 902-910.
Lippman, Z., A.-V. Gendrel, M. Black, M. W. Vaughn, N. Dedhia, W. Richard McCombie, K. Lavine, V. Mittal, B. May, K. D. Kasschau, J. C. Carrington, R. W. Doerge, V. Colot and R. Martienssen (2004) Role of transposable elements in heterochromatin and epigenetic control. Nature. 430, 471-476.
Liquori, C. L., K. Ricker, M. L. Moseley, J. F. Jacobsen, W. Kress, S. L. Naylor, J. W. Day and L. P. W. Ranum (2001) Myotonic Dystrophy Type 2 Caused by a CCTG Expansion in Intron 1 of ZNF9. Science. 293, 864-867.
Makalowski, W. (2000) Genomic scrap yard: how genomes utilize all that junk. Gene. 259, 61-67.
Merkel, A. and N. Gemmell (2008) Detecting short tandem repeats from genome data: opening the software black box. Brief Bioinform. 9, 355-366.
Orgel, L. E. and F. H. C. Crick (1980) Selfish DNA: the ultimate parasite. Nature. 284, 604-607.
Ovchinnikov, I., A. B. Troxel and G. D. Swergold (2001) Genomic Characterization of Recent Human LINE-1 Insertions: Evidence Supporting Random Insertion. Genome Research. 11, 2050-2058.
Penzkofer, T., T. Dandekar and T. Zemojtel (2005) L1Base: from functional annotation to prediction of active LINE-1 elements. Nucl. Acids Res. 33, D498-500.
Pevzner, P. A., H. Tang and G. Tesler (2004) De Novo Repeat Classification and Fragment Assembly. Genome Res. 14, 1786-1796.
Riley, D. E., J. S. Jeon and J. N. Krieger (2007) Simple repeat evolution includes dramatic primary sequence changes that conserve folding potential. Biochemical and Biophysical Research Communications. 355, 619-625.
Riley, D. E. and J. N. Krieger (2005) Short tandem repeat (STR) replacements in UTRs and introns suggest an important role for certain STRs in gene expression and disease. Gene. 344, 203-211.
Riley, D. E. and J. N. Krieger (2009) UTR dinucleotide simple sequence repeat evolution exhibits recurring patterns including regulatory sequence motif replacements. Gene. 429, 80-86.
Rodriguez, J., L. Vives, M. Jorda, C. Morales, M. Munoz, E. Vendrell and M. A. Peinado (2008) Genome-wide tracking of unmethylated DNA Alu repeats in normal and cancer cells. Nucl. Acids Res. 36, 770-784.
Rollins, R. A., F. Haghighi, J. R. Edwards, R. Das, M. Q. Zhang, J. Ju and T. H. Bestor (2006) Large-scale structure of genomic methylation patterns. Genome Research. 16, 157-163.
Ronfani, L. and M. E. Bianchi (2004) Molecular mechanisms in male determination and germ cell differentiation. Cellular and Molecular Life Sciences. 61, 1907-1925.
Saha, S., S. Bridges, Z. Magbanua and D. Peterson (2008) Computational Approaches and Tools Used in Identification of Dispersed Repetitive DNA Sequences. Tropical Plant Biology. 1, 85-96.
Schmid, C. W. (1998) Does SINE evolution preclude Alu function? Nucl. Acids Res. 26, 4541-4550.
Schmid, C. W. (2003) Alu: a parasite's parasite? Nat Genet. 35, 15-16.
Sharova, L. V., A. A. Sharov, T. Nedorezov, Y. Piao, N. Shaik and M. S. H. Ko (2009) Database for mRNA Half-Life of 19 977 Genes Obtained by DNA Microarray Analysis of Pluripotent and Differentiating Mouse Embryonic Stem Cells. DNA Res. 16, 45-58.
Smalheiser, N. R. and V. I. Torvik (2006) Alu elements within human mRNAs are probable microRNA targets. 22, 532-536.
Spudich, G., X. M. Fernandez-Suarez and E. Birney (2007) Genome browsing with Ensembl: a practical overview. Brief Funct Genomic Proteomic. 6, 202-219.
Sun, F.-L., K. Haynes, C. L. Simpson, S. D. Lee, L. Collins, J. Wuller, J. C. Eissenberg and S. C. R. Elgin (2004) cis-Acting Determinants of Heterochromatin Formation on Drosophila melanogaster Chromosome Four. Mol. Cell. Biol. 24, 8210-8220.
Sverdlov, E. D. (1998) Perpetually mobile footprints of ancient infections in human genome. FEBS Letters. 428, 1-6.
Symer, D. E., C. Connelly, S. T. Szak, E. M. Caputo, G. J. Cost, G. Parmigiani and J. D. Boeke (2002) Human L1 Retrotransposition Is Associated with Genetic Instability In Vivo. 110, 327-338.
Thornburg, B. G., V. Gotea and W. Makalowski (2006) Transposable elements as a significant source of transcription regulating signals. Gene. 365, 104-110.
Wang, T., J. Zeng, C. B. Lowe, R. G. Sellers, S. R. Salama, M. Yang, S. M. Burgess, R. K. Brachmann and D. Haussler* (2007) Species-specific endogenous retroviruses shape the transcriptional network of the human tumor suppressor protein p53. Proceedings of the National Academy of Sciences, 0703637104.
Wessler, S. R. (2006) Transposable elements and the evolution of eukaryotic genomes. Proceedings of the National Academy of Sciences. 103, 17600-17601.
Yates, P. A., R. W. Burman, P. Mummaneni, S. Krussel and M. S. Turker (1999) Tandem B1 Elements Located in a Mouse Methylation Center Provide a Target for de Novo DNA Methylation. J. Biol. Chem. 274, 36357-36361.
Yoder, J. A., C. P. Walsh and T. H. Bestor (1997) Cytosine methylation and the ecology of intragenomic parasites. Trends in Genetics. 13, 335-340.
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
第一頁 上一頁 下一頁 最後一頁 top