跳到主要內容

臺灣博碩士論文加值系統

(44.212.96.86) 您好!臺灣時間:2023/12/10 08:12
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:王旭川
論文名稱:Xanthomonascampestrispv.campestris基因體序列的基因預測與註解
論文名稱(外文):Gene prediction and annotation in Xanthomonas campestris pv. campestris
指導教授:呂平江
學位類別:碩士
校院名稱:國立清華大學
系所名稱:生命科學系
學門:生命科學學門
學類:生物學類
論文種類:學術論文
論文出版年:2002
畢業學年度:90
語文別:中文
論文頁數:71
中文關鍵詞:基因註解基因預測生物資訊十字花科黑腐病Xanthomonas campestris pv. campestris
外文關鍵詞:gene annotationgene predictionbioinformaticsblack rotXanthomonas campestris pv. campestris
相關次數:
  • 被引用被引用:17
  • 點閱點閱:213
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
中文摘要
Xanthomonas campestris pv. campestris是格蘭氏陰性的植物病原菌,主要會感染十字花科的植物,造成十字花科黑腐病(black rot)。我們有興趣的方向在於用結構基因體學的方法,來瞭解 X. campestris pv. campestris 的各項蛋白質分子生化功能。首先從X. campestris pv. campestris 的基因體定序開始,得到序列後分析其中含有的基因,然後大量表現出蛋白質,再利用NMR及X-ray的方法得到蛋白質的結構。本篇論文的工作在於將定序過程中產生的大量DNA序列,經由生物資訊的方法分析,找出可能存在的基因,並做出初步的註解,以利未來篩選出目標蛋白質(target protein),提供給結構實驗室解出蛋白質結構。
基因體序列來源是由陽明大學蔡世峰教授的基因體定序實驗室所提供的575個10X contigs。我們使用軟體ORPHEUS及Glimmer對這些contigs做基因預測,分別得到3811及6327個ORFs(open reading frames),然後將這些ORFs與Genbank non-redundant 蛋白質資料庫比對,以確認出這些ORFs的身份。接下來我們將ORFs與COG(clusters of orthologous group)資料庫做比對,得到2819個Hit COG,這些比對到的ORFs按照蛋白質在細胞內的功能予以分類。由於X. campestris pv. campestris是一株植物致病菌,我們想探討這株致病菌的致病基因,將預測得到的ORFs與另兩株致病菌Xylella fastidiosa 及Ralstonia solanacearum 的致病基因做比對,共找出487個可能與致病相關的ORFs。
我們將上述的分析結果建立了一個初步的X. campestris pv. campestris基因註解資料庫,並設立網站提供基因的比對、查詢與瀏覽(http://xcc.life.nthu.edu.tw)。我們可以利用這個資料庫幫助X. campestris pv. campestris的相關研究,發現一些含有特殊折疊的蛋白質結構功能,未知功能的基因,及X. campestris pv. campestris所具特殊的生化調節機制。
Abstract
Xanthomonas campestris pv. campestis is a Gram-negative bacterium and one of the most important plant pathogens. It attacks cruciferous plants and causes worldwide agricultural loss. We annotate 575 contigs of X. campestris pv. campestis, which were sequenced by Genome Research Center at NYMU.
We got 6327 and 3811 ORFs (open reading frame) predicted by Glimmer 2.0 and ORPHEUS. Then we performed BLAST search against Genbank nr (non-redundant) protein database and there were 5854 ORFs having similarity with protein sequences in the database ( sequence identity > 30% ). We did the sequence search with COG (clusters of orthologous groups) database for protein function assignment, and we got 2819 hits in our ORFs (E-value cut-off = e-20). The pathogenesis mechanism is one of the most important issues for plant pathogen study. 487 candidate genes are found responsible for pathogenesis after comparing our sequences with the annotations of the other two plant pathogens, Xylella fastidiosa and Ralstonia solanacearum. This result could apply to study the pathogenesis mechanisms of different plant pathogens and those genes could be good targets for structure determination.
We collected all annotation results to create a X. campestris pv. campestis gene annotation database ( http://xcc.life.nthu.edu.tw ). This database can help the research related to X. campestris pv. campestis and our final goal is to select the valuable target proteins for structure genomics .
第一章 緒論 2
第一節 Xanthomonas campestris pv. campestris簡介 2
第二節 Xanthomonas campestris pv. campestris基因體序列組合 3
微生物基因體Shotgun定序法 3
Xanthomonas campestris pv. campestris基因體序列的組合 6
第三節 基因預測 7
尋找ORF 7
資料庫比對 8
電腦為基礎的基因預測 8
第二章 序列來源 14
第三章 序列分析方法 15
基因預測軟體 16
ORPHEUS 20
Glimmer 32
tRNAscan-SE 36
RBSfinder 37
序列分析軟體 38
PSORT 38
序列比對軟體 40
BLAST(Basic Local Alignment Search Tool) 40
第四章 結果與討論 43
第一節 基因預測 43
ORF 43
tRNA,rRNA 50
第二節 基因註解 52
第三節 基因功能 53
第三節 序列搜尋 60
第四節 致病基因分析 62
第五章 結論 67
參考文獻 68
Aggarwal, G. and R. Ramaswamy (2002). "Ab initio gene identification: prokaryote genome annotation with GeneScan and GLIMMER." J Biosci 27 Suppl 1: 7-14.
Altschul, S. F., W. Gish, W. Miller, E. W. Myers and D. J. Lipman (1990). "Basic local alignment search tool." J Mol Biol 215(3): 403-10.
Altschul, S. F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. (1997). "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs." Nucleic. Acids. Res. 25: 3389-3402.
Bateman, A., E. Birney, L. Cerruti, R. Durbin, L. Etwiller, S. R. Eddy, S. Griffiths-Jones, K. L. Howe, M. Marshall and E. L. Sonnhammer (2002). "The Pfam protein families database." Nucleic Acids Res 30(1): 276-80.
Bocs, S., A. Danchin and C. Medigue (2002). "Re-annotation of genome microbial CoDing-Sequences: finding new genes and inaccurately annotated genes." BMC Bioinformatics 3(1): 5.
Borodovsky, M., K. E. Rudd and E. V. Koonin (1994). "Intrinsic and extrinsic approaches for detecting genes in a bacterial genome." Nucleic Acids Res 22(22): 4756-67.
Brenner, S. E. (2000). "Target selection for structural genomics." Nat Struct Biol 7 Suppl: 967-9.
Chang, K. W., S. F. Weng and Y. H. Tseng (2001). "UDP-glucose dehydrogenase gene of Xanthomonas campestris is required for virulence." Biochem Biophys Res Commun 287(2): 550-5.
Chou, F. L., H. C. Chou, Y. S. Lin, B. Y. Yang, N. T. Lin, S. F. Weng and Y. H. Tseng (1997). "The Xanthomonas campestris gumD gene required for synthesis of xanthan gum is involved in normal pigmentation and virulence in causing black rot." Biochem Biophys Res Commun 233(1): 265-9.
da Silva, A. C., J. A. Ferro, F. C. Reinach, C. S. Farah, L. R. Furlan, R. B. Quaggio, C. B. Monteiro-Vitorello, M. A. Van Sluys, N. F. Almeida, L. M. Alves, A. M. do Amaral, M. C. Bertolini, L. E. Camargo, G. Camarotte, F. Cannavan, J. Cardozo, F. Chambergo, L. P. Ciapina, R. M. Cicarelli, L. L. Coutinho, J. R. Cursino-Santos, H. El-Dorry, J. B. Faria, A. J. Ferreira, R. C. Ferreira, M. I. Ferro, E. F. Formighieri, M. C. Franco, C. C. Greggio, A. Gruber, A. M. Katsuyama, L. T. Kishi, R. P. Leite, E. G. Lemos, M. V. Lemos, E. C. Locali, M. A. Machado, A. M. Madeira, N. M. Martinez-Rossi, E. C. Martins, J. Meidanis, C. F. Menck, C. Y. Miyaki, D. H. Moon, L. M. Moreira, M. T. Novo, V. K. Okura, M. C. Oliveira, V. R. Oliveira, H. A. Pereira, A. Rossi, J. A. Sena, C. Silva, R. F. de Souza, L. A. Spinola, M. A. Takita, R. E. Tamura, E. C. Teixeira, R. I. Tezza, M. Trindade dos Santos, D. Truffi, S. M. Tsai, F. F. White, J. C. Setubal and J. P. Kitajima (2002). "Comparison of the genomes of two Xanthomonas pathogens with differing host specificities." Nature 417(6887): 459-63.
Delcher, A. L., D. Harmon, S. Kasif, O. White and S. L. Salzberg (1999). "Improved microbial gene identification with GLIMMER." Nucleic Acids Res 27(23): 4636-41.
Dow, J. M. and M. J. Daniels (2000). "Xylella genomics and bacterial pathogenicity to plants." Yeast 17(4): 263-71.
Fleischmann, R. D., M. D. Adams, O. White, R. A. Clayton, E. F. Kirkness, A. R. Kerlavage, C. J. Bult, J. F. Tomb, B. A. Dougherty, J. M. Merrick and et al. (1995). "Whole-genome random sequencing and assembly of Haemophilus influenzae Rd." Science 269(5223): 496-512.
Frishman, D., K. Albermann, J. Hani, K. Heumann, A. Metanomski, A. Zollner and H. W. Mewes (2001). "Functional and structural genomics using PEDANT." Bioinformatics 17(1): 44-57.
Frishman, D., A. Mironov, H. W. Mewes and M. Gelfand (1998). "Combining diverse evidence for gene recognition in completely sequenced bacterial genomes." Nucleic Acids Res 26(12): 2941-7.
von Heijne G. (1986). "A new method for predicting signal sequence cleavage sites." Nucleic Acids Res. 14(11): 4683-4680.
von Heijne G. (1989). "The structure of signal peptides from bacterial lipoproteins." Protein Eng. 2(7): 531-4.
Gish, W. and D. J. States (1993). "Identification of protein coding regions by database similarity search." Nat Genet 3(3): 266-72.
Klein P., M. Kanehisa, and C. DeLisi (1985). "The detection and classification of membrane-spanning proteins." Biochim Biophys Acta. 815(3): 468-76.
Krogh, A., I. S. Mian and D. Haussler (1994). "A hidden Markov model that finds genes in E. coli DNA." Nucleic Acids Res 22(22): 4768-78.
Lambais, M. R., M. H. Goldman, L. E. Camargo and G. H. Goldman (2000). "A genomic approach to the understanding of Xylella fastidiosa pathogenicity." Curr Opin Microbiol 3(5): 459-62.
Lin, N. T. and Y. H. Tseng (1997). "Sequence and copy number of the Xanthomonas campestris pv. campestris gene encoding 16S rRNA." Biochem Biophys Res Commun 235(2): 276-80.
Linial, M. and G. Yona (2000). "Methodologies for target selection in structural genomics." Prog Biophys Mol Biol 73(5): 297-320.
Lowe, T. M. and S. R. Eddy (1997). "tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence." Nucleic Acids Res 25(5): 955-64.
Madden, T. L., R. L. Tatusov and J. Zhang (1996). "Applications of network BLAST server." Methods Enzymol 266: 131-41.
Marchler-Bauer, A., A. R. Panchenko, B. A. Shoemaker, P. A. Thiessen, L. Y. Geer and S. H. Bryant (2002). "CDD: a database of conserved domain alignments with links to domain three-dimensional structure." Nucleic Acids Res 30(1): 281-3.
McGeoch, D. J. (1985). "On the predictive recognition of signal peptide sequences." Virus Res 3(3): 271-86.
Nakai, K. and P. Horton (1999). "PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization." Trends Biochem Sci 24(1): 34-6.
Nakamura, Y., T. Gojobori and T. Ikemura (2000). "Codon usage tabulated from international DNA sequence databases: status for the year 2000." Nucleic Acids Res 28(1): 292.
Natale, D. A., M. Y. Galperin, R. L. Tatusov and E. V. Koonin (2000). "Using the COG database to improve gene recognition in complete genomes." Genetica 108(1): 9-17.
Noel, L., F. Thieme, D. Nennstiel and U. Bonas (2001). "cDNA-AFLP analysis unravels a genome-wide hrpG-regulon in the plant pathogen Xanthomonas campestris pv. vesicatoria." Mol Microbiol 41(6): 1271-81.
Ramakrishna, R. and R. Srinivasan (1999). "Gene identification in bacterial and organellar genomes using GeneScan." Comput Chem 23(2): 165-74.
Rutherford, K., J. Parkhill, J. Crook, T. Horsnell, P. Rice, M. A. Rajandream and B. Barrell (2000). "Artemis: sequence visualization and annotation." Bioinformatics 16(10): 944-5.
Salanoubat, M., S. Genin, F. Artiguenave, J. Gouzy, S. Mangenot, M. Arlat, A. Billault, P. Brottier, J. C. Camus, L. Cattolico, M. Chandler, N. Choisne, C. Claudel-Renard, S. Cunnac, N. Demange, C. Gaspin, M. Lavie, A. Moisan, C. Robert, W. Saurin, T. Schiex, P. Siguier, P. Thebault, M. Whalen, P. Wincker, M. Levy, J. Weissenbach and C. A. Boucher (2002). "Genome sequence of the plant pathogen Ralstonia solanacearum." Nature 415(6871): 497-502.
Simpson, A. J., F. C. Reinach, P. Arruda, F. A. Abreu, M. Acencio, R. Alvarenga, L. M. Alves, J. E. Araya, G. S. Baia, C. S. Baptista, M. H. Barros, E. D. Bonaccorsi, S. Bordin, J. M. Bove, M. R. Briones, M. R. Bueno, A. A. Camargo, L. E. Camargo, D. M. Carraro, H. Carrer, N. B. Colauto, C. Colombo, F. F. Costa, M. C. Costa, C. M. Costa-Neto, L. L. Coutinho, M. Cristofani, E. Dias-Neto, C. Docena, H. El-Dorry, A. P. Facincani, A. J. Ferreira, V. C. Ferreira, J. A. Ferro, J. S. Fraga, S. C. Franca, M. C. Franco, M. Frohme, L. R. Furlan, M. Garnier, G. H. Goldman, M. H. Goldman, S. L. Gomes, A. Gruber, P. L. Ho, J. D. Hoheisel, M. L. Junqueira, E. L. Kemper, J. P. Kitajima, J. E. Krieger, E. E. Kuramae, F. Laigret, M. R. Lambais, L. C. Leite, E. G. Lemos, M. V. Lemos, S. A. Lopes, C. R. Lopes, J. A. Machado, M. A. Machado, A. M. Madeira, H. M. Madeira, C. L. Marino, M. V. Marques, E. A. Martins, E. M. Martins, A. Y. Matsukuma, C. F. Menck, E. C. Miracca, C. Y. Miyaki, C. B. Monteriro-Vitorello, D. H. Moon, M. A. Nagai, A. L. Nascimento, L. E. Netto, A. Nhani, Jr., F. G. Nobrega, L. R. Nunes, M. A. Oliveira, M. C. de Oliveira, R. C. de Oliveira, D. A. Palmieri, A. Paris, B. R. Peixoto, G. A. Pereira, H. A. Pereira, Jr., J. B. Pesquero, R. B. Quaggio, P. G. Roberto, V. Rodrigues, M. R. A. J. de, V. E. de Rosa, Jr., R. G. de Sa, R. V. Santelli, H. E. Sawasaki, A. C. da Silva, A. M. da Silva, F. R. da Silva, W. A. da Silva, Jr., J. F. da Silveira, M. L. Silvestri, W. J. Siqueira, A. A. de Souza, A. P. de Souza, M. F. Terenzi, D. Truffi, S. M. Tsai, M. H. Tsuhako, H. Vallada, M. A. Van Sluys, S. Verjovski-Almeida, A. L. Vettore, M. A. Zago, M. Zatz, J. Meidanis and J. C. Setubal (2000). "The genome sequence of the plant pathogen Xylella fastidiosa. The Xylella fastidiosa Consortium of the Organization for Nucleotide Sequencing and Analysis." Nature 406(6792): 151-7.
Sonnhammer, E. L., S. R. Eddy, E. Birney, A. Bateman and R. Durbin (1998). "Pfam: multiple sequence alignments and HMM-profiles of protein domains." Nucleic Acids Res 26(1): 320-2.
Sonnhammer, E. L., S. R. Eddy and R. Durbin (1997). "Pfam: a comprehensive database of protein domain families based on seed alignments." Proteins 28(3): 405-20.
Suzek, B. E., M. D. Ermolaeva, M. Schreiber and S. L. Salzberg (2001). "A probabilistic method for identifying start codons in bacterial genomes." Bioinformatics 17(12): 1123-30.
Tatusov, R. L., M. Y. Galperin, D. A. Natale and E. V. Koonin (2000). "The COG database: a tool for genome-scale analysis of protein functions and evolution." Nucleic Acids Res 28(1): 33-6.
Tatusov, R. L., E. V. Koonin and D. J. Lipman (1997). "A genomic perspective on protein families." Science 278(5338): 631-7.
Tatusov, R. L., D. A. Natale, I. V. Garkavtsev, T. A. Tatusova, U. T. Shankavaram, B. S. Rao, B. Kiryutin, M. Y. Galperin, N. D. Fedorova and E. V. Koonin (2001). "The COG database: new developments in phylogenetic classification of proteins from complete genomes." Nucleic Acids Res 29(1): 22-8.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top