跳到主要內容

臺灣博碩士論文加值系統

(216.73.216.44) 您好!臺灣時間:2025/12/31 10:44
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:李正揚
研究生(外文):Cheng-Yang Lee
論文名稱:運用高通量全外顯子定序偵測複製數變異工具之評估與整合
論文名稱(外文):Evaluation and integration of somatic copy number detection tools for whole-exome sequencing data
指導教授:張資昊
指導教授(外文):Tzu-Hao Chang
學位類別:碩士
校院名稱:臺北醫學大學
系所名稱:醫學資訊研究所
學門:醫藥衛生學門
學類:醫學技術及檢驗學類
論文種類:學術論文
論文出版年:2016
畢業學年度:104
語文別:中文
論文頁數:70
中文關鍵詞:全外顯子定序複製數變異複製數變異偵測工具
外文關鍵詞:whole-exome sequencingcopy number variationCNV tools
相關次數:
  • 被引用被引用:0
  • 點閱點閱:243
  • 評分評分:
  • 下載下載:13
  • 收藏至我的研究室書目清單書目收藏:0
複製數變異 ( Copy Number Variations,CNVs ) 是一種其長度定義為大於50核苷酸,且在DNA序列上發生多重複製( Amplifications )、刪除 ( Deletions )、易位 ( Translocations )、嵌入 ( Insertions ) 的基因組變異。初期研究指出CNVs與正常人之神經系統功能、細胞生長調節及調節新陳代謝有關,並與多項疾病有所關聯,包括自閉症、精神分裂症、肥胖等,近年來相關研究亦指出CNVs與癌症有著密不可分的關係。目前已知超過半數之CNVs區段與蛋白質編碼區 (protein-coding region) 重疊,而全外顯子定序 ( whole-exome sequencing,WES ) 是針對蛋白質編碼區進行定序,具備成本低、準確度高且運算資源少等特性,因此WES已成為臨床應用不可或缺之工具。CNVs偵測準確性對於臨床診斷與預後評估具有重要的影響。目前已有許多工具致力於偵測WES資料中之CNVs,然而不同偵測策略皆有其不同之偵測極限 ( limitation ),且目前尚未有單一一套偵測工具能偵測出所有類型之CNVs,也尚未有任一平台能提供各種偵測工具之整合分析。本研究針對上述兩點,利用TCGA (The Cancer Genome Atlas) WES 資料評估多項可偵測Somatic Copy Number Variations ( somatic CNVs, SCNVs ) 之工具,並利用Virtual Machine ( VM ) 架構各種偵測SCNV工具之整合分析平台。經評估後發現ExomeCNV及VEGAWES偵測SCNVs準確率較高,EXCAVATOR可偵測較長片段之SCNVs,VarScan2需要較多時間進行CNVs偵測。本研究將評估後之結果彙整而成表格,可方便使用者尋求是用於自身實驗設計之偵測工具並透過一簡單指令即可使用本研究建立之分析平台偵測somatic CNVs。

Copy Number Variations (CNVs) are a form of structural variation that manifest as amplifications, deletions, translocations, and insertions in the genome with segment size larger than 50 bp. Previous studies have reported that CNVs are associated with biological functions of nervous system, cellular development and metabolism in healthy people while also have relationships with diseases such as autism, schizophrenia and obesity. Recent related studies have also uncovered additional important role of CNVs in cancers. With the decreasing costs and high accuracy of next-generation sequencing, whole-exome sequencing ( WES ) has become a dominant method for identifying CNVs in both research and clinical settings. Since the accurate identification of CNVs may affect successful clinical diagnosis and prognosis, substantial efforts have been devoted to develop tools for detecting CNVs for WES, but these tools have their own limitation. However, no single method can achieve the complete detection of all kinds of CNV events. Accordingly, we tried to evaluate as many detection tools as possible by using WES data obtained from TCGA (The Cancer Genome Atlas) GChub, to achieve a fully consideration and evaluation of existing somatic copy number variations ( somatic CNVs, SCNVs ) detection tools. Furthermore, we also constructed and integrated platform for CNVs detection in VM. After evaluation, the study found that ExomeCNV and VEGAWES could have higher accuracy for detecting CNVs; EXCAVATOR could have preference for large CNVs; VarScan2 could need more time to execute CNVs detecting. The study also made a table to summarize all result of evaluation and the table will be convenient for users to find tools which could be fitted their own experimental design. Finally, users can use a simple command line to execute analysis pipeline made by the study to detect CNVs.

標題 i
臺北醫學大學碩士卅博士學位考試委員審定書 ii
臺北醫學大學電子暨紙本學位論文書目同意公開申請書 iii
臺北醫學大學學位考試保密同意書暨簽到表 iv
誌謝 v
目錄 vi
表目錄 List of Tables viii
圖目錄 List of Figures viii
中文摘要 x
英文摘要 xii
第一章 緒論 1
1.1何謂複製數變異 1
1.2運用Microarray及定序技術 ( WGS、WES ) 偵測CNVs方法 2
1.2.1運用Microarray偵測CNVs之方法 3
1.2.2運用NGS偵測CNVs之方法 4
1.3研究動機與目的 6
1.3.1研究動機 6
1.3.2研究目的 7
第二章 文獻探討 8
第三章 研究材料與方法 10
3.1收集與建置somatic CNVs ( SCNVs ) 偵測工具分析流程 10
3.2收集TCGA WES and SNP array 資料 11
3.3 OSCC已知基因之gain與loss 13
3.4建立SCNVs偵測工具整體效能評估模型 14
3.4.1各偵測工具測定SCNVs效能之評估 14
3.4.2各偵測工具準確性之評估 17
3.4.3建立各偵測工具之評估表格 17
3.5整合SCNVs偵測工具並建置VM環境,發展SCNVs自動化分析平台 18
第四章 分析與結果 20
4.1 SNP array與各偵測工具測得SCNVs之分布情形 20
4.1.1 SNP array測得各樣本中SCNVs之分布情形 20
4.1.2各偵測工具測得各樣本中SCNVs之分布情形 22
4.1.3各樣本中SNP array與各偵測工具測得SCNVs之分布情形 25
4.2 SNP array與各偵測工具測得SCNVs duplication與deletion之分布情形 30
4.3各偵測工具測得SCNVs長度之分布情形 31
4.4各偵測工具測得common SCNVs之比較 33
4.4.1 SNP array資料之common SCNVs 33
4.4.2 OSCC中已知具有SCNVs之基因 35
4.5各偵測工具測得SCNVs所需之時間 36
4.6各偵測工具準確性之評估 37
4.6.1運用SNP array資料作為驗證標準之準確性評估 37
4.6.2運用已知具有SNVs之基因作為標準之準確性評估 39
4.7建立各偵測工具之評估表格 40
4.8整合並建置SCNVs之分析平台 42
4.8.1建置SCNVs之分析平台 42
4.8.2 SCNVs分析平台之輸出結果 44
第五章 討論 47
5.1 收集與建置SCNVs偵測工具分析流程 47
5.2 發展SCNVs一致性效能評估模型 51
5.2.1各偵測工具測得SCNVs長度之比較 51
5.2.2 各偵測工具測得SCNVs duplication與deletion比例分布 51
5.2.3各偵測工具測得common SCNVs之能力評估 52
5.2.4各偵測工具測得20套樣本SCNVs所需之時間 52
5.2.5各偵測工具測得SCNVs之準確性評估 52
5.3 利用VM建置並整合SCNVs偵測工具之分析平台 53
第六章 結論與建議 54
參考資料 55

Alkan, C., Coe, B. P., & Eichler, E. E. (2011). Genome structural variation discovery and genotyping. Nat Rev Genet, 12(5), 363-376. doi:10.1038/nrg2958
Alkodsi, A., Louhimo, R., & Hautaniemi, S. (2015). Comparative analysis of methods for identifying somatic copy number alterations from deep sequencing data. Brief Bioinform, 16(2), 242-254. doi:10.1093/bib/bbu004
Anjum, S., Morganella, S., D''Angelo, F., Iavarone, A., & Ceccarelli, M. (2015). VEGAWES: variational segmentation on whole exome sequencing for copy number detection. BMC Bioinformatics, 16, 315. doi:10.1186/s12859-015-0748-0
Beroukhim, R., Mermel, C. H., Porter, D., Wei, G., Raychaudhuri, S., Donovan, J., . . . Meyerson, M. (2010). The landscape of somatic copy-number alteration across human cancers. Nature, 463(7283), 899-905. doi:10.1038/nature08822
Cancer Genome Atlas, N. (2015). Comprehensive genomic characterization of head and neck squamous cell carcinomas. Nature, 517(7536), 576-582. doi:10.1038/nature14129
Henry, V. J., Bandrowski, A. E., Pepin, A. S., Gonzalez, B. J., & Desfeux, A. (2014). OMICtools: an informative directory for multi-omic data analysis. Database (Oxford), 2014. doi:10.1093/database/bau069
Iafrate, A. J., Feuk, L., Rivera, M. N., Listewnik, M. L., Donahoe, P. K., Qi, Y., . . . Lee, C. (2004). Detection of large-scale variation in the human genome. Nat Genet, 36(9), 949-951. doi:10.1038/ng1416
Jiang, Y., Oldridge, D. A., Diskin, S. J., & Zhang, N. R. (2015). CODEX: a normalization and copy number variation detection method for whole exome sequencing. Nucleic Acids Res, 43(6), e39. doi:10.1093/nar/gku1363
Kadalayil, L., Rafiq, S., Rose-Zerilli, M. J., Pengelly, R. J., Parker, H., Oscier, D., . . . Collins, A. (2015). Exome sequence read depth methods for identifying copy number changes. Brief Bioinform, 16(3), 380-392. doi:10.1093/bib/bbu027
Karampetsou, E., Morrogh, D., & Chitty, L. (2014). Microarray Technology for the Diagnosis of Fetal Chromosomal Aberrations: Which Platform Should We Use? J Clin Med, 3(2), 663-678. doi:10.3390/jcm3020663
Koboldt, D. C., Zhang, Q., Larson, D. E., Shen, D., McLellan, M. D., Lin, L., . . . Wilson, R. K. (2012). VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res, 22(3), 568-576. doi:10.1101/gr.129684.111
Li, H., & Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. BIOINFORMATICS, 25(14), 1754-1760. doi:10.1093/bioinformatics/btp324
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., . . . Genome Project Data Processing, S. (2009). The Sequence Alignment/Map format and SAMtools. BIOINFORMATICS, 25(16), 2078-2079. doi:10.1093/bioinformatics/btp352
Li, J., Doyle, M. A., Saeed, I., Wong, S. Q., Mar, V., Goode, D. L., . . . Tothill, R. W. (2014). Bioinformatics pipelines for targeted resequencing and whole-exome sequencing of human and mouse genomes: a virtual appliance approach for instant deployment. PLoS One, 9(4), e95217. doi:10.1371/journal.pone.0095217
Li, J., Lupat, R., Amarasinghe, K. C., Thompson, E. R., Doyle, M. A., Ryland, G. L., . . . Gorringe, K. L. (2012). CONTRA: copy number analysis for targeted resequencing. BIOINFORMATICS, 28(10), 1307-1313. doi:10.1093/bioinformatics/bts146
Liu, B., Morrison, C. D., Johnson, C. S., Trump, D. L., Qin, M., Conroy, J. C., . . . Liu, S. (2013). Computational methods for detecting copy number variations in cancer genome using next generation sequencing: principles and challenges. Oncotarget, 4(11), 1868-1881.
Magi, A., Tattini, L., Cifola, I., D''Aurizio, R., Benelli, M., Mangano, E., . . . Gensini, G. F. (2013). EXCAVATOR: detecting copy number variants from whole-exome sequencing data. Genome Biol, 14(10), R120. doi:10.1186/gb-2013-14-10-r120
Mardis, E. R. (2008). Next-generation DNA sequencing methods. Annu Rev Genomics Hum Genet, 9, 387-402. doi:10.1146/annurev.genom.9.081307.164359
McKenna, A., Hanna, M., Banks, E., Sivachenko, A., Cibulskis, K., Kernytsky, A., . . . DePristo, M. A. (2010). The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res, 20(9), 1297-1303. doi:10.1101/gr.107524.110
Mermel, C. H., Schumacher, S. E., Hill, B., Meyerson, M. L., Beroukhim, R., & Getz, G. (2011). GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol, 12(4), R41. doi:10.1186/gb-2011-12-4-r41
Nam, J. Y., Kim, N. K., Kim, S. C., Joung, J. G., Xi, R., Lee, S., . . . Park, W. Y. (2016). Evaluation of somatic copy number estimation tools for whole-exome sequencing data. Brief Bioinform, 17(2), 185-192. doi:10.1093/bib/bbv055
Okonechnikov, K., Conesa, A., & Garcia-Alcalde, F. (2016). Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data. BIOINFORMATICS, 32(2), 292-294. doi:10.1093/bioinformatics/btv566
Qiu, F., Xu, Y., Li, K., Li, Z., Liu, Y., DuanMu, H., . . . Li, X. (2012). CNVD: text mining-based copy number variation in disease database. Hum Mutat, 33(11), E2375-2381. doi:10.1002/humu.22163
Rodriguez-Revenga, L., Mila, M., Rosenberg, C., Lamb, A., & Lee, C. (2007). Structural variation in the human genome: the impact of copy number variants on clinical diagnosis. Genet Med, 9(9), 600-606. doi:10.1097GIM.0b013e318149e1e3
Salahshourifar, I., Vincent-Chong, V. K., Kallarakkal, T. G., & Zain, R. B. (2014). Genomic DNA copy number alterations from precursor oral lesions to oral squamous cell carcinoma. Oral Oncol, 50(5), 404-412. doi:10.1016/j.oraloncology.2014.02.005
Samarakoon, P. S., Sorte, H. S., Kristiansen, B. E., Skodje, T., Sheng, Y., Tjonnfjord, G. E., . . . Lyle, R. (2014). Identification of copy number variants from exome sequence data. BMC Genomics, 15, 661. doi:10.1186/1471-2164-15-661
Sathirapongsasuti, J. F., Lee, H., Horst, B. A., Brunner, G., Cochran, A. J., Binder, S., . . . Nelson, S. F. (2011). Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV. BIOINFORMATICS, 27(19), 2648-2654. doi:10.1093/bioinformatics/btr462
Sebat, J., Lakshmi, B., Troge, J., Alexander, J., Young, J., Lundin, P., . . . Wigler, M. (2004). Large-scale copy number polymorphism in the human genome. Science, 305(5683), 525-528. doi:10.1126/science.1098918
Shi, Y., & Majewski, J. (2013). FishingCNV: a graphical software package for detecting rare copy number variations in exome-sequencing data. BIOINFORMATICS, 29(11), 1461-1462. doi:10.1093/bioinformatics/btt151
Sircoulomb, F., Bekhouche, I., Finetti, P., Adelaide, J., Ben Hamida, A., Bonansea, J., . . . Chaffanet, M. (2010). Genome profiling of ERBB2-amplified breast cancers. BMC Cancer, 10, 539. doi:10.1186/1471-2407-10-539
Tan, R., Wang, Y., Kleinstein, S. E., Liu, Y., Zhu, X., Guo, H., . . . Zhu, M. (2014). An evaluation of copy number variation detection tools from whole-exome sequencing data. Hum Mutat, 35(7), 899-907. doi:10.1002/humu.22537
Wang, C., Evans, J. M., Bhagwate, A. V., Prodduturi, N., Sarangi, V., Middha, M., . . . Asmann, Y. W. (2014). PatternCNV: a versatile tool for detecting copy number changes from exome sequencing data. BIOINFORMATICS, 30(18), 2678-2680. doi:10.1093/bioinformatics/btu363
Zhang, Y., Yu, Z., Ban, R., Zhang, H., Iqbal, F., Zhao, A., . . . Shi, Q. (2015). DeAnnCNV: a tool for online detection and annotation of copy number variations from whole-exome sequencing data. Nucleic Acids Res, 43(W1), W289-294. doi:10.1093/nar/gkv556
Zhao, M., Wang, Q., Wang, Q., Jia, P., & Zhao, Z. (2013). Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives. BMC Bioinformatics, 14 Suppl 11, S1. doi:10.1186/1471-2105-14-S11-S1


QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top