跳到主要內容

臺灣博碩士論文加值系統

(3.236.50.201) 您好!臺灣時間:2021/08/06 08:32
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:賴偉榮
研究生(外文):Wei-JungLai
論文名稱:Crosstalk矩陣之選取及鹼基的判定
論文名稱(外文):Crosstalk Matrix Selection and Base Determination
指導教授:詹世煌詹世煌引用關係
指導教授(外文):Shih-Huang Chan
學位類別:碩士
校院名稱:國立成功大學
系所名稱:統計學系碩博士班
學門:數學及統計學門
學類:統計學類
論文種類:學術論文
論文出版年:2012
畢業學年度:100
語文別:英文
論文頁數:57
中文關鍵詞:次世代基因定序cross-talk矩陣鹼基判定品質分數
外文關鍵詞:Next Generation SequencingCross-talk MatrixBase-calling Quality Score
相關次數:
  • 被引用被引用:1
  • 點閱點閱:169
  • 評分評分:
  • 下載下載:14
  • 收藏至我的研究室書目清單書目收藏:0
在次世代基因序列的研究主要聚焦於reads的組裝,鮮有人討論到基因序列中鹼基A、C、G、T的判定與品質評估。由於鹼基判定的精準性會影響後續reads的組裝及其結果的分析,故在生物多樣性的探測與下游統計分析上佔極其重要之地位與份量。有學者提出使用crosstalk matrix來提高鹼基判定的可靠性與降低預測錯誤率,並採用不同的品質分數區分經crosstalk matrix轉化後之鹼基亮度值的優劣成效(Giddings 1993),在本文中,我們利用SN ratio 來選取crosstalk matrix。在鹼基之散佈行為屬均勻分配之假設下,我們將陣列切割成列寬為R的區間,並據以估計crosstalk matrix。切割不同區間而能得到最大的SN ratio者所對應之矩陣即為最佳的crosstalk matrix。
在品質分析上,本文將以芒屬植物(Miscanthus)之DNA資料來建立模型,並用此模型參數來模擬基因晶片(Tile)上鹼基的散佈行為。本文亦利用Lawrence和Solovyev(1994)所提出√(m^2+d^2 )品質分數與高崑閎(2011)提出指標值之分配極端性品質分數來衡量與SN ratio的相關性,最後模擬結果發現√(m^2+d^2 )品質分數與SN ratio間存在著正相關之特性。

The main focus of NGS data analysis is read assemble. For NGS data, relative few people discussed the determination and quality score for bases. Because the accuracy of base determination affects following reads assembly, and hence analysis, it is very important if qualitative findings are to be assured in biodiversity detection and downstream statistics analysis. Cross-talk matrix is proposed by several scholars, say Giddings et al (1993). The application of cross-talk matrix enhances base-call reliability and reduces the prediction error rate. In this thesis, SN ratio will be proposed in the selection of optimal row number used, in the estimation of cross-talk matrix.
As to the issue of quality for base, the DNA data of Miscanthus is used to build model, and to estimate the parameters of model in simulating base scattered behavior. We also use the √(m^2+d^2 ) proposed by Lawrence and Solovyev (1994) to establish the quality score, and the extreme behavior of an index distribution suggested by Kao (2011). We also measure the correlation between SN ratio and the quality score, and through simulation, we discover that √(m^2+d^2 ) has positive correlation with SN ratio.

Contents

1 Introduction 1
1.1 Background……………………………………………………...…….1
1.2 Motivation and Purpose………………………………...…….…..……2
2 Literature Review 3
2.1 DNA Sequencing.………………………………………...….……..…3
2.2 Crosstalk Matrix…………….…………………………..……………5
2.3 Base-Calling Quality Score…………………………….……………10
3 A New Crosstalk Index and Crosstalk Matrix 12
3.1 Index in Selecting Crosstalk Matrix:SN ratio…………...…………12
3.2 A New Estimating Crosstalk Matrix Method….……….…………….14
4 Simulation 16
4.1 Data Structure…………………….………………….………………16
4.2 Simulation Process…………………………...……………………...20
4.3 Simulation Results………..…………………..…..…………………22
5 Conclusion and Discussion 35

Reference 36
Appendix A 37
Appendix B 39
Appendix C 42
Appendix D 46
Appendix E 50
Appendix F 55


Giddings, M.C., Brumley, R.L. Jr, Haker, M. and Smith^*, L.M. (1993). “An adaptive, object oriented strategy for base calling in DNA sequence analysis, Nucleic Acids Res. 19(21) :4530-4540.
Kao, K.H. (2011). Issues on Cross-Talk Matrix and Quality Measures for Second-Generation Sequence Call. Department of Statistics National Cheng Kung University.
Lawrence^*,C.B. and Solovyev, V.V. (1994). “Assignment of position-specific error probability to primary DNA sequence data. Nucleic Acids Res. 7(22) :1272-1280.
Li, L. and Speed, T.P. (1999). “An estimate of the crosstalk matrix in four-dye fluorescence-based DNA sequence. Electrophoresis. Jun;7(20):1433-1442.
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H. and Teller, E. (1953). Equation of State Calculations by Fast Computing Machines. J.Chem.Phys. 6(21) :1087-1092.
Saiki, R.K., Scharf, S., Faloona, F., Mullis, K.B., Horn, G.T., Erlich, H.A. and Arnheim, N. (1985). “Enzymatic Amplification of $ eta $-Globin Genomic Sequences and Restriction Site Analysis for Diagnosis of Sickle Cell Anemia. Science New Series. 4732(230) :1350-1354.
Sanger, F., Nicklen, S. and Coulson, A.R. (1977). “DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA. 12(74) :5463-5467.
大石正道 (2002), 圖解人類基因組的構造, 台北:世茂。

連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top