跳到主要內容

臺灣博碩士論文加值系統

(44.222.104.206) 您好!臺灣時間:2024/05/23 16:32
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:林永泰
研究生(外文):Lin, Yong-Tai
論文名稱:基於跨步卷積預測第四位點胞嘧啶甲基化
論文名稱(外文):DNA 4mC Site Prediction Method based on Strided Convolution
指導教授:張天豪
指導教授(外文):Chang, Tien-Hao
口試委員:吳謂勝劉宗霖陳倩瑜
口試日期:2021-07-23
學位類別:碩士
校院名稱:國立成功大學
系所名稱:電腦與通信工程研究所
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2021
畢業學年度:109
語文別:中文
論文頁數:39
中文關鍵詞:第四位點胞嘧啶甲基化深度學習跨步卷積
外文關鍵詞:DNA N4-MethylcytosineDeep LearningStrided Convolution
相關次數:
  • 被引用被引用:0
  • 點閱點閱:60
  • 評分評分:
  • 下載下載:2
  • 收藏至我的研究室書目清單書目收藏:0
第四位點的胞嘧啶甲基化是涉及各種生物過程的重要表觀遺傳修飾,對該位點進行準確的識別可以幫助研究人員了解表觀遺傳功能和機制。第四位點的胞嘧啶甲基化可以透過DNA甲基化定序等生物實驗方式進行鑑定,然而這樣的生物實驗方法繁瑣且成本昂貴,因此近年來有許多研究開發準確的計算方法來識別第四位點的胞嘧啶甲基化。
過去提出的計算方法大多使用機器學習方式預測,這些方法大多是依據DNA物理化學特性來設計輸入模型的特徵,然而這樣的方式需要大量的生物專業知識。本研究提出一個基於卷積神經網路的模型,能夠讓模型自行學習DNA序列的特徵。本研究中提出的模型與其他現有的預測方式比較,最終本研究在線蟲、果蠅、阿拉伯芥和大腸桿菌上有最好馬修斯相關係數0.530、0.534、0.566、0.120.
DNA N4-methylcytosine (4mC) is an important epigenetic modification involved in various biological processes. Accurate identification of 4mC site is essential to improve understanding of its biological functions and mechanisms. 4mC can be identified through biological experiments such as DNA methylation sequencing. However, biological experiments are cumbersome and expensive. Developing an accurate computational method to identify DNA N4-methylcytosine in multiple species is necessary.
Most of the computational methods proposed in the past use machine learning methods. The features used in these methods are mostly designed according to the physical and chemical properties of DNA. However, these methods require a lot of expertise. In this work, a deep learning model is proposed. The proposed model is based on a strided convolution network, which allows the model to learn the features of DNA sequences on its own, without the need for artificial design features.
The model proposed in this work is compared with other existing prediction methods of DNA N4-methylcytosine (4mC). According to our results, our work has the best Matthews correlation coefficient on multiple species of C.elegans (0.530)、D.melanogaster (0.534)、A.thaliana.(0.566) and E.coli (0.120)
第一章 緒論 1
第二章 相關研究 2
2.1 DNA 甲基化 3
2.2 DNA N4甲基胞嘧啶預測方法 4
2.2.1 iDNA4mc 4
2.2.2 4mcPred 5
2.2.3 4mcPred-SVM 7
2.2.4 4mcPred-IFL 8
2.2.5 Meta-4mcPred 9
2.2.6 4mCCNN 10
2.2.7 DNA4mc-LIP 11
2.2.8 Comparison and Analysis of Web based N4-Methylcytosine Site Prediction Tools 12
2.3 卷積神經網路 (Convolutional Neural Network, CNN) 12
2.3.1 卷積層 (Convolutional Layer) 13
2.3.2 跨步卷積層 (Strided Convolutional Layer) 14
2.3.3 全連接層 (Fully Connected Layer) 15
第三章 研究方法 16
3.1 資料集 16
3.1.1 Chen 資料集 16
3.1.2 Manavalan 資料集 17
3.2 資料編碼 18
3.3 模型訓練與測試流程 19
3.4 模型架構 19
3.5 模型訓練配置 21
第四章 實驗結果 22
4.1 效能評估標準 23
4.2 物種表現評估 24
4.2.1 Chen 資料集 24
4.2.2 Manavalan 資料集 27
4.3 跨步卷積與池化層比較 31
4.4 跨物種預測分析 33
4.5 合併物種樣本訓練模型 35
4.6 訓練Manavalan資料集並測試在Chen資料集 36
第五章 結論 37
5.1 結論 37
5.2 未來展望 38
參考文獻 38
[1].X. Cheng, "DNA modification by methyltransferases," , Current opinion in structural biology, vol. 5, no. 1, pp. 4-10, Feb 1995
[2].P. Modrich, "Mechanisms and biological effects of mismatch repair," , Annual review of genetics, vol. 25, pp. 229-53, 1991
[3].W. Chen, H. Yang, P. Feng, H. Ding, and H. Lin, "iDNA4mC: identifying DNA N4-methylcytosine sites based on nucleotide chemical properties," , Bioinformatics (Oxford, England), vol. 33, no. 22, pp. 3518-3523, Nov 15 2017
[4].W. He, C. Jia, and Q. Zou, "4mCPred: machine learning methods for DNA N4-methylcytosine sites prediction," , Bioinformatics (Oxford, England), vol. 35, no. 4, pp. 593-601, Feb 15 2019
[5].A. S. Nair and S. P. Sreenadhan, "A coding measure scheme employing electron-ion interaction pseudopotential (EIIP)," , Bioinformation, vol. 1, no. 6, pp. 197-202, Oct 7 2006
[6].L. Wei, S. Luan, L. A. E. Nagai, R. Su, and Q. Zou, "Exploring sequence-based features for the improved prediction of DNA N4-methylcytosine sites in multiple species," , Bioinformatics (Oxford, England), vol. 35, no. 8, pp. 1326-1333, Apr 15 2019
[7].L. Wei et al., "Iterative feature representations improve N4-methylcytosine site prediction," , Bioinformatics (Oxford, England), vol. 35, no. 23, pp. 4930-4937, Dec 1 2019.
[8].B. Manavalan, S. Basith, T. H. Shin, L. Wei, and G. Lee, "Meta-4mCpred: A Sequence-Based Meta-Predictor for Accurate DNA 4mC Site Prediction Using Effective Feature Representation," , Molecular therapy. Nucleic acids, vol. 16, pp. 733-744, Jun 7 2019
[9].L. Wei et al., "Iterative feature representations improve N4-methylcytosine site prediction," , Bioinformatics (Oxford, England), vol. 35, no. 23, pp. 4930-4937, Dec 1 2019
[10].J. Khanal, I. Nazari, H. Tayara, and K. T. Chong, "4mCCNN: Identification of N4-Methylcytosine Sites in Prokaryotes Using Convolutional Neural Network," Ieee Access, vol. 7, pp. 145455-145461, 2019.
[11].Q. Tang et al., "DNA4mC-LIP: a linear integration method to identify N4-methylcytosine site in multiple species," , Bioinformatics (Oxford, England), vol. 36, no. 11, pp. 3327-3335, Jun 1 2020
[12].B. Manavalan, M. M. Hasan, S. Basith, V. Gosu, T. H. Shin, and G. Lee, "Empirical Comparison and Analysis of Web-Based DNA N (4)-Methylcytosine Site Prediction Tools," , Molecular therapy. Nucleic acids, vol. 22, pp. 406-420, Dec 4 2020
[13].Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the Ieee, vol. 86, no. 11, pp. 2278-2324, Nov 1998
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top