跳到主要內容

臺灣博碩士論文加值系統

(44.200.168.16) 您好!臺灣時間:2023/04/02 02:31
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

我願授權國圖
: 
twitterline
研究生:邱諄擇
研究生(外文):CHIU, CHUN-TSE
論文名稱:以SMILES分子指紋作為輸入的Transformer模型應用於預測純物質熱力學性質
論文名稱(外文):Using Transformer Model with SMILES Fingerprint Input Predict Pure Component Thermodynamic Properties
指導教授:康嘉麟
指導教授(外文):KANG, JIA-LIN
口試委員:汪上曉姚遠
口試委員(外文):WONG, SHAN-HILLYAO, YUAN
口試日期:2022-07-21
學位類別:碩士
校院名稱:國立雲林科技大學
系所名稱:化學工程與材料工程系
學門:工程學門
學類:化學工程學類
論文種類:學術論文
論文出版年:2022
畢業學年度:110
語文別:中文
論文頁數:57
中文關鍵詞:Sigma profileTransformer活性係數分子性質預測
外文關鍵詞:Sigma profileTransformeractivity coefficientmolecular property
相關次數:
  • 被引用被引用:0
  • 點閱點閱:60
  • 評分評分:
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
摘要.......................i
ABSTRACT.......................................ii
目錄........................................ iii
表目錄...................................................v
圖目錄..................................................vi
第一章、 緒論...............................1
1.1. 研究背景...........................1
1.2. 研究目的...........................1
第二章、 文獻回顧.......................3
2.1. 分子性質預測...................3
2.1.1. 序列神經網路.......3
2.1.2. 圖神經網路...........4
2.2. 自然語言處理...................4
2.3. Sigma profile.....................5
2.4. 類導體屏蔽模型...............6
第三章、 研究方法.......................8
3.1. 數據預處理.......................8
3.1.1. K-mers...................8
3.1.2. SMILES 數字加密........................................................................9
3.1.3. 標準化...................9
3.2. 模型結構...........................9
3.2.1. Transformer 模型結構..................................................................9
3.2.2. 注意力機制.........10
3.2.3. COSMOSAC activity coefficient loss .........................................12
3.2.4. GDB9 性質預測模型..................................................................13
3.3. 損失函數.........................14
3.3.1. Sigma profile 預測模型...............................................................14
3.3.2. GDB9 性質預測模型..................................................................15
3.4. 模型評估.........................15
第四章、 研究案例.....................16
4.1. Sigma profile 性質預測..16
4.2. 活性係數預測.................17
4.3. 分子性質預測.................17
第五章、 結果與討論.................19
5.1. HSPiP Sigma profile 預測效果.................................19
5.1.1. K-mers 對預測效果的影響....................................19
5.1.2. 添加gamma-loss 對模型的影響.......................................24
5.2. 比較在Gaussian09 數據庫的預測效果...................................27
5.2.1. K-mers 對預測效果的影響.......................................27
5.2.2. 比較添加gamma-loss 對預測效果的影響...........................32
5.2.3. 模型可視化分析.35
5.3. GDB9 性質預測結果....38
第六章、 未來展望.....................41
第七章、 結論.............................44
參考文獻.............................................45

[1]Constantinou L. and Gani R., "New group contribution method for estimating properties of pure compounds," AIChE Journal, vol. 40, no. 10, pp. 1697-1710, 1994.
[2]Islam M. R. and Chen C.-C., "COSMO-SAC sigma profile generation with conceptual segment concept," Industrial & Engineering Chemistry Research, vol. 54, no. 16, pp. 4441-4454, 2015.
[3]Mullins E., Oldland R., Liu Y., Wang S., Sandler S. I., Chen C.-C., Zwolak M., and Seavey K. C., "Sigma-profile database for using COSMO-based thermodynamic methods," Industrial & engineering chemistry research, vol. 45, no. 12, pp. 4389-4415, 2006.
[4]Weininger D., "SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules," Journal of chemical information and computer sciences, vol. 28, no. 1, pp. 31-36, 1988.
[5] Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A. N., Kaiser Ł., and Polosukhin I., "Attention is all you need," in Advances in neural information processing systems, 2017, pp. 5998-6008.
[6]Bengio Y., Courville A., and Vincent P., "Representation learning: A review and new perspectives," IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 8, pp. 1798-1828, 2013.
[7]Shen J. and Nicolaou C. A., "Molecular property prediction: recent trends in the era of artificial intelligence," Drug Discovery Today: Technologies, vol. 32, pp. 29-36, 2019.
[8]Goh G. B., Siegel C., and Vishnu A., "An Interpretable General-Purpose Deep Neural Network for Predicting Chemical Properties," stat, vol. 1050, p. 18, 2018.
[9]Paul A., Jha D., Al-Bahrani R., Liao W.-k., Choudhary A., and Agrawal A., "CheMixNet: Mixed DNN Architectures for Predicting Chemical Properties using Multiple Molecular Representations."
[10]Winter R., Montanari F., Noé F., and Clevert D.-A., "Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations," Chemical science, vol. 10, no. 6, pp. 1692-1701, 2019.
[11]Zheng S., Yan X., Yang Y., and Xu J., "Identifying structure–property relationships through SMILES syntax analysis with self-attention mechanism," Journal of chemical information and modeling, vol. 59, no. 2, pp. 914-923, 2019.
[12]Chang J.-J., Wong D. S.-H., Huang C.-H., Kang J.-L., Hsu H.-H., and Lin S.-T., "Towards a universal digital chemical space for pure component properties prediction," Fluid Phase Equilibria, vol. 527, p. 112829, 2021.
[13] Gilmer J., Schoenholz S. S., Riley P. F., Vinyals O., and Dahl G. E., "Neural message passing for quantum chemistry," in International conference on machine learning, 2017: PMLR, pp. 1263-1272.
[14]Choudhary K. and DeCost B., "Atomistic Line Graph Neural Network for improved materials property predictions," npj Computational Materials, vol. 7, no. 1, pp. 1-8, 2021.
[15]Young T., Hazarika D., Poria S., and Cambria E., "Recent trends in deep learning based natural language processing," ieee Computational intelligenCe magazine, vol. 13, no. 3, pp. 55-75, 2018.
[16]Noble W. S., "What is a support vector machine?," Nature biotechnology, vol. 24, no. 12, pp. 1565-1567, 2006.
[17]Mikolov T., Chen K., Corrado G., and Dean J., "Efficient Estimation of Word Representations in Vector Space," arXiv e-prints, p. arXiv: 1301.3781, 2013.
[18] Zou W. Y., Socher R., Cer D., and Manning C. D., "Bilingual word embeddings for phrase-based machine translation," in Proceedings of the 2013 conference on empirical methods in natural language processing, 2013, pp. 1393-1398.
[19] Bahdanau D., Cho K. H., and Bengio Y., "Neural machine translation by jointly learning to align and translate," in 3rd International Conference on Learning Representations, ICLR 2015, 2015.
[20]Klamt A., Jonas V., Bürger T., and Lohrenz J. C., "Refinement and parametrization of COSMO-RS," The Journal of Physical Chemistry A, vol. 102, no. 26, pp. 5074-5085, 1998.
[21]Lin S.-T. and Sandler S. I., "A priori phase equilibrium prediction from a segment contribution solvation model," Industrial & engineering chemistry research, vol. 41, no. 5, pp. 899-913, 2002.
[22]Vidal D., Thormann M., and Pons M., "LINGO, an efficient holographic text based method to calculate biophysical properties and intermolecular similarities," Journal of chemical information and modeling, vol. 45, no. 2, pp. 386-393, 2005.
[23]Öztürk H., Ozkirimli E., and Özgür A., "A comparative study of SMILES-based compound similarity functions for drug-target interaction prediction," BMC bioinformatics, vol. 17, no. 1, pp. 1-11, 2016.
[24]Bell I. H., Mickoleit E., Hsieh C. M., Lin S. T., Vrabec J., Breitkopf C., and Jager A., "A Benchmark Open-Source Implementation of COSMO-SAC," J Chem Theory Comput, vol. 16, no. 4, pp. 2635-2646, Apr 14 2020, doi: 10.1021/acs.jctc.9b01016.
[25] Abadi M., Barham P., Chen J., Chen Z., Davis A., Dean J., Devin M., Ghemawat S., Irving G., and Isard M., "Tensorflow: A system for large-scale machine learning," in 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16), 2016, pp. 265-283.
[26]Prechelt L., "Early stopping-but when?," in Neural Networks: Tricks of the trade: Springer, 1998, pp. 55-69.
[27]Jarvas G., Quellet C., and Dallos A., "Estimation of Hansen solubility parameters using multivariate nonlinear QSPR modeling with COSMO screening charge density moments," Fluid Phase Equilibria, vol. 309, no. 1, pp. 8-14, 2011.
[28]Frisch M., Trucks G. W., Schlegel H. B., Scuseria G. E., Robb M. A., Cheeseman J. R., Scalmani G., Barone V., Mennucci B., and Petersson G., "Gaussian 09, revision D. 01," ed: Gaussian, Inc., Wallingford CT, 2009.
[29]Ramakrishnan R., Dral P. O., Rupp M., and Von Lilienfeld O. A., "Quantum chemistry structures and properties of 134 kilo molecules," Scientific data, vol. 1, no. 1, pp. 1-7, 2014.
[30]Dhasmana A., Raza S., Jahan R., Lohani M., and Arif J. M., "High-throughput virtual screening (htvs) of natural compounds and exploration of their biomolecular mechanisms: an in silico approach," in New look to phytomedicine: Elsevier, 2019, pp. 523-548.
[31]Sanchez-Lengeling B. and Aspuru-Guzik A., "Inverse molecular design using machine learning: Generative models for matter engineering," Science, vol. 361, no. 6400, pp. 360-365, 2018.
[32]Supady A., Blum V., and Baldauf C., "First-principles molecular structure search with a genetic algorithm," Journal of Chemical Information and Modeling, vol. 55, no. 11, pp. 2338-2348, 2015.
[33]Yoshikawa N., Terayama K., Sumita M., Homma T., Oono K., and Tsuda K., "Population-based de novo molecule generation, using grammatical evolution," Chemistry Letters, vol. 47, no. 11, pp. 1431-1434, 2018.
[34]Goodfellow I., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., Courville A., and Bengio Y., "Generative adversarial nets," Advances in neural information processing systems, vol. 27, 2014.
[35] van den Oord A., Dieleman S., Zen H., Simonyan K., Vinyals O., Graves A., Kalchbrenner N., Senior A., and Kavukcuoglu K., "WaveNet: A Generative Model for Raw Audio," in 9th ISCA Speech Synthesis Workshop, pp. 125-125.
[36] Bowman S. R., Vilnis L., Vinyals O., Dai A. M., Jozefowicz R., and Bengio S., "Generating sentences from a continuous space," in 20th SIGNLL Conference on Computational Natural Language Learning, CoNLL 2016, 2016: Association for Computational Linguistics (ACL), pp. 10-21.
[37]Gómez-Bombarelli R., Wei J. N., Duvenaud D., Hernández-Lobato J. M., Sánchez-Lengeling B., Sheberla D., Aguilera-Iparraguirre J., Hirzel T. D., Adams R. P., and Aspuru-Guzik A., "Automatic chemical design using a data-driven continuous representation of molecules," ACS central science, vol. 4, no. 2, pp. 268-276, 2018.
[38] Dai H., Tian Y., Dai B., Skiena S., and Song L., "Syntax-Directed Variational Autoencoder for Structured Data," in International Conference on Learning Representations, 2018.
[39] Kusner M. J., Paige B., and Hernández-Lobato J. M., "Grammar variational autoencoder," in International conference on machine learning, 2017: PMLR, pp. 1945-1954.
[40] Arjovsky M., Chintala S., and Bottou L., "Wasserstein generative adversarial networks," in International conference on machine learning, 2017: PMLR, pp. 214-223.
[41]Yang X., Zhang J., Yoshizoe K., Terayama K., and Tsuda K., "ChemTS: an efficient python library for de novo molecular generation," Science and technology of advanced materials, vol. 18, no. 1, pp. 972-976, 2017.
[42]Popova M., Isayev O., and Tropsha A., "Deep reinforcement learning for de novo drug design," Science advances, vol. 4, no. 7, p. eaap7885, 2018.
[43]Segler M. H., Preuss M., and Waller M. P., "Planning chemical syntheses with deep neural networks and symbolic AI," Nature, vol. 555, no. 7698, pp. 604-610, 2018.
[44]Zhou Z., Li X., and Zare R. N., "Optimizing chemical reactions with deep reinforcement learning," ACS central science, vol. 3, no. 12, pp. 1337-1344, 2017.


電子全文 電子全文(網際網路公開日期:20270601)
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top