研究生(外文):Wei-You Chen
論文名稱(外文):Efficient Method for High-Throughput Virtual Screening and Machine Learning Based on SMILES: Design of New Additives for Lithium-Ion Batteries
指導教授(外文):Jyh-Chiang Jiang
口試委員(外文):Bing-Joe HwangMing-Kang Tsai
外文關鍵詞:High-Throughput Virtual ScreeningHTSMachine LearningLithium-Ion BatteriesAdditives
隨著計算機科學與理論計算方法的蓬勃發展,高通量計算方法以及機器學習在新穎物質的探索上變成越來越重要的工具,尤其是結合第一原理計算,這兩種工具在開發新電池材料、熱電、壓電物質和有機光電材料等領域都具有卓越的貢獻。在本研究中,我們使用高通量虛擬篩選方法和機器學習技術來設計新穎的鋰電池電解質添加劑,並著重在其擁有比環硫乙烷更佳的電化學性質。簡化分子線性輸入規範被使用為結構表示式,以環硫乙烷為核心結構,結合氟、氯、烷基衍伸物,生成高達 3,649,051種不同的化學結構的資料庫。並使用半經驗方法PM6來計算其最高占據分子軌域(HOMO)、最低未占分子軌域(LUMO)、電子親和力(EA)、偶極矩(μ)和化學硬度(η)等性質。在高通量虛擬篩選方法中 我們根據兩個重要的電化學性質 1.較低的最低未占分子軌域(LUMO) 2.較低的能帶隙(HOMO-LUMO gap),從7260個物質中篩選出6種新穎的添加劑。在機器學習的方法中,我們使用4種模型,包括隨機森林 (Random forests)、決策樹 (Decision Trees)、極端隨機樹 (extra trees)、 極限梯度提升 (XGBoost)來預測電化學性質,並成功的從 3,649,051個結構中篩選出26種優於環硫乙烷的新穎材料。最後,我們將囊括高通量虛擬篩選方法和機器學習技術所篩選出來的32種材料以更高階的DFT計算進行驗證。基於這些理論計算結果,我們預期所篩選出的32種材料將成為下一代鋰電池的潛力添加劑。
The rapid advances in the computer technologies and theoretical methodologies have made high- throughput screening (HTS) method and Machine learning (ML) as a powerful tool in the process of materials discovery. Especially, these methods combined with first-principles calculations have demonstrated an effective track record of guiding advances in a variety of fields including design of new battery materials, thermoelectric materials, piezoelectric materials, and organic photovoltaic materials. In this study, we used both HTS and ML methods to screen the efficient electrolyte additive materials for the lithium-ion battery by mainly focusing the better electrochemical properties than the ethylene sulfite (ES). Here, we used simplified molecular-input line-entry system (SMILES) as a chemical nomenclature to construct a virtual library consisting of up to 3,649,051 different stereotype chemical structures, which are automatically generated based on fluoro-, chloro- and alkyl-derivatives of ES. The selected descriptors such as HOMO, LUMO, electron affinity (EA), the dipole moment (μ) and chemical hardness (η) have been calculated for the molecular structures in the virtual library using semiempirical PM6 method. In HTS, we screened by focusing on two significant properties of electrolyte additives:1) lower LUMO level; 2) smaller HOMO-LUMO gap relative to the core ES system. We identified 6 new materials as promising additives among the 7260 structures based on their better target properties than ES. In ML part, we used four different models including Random forests, Decision Trees, extra trees and extreme gradient boosting (XGBoost) to map the material descriptors to our target property and XGBoost demonstrates a capability for accurately predicting the target properties. Out of 3,649,051 structures in the ML database, we screened 26 additive structures which exhibit better target properties compared to the ES. In addition, all the screened candidates from both HTS and ML methods are further analyzed by high level DFT calculations, and our results indicate that screened total of 32 structures will be the promising electrolyte additive materials for the use of next generation Li ion battery.
摘要 IV
致謝 V
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Lithium-ion Battery 3
1.3 Lithium-ion Battery Material 5
1.3.1 Cathode 5
1.3.2 Anode 7
1.3.3 Electrolytes 8 Solvent 9 Salt 10 Additives 13
1.4 Solid Electrolyte Interface 15
1.4.1 Anode-Electrolyte Interface: SEI 16
1.4.2 Cathode-Electrolyte Interface 20
1.3 High Throughput Screening (HTS) 23
1.4 Machine Learning (ML) 28
1.5 Present Study 33
Chapter 2 Virtual Library 34
2.1 Methods 34
2.2 Results and discussion 36
2.3 Conclusion 39
Chapter 3 High Throughput Screening (HTS) 40
3.1 Methods and Computational Details 40
3.2 Results and Discussion 43
3.2.1 The static data of properties 43
3.2.2 Screening of Lead Structures: Pareto-optimal 45
3.2.3 Validation of Lead Structures by DFT 49
3.3 Conclusion 52
Chapter 4 Machine Learning 53
4.1 Methods and Computational Details 53
4.1.1 Random Forest (RF) 55
4.1.2 Decision Trees (DTs) 56
4.1.3 Extremely Randomized Trees (Extra-Trees) 57
4.1.4 Extreme gradient boosting (XGBoost) 57
4.2 Results and Discussion 58
4.2.1 Choice of ML models 58
4.2.2 Validation of ML model 60
4.2.3 Screening of Lead Structures: Pareto-optimal 62
4.2.4 Validation of Lead Structures by DFT 63
4.2.5 Comparison of HTS and ML methods 70
4.3 Conclusion 73
Chapter 5 Summary 75
References 77
Appendix 89
