利用晶体结构准确预测材料性能在材料科学领域中发挥着关键的作用。在确定候选材料后,必须进行一系列实验或者大量的密度泛函理论计算。根据系统的复杂性,这可能需要耗费数小时、数天甚至数月。因此,在合成前准确预测所关注的材料属性,对择优分配模拟和实验资源非常有用。
仅基于组分的预测模型有助于筛选并识别潜在的候选材料而无需结构输入,但它们无法区分给定组分的结构多态性。此外,由于给定组分的不同结构可能具有截然不同的特性,因而与真实特性相比,仅基于组分的模型在预测值上可能存在显著的误差。这些缺陷可以通过在训练数据集中包含基于结构的输入得到缓解。因此,与基于组分的模型相比,基于结构的模型为推进材料科学领域的发现过程提供了更大的可能性。
来自美国西北大学电气与计算机工程系的Vishu Gupta等,提出了一个材料属性预测任务框架。该框架将先进的数据挖掘技术与结构感知图神经网络相结合,以提高模型对具有稀疏数据的材料属性的预测性能。研究者首先使用基于结构感知图神经网络的深度学习架构,从现有的包含晶体结构信息的大数据中捕捉底层化学信息。学习得到的知识将被迁移到稀疏数据集上使用,以开发可靠和准确的目标模型。作者使用115个数据集对所提出的框架在跨属性和跨材料类别的场景下进行了评估,发现迁移学习模型在104种情形下(≈90%)优于从头开始训练的模型。此外,迁移学习模型在外推问题中具有额外的性能优势。
Fig. 3 Training curve for predicting formation energy in JARVIS dataset for different training data sizes on a fixed test set.
Fig. 4 Prediction error analysis with mean absolute error (MAE) as error metric for predicting formation energy in JARVIS dataset using best scratch (SC) and best transfer learning (TL) model.
Editorial Summary
Accurate materials property prediction using crystal structure occupies a primary and often critical role in materials science. Upon identification of a candidate material, one has to go through either a series of hands-on experiments or intensive density functional theory calculations which can take hours to days to even months depending on the complexity of the system. Hence, the ability to accurately predict the properties of interest of the material prior to synthesis can be extremely useful to prioritize available resources for simulations and experiments. Although composition-only based predictive models can be helpful for screening and identifying potential material candidates without the need for structure as an input, they are by design not capable of distinguishing between structure polymorphs of a given composition. Further, composition-only based models could potentially have substantial errors in the predicted values as compared to ground truth, as different structure polymorphs of a given composition can have drastically different properties. These shortcomings can be mitigated by incorporating structure-based inputs, and hence structure-based modeling presents bigger opportunities than composition-based modeling to advance the discovery process in the field of materials science.
Vishu Gupta et al. from the Department of Electrical and Computer Engineering, Northwestern University, presented a framework for materials property prediction tasks that combines advanced data mining techniques with a structure-aware graph neural network (GNN) to improve the predictive performance of the model for materials properties with sparse data. They first applied a structure-aware GNN-based deep learning architecture to capture the underlying chemistry associated with the existing large data containing crystal structure information. The resulting knowledge learned was then transferred and used during training on the sparse dataset to develop reliable and accurate target models. The researchers evaluated the proposed framework in cross-property and cross-materials class scenarios using 115 datasets to find that transfer learning models outperform the models trained from scratch in 104 cases, i.e., ≈90%, with additional benefits in performance for extrapolation problems. The significant improvements gained by using the proposed framework are expected to be useful for materials science researchers to more gainfully utilize data mining techniques to help screen and identify potential material candidates more reliably and accurately for accelerating materials discovery. This article was recently published in npj Computational Materials 10: 1 (2024).
原文Abstract及其翻译
Structure-aware graph neural network based deep transfer learning framework for enhanced predictive analytics on diverse materials datasets (基于结构感知图神经网络的深度迁移学习框架:应用于不同材料数据集的增强预测分析)
Vishu Gupta, Kamal Choudhary, Brian DeCost, Francesca Tavazza, Carelyn Campbell, Wei-keng Liao, Alok Choudhary & Ankit Agrawal
Abstract Modern data mining methods have demonstrated effectiveness in comprehending and predicting materials properties. An essential component in the process of materials discovery is to know which material(s) will possess desirable properties. For many materials properties, performing experiments and density functional theory computations are costly and time-consuming. Hence, it is challenging to build accurate predictive models for such properties using conventional data mining methods due to the small amount of available data. Here we present a framework for materials property prediction tasks using structure information that leverages graph neural network-based architecture along with deep-transfer-learning techniques to drastically improve the model’s predictive ability on diverse materials (3D/2D, inorganic/organic, computational/experimental) data. We evaluated the proposed framework in cross-property and cross-materials class scenarios using 115 datasets to find that transfer learning models outperform the models trained from scratch in 104 cases, i.e., ≈90%, with additional benefits in performance for extrapolation problems. We believe the proposed framework can be widely useful in accelerating materials discovery in materials science.
摘要现代数据挖掘方法在理解和预测材料性能方面展现出了高效性。材料发现过程中的一个重要环节是了解哪种材料将具有理想的特性。对许多材料属性而言,进行实验和密度泛函理论计算相当昂贵且耗时。因此,由于可用的数据量较少,使用传统的数据挖掘方法建立这些属性的准确预测模型极具挑战性。这里,我们提出了一个使用结构信息的材料属性预测任务框架,该框架利用基于图神经网络的架构和深度迁移学习技术,从而显著提高模型在不同材料(3D/2D、无机/有机、计算/实验)数据上的预测能力。我们使用115个数据集对所提出的框架在跨属性和跨材料类别的场景下进行了评估,发现迁移学习模型在104种情形下(≈90%)优于从头开始训练的模型。此外,迁移学习模型在外推问题中具有额外的性能优势。我们相信所提出的框架能够广泛应用于加速材料科学中的材料发现。
原创文章,作者:计算搬砖工程师,如若转载,请注明来源华算科技,注明出处:https://www.v-suan.com/index.php/2024/02/25/87f2fabbcf/