Large Language models for Solid-State Synthesis Predictions and Explanations
Name
Prof. Joshua Schrier
Affiliation
Fordham Univ., USA
Abstract
In this talk, I will describe recent progress on the use of large language models (LLMs) to predict and explain the synthesizability (can it be made?) and selecting precursors (how to make it?) for solid-state inorganic compounds. In our initial work, we examined the ability to make predictions given only the chemical formula of the target compound, and benchmarked pre-trained and fine-tuned LLMs against recent (traditional) machine-learning approaches based on convolutional graph neural networks.[1] Surprisingly, fine-tuned LLMs can solve these problems at levels that are comparable to the best traditional machine learning approaches. The relative ease, speed, and quality of this LLM-based approach suggests both its broader adoption in chemical discovery and use of methods like these as a general baseline for when reporting the performance of more traditional chemical space prediction methods. More recently, we have extended this approach to the prediction of specific polymorphs, in which the structure is represented by a plain text description.[2]. While fine-tuned LLM are competitive with bespoke machine-learning methods, we find that better results can be achieved by training a model on the embedding vector of the text description. We also demonstrate how to use an LLM-based workflow to generate human-readable explanations for the types of factors governing synthesizability, extract the underlying physical rules, and assess the veracity of those rules. These text-based models can be adapted to specialized cases where less data exists by transfer learning, which we demonstrate for the case of oxide perovskites.
[1] S. Kim, Y. Jung, J. Schrier, “Large Language Models for Inorganic Synthesis Predictions” J. Am. Chem. Soc. 146, 29, 19654–19659 (2024) doi:10.1021/jacs.4c05840
[2] S. Kim, J. Schrier, Y. Jung, “Explainable Synthesizability Prediction of Inorganic Crystal Structures using Large Language Models” ChemRXiv doi:10.26434/chemrxiv-2024-ltncz