

Environmental regulations have intensified demand for green solvents, but discovery is limited by Solvent Selection Guides (SSGs) that quantify solvent sustainability. Training a machine learning model on GlaxoSmithKline SSG, a database of sustainability metrics for 10,189 solvents, GreenSolventDB is developed. Integrated with Hansen solubility metrics, the pipeline identifies greener solvents with solubility similar to hazardous ones, accelerating green solvent discovery. Abstract Strict environmental regulations have intensified the demand for green solvents that can replace hazardous ones without compromising performance. Existing methods for estimating solvent sustainability rely on Solvent Selection Guides (SSGs), which assign scores based on environmental, health, safety, and waste (EHSW) criteria, covering as few as 200 solvents. Expanding these guides is tedious, as it requires over 30 properties per solvent, many of which are often unavailable. Moreover, identifying greener alternatives within the limited SSG pool is challenging due to the need to balance conflicting criteria such as sustainability, cost, and performance. To address these limitations, a data‐driven pipeline is presented for assessing the sustainability of solvents and identifying greener substitutes. Three models are trained and evaluated on the GlaxoSmithKline Solvent Sustainability Guide (GSK SSG) to predict “greenness” metrics: a traditional Gaussian Process Regression (GPR) model, a fine‐tuned GPT model (FT GPT), and a GPT model using in‐context learning (ICL). It is found that GPR slightly outperforms language‐based GPT models and is used to evaluate 10,189 solvents, forming GreenSolventDB–the largest public database of green solvent metrics. These predictions are combined with Hansen solubility parameter‐based metrics to identify greener solvents with solubility behavior similar to hazardous solvents. This approach is validated through case studies on benzene and diethyl ether, with predicted alternatives aligning well with known greener substitutes. Building on this success, novel alternatives are proposed for the hazardous solvents listed in the GSK SSG. This framework for quantifying solvent sustainability and identifying greener substitutes is expected to significantly accelerate the discovery and adoption of environmentally‐friendly solvents. Environmental regulations have intensified demand for green solvents, but discovery is limited by Solvent Selection Guides (SSGs) that quantify solvent sustainability. Training a machine learning model on GlaxoSmithKline SSG, a database of sustainability metrics for 10,189 solvents, GreenSolventDB is developed. Integrated with Hansen solubility metrics, the pipeline identifies greener solvents with solubility similar to hazardous ones, accelerating green solvent discovery. Abstract Strict environmental regulations have intensified the demand for green solvents that can replace hazardous ones without compromising performance. Existing methods for estimating solvent sustainability rely on Solvent Selection Guides (SSGs), which assign scores based on environmental, health, safety, and waste (EHSW) criteria, covering as few as 200 solvents. Expanding these guides is tedious, as it requires over 30 properties per solvent, many of which are often unavailable. Moreover, identifying greener alternatives within the limited SSG pool is challenging due to the need to balance conflicting criteria such as sustainability, cost, and performance. To address these limitations, a data-driven pipeline is presented for assessing the sustainability of solvents and identifying greener substitutes. Three models are trained and evaluated on the GlaxoSmithKline Solvent Sustainability Guide (GSK SSG) to predict “greenness” metrics: a traditional Gaussian Process Regression (GPR) model, a fine-tuned GPT model (FT GPT), and a GPT model using in-context learning (ICL). It is found that GPR slightly outperforms language-based GPT models and is used to evaluate 10,189 solvents, forming GreenSolventDB–the largest public database of green solvent metrics. These predictions are combined with Hansen solubility parameter-based metrics to identify greener solvents with solubility behavior similar to hazardous solvents. This approach is validated through case studies on benzene and diethyl ether, with predicted alternatives aligning well with known greener substitutes. Building on this success, novel alternatives are proposed for the hazardous solvents listed in the GSK SSG. This framework for quantifying solvent sustainability and identifying greener substitutes is expected to significantly accelerate the discovery and adoption of environmentally-friendly solvents. Advanced Science, EarlyView.
Medical Journal
|15th Jan, 2026
|Nature Medicine's Advance Online Publication (AOP) table of contents.
Medical Journal
|15th Jan, 2026
|Wiley
Medical Journal
|15th Jan, 2026
|Wiley
Medical Journal
|15th Jan, 2026
|Wiley
Medical Journal
|15th Jan, 2026
|Wiley
Medical Journal
|15th Jan, 2026
|Wiley
Medical Journal
|15th Jan, 2026
|Wiley