PhononBench:A Large-Scale Phonon-Based Benchmark for Dynamical Stability in Crystal Generation
In this work, we introduce PhononBench, the first large-scale benchmark for dynamical stability in AI-generated crystals.
What’s new (20 sec)
In this work, we introduce PhononBench, the first large-scale benchmark for dynamical stability in AI-generated crystals.
Why it matters (2 min)
- In this work, we introduce PhononBench, the first large-scale benchmark for dynamical stability in AI-generated crystals.
- Leveraging the recently developed MatterSim interatomic potential, which achieves DFT-level accuracy in phonon predictions across more than 10,000 materials, PhononBench enables efficient…
- Open receipts to verify and go deeper.
Go deeper (8 min)
Context
In this work, we introduce PhononBench, the first large-scale benchmark for dynamical stability in AI-generated crystals. Leveraging the recently developed MatterSim interatomic potential, which achieves DFT-level accuracy in phonon predictions across more than 10,000 materials, PhononBench enables efficient large-scale phonon calculations and dynamical-stability analysis for 108,843 crystal structures generated by six leading crystal generation models. PhononBench reveals a widespread limitation of current generative models in ensuring dynamical stability: the average dynamical-stability rate across all generated structures is only 25.83%, with the top-performing model, MatterGen, reaching just 41.0%. Further case studies show that in property-targeted generation-illustrated here by band-gap conditioning with MatterGen--the dynamical-stability rate remains as low as 23.5% even at the optimal band-gap condition of 0.5 eV. In space-group-controlled generation, higher-symmetry crystals exhibit better stability (e.g., cubic systems achieve rates up to 49.2%), yet the average stability across all controlled generations is still only 34.4%. An important additional outcome of this…
For builders
Builder: scan the abstract + experiments; look for code, datasets, and evals.
Verify
Prefer primary announcements, papers, repos, and changelogs over reposts.