Methodology
Reference data
The DFT ground truth is the ASSYST campaign: ~7M structures stored in
the dft_surrogate_mlip_assyst Postgres database. Convention details are
captured in [our wiki](https://… link TBD).
Per-element offset fit
Each potential's raw MLIP energy can be off the DFT scale by an element-
dependent constant. We fit per-element shifts Δμ_i to minimise
(E_MLIP − E_DFT) − Σᵢ Nᵢ · Δμᵢ across all structures, via the normal
equations (AᵀA)μ = Aᵀb. Two flavours are computed:
- global: simultaneous least-squares over all elements; reported as
_corrected_globalcolumns - unary: per-element mean over single-element structures only
The site's primary view uses the global correction.
Outlier filter
Before fitting and before reporting per-element RMSE/MAE, structures with
|(E_MLIP − E_DFT) / n_atoms| ≥ 50 eV/atom are dropped. These are MLIP
numerical failures, not real reference offsets — without the filter a single
runaway structure can inflate a per-element RMSE by 10–100×.
Metric definitions
| Metric | Definition |
|---|---|
E_rmse_corrected |
√mean((dE/atom − offset)²) over structures grouped by structure.element |
E_mae_corrected |
mean(|dE/atom − offset|) — same grouping |
F_rmse_comp |
√mean(|F_MLIP − F_DFT|²) over atoms, grouped by atomic species |
F_mae_comp |
mean(|F_MLIP − F_DFT|) — same |
F_radial_rmse |
√mean((|F_MLIP| − |F_DFT|)²) over atoms — magnitude-only error |
F_radial_mae |
mean(||F_MLIP| − |F_DFT||) — same |
F_ang_med_deg |
median angle(F_MLIP, F_DFT) per atomic species |
S_rmse_voigt |
√mean(δs²) over 6 Voigt components of stress, grouped by structure.element |
Element grouping caveat
The DB column structure.element is the marker assigned at ingest. It is
not consistent across binary datasets — Binary_Fe_Mo rows are labelled
Fe, while Binary_Fe_B rows are labelled B. We follow this convention to
match the upstream offset-fit CSVs. Force metrics use per-atom species
(atomic-number-derived) grouping, which is consistent.