33. MgO from Materials Project — vqfetch end-to-end¶
This tutorial exercises the v0.8.0 vqfetch external-data
integration: pull an MgO crystal structure from
Materials Project,
emit a regression PeriodicSpec + an executable input
script, and run a periodic SCF with full provenance preserved.
You will:
Install the
[fetch]optional extra.Use
vqfetch mp --id mp-1265to pull the MgO rocksalt structure.Inspect the emitted SPEC + input-script files.
Run the SCF and verify the per-record provenance bundle in the SCF log header.
Background: see
user_guide/external_structures.md
for the full vqfetch CLI + provenance contract +
docs/license.md
for the per-source licensing inventory.
Setup¶
vqfetch is part of the optional [fetch] extra:
pip install -e '.[fetch]'
This pulls in optimade, ase, beautifulsoup4, lxml.
After install, the vqfetch console script is on $PATH.
Fetch + emit¶
Materials Project ID mp-1265 is the rocksalt MgO entry:
vqfetch mp --id mp-1265 --basis sto-3g --method rks-lda
# Output (one path per line; both written to disk):
# examples/regression/systems/periodic/mp_mp-1265.py
# examples/scf-mp_mp-1265.py
For a smoke test (recommended on first run), append --quick
to force sto-3g:
vqfetch mp --id mp-1265 --quick
What gets written¶
The emitted SPEC module is import-as-Python-data, ready for the regression suite:
# examples/regression/systems/periodic/mp_mp-1265.py
from examples.regression.core.spec import (
PeriodicSpec, Provenance,
)
mp_mp_1265 = PeriodicSpec(
id="mp_mp-1265",
formula="MgO",
lattice_vectors=[[0.0, 2.106, 2.106],
[2.106, 0.0, 2.106],
[2.106, 2.106, 0.0]],
atoms=[("Mg", [0.0, 0.0, 0.0]),
("O", [2.106, 2.106, 2.106])],
recommended_basis="sto-3g",
provenance=Provenance(
source_db="materials_project",
source_id="mp-1265",
source_url="https://next-gen.materialsproject.org/materials/mp-1265",
license="CC-BY 4.0",
fetched_at="2026-05-09T21:30:00+00:00",
),
)
The runnable script:
# examples/scf-mp_mp-1265.py
"""SCF on mp_mp-1265 — generated by vqfetch on 2026-05-09."""
from vibeqc import Atom, PeriodicSystem, run_periodic_job
cell = PeriodicSystem(
lattice_vectors=[[0.0, 2.106, 2.106],
[2.106, 0.0, 2.106],
[2.106, 2.106, 0.0]],
atoms=[Atom(12, [0.0, 0.0, 0.0]),
Atom(8, [2.106, 2.106, 2.106])],
)
run_periodic_job(
cell,
basis="sto-3g",
method="RKS",
functional="LDA",
output="mp_mp-1265",
)
# Provenance bundle (preserved in the SCF log header):
# Source: materials_project mp-1265
# URL: https://next-gen.materialsproject.org/materials/mp-1265
# License: CC-BY 4.0
# Fetched: 2026-05-09T21:30:00+00:00
Run the SCF¶
.venv/bin/python examples/scf-mp_mp-1265.py
Three output files in the working directory:
mp_mp-1265.out— banner, SCF trace, energy breakdown, orbital table, the source DB / ID / DOI / license recorded in the run header (search for “Provenance:” in the file).mp_mp-1265.molden— molecular orbitals (open with Avogadro or Jmol).mp_mp-1265.traj— ASE trajectory (single frame for static SCF; multi-frame ifoptimize=True).
Reference: live planetx round-trip on 2026-05-09 produced
E = −950.4204308512 Ha in 13 SCF iters (~2h 20m on 16
cores at sto-3g; sub-minute on a laptop with
OMP_NUM_THREADS=4 and the same basis).
Step up to a real basis + multi-k¶
The fetched cell is just data; you can re-run it at any basis or k-mesh by editing the input script:
run_periodic_job(
cell,
basis="pob-tzvp", # solid-state-tuned triple-zeta
method="RKS",
functional="pbe", # GGA workhorse
kmesh=(4, 4, 4), # multi-k via KRKS
output="mp_mp-1265-pbe-tzvp-444",
)
This drives the v0.8.0 run_krks_periodic_gdf driver — see
Tutorial 34 for a multi-k worked example
that targets a published reference, and
user_guide/multi_k_scf.md
for the algorithm + scope caveats.
Try other databases + structures¶
# COD CIF (experimental geometry):
vqfetch cod --id 1011027 --basis pob-tzvp
# Federated OPTIMADE search by formula (default provider: MP):
vqfetch optimade --formula NaCl
# OPTIMADE search restricted to NOMAD:
vqfetch optimade --formula CaCO3 --provider nomad
# Canonical-set slug (round-trip-verified):
vqfetch canonical mgo_rocksalt
vqfetch canonical lih_rocksalt
vqfetch canonical si_diamond
Each subcommand caches the result on disk per XDG
(~/.cache/vqfetch/), so re-running the same query is a
no-op cache hit — useful for offline / reproducible runs.
What this tells you¶
Provenance is non-negotiable. Every fetched record carries the source DB, ID, URL, original DOI (where available), license, and fetched timestamp — and they travel with the SCF result through to the
.outfile. No more “where did this geometry come from?” auditability gap.Open databases are first-class. vibe-qc treats COD / MP / NOMAD / OPTIMADE as input fixtures; you don’t need to wrangle CIFs by hand.
The license matters. vqfetch surfaces the per-record license string so when you cite the result, you cite the source per its terms (Materials Project: cite per their terms; COD: CC0 — citation appreciated, not legally required).
See also¶
user_guide/external_structures.md— full vqfetch CLI + flags + cache + heuristic basis picker.user_guide/reference_data.md— vqfetch’s CCCBDB Phase 2 for experimental reference data.docs/license.md— per-source licensing.Tutorial 34 — LiH multi-k — pairs vqfetch with multi-k SCF.