MgO from the Materials Project: vqfetch end-to-end¶
This tutorial exercises the v0.8.0 vqfetch external-data
integration: pull an MgO crystal structure from
Materials Project,
emit a regression PeriodicSpec + an executable input
script, and run a periodic SCF with full provenance preserved.
You will:
Install the
[fetch]optional extra.Use
vqfetch mp --id mp-1265to pull the MgO rocksalt structure.Inspect the emitted SPEC + input-script files.
Run the SCF and verify the per-record provenance bundle in the SCF log header.
Background: see
user_guide/external_structures.md
for the full vqfetch CLI + provenance contract +
docs/license.md
for the per-source licensing inventory.
Setup¶
vqfetch is part of the optional [fetch] extra:
pip install -e '.[fetch]'
This pulls in optimade, ase, beautifulsoup4, lxml.
After install, the vqfetch console script is on $PATH.
Fetch + emit¶
Materials Project ID mp-1265 is the rocksalt MgO entry:
vqfetch mp --id mp-1265 --basis sto-3g --method rks-lda
# Output (one path per line; both written to disk):
# examples/regression/systems/periodic/mp_mp-1265.py
# examples/scf-mp_mp-1265.py
For a smoke test (recommended on first run), append --quick
to force sto-3g:
vqfetch mp --id mp-1265 --quick
What gets written¶
The emitted SPEC module is import-as-Python-data, ready for the regression suite:
# examples/regression/systems/periodic/mp_mp-1265.py
from examples.regression.core.spec import (
PeriodicSpec, Provenance,
)
mp_mp_1265 = PeriodicSpec(
id="mp_mp-1265",
formula="MgO",
lattice_vectors=[[0.0, 2.106, 2.106],
[2.106, 0.0, 2.106],
[2.106, 2.106, 0.0]],
atoms=[("Mg", [0.0, 0.0, 0.0]),
("O", [2.106, 2.106, 2.106])],
recommended_basis="sto-3g",
provenance=Provenance(
source_db="materials_project",
source_id="mp-1265",
source_url="https://next-gen.materialsproject.org/materials/mp-1265",
license="CC-BY 4.0",
fetched_at="2026-05-09T21:30:00+00:00",
),
)
The runnable script:
# examples/scf-mp_mp-1265.py
"""SCF on mp_mp-1265 — generated by vqfetch on 2026-05-09."""
from vibeqc import Atom, PeriodicSystem, run_periodic_job
cell = PeriodicSystem(
3,
[[0.0, 2.106, 2.106],
[2.106, 0.0, 2.106],
[2.106, 2.106, 0.0]],
[Atom(12, [0.0, 0.0, 0.0]),
Atom(8, [2.106, 2.106, 2.106])],
)
run_periodic_job(
cell,
basis="sto-3g",
method="RKS",
functional="LDA",
output="mp_mp-1265",
)
# Provenance bundle (preserved in the SCF log header):
# Source: materials_project mp-1265
# URL: https://next-gen.materialsproject.org/materials/mp-1265
# License: CC-BY 4.0
# Fetched: 2026-05-09T21:30:00+00:00
Run the SCF¶
Execute the emitted input script to run the periodic LDA SCF on the fetched MgO cell:
.venv/bin/python examples/scf-mp_mp-1265.py
Three output files in the working directory:
mp_mp-1265.out, banner, SCF trace, energy breakdown, orbital table, the source DB / ID / DOI / license recorded in the run header (search for “Provenance:” in the file).mp_mp-1265.molden, molecular orbitals (open with Avogadro or Jmol).mp_mp-1265.traj, ASE trajectory (single frame for static SCF; multi-frame ifoptimize=True).
Reference: live planetx round-trip on 2026-05-09 produced
E = −950.4204308512 Ha in 13 SCF iters (~2h 20m on 16
cores at sto-3g; sub-minute on a laptop with
OMP_NUM_THREADS=4 and the same basis).
Step up to a real basis + multi-k¶
The fetched cell is just data; you can re-run it at any basis or k-mesh by editing the input script:
run_periodic_job(
cell,
basis="pob-tzvp", # solid-state-tuned triple-zeta
method="RKS",
functional="pbe", # GGA workhorse
kmesh=(4, 4, 4), # multi-k via KRKS
output="mp_mp-1265-pbe-tzvp-444",
)
This drives the v0.8.0 run_krks_periodic_gdf driver, see
LiH at multiple k-points: KRHF vs Peintinger 2013 for a multi-k worked example
that targets a published reference, and
user_guide/multi_k_scf.md
for the algorithm + scope caveats.
Try other databases + structures¶
The same fetcher reaches other providers and canonical fixtures: COD for experimental CIFs, federated OPTIMADE search by formula, and the round-trip-verified canonical slugs:
# COD CIF (experimental geometry):
vqfetch cod --id 1011027 --basis pob-tzvp
# Federated OPTIMADE search by formula (default provider: MP):
vqfetch optimade --formula NaCl
# OPTIMADE search restricted to NOMAD:
vqfetch optimade --formula CaCO3 --provider nomad
# Canonical-set slug (round-trip-verified):
vqfetch canonical mgo_rocksalt
vqfetch canonical lih_rocksalt
vqfetch canonical si_diamond
Each subcommand caches the result on disk per XDG
(~/.cache/vqfetch/), so re-running the same query is a
no-op cache hit, useful for offline / reproducible runs.
What this tells you¶
Provenance is non-negotiable. Every fetched record carries the source DB, ID, URL, original DOI (where available), license, and fetched timestamp, and they travel with the SCF result through to the
.outfile. No more “where did this geometry come from?” auditability gap.Open databases are first-class. vibe-qc treats COD / MP / NOMAD / OPTIMADE as input fixtures; you don’t need to wrangle CIFs by hand.
The license matters. vqfetch surfaces the per-record license string so when you cite the result, you cite the source per its terms (Materials Project: cite per their terms; COD: CC0, citation appreciated, not legally required).
See also¶
user_guide/external_structures.md, full vqfetch CLI + flags + cache + heuristic basis picker.user_guide/reference_data.md, vqfetch’s CCCBDB Phase 2 for experimental reference data.docs/license.md, per-source licensing.LiH at multiple k-points: KRHF vs Peintinger 2013, pairs vqfetch with multi-k SCF.