Tutorial 47: The QVF file format, end to end

You will learn: what a .qvf archive is, the two ways vibe-qc produces one, the anatomy of its manifest, and how to write and read every section kind the format supports — structure, the six scalar fields, the full wavefunction, the seven spectra, electronic bands and density of states, optimisation and reaction paths, vibrations, and the provenance / citation metadata. By the end you will have produced three archives that between them exercise all 27 section kinds and validated every one.

This is the format companion to Tutorial 45, which walks the vibe-view GUI. Tutorial 45 shows what a routine job emits; this tutorial shows the whole format surface and the lower-level writer API you reach for when you want to put something in a .qvf that run_job does not emit on its own.

Prerequisites:

  • vibeqc installed and a working SCF setup (any Fundamentals tutorial).

  • Optional: vibe-view installed (for the read-back and GUI steps) — pip install -e vibe-view/ from the checkout root.

Time to complete: about 15 minutes.

1. What a .qvf is

A .qvf (Quantum Visualization Format) is a single ZIP archive that bundles everything a calculation produced for visualization — geometry, scalar fields, spectra, bands, trajectories, provenance — instead of scattering it across .cube, .xyz, .molden, .xsf and log files. One file you can hand to a colleague, attach to a paper’s SI, or open in vibe-view.

Inside, it is just a ZIP with three layers:

water.qvf
├── manifest.json              # the index: every section + member, with checksums
├── structure/structure.json   # JSON members (small, human-readable)
├── volumes/density.dat         # binary members (large numeric arrays)
├── volumes/density_grid.json
├── ...
  • manifest.json is the single source of truth. It lists every section (a unit of content like “the electron density” or “the IR spectrum”), and within each section the members (the files that carry the bytes), each with a path, a format (json or binary), and a sha256 checksum.

  • JSON members carry small structured data (atom lists, spectrum peaks, grid descriptors).

  • Binary members carry large numeric arrays (volumetric grids, MO coefficients, trajectory coordinates) as raw little-endian buffers, with their dtype and shape declared in the manifest.

The canonical contract is the JSON Schema at python/vibeqc/output/formats/qvf_manifest.schema.json; the full specification (units, section kinds, the validation and extension models) is the QVF design doc.

The write-time validation gate

write_qvf validates the archive it just wrote against the canonical schema before returning, and if validation fails it deletes the file and raises. A .qvf on disk that came from vibe-qc is therefore guaranteed schema-valid. This is also why the showcase script in this tutorial is a useful bug-finder: writing one archive per section kind and getting no exception is the test that every writer path is sound.

2. Two ways to produce a .qvf

a. The natural way — output_qvf=True

Any run_job / run_periodic_job call emits a .qvf when you pass output_qvf=True. The sections that appear are whatever the job computed: ask for cubes and you get volume.*; ask for a Hessian and you get vibrations + spectra.ir; optimise and you get a trajectory. This is what Tutorial 45 uses, and it is what you want 95% of the time:

from vibeqc import Atom, Molecule, run_job

mol = Molecule([
    Atom(8, [0.0,  0.00,  0.00]),
    Atom(1, [0.0,  1.43, -0.98]),
    Atom(1, [0.0, -1.43, -0.98]),
])

run_job(
    mol, basis="6-31g*", method="rhf",
    optimize=True,                 # -> trajectory
    hessian=True,                  # -> vibrations + spectra.ir
    write_cube=["density", "homo", "lumo"],  # -> volume.density + volume.orbital
    write_molden_file=True,        # -> wavefunction.gto
    write_population_file=True,    # -> atom_properties
    citations=True,                # -> citations
    output="water", output_qvf=True,
)

b. The explicit way — write_qvf(...)

To put something in a .qvf that run_job does not emit — a difference density you computed yourself, a UV/Vis spectrum from an external tool, hand-built reaction-path waypoints — call the writer directly. It takes an OutputPlan and a bag of context keyword arguments, one per section kind:

from vibeqc.output.formats.qvf import write_qvf, validate_qvf
from vibeqc.output.plan import OutputPlan

plan = OutputPlan.from_run_job_kwargs(
    output="custom", method="rhf", basis="sto-3g", functional=None
)

write_qvf(
    "custom", plan,
    molecule=mol,                              # -> structure
    volume_data={"Density": (rho, origin, span)},  # -> volume.density
    raman_data={"frequencies": [...], "intensities": [...]},  # -> spectra.raman
    # ... one kwarg per section you want ...
)

The rest of this tutorial walks every section kind and the context key that produces it. The full runnable version is examples/vibe_view/showcase_qvf_all_sections.py — it builds three archives covering all 27 kinds and validates each. Run it now and refer back to its output as you read:

~/path/to/vibeqc/.venv/bin/python examples/vibe_view/showcase_qvf_all_sections.py

3. Anatomy of the manifest

Open any archive and read its manifest — this is the consumer’s entry point:

import zipfile, json
manifest = json.loads(zipfile.ZipFile("water.qvf").read("manifest.json"))

print("QVF v", manifest["qvf_version"])
print("from ", manifest["source"])           # program, version, calculation
print("provenance", manifest.get("provenance"))  # method, basis, energy, ...
for s in manifest["sections"]:
    print(f"  {s['id']:<16s} {s['kind']}")

The manifest root carries:

  • qvf_version1 for molecular archives, 2 once a periodic reaction path adds per-frame lattices (see § 8).

  • source — producer program, version, and a calculation string.

  • provenance — best-effort method / functional / basis / charge / multiplicity / SCF energy / convergence (assembled from the context you pass).

  • viewer_defaults — optional producer hints (which section to auto-open, default isovalue / colormap, camera bookmarks).

  • sections — the list. Each entry has an id, a kind, optional label, and a members map of role → member spec. A member spec is {path, format, sha256} for JSON and additionally {dtype, shape} for binary.

All units follow the format’s conventions (positions in Å, grids in bohr, energies in Hartree or eV per the units table).

4. The section kinds, group by group

Every snippet below is a context kwarg to write_qvf. The showcase script assembles them into coherent archives; here they are isolated so you can see exactly what each kind needs.

4.1 Structure and connectivity

write_qvf(
    stem, plan,
    molecule=mol,                       # structure  (or system=PeriodicSystem)
    bonds_data=[(0, 1, 1.0), (0, 2, 1.0)],   # bonds — (i, j, order) triples
    symmetry_data={                     # structure.symmetry  (spglib-style)
        "space_group_number": 225,
        "space_group_symbol": "Fm-3m",
        "point_group": "m-3m",
    },
)

structure is the one section almost every archive has. Pass molecule= for a molecule or system= for a PeriodicSystem (which adds lattice vectors and pbc flags). bonds is an explicit connectivity table (omit it and the viewer infers bonds from covalent radii); structure.symmetry carries the spglib summary.

4.2 Scalar fields — the six volume.* kinds

All volumes share the same shape: a dict {label: (data_3d, origin, span)} where data_3d is a 3-D numpy array, origin is the grid anchor in bohr, and span is the 3×3 matrix of per-voxel step vectors in bohr.

write_qvf(
    stem, plan, molecule=mol,
    volume_data={"Electron density": (rho, origin, span)},   # volume.density
    mo_data=[...],                                           # volume.orbital
    spin_data={"Spin density": (s, origin, span)},           # volume.spin
    elf_data={"ELF": (elf, origin, span)},                   # volume.elf
    generic_volume_data={"RDG": (rdg, origin, span)},        # volume.generic
    diff_data={"Δρ": {                                       # volume.difference
        "data": diff, "origin": origin, "span": span,
        "operand_a": "vol_dens_0", "operand_b": "vol_dens_0",
        "description": "ρ(product) − ρ(reactant)",
    }},
)
  • volume.density / volume.spin / volume.elf differ only in meaning and the viewer’s default colormap (signed for spin and difference, sequential for density and ELF).

  • volume.generic is the escape hatch for any scalar field that is not one of the named kinds (reduced density gradient, electrostatic potential you computed yourself, …).

  • volume.difference can link to the two volume.* sections it was built from via operand_a / operand_b (section ids), so a viewer can offer “show me the operands”.

  • mo_data (the volume.orbital kind) is a list of per-orbital dicts; the qvf_mo_data(result, basis, molecule, indices) helper builds it for you from a finished SCF (pass the orbital indices to sample), and run_job(write_cube=[...]) produces it automatically.

For the full molecular orbital set without pre-sampling every orbital, use the wavefunction section instead (next).

4.3 The wavefunction — wavefunction.gto

from vibeqc.output.formats.qvf import qvf_wf_data
write_qvf(stem, plan, molecule=mol,
          wf_data=qvf_wf_data(result, basis, mol))   # wavefunction.gto

This embeds the GTO basis (shells, exponents, contraction coefficients) plus the MO coefficient matrix, so a viewer can resample any orbital on demand — far cheaper than writing each orbital as its own volume.orbital. run_job(write_molden_file=True) emits it automatically. Unrestricted runs carry separate α/β coefficient members.

4.4 The spectra family — seven kinds

Each spectrum is a small JSON dict of peaks. They share a common {frequencies, intensities} core; the energy-axis kinds accept energies_ev and the writer maps it for you.

write_qvf(
    stem, plan, molecule=mol,
    # spectra.ir is emitted automatically from a Hessian (hessian_result=)
    raman_data={"frequencies": fr, "intensities": ri},          # spectra.raman
    uvvis_data={"energies_ev": e, "intensities": f},            # spectra.uvvis
    ecd_data={"energies_ev": e, "intensities": R},              # spectra.ecd
    vcd_data={"frequencies_cm1": fr, "intensities": dA},        # spectra.vcd
    nmr_data={"isotope": "1H", "chemical_shifts": [...]},       # spectra.nmr
    generic_spectrum_data={                                     # spectra.generic
        "section_id": "xps", "label": "XPS",
        "frequencies": [...], "intensities": [...],
    },
)

spectra.ir is special: it is derived from a HessianResult (pass hessian_result=, or use run_job(hessian=True)), not a hand-built dict — the writer computes the intensities. The other six accept producer-supplied numbers, which is what you want when the spectrum came from a method vibe-qc does not yet compute. spectra.generic lets you name your own axis for anything that does not fit (XPS, a fitted envelope, …).

Warning

The synthetic spectra in the showcase script are illustrative format-demo data, not computed observables. The format supports carrying these spectra; vibe-qc does not yet compute most of them. Do not cite numbers read out of a demo archive.

4.5 Electronic structure of solids — bands, dos.total, dos.projected

write_qvf(
    stem, plan, system=periodic_system,
    band_structure=vq.band_structure_hcore(sys, basis, kpath),   # bands
    dos_data={                                                   # dos.total
        "energies": E, "dos": g, "n_spin": 1,
        "smearing": 0.1, "fermi_energy_ev": 0.0, "n_electrons": 20.0,
    },
    pdos_data={                                                  # dos.projected
        "energies": E, "projections": g_channels, "n_spin": 1,
        "channels": [{"atom_index": 1, "symbol": "O", "l": 1, "label": "O 2p"}],
    },
)

bands takes a real BandStructure (the cheapest is band_structure_hcore, which needs no SCF — see Tutorial 12). DOS arrays are [n_points] (or [2, n_points] for spin-polarized total DOS, [n_channels, n_points] for projected); energies are in eV with the Fermi level at 0. These are the periodic-only kinds — they need system= rather than molecule=.

4.6 Paths and dynamics — trajectory, reaction.path, reaction.waypoints, vibrations

write_qvf(
    stem, plan, molecule=mol,
    trajectory_frames=frames,            # trajectory  (list of Molecule)
    trajectory_energies=[...],
    reaction_waypoints={                 # reaction.waypoints (annotates traj0)
        "trajectory_ref": "traj0",
        "waypoints": [{"frame_index": 0, "label": "start", "kind": "point"}],
    },
    reaction_path={                      # reaction.path (self-contained)
        "frames": rxn_frames,
        "energies": [...],
        "waypoints": [
            {"frame_index": 0, "label": "Reactant", "kind": "reactant"},
            {"frame_index": 2, "label": "TS", "kind": "transition_state"},
            {"frame_index": 4, "label": "Product", "kind": "product"},
        ],
    },
    hessian_result=hess,                 # vibrations  (+ spectra.ir)
)
  • trajectory is a stack of geometries (a geometry optimisation, an MD run) — frames are Molecule objects, optionally with per-frame energies. run_job(optimize=True) emits one.

  • reaction.path is a self-contained path with typed waypoints (reactant / transition_state / intermediate / product / point) — exactly what an NEB run produces.

  • reaction.waypoints is the lightweight alternative: instead of re-storing coordinates, it annotates a trajectory already in the archive (the trajectory_ref must name a trajectory section in the same file, or the writer raises).

  • vibrations carries normal-mode displacements from a HessianResult; passing one also emits spectra.ir.

4.7 Provenance and metadata — atom_properties, scf_history, citations, viewer_defaults

write_qvf(
    stem, plan, molecule=mol,
    population_summary=pop,               # atom_properties (Mulliken/Löwdin)
    scf_history_data=[                    # scf_history
        {"iter": 0, "energy_eh": -74.9, "delta_e": 1.0, "diis_error": 0.5},
    ],
    bibtex_content=open("water.bibtex").read(),   # citations
    viewer_defaults={                     # manifest-root hints
        "auto_open": ["vol_dens_0"],
        "vol_dens_0": {"isovalue": 0.05, "colormap": "viridis"},
    },
)

atom_properties carries per-atom Mulliken / Löwdin charges; scf_history the per-iteration energy and DIIS error; citations the embedded BibTeX bundle (the same bytes as the .bibtex sidecar — see Tutorial 40). viewer_defaults is not a section but a manifest-root block of hints the viewer honours on load.

5. Run the showcase — three archives, 27 kinds

The script produces:

Archive

Built by

Kinds it contributes

water.qvf

real run_job

structure, volume.density, volume.orbital, wavefunction.gto, trajectory, vibrations, spectra.ir, atom_properties, scf_history, citations

spectroscopy.qvf

explicit write_qvf

volume.spin / elf / difference / generic, spectra.raman / uvvis / ecd / vcd / nmr / generic, structure.symmetry, bonds, reaction.path, reaction.waypoints, viewer_defaults

crystal.qvf

explicit write_qvf

bands, dos.total, dos.projected, periodic structure, structure.symmetry, periodic (v2) reaction.path

The final lines of its output are the coverage gate:

Coverage gate:
  kinds covered: 27/27
...
 All 27 section kinds written, validated, and round-tripped.

If a writer path regresses, the script dies at the offending write_qvf call (the validation gate) or at the coverage assertion — which is exactly the bug-finding behaviour we want from it.

6. Validate and read back

You do not need vibe-view to consume a .qvf — it is a documented ZIP. The producer-side gate already validated it, but you can re-check:

from vibeqc.output.formats.qvf import validate_qvf
report = validate_qvf("crystal.qvf")
assert report["valid"], report["errors"]

To read it with nothing but the standard library — verifying checksums yourself, then pulling a binary array out — follow the QVF consumer reference. The essential move for a binary member is np.frombuffer(zf.read(m["path"]), dtype=m["dtype"]).reshape(m["shape"]). The showcase script’s _verify_one_checksum does the SHA-256 round-trip in a few lines.

If vibe-view is installed, its reader gives you the same manifest plus classification of which kinds it can render:

from vibeview.qvf import QVFReader
reader = QVFReader("spectroscopy.qvf")
print(len(reader.manifest["sections"]), "sections")

7. Open it in vibe-view

vibe-view open examples/vibe_view/runs/qvf_showcase/spectroscopy.qvf

The startup banner lists every section as rendered / skipped, and the browser opens at http://127.0.0.1:8080. Walk the sidebar — each kind this tutorial wrote has its own panel. See Tutorial 45 for the panel-by-panel tour and the vibe-view user guide for every control.

8. Versioning, reserved kinds, and extensions

  • v1 vs v2. A molecular archive is qvf_version=1. The moment a reaction.path (or trajectory) carries PeriodicSystem frames, the writer bumps the archive to qvf_version=2, which adds per-frame lattice vectors and a dimensionality flag so a viewer can draw the cell and wrap atoms across boundaries. crystal.qvf in the showcase is a v2 archive; everything else is v1. v1 archives stay valid forever — v2 is a strict superset.

  • Reserved kinds. The design doc specifies several kinds the writer does not yet emit (volume.potential, volume.rdg, fermi_surface, phonon_bands, phonon_dos, equation_of_state). They are reserved so that when they land, existing consumers already know to expect them. Until then, use volume.generic / spectra.generic to carry the data.

  • Vendor extensions. A producer can write its own section under the x_<vendor>.* namespace. Consumers that do not understand it skip it (“skipped, vendor namespace”) rather than refusing the file — unless the section is flagged critical. The extension model is the governance for promoting a vendor kind to a canonical one.

What next