Troubleshooting

When a vibe-qc run fails the error message is almost always concrete and actionable, but knowing which knob to turn requires a bit of context. This page collects the errors users actually hit during a first month with the code, with the canonical fix for each. The homepage admonition carries the full v0.8.x known-issues list (the things that are bugs in vibe-qc rather than misconfigurations on your end); this page covers the recoverable situations.

For “I think I found a bug” cases see CONTRIBUTING.md § Filing a bug.

SCF didn’t converge

Symptom — molecular

  iter  85   -75.94731234   3.4e-04   2.1e-03   8
  iter  86   -75.94731201   3.3e-04   2.1e-03   8
  iter  87   -75.94731267   3.4e-04   2.1e-03   8
WARNING: SCF did NOT converge after 100 iterations
RuntimeError: SCF failed: max_iter (100) reached without convergence

The third column (dE) is bouncing instead of decreasing. The fourth column (\(\lVert\mathbf{e}\rVert_F\)) is plateaued. The DIIS subspace (the trailing integer) is full at 8.

Cause

DIIS is oscillating between two near-degenerate density solutions. Common triggers:

  • Stretched bond / dissociation-limit geometry (closed-shell reference becomes a poor description of the multireference state).

  • Transition-metal complex with ligand-field-degenerate d orbitals.

  • Broken-symmetry singlet biradical.

  • Wrong initial guess (e.g. AUTO picked SAP on a system where SAD starts closer to the true minimum).

Fix

In v0.8.0 the default scf_accelerator is already EDIIS_DIIS (see the stiff-convergence tutorial) — if you’re seeing this on a v0.8.x build, the default already fired. The remaining levers, in order of bang-for-buck:

opts.level_shift   = 0.2          # Saunders-Hillier shift
opts.newton_threshold = 1.0       # Phase D2c Newton finalizer
opts.initial_guess = vq.InitialGuess.SAD  # if AUTO picked SAP
opts.max_iter      = 200          # last resort — buys time

If you specifically need the deprecated v0.7.x-style vanilla DIIS behaviour for parity work:

opts.scf_accelerator = vq.SCFAccelerator.DIIS

The full decision tree is in user_guide/scf_convergence.md § SCF accelerators.

Symptom — periodic

  iter  18   -114.821342   1.2e-05   4.7e-04
  iter  19   -114.821339   1.1e-05   4.8e-04
  iter  20   -114.821345   1.2e-05   4.7e-04

For periodic SCF, oscillation in the last two digits is usually a sign of a convergence aid problem (band crossing the Fermi level at a coarse k-mesh) rather than a hard SCF instability. The fix is Fermi-Dirac smearing:

opts.smearing_temperature = "metal"          # 0.005 Ha — see
                                             # tutorial 42

If the periodic SCF lands at an impossible energy (over-bound by Madelung-scale shifts of ~0.5 Ha on H₂/STO-3G/30-bohr-box) the problem is something more fundamental — see CLAUDE.md § 7: that’s a bug, not a convergence-aid problem. File an issue with the input that reproduces it.

“Canonical orth dropped too many basis directions”

Symptom

RuntimeError: Canonical orthogonalisation dropped 17 of 234 basis
directions (overlap eigenvalues below 1e-7). The basis set is
nearly linearly dependent on this geometry/cell. See
docs/user_guide/linear_dependence.md.

Cause

The overlap matrix \(\mathbf{S}\) has near-zero eigenvalues — two or more basis-function combinations are nearly identical at the input geometry. Most common triggers:

  • Using a molecular basis set in a periodic calculation (e.g. def2-tzvp for solid NaCl). Diffuse functions overlap with periodic images. Use the pob- family for periodic systems* — see tutorial 16.

  • A vDZP / dhf- / x2c- basis** without an explicit ECP center (the heavy-element basis is short by Z core electrons; without the ECP the SCF reaches for them via diffuse-function overlap).

  • Atoms placed at or very near the same coordinates by an upstream geometry-conversion step (CIF → POSCAR rounding, duplicated atoms from a faulty supercell expansion).

Fix

Pick the appropriate fix for the trigger:

# 1. Use a solid-state basis instead.
basis = vq.BasisSet(sysp.unit_cell_molecule(), "pob-tzvp")

# 2. Filter out diffuse primitives at the BasisSet stage.
basis = vq.make_basis(sysp.unit_cell_molecule(), "def2-tzvp",
                      exp_to_discard=0.05)

# 3. Wire ECP centers explicitly for vDZP / dhf-* / x2c-*
#    (see docs/user_guide/ecp.md § ECPCenter recipe).
opts.ecp_centers = [
    vq.ECPCenter(Z=78, xyz=[0.0, 0.0, 0.0]),   # Pt
    ...
]
opts.ecp_library = "ecp60mdf"

# 4. Auto-optimise periodic cutoffs jointly with screening.
opts.auto_optimize_truncation = True    # default on v0.7+

user_guide/linear_dependence.md walks through the v0.7 diagnostic stack (vq.eigs_preflight, vq.disambiguate_critical_overlap, auto_optimize_truncation) and the matching fix recipe in detail.

Memory abort before the SCF starts

Symptom

vibe-qc estimates this calculation will require ~218.4 GB of memory:
    ERI tensor      186.0 GB
    ...
Available on this machine: 7.2 GB. ABORTING.

InsufficientMemoryError: Set `options.memory_override = True` (or
pass `memory_override=True` to `run_job`) to proceed anyway.

Cause

The pre-flight estimator (see user_guide/memory.md) sized the dense four-index ERI tensor or the MO transformation buffers and added a 20% headroom; that number exceeds the machine’s available RAM.

Fix — in order of severity

# 1. Smaller basis (always start here).
run_job(mol, basis="def2-svp", ...)

# 2. Density fitting — orders-of-magnitude smaller working set.
opts.density_fit = True
opts.aux_basis = vq.default_aux_basis_for(basis_name, kind="jk")

# 3. RIJCOSX — even smaller working set for large hybrid-DFT.
opts.density_fit = True
opts.cosx = True

# 4. Override the check (you accept the risk of swap-thrashing).
run_job(mol, ..., memory_override=True)

density_fit=True is the right answer for ~250-1000 basis functions; cosx=True is the right answer above ~1000. See user_guide/density_fitting.md.

Warning

The v0.7.3 DF integral-kernel SIGSEGV on auxiliary bases with l 1 shells is still open as of v0.8.0 — only s-only auxiliary bases (sto-3g, sto-6g, 6-31g) are safe in the DF / ADFT path. Use the default aux_basis picker on a molecular run; periodic GDF has its own metric path that’s not affected.

KeyError: 'Ne' (or any other element) on basis load

Symptom

KeyError: "basis set 'pob-tzvp' has no entry for element 'Ne'"

Cause

The bundled .g94 for that basis set covers a subset of the periodic table that doesn’t include the requested element. Known gaps as of v0.8.0:

Basis

Missing on main

Mitigation

pob-tzvp

Ne

Pull branch fix/pob-tzvp-add-ne (def2-TZVP Ne block borrowed)

pob-dzvp-rev2

All of Si, Fe, transition metals (19-element coverage only)

Use pob-tzvp-rev2 instead for silicates / TM systems

Fix

For Ne specifically, the workaround branch is documented in the homepage admonition § “Open in the v0.7.x maintenance window”. For other elements, either:

  • Switch to a basis that covers the element (def2-tzvp / cc-pvtz for molecular work; pob-tzvp-rev2 for periodic).

  • Add the element by hand — copy the matching block from BSE into a custom/*.g94 file (see user_guide/basis_sets.md § Custom basis sets) and rebuild the library with ./scripts/setup_basis_library.sh.

“DF gradient disagrees by ~115 mHa/bohr”

Symptom

Geometry optimisation drifts away from the true minimum when density_fit=True is on; or hand-written FD gradients disagree with compute_gradient by ~100 mHa/bohr on glycine / def2-TZVP.

Cause

Fixed in v0.8.0. A libint engine-state leak in the 3c-ERI gradient kernel (compute_3c_eri_gradient_weighted in cpp/src/df.cpp). Two adjacent same-l heavy atoms (e.g. the carboxyl O=C-O-H oxygens in glycine and formic acid) caused the engine to leak derivative-buffer state across compute() calls.

Fix

Upgrade to vibe-qc ≥ v0.8.0. The regression guard is tests/test_df_gradient.py::test_df_rhf_gradient_hcooh_def2_tzvp_matches_direct. density_fit=True gradients are now safe at def2-TZVP-class basis sets.

“Analytic RHF gradient is wrong”

Symptom

Geometry optimisation walks to the wrong minimum on a system with f-shells AND ≥2 different second-row elements (e.g. CO, CH₂O, glycine + def2-TZVP). Magnitude can reach 161 mHa on glycine with a recent libint build.

Cause

Fixed in v0.8.0. The two_electron_gradient_contribution (direct 4-index ERI gradient) kernel relied on libint’s internal DerivMapGenerator unscramble path, which had a buggy derivative-to-atom routing for high-l mixed-l shell quartets. Fix C rewrites the kernel as a canonical 1/8 shell-quartet loop with explicit l-canonical reorder before engine.compute().

Fix

Upgrade to vibe-qc ≥ v0.8.0. The regression guard is tests/test_gradient_f_bug.py::test_h2co_def2_tzvp_gradient_matches_pyscf. Direct analytic gradients with f-shells are now correct (post-fix H2CO/def2-tzvp matches PySCF to ~5e-11 Ha/bohr).

ImportError: cannot import name 'EEQOptions' from 'vibeqc._vibeqc_core'

Symptom

ImportError: cannot import name 'EEQOptions' from
'vibeqc._vibeqc_core' (/path/to/_vibeqc_core.cpython-...so)

Cause

You have a Python source tree from a newer commit but the compiled _vibeqc_core.so is from an older one. Symbol mismatch.

Fix

pip install -e . --no-build-isolation --force-reinstall

This rebuilds the C++ extension against the current Python sources. On macOS the rebuild takes ~30-90 s; on Linux similar. If you’re on a worktree, memory dictates a per-worktree venv so the parent venv doesn’t get clobbered.

ModuleNotFoundError: No module named 'vibeqc'

Symptom

ModuleNotFoundError: No module named 'vibeqc'

Cause

You ran the script with the wrong Python — the system python3 instead of the venv-installed one.

Fix

Either give the full path to the venv’s interpreter:

~/path/to/vibeqc/.venv/bin/python my_input.py

or activate the venv first:

source ~/path/to/vibeqc/.venv/bin/activate
python my_input.py

“Cell-list construction returned 0 cells”

Symptom

RuntimeError: cell-list construction returned 0 cells for cutoff
12.0 bohr on this lattice — check that the lattice matrix is full
rank and the cutoff is positive.

Cause

Either a degenerate lattice (zero-rank column) or a non-positive lattice_opts.cutoff_bohr. Common when a 1D-chain input has the two vacuum axes accidentally set to 0 instead of a wide separation:

sysp = vq.PeriodicSystem(
    dim=1,
    lattice=np.diag([4.0, 0.0, 0.0]),    # ← bug — vacuum axes are 0
    unit_cell=[...],
)

Fix

Set the non-periodic axes to a generous vacuum separation:

sysp = vq.PeriodicSystem(
    dim=1,
    lattice=np.diag([4.0, 30.0, 30.0]),  # 30 bohr vacuum decouples images
    unit_cell=[...],
)

vibeqc-cite: manifest missing after a job

Symptom

$ vibeqc-cite output-h2o
error: manifest file not found: output-h2o.system

Cause

Either the job hasn’t finished writing the manifest (look for output-h2o.system.tmp), or the path is wrong, or the run was killed before the initial manifest landed.

Fix

  • Check output-h2o.system actually exists: ls output-h2o.*.

  • Try the stem with the .out suffix — vibeqc-cite output-h2o.out also works (the CLI normalises via .with_suffix(".system")).

  • For pre-v0.8.x runs (no manifest at all), use vibeqc-cite’s inability to find the manifest as a diagnostic — the run predates the citation surface. Re-run the SCF on a current build to get the bibliography.

Things that are not errors

A handful of things look alarming but aren’t:

  • vibe-qc estimates this calculation will require ~218.4 GB, Proceeding (override) — you passed memory_override=True and the run is going through despite the estimate. Wait and see; the estimate’s 20% headroom plus the dense-ERI-tensor estimate is conservative.

  • canonical_orth dropped 2 of 234 basis directions with a small number — vibe-qc reports any drop; 1-3 dropped basis directions out of hundreds is typical of a tight cell or large basis. Only worry if the drop exceeds ~5% of the basis count.

  • # no citation route for basis 'foo' in .references — your basis isn’t in the routing table. The job ran fine; the bibliography is just missing one entry that you’ll need to add by hand. The fix is to extend database.toml in your next PR.

  • Used 4 OpenMP threads even though you set OMP_NUM_THREADS=16 — the system’s actual core count (or the cgroup limit on a cluster node) wins over the env var. Sanity-check with nproc or sysctl hw.physicalcpu (macOS).

Still stuck?

gitlab.peintinger.com/mpei/vibeqc/-/issues

When filing an issue, include:

  1. The full error traceback.

  2. The minimal Python script that reproduces it.

  3. The output-*.system manifest (carries the hardware + library versions vibe-qc needs to reproduce).

  4. The vibe-qc version (python -c "from vibeqc import VIBEQC_VERSION; print(VIBEQC_VERSION)").

The maintainer or release chat will triage. For security vulnerabilities follow SECURITY.md instead of opening a public issue.