vibe-qc bug-hunt brief

This is the binding operating manual for any chat or contributor running a bug hunt on a tagged vibe-qc release. If you’re a Claude Code chat that’s been told to bug-hunt vibe-qc, this is your policy document — read it end-to-end before doing any work.

Status: active for the v0.4.x patch line. Last updated 2026-04-27.

Canonical source is main. The release branch is fast- forward-only from a tag and won’t have this file until the next patch release rolls (v0.4.1 ships the v0.4.x version of this brief to release). Until then, a bug-hunt chat checked out on release should read the brief from main via:

git fetch origin main
git show origin/main:docs/bug_hunt_brief.md | less

— or simply consult the canonical copy at https://vibe-qc.com/bug_hunt_brief.html once the docs site auto-deploys the new file.


1. Scope (priority order)

You hunt for and fix:

  1. Wrong answers — energies, gradients, properties that disagree with PySCF / CRYSTAL / ORCA at validated tolerances. Highest priority.

  2. Crashes / segfaults — anything that exits without a Python traceback.

  3. Ergonomic dead-ends — confusing errors, missing input validation, footguns that lead users into invalid states.

  4. Documentation accuracy — code blocks in tutorials and docstrings that don’t actually run on a clean checkout.

Bias toward (1) before (2–4). If you find a (2–4) issue while hunting (1), drop a one-liner in your reply rather than getting derailed.

2. Branch model

Branch lock

You work on the release branch only. v0.4.0 is tagged at v0.4.0 on release (commit 83e9ef5).

You never push to main. main is the engineering chat’s territory (currently on the v0.5 milestone). Bugs that exist on both branches still get fixed here on release; the engineering chat cherry-picks back.

3. Setup

git clone ssh://git@gitlab.peintinger.com:26/mpei/vibeqc.git
cd vibeqc
git checkout release
./scripts/setup_native_deps.sh
python3 -m venv .venv
.venv/bin/pip install -e '.[test]'
.venv/bin/python -c "from vibeqc.banner import print_banner; print_banner()"
# Must print:  Release v0.4.0

Test baseline before any other work: .venv/bin/pytest tests/ must report 879 passed / 25 skipped / 1 xfailed. If anything else, stop and figure out why before touching any source.

4. Workflow per bug

  1. Reproduce minimally. Reduce to the smallest failing case.

  2. Pin it as a regression test. Add a test to tests/test_bug_repros_v04x.py (create the file if missing). The test must fail before your fix and pass after. Each test docstring includes a one-line description and links to the reproducing scenario.

  3. Fix. Edit the source. Keep the diff under ~300 lines and ideally limited to a single subsystem. If a fix needs to be bigger or touches multiple subsystems, stop and escalate (see § 6).

  4. Run the full suite. pytest tests/ must show one additional passing test (your new one) compared to baseline.

  5. Run sphinx -W if you touched any docs:

    sphinx-build -W -b html docs /tmp/sb
    

    must build clean.

  6. Commit. Use this format:

    fix(<area>): <one-line summary>
    
    <2-4 line problem description: what was wrong, what symptom>
    
    <2-4 line fix description: what changed, why this is the right fix>
    
    Found via: <e.g. "PySCF cross-check on ZnO / B3LYP / STO-3G">
    Reproducer: tests/test_bug_repros_v04x.py::test_<name>
    Affects main: yes / no / probably (engineering chat to confirm)
    
    Co-Authored-By: <your model identifier> <noreply@anthropic.com>
    
  7. Push to release. git push origin release. The branch is fast-forward-only; your fix appends to it.

5. Tagging — you do NOT cut tags

After landing a fix, surface a status line in your reply:

Fix landed at <sha> on release. Ready to tag v0.4.{N+1}.

The user runs the actual tag commands (git tag -a vX.Y.Z, git push origin vX.Y.Z). You do not run git tag yourself, ever, under any circumstances.

6. Hard escalation gates — STOP and ask before any of these

  • Push to main (any commit, ever).

  • Cut a new git tag (any tag, ever).

  • Make a change > 300 lines, or that touches > 5 files unrelated to the bug under investigation.

  • Add a new dependency (Python or system).

  • Bump the major or minor version in pyproject.toml. Patch bumps come from the user explicitly cutting a tag.

  • Force-push, rewrite history, delete branches, modify protected refs.

  • Run any operation you’re not sure is reversible.

When in doubt, escalate. The cost of a stop-and-ask is much lower than the cost of an unauthorized destructive action.

7. Hand-off protocol

Three categories of finding need to leave the bug-hunt chat:

Bug needs feature-level work (new C++ kernel, public API restructure, anything bigger than a fix) — write up your analysis as bug_repros/notes/<short-name>.md, commit + push, surface in your reply tagged engineering-chat-needed. The user routes to the engineering chat working on main.

Bug is in tutorial / docs / website prose (not code logic) — write up under bug_repros/notes/<short-name>.md, surface tagged docs-chat-needed. The user routes to the docs chat.

Bug exists on both branches (most fixes will) — land your fix on release with commit-message footer Affects main: yes. The engineering chat picks it up by scanning that footer.

The bug_repros/notes/ directory is a free-form scratch space; it ships with the repo so notes are durable but isn’t required to follow any particular structure.

8. Reference values for spot-checks

These canaries should always pass. If any regress, the bug is upstream of whatever you were hunting — stop, investigate, and report.

System

Spec

Expected

Pinned in

H₂

STO-3G / RHF

−1.11675931 Ha

tests/test_periodic_rhf_ewald.py

H₂O

6-31G* / RHF

−76.0107 Ha

tests/test_h2o.py

Zn²⁺

LANL2DZ ECP / 6-31G / UHF

matches PySCF to µHa

tests/test_ecp_validation.py

NaCl

Madelung constant

−1.7475645946…

tests/test_madelung.py

H atom

any DFT functional / UKS

⟨S²⟩ = 0.75

tests/test_periodic_uks_ewald.py

Multi-k Ewald [1,1,1]

molecular-limit cell

matches Γ-only Ewald to 1e-10 Ha

tests/test_periodic_*_multi_k_ewald.py

9. Stop criteria for a session

End a session and report when you’ve either:

(a) made a full pass through the active-hunt targets in § 10, or (b) accumulated 5–10 fixes on release, whichever comes first.

In either case, surface in your reply:

  • Bugs found / fixed / escalated.

  • SHAs landed on release (one-line summary each).

  • File paths of any bug_repros/notes/*.md (engineering / docs hand-offs).

  • Recommendation: cut v0.4.1 now, or keep hunting?

Quality over breadth.


10. Active hunt — v0.4.x

The eight high-value hunt areas for v0.4.x. Don’t invent your own targets until you’ve explored these. The v0.4.0 CHANGELOG [0.4.0] § Limitations flags some of them explicitly as known unvalidated.

  1. EWALD_3D KS on real ionic crystals. Molecular-limit and ω-invariance witnesses pass at v0.4.0; never validated against CRYSTAL on LiH / NaCl / MgO / Si. Pick one, compare lattice constants / cohesive energies / band gaps.

  2. ECP × multi-k periodic. Phase 14 (ECP via libecpint) and Phase 15 (multi-k Ewald) shipped in the same release but were never cross-tested. Run a transition-metal crystal (ZnO, TiO₂, any simple TM oxide) with RHFOptions.ecp_centers + the EWALD_3D Coulomb method.

  3. Hybrid functionals on TM systems. B3LYP / PBE0 paths exist on periodic RKS / UKS Ewald (any α = func.hf_exchange_fraction > 0) but were tested only on H₂. Run B3LYP on a real TM crystal — check band gap and HOMO-LUMO sanity.

  4. Tight-core resolution on the FFT grid. Saunders–Dovesi multipolar splitting is not shipped → STO-3G O 1s is xfailed in the test suite. Where else does this lurk? Anything with α_basis 50 on a 0.3-bohr default grid is a candidate.

  5. Tutorials 20–25 actually run. New in v0.4.0: 20 (natural orbitals), 21 (PDOS), 22 (periodic Bloch cubes), 23 (tight-cell DFT), 24 (periodic SCF convergence), 25 (symmetry storage). Execute every code block on a clean checkout.

  6. Wheel install path. A stock pip install vibe-qc (no setup_native_deps.sh) should work because the basis + ECP libraries are bundled inside the wheel. Test it from a fresh venv with no third_party/ directory; confirm a TM ECP calculation runs end-to-end.

  7. Banner reliability. Every persisted SCF log carries the banner. On release it must print Release v0.4.0. Edge cases: pip-installed wheel without .git (does build_info() degrade gracefully?), detached HEAD, uncommitted local changes.

  8. Numerical robustness. Very tight cells (small box, image atoms overlapping); near-linearly-dependent bases (the canonical orthogonalisation path); high-spin states; charged cells (the Madelung-correction helpers exist but aren’t auto-applied).


11. Closed hunts

(empty — v0.4.x is the first patch line)