CASCI / CASSCF / NEVPT2 / CASPT2 — Implementation Handover

Last updated: 2026-06-06 (large-basis direct active-CI IC-CASPT2 path landed; method="caspt2" dispatches explicit-small / direct-large) Branch: main Status:

  • CASCI — fixed, validated vs PySCF, ungated. Supports nroots>1 for multi-root (excited-state) calculations.

  • CASSCF is orbital-optimized CASCI, validated vs PySCF mcscf.CASSCF, ungated.

  • NEVPT2 (strongly-contracted) — fixed, validated vs PySCF, ungated.

  • CASPT2method="caspt2" is now the internally-contracted variant (variant="ic", default), validated against OpenMolcas &CASPT2 to ≤2 µHa and ungated. The older strongly-contracted variant (variant="sc") and the IPEA shift (ipea≠0) stay gated (VIBEQC_EXPERIMENTAL_MULTIREF=1) — see “CASPT2” below.


What was wrong (audit 2026-05-30) and how it was fixed

These three methods landed in 0c8f0fcf as scaffolding on the v0.21 roadmap path. The audit found all three broken; they have since been rebuilt:

Method

Was

Now

CASCI

crashed for >1 same-spin e⁻; cored ≈ −104 Ha (below FCI)

rebuilt on the validated build_hamiltonian_matrix_unrestricted engine + correct frozen-core dressing; matches PySCF mcscf.CASCI

NEVPT2

3/8 classes; wrong-sign E_corr

strongly-contracted NEVPT2 via the uniform perturber-projection definition; matches PySCF mrpt.NEVPT per class

CASPT2

returned 0 for no-core

internally-contracted CASPT2 (Andersson 1992; nondiagonal generalized-Fock H₀), validated vs OpenMolcas to ≤2 µHa and un-gated; SC variant retained behind the flag

Key design decisions

  • CASCI is built on the proven FCI engine. The only CASCI-specific physics is the frozen-core dressing (_casci.py::_frozen_core_dressing): E_core = ⟨core|H|core⟩ and h̃_pq = h_pq + Σ_c (2 g_{pcqc} g_{pccq}) (physicist’s g). Validated to Δ=0 against an explicit CAS-determinant FCI over the full orbital space, and to ~1e-12 against PySCF.

  • RDMs (_rdm.py) are spin-summed 1/2/3-RDM builders in PySCF convention, validated elementwise vs fci.direct_spin1.make_rdm123 (~1e-15) and by the energy self-test energy_from_rdms == eigenvalue.

  • NEVPT2/CASPT2 share one engine (_mrpt.py). The strongly-contracted PT2 energy is computed from its definition — one perturber |V_g⟩ per external (core-hole / virtual-particle) pattern, E = Σ_g −⟨V_g|V_g⟩ / (⟨V_g|H₀|V_g⟩/⟨V_g|V_g⟩ E₀) — using a small spin-orbital second-quantization engine, so all spin/antisymmetry and combinatorial factors are handled by construction (no per-class formulas). NEVPT2 uses Dyall’s H₀; SC-CASPT2 uses the generalized-Fock H₀. Orbitals are semicanonicalized (core + virtual blocks of the generalized Fock diagonalized); the active CI vector is invariant under that rotation.

  • Internally-contracted CASPT2 is the default method="caspt2". For small contracted spaces _mrpt.py::_ic_caspt2_corr builds the first-order interacting space explicitly in the determinant basis as the contracted double excitations |Ψ_pqrs⟩ = Ê_pr Ê_qs|0⟩ (Andersson, Malmqvist, Roos, JCP 96, 1218 (1992); Celani-Werner, JCP 112, 5546 (2000)), projects them orthogonal to the CAS space, orthonormalizes through the overlap metric (dropping near-linear- dependent directions), and solves (H₀ E₀) C = −V with the full (nondiagonal) generalized Fock H₀ = P_SD F P_SD (the CASPT2N variant = OpenMolcas’s default &CASPT2). For larger unshifted IC jobs (ipea=0, imaginary=0) caspt2() dispatches to the direct active-CI signature-block path in _caspt2_rdm.py, which builds raw signature blocks without dense IC matrices and prunes one-body-unreachable cross-block H₀ couplings. Both paths are exact internal contraction and validated against OpenMolcas. Supports a frozen core (n_frozen), an imaginary level shift (imaginary σ; Forsberg-Malmqvist 1997, explicit path), and the IPEA shift (ipea ε; Ghigo 2004, gated explicit path).

  • CASSCF optimizes the orbitals on top of CASCI (_casscf.py). Two-step macro-iteration: solve CASCI, build the non-symmetric generalized Fock and the orbital gradient g_pq = 2(F_qp F_pq) over the non-redundant (inactive↔active / inactive↔virtual / active↔virtual) rotations, then take a backtracked augmented-Hessian step whose Hessian comes from central finite differences of that gradient. The gradient is exact (re-solving the CI in every basis removes the CI-response term) and is validated elementwise against finite differences of the trusted CASCI energy (~1e-9). The backtracking line search makes the energy monotone, so it cannot diverge. Like every CASSCF the surface is non-convex: the optimizer converges to the stationary point in the basin of the starting (HF) orbitals, which need not be the global minimum (PySCF’s default start can land in a different basin). Basis-independent correctness is pinned by the variational sandwich E_FCI E_CASSCF E_CASCI, a vanishing orbital gradient, and exact rotation-invariance of the full-space CASCI energy. NEVPT2/CASPT2 can run on the CASSCF reference via CASSCFResult.{cas, h1e_cas, h2e_cas}.

Validation (PySCF out-of-process, CLAUDE.md §10)

System

Quantity

vs reference

H₂O/STO-3G CAS(2,2)/(4,4), H₂/6-31g, LiH

CASCI

PySCF mcscf.CASCI, Δ ≤ 1e-9

H₂/6-31g, LiH, HF, Be (unique-minimum)

CASSCF (single-state)

PySCF mcscf.CASSCF, Δ ≤ 1e-7

H₂/6-31g

CASSCF orbital gradient

finite diff of CASCI energy, Δ ≤ 1e-6

H₂O/STO-3G CAS(2,2)

CASSCF sandwich

E_FCI E_CASSCF E_CASCI

H₂O/STO-3G CAS(2,2)

multi-root CASCI (nroots=3)

PySCF mcscf.CASCI, Δ ≤ 1e-9

H₂O/STO-3G CAS(2,2) w=[0.6,0.4]

SA-CASSCF (nroots=2)

PySCF mcscf.CASSCF(state_average_), Δ ≤ 1e-6

H₂/6-31g, H₂O/STO-3G CAS(2,2)/(4,4)

NEVPT2 (all 8 classes)

PySCF mrpt.NEVPT per class, Δ ≤ 5e-9

H₂/6-31g

CASPT2 empty-active (MP2 limit)

PySCF mp.MP2, Δ ≤ 1e-9

H₂/6-31g

CASPT2 full-space

0 (exact)

H₂/6-31g CAS(2,2) (virtual)

IC-CASPT2

OpenMolcas &CASPT2, 0.0 µHa

H₂O/STO-3G CAS(4,4) (core hole)

IC-CASPT2

OpenMolcas, 0.03 µHa

H₂O/STO-3G CAS(4,4) frozen O 1s

IC-CASPT2 + frozen core

OpenMolcas Frozen=1, 0.01 µHa

H₂O/6-31g CAS(4,4) (inactive+virtual)

IC-CASPT2

OpenMolcas, 0.02 µHa

N₂/STO-3G CAS(6,6) (strong correlation)

IC-CASPT2

OpenMolcas, 1.8 µHa

H₂O/cc-pVDZ CAS(4,4) (direct large path)

IC-CASPT2

OpenMolcas, <0.01 µHa

N₂/STO-3G CAS(6,6)

imaginary shift σ=0.1

OpenMolcas Imaginary=0.1, 0.01 µHa (shift effect)

Tests: tests/test_solvers_mrpt_parity.py (OpenMolcas reference values recorded as constants; regenerate with examples/regression/core/runner_openmolcas.py).

CASPT2 — internally-contracted (un-gated) + strongly-contracted (gated)

method="caspt2" (variant="ic", default, ipea=0) is the internally- contracted CASPT2 described above, validated against OpenMolcas &CASPT2 (fixed CASCI(HF) reference, IPEAshift=0.0, Group=C1) to ≤2 µHa across the virtual-, core-hole-, frozen-core-, combined- and strongly-correlated- active-space regimes in the table. It is un-gated. OpenMolcas is the only local CASPT2 oracle: PySCF ships no CASPT2 and ORCA’s MR-PT2 is NEVPT2, not CASPT2 (CASPT2 is the Roos/Andersson method native to OpenMolcas/Molpro), so ORCA cross-checks NEVPT2/CASSCF/CASCI only.

Two things stay gated behind VIBEQC_EXPERIMENTAL_MULTIREF=1:

  • The strongly-contracted variant (variant="sc") — one perturber per external pattern; MP2/full-space limits exact and it cross-checks NEVPT2, but its active-space terms have no external reference (the IC variant supersedes it). Retained for comparison.

  • The IPEA shift (ipea≠0, Ghigo 2004) — implemented per the published per-orbital formula (Eqs 10–13, correct sign: the shift always raises the denominator). It is not sub-µHa reproducible against OpenMolcas (~0.1 mHa at ε=0.25) because the IPEA level shift is diagonal in OpenMolcas’s class-wise internal contraction, whereas vibe-qc uses an explicit-determinant contraction; the two contractions give identical energies at ε=0 (the method is contraction-independent there) but a different diagonal-shift operator at ε≠0. Since the IPEA shift is itself a fitted semi-empirical correction, this ~0.1 mHa implementation difference is within its inherent uncertainty, but per CLAUDE.md §7 it is gated rather than presented as OpenMolcas-matching.

Files

File

Role

python/vibeqc/solvers/_rdm.py

spin-summed 1/2/3/4-RDM builders + SA-RDM (make_rdm12_sa)

python/vibeqc/solvers/_casci.py

CASCI on the FCI engine + frozen-core dressing + multi-root (nroots)

python/vibeqc/solvers/_casscf.py

CASSCF: (SA) orbital optimization on (multi-root) CASCI

python/vibeqc/solvers/_mrpt.py

spin-orbital engine + SC/IC-CASPT2 + NEVPT2

python/vibeqc/solvers/_common.py

require_experimental_multiref (SC + IPEA gate)

python/vibeqc/runner.py

method="casci"/"casscf"/"nevpt2"/"caspt2" dispatch

python/vibeqc/memory.py

casci/casscf/nevpt2/caspt2 memory estimators

python/vibeqc/output/citations/database.toml

Roos 1980 / Angeli 2001 / Andersson 1990+1992 / Celani-Werner 2000 / Ghigo 2004 / Forsberg 1997 + routes

examples/regression/core/runner_openmolcas.py

OpenMolcas IC-CASPT2 subprocess oracle (regenerates the recorded refs)

tests/test_solvers_mrpt_parity.py

parity (PySCF + OpenMolcas) + limit + gate tests

tests/test_solvers_sacasscf.py

multi-root CASCI + SA-CASSCF parity vs PySCF

Performance / scope

The spin-orbital engine scales steeply with the orbital count. CASPT2 now keeps both implementations: the dense explicit IC engine for small jobs and the direct active-CI signature-block engine for larger unshifted IC jobs. The direct path is validated through H2O/cc-pVDZ CAS(4,4) against OpenMolcas, but it still scales as FCI in the active space and is not a substitute for much larger active-space algorithms. CASSCF rebuilds the orbital Hessian by finite differences of the analytic gradient each macro-iteration (O(n_pairs) CASCI solves per iteration), which is fine for the small active spaces this engine targets.

Next steps

Internally-contracted CASPT2 is DONE and un-gated (updated 2026-06-06). The explicit-small and direct-large IC paths reproduce OpenMolcas &CASPT2 to ≤2 µHa across the tested regimes, including H2O/cc-pVDZ CAS(4,4). The frozen core and imaginary level shift remain on the explicit path; the direct path is used for large unshifted IC jobs. method="caspt2" is the default IC variant. Remaining open items:

  1. Exact IPEA matching — investigated 2026-06-03, blocked clean-room. vibe-qc’s ipea≠0 follows Ghigo 2004 but is not sub-µHa reproducible against OpenMolcas (~0.1 mHa at ε=0.25). A focused investigation (read Ghigo 2004 + Celani-Werner 2000 in full; references/ ipea_minimal_basis_scratch.py) established the root cause: the IPEA shift on the contracted doubly-external functions must contract the per-orbital σ shift (Ghigo Eqs 10–11) with the active density, and the exact contraction weights are a MOLCAS implementation detail not specified in the published equations — Ghigo himself (p. 3) notes the “into/out of an active orbital” classification “is not obvious” and uses an unstated “simplified approach”. Five reasonable interpretations were tested against the OpenMolcas oracle on H₂/6-31g CAS(2,2) (OM ε=0.25 = -1.14616891): over-complete diagonal (-123 µHa), minimal-basis positional Σσ (+247 µHa), Fock-diagonal shift on H0+E0 (wrong sign), Fock shift on H0 only (wrong sign), and hole-weighted (D_m ⟨Ê_mm⟩) (wrong sign — the contracted function’s occupation of a low-D active orbital can exceed D_m, so “holes” go negative). They bracket the OM value but none match. The clean-room constraint (CLAUDE.md §10: implement from papers only, never OpenMolcas src/ — LGPL→MPL incompatibility) prevents reading the actual prescription. Conclusion: bit-exact IPEA is not reliably achievable from the published equations; ipea≠0 stays gated as a documented approximation. The RDM-contraction route (item 2) would face the same shift-contraction ambiguity for ipea≠0, but resolves cleanly for ipea=0 (no MOLCAS-specific bookkeeping there).

  2. Larger active spaces. The dense explicit-determinant contraction is the small-CAS regime (H2O/6-31g CAS(4,4) ≈ 2 min). The RDM-contraction route (building H0/S from RDMs through 4th order, metric-orthonormalized linear solve per class — Celani-Werner 2000) scales to large CAS. This is a large (multi-week-scale) reimplementation of the production working equations, but for ipea=0 every intermediate (S, H0, V per class) is validatable against the explicit-determinant engine for small CAS (exact ground truth, no MOLCAS-specific ambiguity), then run on large CAS. It does not resolve the ipea≠0 shift-contraction ambiguity of item 1.

  3. Real level shift (Roos-Andersson 1995) as a cheap alternative to the imaginary shift; drops in as a diagonal denominator term like the imaginary one.

The strongly-contracted perturber engine in _mrpt.py is NOT reusable for the RDM-contraction IC path (SC materializes one perturber per external pattern; IC solves a metric linear system). The RDMs (through 4th order), the generalized Fock, and the semicanonicalization ARE reusable.

Also open: an analytic CASSCF orbital Hessian (currently central-FD; forward differences would be 2x faster). excited states / open-shell.