Density fitting (RIJ / RIJK / RIJCOSX)

vibe-qc’s molecular SCF Fock build is dispatched through a polymorphic JKBuilder interface (introduced at v0.8.0). Three concrete kernels cover the production paths:

Kernel

When to use

Memory

Wall-clock

FourIndexJKBuilder

small systems (≤ ~250 BFs); reference / debugging

O(N⁴)

O(N⁴) per iter

DFJKBuilder (RIJK)

medium systems (~250–1000 BFs); hybrid DFT

O(N²·M)

O(N³) per iter

COSXJKBuilder (RIJCOSX)

large hybrid-DFT systems (~500–3000 BFs)

O(N²·M)

O(N²·M) per iter

(N = orbital basis size; M = aux basis or grid points.)

The JKBuilder dispatch is automatic — set the right RHFOptions flags and vibe-qc picks the kernel. The interface is also the insertion point for the periodic-Γ JKBuilders (see multi-k SCF) so the same infrastructure drives molecular and periodic-Γ Fock builds.

Quick start

Direct (four-index, the default)

from vibeqc import Atom, Molecule, run_rhf, RHFOptions

mol = Molecule([Atom(8, [0, 0, 0]),
                Atom(1, [0,  1.43, -0.98]),
                Atom(1, [0, -1.43, -0.98])])

result = run_rhf(mol, basis="def2-svp", options=RHFOptions())
# JKBuilder = FourIndexJKBuilder (default)

Density fitting (RIJK)

opts = RHFOptions(
    density_fit=True,
    aux_basis="def2-svp-jk",     # any libint-recognised JK-fit aux
)
result = run_rhf(mol, basis="def2-svp", options=opts)
# JKBuilder = DFJKBuilder

RIJCOSX (RI-J for Coulomb + COSX for exchange)

opts = RHFOptions(
    density_fit=True,
    aux_basis="def2-svp-jk",
    cosx=True,                    # turns DF path into RIJCOSX
    # cosx_grid defaults to default_cosx_grid_options()
)
result = run_rhf(mol, basis="def2-svp", options=opts)
# JKBuilder = COSXJKBuilder

For hybrid DFT replace run_rhf with run_rks:

from vibeqc import run_rks
result = run_rks(mol, basis="def2-tzvp", functional="b3lyp",
                 options=RHFOptions(density_fit=True,
                                    aux_basis="def2-tzvp-jk",
                                    cosx=True))

The same flags work for run_uhf / run_uks (open-shell).

Auto-picking an aux basis

If you don’t want to remember the matching aux for each orbital basis, use the helper:

from vibeqc import default_aux_basis_for

orbital = "def2-tzvp"
aux = default_aux_basis_for(orbital, kind="jk")
# → "def2-tzvp-jk"

Recommended aux bases per orbital basis (the helper’s table):

Orbital basis

Recommended aux_basis (kind=”jk”)

Recommended aux_basis (kind=”ri”)

def2-svp

def2-svp-jk

def2-svp-rifit

def2-svpd

def2-svp-jk

def2-svpd-rifit

def2-tzvp

def2-tzvp-jk

def2-tzvp-rifit

def2-tzvpp

def2-tzvpp-jk

def2-tzvpp-rifit

def2-qzvp

def2-qzvp-jk

def2-qzvp-rifit

cc-pvdz

cc-pvdz-jkfit

cc-pvdz-rifit

cc-pvtz

cc-pvtz-jkfit

cc-pvtz-rifit

cc-pvqz

cc-pvqz-jkfit

cc-pvqz-rifit

aug-cc-pvdz

aug-cc-pvdz-jkfit

aug-cc-pvdz-rifit

pob-tzvp

(no matching aux yet)

(no matching aux yet)

kind="jk" aux basis = bigger, designed for the Coulomb + exchange J/K Fock build. kind="ri" aux basis = smaller, designed for the Coulomb-only (correlation-side) RI used in MP2. Use the right one — JK aux on MP2 is wasteful, RI aux on SCF is inaccurate.

pob-tzvp does not have a matching aux yet. Designing a pob-tzvp-jk fitting basis is a separate-paper item per docs/roadmap.md. Workaround until then: fall back to def2-tzvp-jk (aux basis size scales with orbital basis size, not orbital identity, so the def2 aux works as a generic), or use density_fit=False for periodic work.

RIJCOSX: chain-of-spheres exchange

RIJCOSX (Neese, Chem. Phys. 356, 98, 2009; Bykov, Petrenko, Izsák, Kossmann, Becker, Valeev, Neese, Mol. Phys. 113, 1961, 2015) replaces the O(N⁴) ERI assembly for the exchange K matrix with an O(N²·M) seminumerical chain-of- spheres algorithm:

K_{μν} ≈ Σ_g w_g χ_μ(r_g) Σ_ρ A_{νρ}(r_g) Σ_λ D_{λρ} χ_λ(r_g)

where A_{νρ}(r_g) = ∫dr' χ_ν(r') χ_ρ(r') / |r' r_g| is the analytic 1/|r-r’| integral with a unit charge at grid point r_g. RI-J is used for the Coulomb piece; the same B-tensor that the DFJKBuilder precomputes is reused.

Default COSX grid

n_radial =  35   (Treutler-Ahlrichs M4)
n_theta  =   9   (Gauss-Legendre)
n_phi    =  18   (uniform φ)

= ~5670 angular pts/atom before Becke pruning, which is ~1/8 the XC integration grid (75 × 17 × 36 = ~46 000 pts/atom). Substantially sparser is fine because COSX needs only the smooth 1/|r−r’| kernel through χ_μ(r) — no XC gradient response.

This default tier matches ORCA’s default GridX. Override via RHFOptions.cosx_grid:

from vibeqc import GridOptions
opts = RHFOptions(
    density_fit=True, aux_basis="def2-tzvp-jk",
    cosx=True,
    cosx_grid=GridOptions(n_radial=50, n_theta=11, n_phi=23),
                                          # finer than default
)

Validation: glycine / def2-TZVP

Method

Energy (Ha)

Δ vs ORCA 6.1.1 RIJCOSX

RHF, 4-index

−282.823485

0.00 mHa (reference)

RHF, RIJK (def2-tzvp-jk)

−282.823484

0.001 mHa

RHF, RIJCOSX

−282.823611

0.13 mHa

Analytic gradient max|Δ| (RIJCOSX)

0.13 mHa/bohr

Hits the standard “sub-mHa accuracy” target Neese 2009 cited. For larger systems (e.g. cellulose hexamer / def2-TZVP), the COSX path becomes ~10× faster than 4-index for hybrid DFT SCF + analytic gradient combined.

One-centre exact-exchange correction (v0.8.0)

The COSX grid under-samples the 1/r₁₂ Coulomb singularity for shell quartets where all four basis-function centres lie on the same atom. At v0.8.0, vibe-qc applies a one-centre correction that replaces the COSX-approximate intra-atom K-matrix blocks with exact four-index ERI contractions:

K_corrected[μ∈A, ν∈A] = K_exact_1c[μ,ν]
                       + (K_cosx_full − K_cosx_1c)[μ∈A, ν∈A]

The exact one-centre ERIs are precomputed at JKBuilder construction (cost: one libint quartet evaluation per intra-atom shell quartet — negligible). The COSX-approximate one-centre contribution K_cosx_1c is computed from the same GridBatches chi cache and the in-tree nuclear-attraction kernel used by the main COSX-K path, giving bit-identical AO values.

The correction is applied once post-SCF-convergence in all four molecular SCF drivers, so it does not perturb DIIS or SOSCF acceleration during iterations. It is always active when cosx=True; there is no user-facing toggle.

For light atoms (H, C, N, O), the correction is sub-μHa because the COSX grid already samples intra-atom shell pairs adequately. For heavier atoms (3d / 4d transition metals) with contracted core shells, the improvement can reach ~10⁻⁴ Ha per atom in the total energy.

What improved between v0.7.x and v0.8.0

RIJCOSX got ~3.65× faster on the n-hexadecane / B3LYP / def2-svp / 5-iter anchor between the v0.7.x baseline and the v0.8.0 tip (matched-load median, M5Max, 10 perf cores: ~603 s → ~165 s wall). Energy bit-identical end-to-end (-416.178230 Ha at iter 5). Nothing in the user-facing API changes — all the improvements are automatic when cosx=True is set. The contributing landings, in chronological order:

  • Drop dense per-grid-point A_g — build F_g block-sparse on the fly instead of materialising the full n_bf × n_bf analytical-integral matrix per point.

  • OMP schedule(guided) for the per-grid-point loop — absorbs the per-point cost variance (chi-magnitude / shell- pair-survival distribution swings ~100× between nuclear and outer-grid regions). ~15-17 % wall.

  • Per-shell radial cutoff (Schwarz × distance × dχ) on the COSX-K pair loop — drops pair-loop iteration count by 5-50× on extended systems. ~11 % wall on n-hex.

  • Reusable per-batch grid + chi-cache infrastructure (GridBatch / GridBatches) — basis-only AO cache evaluated once at SCF setup instead of per K-build.

  • Per-batch L1 (primary-shell) hierarchyCOSXJKBuilder consumes the chi cache on every build_K(D) call. L1-only first landing.

  • Per-batch L2 (density-coupled) shell screen — drops pairs where both shells fall outside the batch’s density- coupled set (Burow & Sierka 2011 § 2; Stratmann-Scuseria- Frisch 1996 § 11). The headline structural win at ~20 % wall over the pre-batch baseline.

The full per-commit measurement table + references lives in CHANGELOG.md under [Unreleased] for v0.8.0. Further perf work (custom Boys-table integral kernel replacing libint) is underway and tracked at HANDOVER_COSX_KERNEL.md.

⚠️ Known DF gradient bug (v0.8.x maintenance)

density_fit=True analytic gradients disagree with the direct gradient by ~115 mHa/bohr on glycine / def2-TZVP / RHF.

Energies still match to ~60 µHa — the bug is isolated to gradient assembly, not the DF SCF. Likely a d/f-shell derivative kernel issue. Triaged for the v0.8.x maintenance window.

Workaround: use density_fit=False for production gradient runs above ~50 BFs / def2-TZVP-class basis sets. RIJK gradient on def2-SVP and small molecules is fine. The RIJCOSX path (cosx=True) goes through the COSX gradient machinery and is not affected by this bug — RIJCOSX gradient is production-ready.

There’s also an upstream-related f-shell direct-gradient bug (see docs/index.md open-bugs admonition) — wrong on multi-heavy-atom systems with f-shell + ≥2 different second-row elements. The DF path may be a workaround for that bug if you stay below the 50-BF threshold above; otherwise fall back to FD on the total energy.

When to use which kernel

Decision tree:

  • System ≤ 250 BFs: FourIndexJKBuilder (the default). Simpler, no aux-basis dependency, exactly as accurate as the 4-index ERI. No DF or COSX overhead worth introducing.

  • System ~250–1000 BFs, hybrid DFT: DFJKBuilder (RIJK). RIJCOSX is overkill at this size.

  • System > 1000 BFs, hybrid DFT: COSXJKBuilder (RIJCOSX). The exchange-K piece is where the wall-clock savings come from.

  • System > 1000 BFs, pure GGA DFT (no exact exchange): DFJKBuilder with cosx=False. Skipping COSX is fine because there’s no exchange to assemble.

  • Need analytic gradient AND > 50 BFs def2-TZVP-class: RIJCOSX (works) OR direct (works) — NOT RIJK (broken per the bug above).

API surface

# Python-side flags on RHFOptions / UHFOptions / RKSOptions / UKSOptions:
RHFOptions(
    density_fit=True,            # → DFJKBuilder
    aux_basis="def2-tzvp-jk",
    cosx=True,                   # → COSXJKBuilder (paired with density_fit)
    cosx_grid=default_cosx_grid_options(),
)
// C++-side factory (cpp/include/vibeqc/jk_builder.hpp):
std::unique_ptr<JKBuilder> make_four_index_jk_builder(const BasisSet& basis);
std::unique_ptr<JKBuilder> make_df_jk_builder(const BasisSet& basis,
                                              const BasisSet& aux);
std::unique_ptr<JKBuilder> make_cosx_jk_builder(const BasisSet& basis,
                                                const BasisSet& aux,
                                                Grid cosx_grid);

The strategy interface:

class JKBuilder {
public:
    virtual Eigen::MatrixXd build_J(const Eigen::MatrixXd& D) const = 0;
    virtual Eigen::MatrixXd build_K(const Eigen::MatrixXd& D) const = 0;
    virtual Eigen::MatrixXd build_g_rhf(const Eigen::MatrixXd& D,
                                        double alpha_hf = 1.0) const;
};

build_g_rhf defaults to J ½·α·K; concrete kernels override when the J/K can fuse work (DF amortises the B-tensor contraction).

Citations

Cite for any published RIJ / RIJK / RIJCOSX work:

  • RI / DF in Hartree-Fock: Whitten, J. Chem. Phys. 58, 4496 (1973); Dunlap, J. Chem. Phys. 71, 3396 (1979); Eichkorn, Treutler, Öhm, Häser, Ahlrichs, Chem. Phys. Lett. 240, 283 (1995).

  • JK auxiliary bases (def2 family): Weigend, Phys. Chem. Chem. Phys. 8, 1057 (2006).

  • RI auxiliary bases (def2 family): Hellweg, Hättig, Höfener, Klopper, Theor. Chem. Acc. 117, 587 (2007).

  • COSX: Neese, Wennmohs, Hansen, Becker, Chem. Phys. 356, 98 (2009).

  • COSX-K analytic gradient: Bykov, Petrenko, Izsák, Kossmann, Becker, Valeev, Neese, Mol. Phys. 113, 1961 (2015).

Plus your basis-set + functional citations per the rest of docs/license.md.

See also

  • functionals.md — XC functional library (B3LYP-VWN5 sidebar, PW1PW, hybrid functional aliases).

  • multi_k_scf.md — periodic-Γ JKBuilder

    • multi-k SCF.

  • scf_convergence.md — DIIS / EDIIS+DIIS / level-shift / quadratic-fallback that the JKBuilder dispatch lands inside.