Basis-set convergence

Total energies and most properties improve as you move from small basis sets (minimal, split-valence) toward large ones (triple-zeta with diffuse and polarisation). This tutorial shows how to run a family of calculations, tabulate the trend, and decide when you’ve hit the point of diminishing returns.

A Dunning-family sweep on water

The correlation-consistent series cc-pVXZ (X = D, T, Q) and its augmented variants systematically approach the HF and correlation complete-basis-set (CBS) limit. Run HF on water at each:

from vibeqc import Atom, BasisSet, Molecule, run_rhf

mol = Molecule([
    Atom(8, [ 0.0,  0.00,  0.00]),
    Atom(1, [ 0.0,  1.43, -0.98]),
    Atom(1, [ 0.0, -1.43, -0.98]),
])

print(f"{'basis':12s}  {'nbasis':>6s}  {'E (Ha)':>16s}  {'ΔE (mHa)':>10s}  {'iters':>5s}")
print("-" * 60)

previous = None
for name in ("sto-3g", "cc-pvdz", "cc-pvtz", "cc-pvqz"):
    basis = BasisSet(mol, name)
    r = run_rhf(mol, basis)
    delta = ""
    if previous is not None:
        delta = f"{(r.energy - previous) * 1000:+10.3f}"
    print(f"{name:12s}  {basis.nbasis():6d}  {r.energy:16.8f}  {delta:>10s}  "
          f"{r.n_iter:5d}")
    previous = r.energy

Output:

basis         nbasis           E (Ha)   ΔE (mHa)  iters
------------------------------------------------------------
sto-3g             7       -74.94956615                  8
cc-pvdz           24       -76.02431388   -1074.748     10
cc-pvtz           58       -76.05605100     -31.737     12
cc-pvqz          115       -76.06394333      -7.892     15

Reading across:

  • ΔE shrinks by roughly 1/3 with each level up — classic asymptotic behaviour of the correlation-consistent series. Extrapo- lating to CBS by fitting an exponential in the cardinal number X gives around -76.0676 Ha at HF limit.

  • cc-pVTZ captures 99.96% of the cc-pVQZ improvement for 1/2 the basis functions — a reasonable default for production HF.

  • Timing scales as O(N^4) in the basis-function count for HF, so doubling the basis multiplies the cost by ~16.

A polarised Pople sweep

A quick “what does polarisation do?” experiment:

for name in ("6-31g", "6-31g*", "6-31g**", "6-311g**"):
    basis = BasisSet(mol, name)
    r = run_rhf(mol, basis)
    print(f"{name:14s}  nbasis={basis.nbasis():3d}  E = {r.energy:.6f}")
6-31g         nbasis= 13  E = -75.982973
6-31g*        nbasis= 18  E = -76.006678
6-31g**       nbasis= 24  E = -76.021164
6-311g**      nbasis= 30  E = -76.045115
  • Adding polarisation on heavy atoms (*) gains ~24 mHa.

  • Adding polarisation on hydrogens as well (**) another ~14 mHa.

  • Going from 6-31 → 6-311 split-valence ~24 mHa.

Diffuse-augmented bases (6-311++g**, aug-cc-pvtz) are essential for anions, weak-interaction energies, and excited-state calculations. vibe-qc’s basis library ships them, but their small exponents can trigger linear-dependency issues in certain tight geometries; tighten conv_tol_grad and bump max_iter when you need them.

Properties that converge differently

Total energies are the slowest-converging quantity. Relative energies (reaction energies, barriers) and properties (dipoles, bond orders, frequencies) converge faster because error cancellation helps — but not always monotonically.

from vibeqc import mulliken_charges, dipole_moment

for name in ("cc-pvdz", "cc-pvtz", "cc-pvqz"):
    basis = BasisSet(mol, name)
    r = run_rhf(mol, basis)
    q = mulliken_charges(r, basis, mol)
    mu = dipole_moment(r, basis, mol)
    print(f"{name:14s}  E = {r.energy:.5f}   "
          f"q_O(Mull)= {q[0]:+.3f}   μ(D)= {mu.total_debye:.4f}")

Output:

cc-pvdz         E = -76.02431   q_O(Mull)= -0.269   μ(D)= 1.9131
cc-pvtz         E = -76.05605   q_O(Mull)= -0.471   μ(D)= 1.8847
cc-pvqz         E = -76.06394   q_O(Mull)= -0.497   μ(D)= 1.8648

Observe:

  • Dipole converges cleanly — within 0.03 D by cc-pVTZ, within 0.02 D at cc-pVQZ. The experimental gas-phase value is 1.855 D; cc-pVQZ HF is already on top of it, but this is partially a cancellation of errors (HF over-polarises, no correlation pulls it back) rather than true convergence.

  • Mulliken charges are a mess — they wander even within a consistent family. Don’t chase a particular value; use them only for within-set comparisons.

Tabulating the trend in a figure

import numpy as np
import matplotlib.pyplot as plt

basis_names = ["sto-3g", "cc-pvdz", "cc-pvtz", "cc-pvqz"]
nbasis, energies = [], []
for name in basis_names:
    basis = BasisSet(mol, name)
    r = run_rhf(mol, basis)
    nbasis.append(basis.nbasis())
    energies.append(r.energy)

fig, ax = plt.subplots()
ax.plot(nbasis, energies, "o-")
ax.set_xlabel("Number of basis functions")
ax.set_ylabel("E(RHF) / Ha")
ax.set_xscale("log")
for n, e, name in zip(nbasis, energies, basis_names):
    ax.annotate(name, (n, e), xytext=(4, 4), textcoords="offset points")
fig.tight_layout()
fig.savefig("h2o_basis_convergence.png", dpi=150)

For a three-point cc-pVXZ extrapolation to the HF CBS limit, fit E(X) = E_CBS + A * exp(-α X) with α 1.63 (Halkier et al. 1998).

Notes

  • Basis-set superposition error (BSSE) matters for interaction-energy calculations on weakly-bound complexes. No counterpoise correction is built in yet, but you can compute it manually by running the monomer calculations in the dimer basis and subtracting.

  • Basis linear dependencies. At very large bases (aug-cc-pVQZ and bigger, or diffuse functions in close-packed geometries) the overlap matrix becomes nearly singular. vibe-qc will still converge, but the MO coefficients become noisy. For periodic calculations this is why the pob-* solid-state basis family exists — small-exponent primitives trimmed deliberately.

Next

  • A DFT-functional-comparison tutorial using the same sweep pattern is next on the list (scan the XC library instead of basis sets).

  • The periodic HF tutorial uses pob-tzvp directly because standard basis sets are usually too diffuse for crystals.