SCF convergence

All SCF drivers in vibe-qc share the same convergence apparatus:

  • Initial guess — Hcore diagonalisation or SAD (superposition of atomic densities). SAD is the default; it typically halves the iteration count.

  • Density damping — linear mix of old and new density before building the next Fock matrix. Default damping = 0.5.

  • DIIS — Pulay extrapolation on the Fock matrix with an 8-iteration history. Default use_diis=True.

  • Convergence criteria — both an energy tolerance and an orbital gradient tolerance must be met.

from vibeqc import RHFOptions

opts = RHFOptions()
opts.max_iter = 100
opts.conv_tol_energy = 1e-8
opts.conv_tol_grad = 1e-6
opts.damping = 0.5
opts.use_diis = True
opts.diis_start_iter = 2
opts.diis_subspace_size = 8
opts.initial_guess = vq.InitialGuess.SAD   # or HCORE

All of these options also exist on UHFOptions, RKSOptions, UKSOptions, PeriodicRHFOptions, PeriodicSCFOptions, PeriodicKSOptions.

Diagnosing non-convergence

Every result struct carries a full scf_trace:

for it in result.scf_trace:
    print(f"iter {it.iter:3d}  E = {it.energy:+.10f}  "
          f"dE = {it.delta_e:+.2e}  |grad| = {it.grad_norm:.2e}  "
          f"diis dim = {it.diis_subspace}")

Pretty-format utilities:

from vibeqc import format_scf_trace, log_scf_trace
print(format_scf_trace(result))
log_scf_trace(result)         # emits via Python logging

If the energy oscillates, increase damping (0.7 is a strong default for hard cases) or start DIIS later. If the energy diverges, check your initial guess and geometry — a bad structure will never converge.

Parallelism

vibe-qc’s compute-heavy kernels — molecular and periodic Fock builds, analytic gradients, lattice-summed one-electron integrals, the Ewald lattice sums, and the AO evaluation used by DFT — are parallelised with OpenMP via a one-engine-per-thread pool.

Three ways to set the thread count, in increasing precedence:

  1. OMP_NUM_THREADS environment variable, before launching Python:

    OMP_NUM_THREADS=8 python my-calc.py
    
  2. vibeqc.set_num_threads(n) from Python, pinning the count for the rest of the process:

    import vibeqc
    vibeqc.set_num_threads(8)
    print(vibeqc.get_num_threads())   # 8
    

    n <= 0 restores the default (reads OMP_NUM_THREADS or falls back to the hardware logical-core count).

  3. num_threads= keyword argument on vibeqc.run_job:

    run_job(mol, basis="6-31g*", method="rhf", output="h2o",
            num_threads=4)
    

    The actual thread count is recorded in the output file as Threads: 4  (OpenMP shared-memory parallelism).

Every .out file includes a timing block at the end:

Timings (wall clock, seconds)
----------------------------------------------------
SCF total                              0.326
SCF avg. per iteration                 0.036  (9 iters)
Job total                              0.328
Used 4 OpenMP threads.

For systematic benchmarking, scripts/bench.py in the repository runs a small fixed suite across a sweep of thread counts and prints a speedup table.

Good scaling requires enough work per thread — tiny test systems (diatomics in minimal bases) won’t show much because the OpenMP start-up overhead dominates. Bigger molecules and periodic calculations with many lattice cells benefit much more (up to near-linear in the Fock build).