The vq calculation queue¶
vq is vibe-qc’s calculation queue — a small SSH-backed
job-submission tool that lets you run vibe-qc (and CRYSTAL /
ORCA) calculations on a remote compute box without writing
shell glue. Configure it once, then vq submit my_calc.py
from your laptop and the job is queued, dispatched,
resource-capped, and watched on the remote host. Outputs
come back the same way.
vq is co-shipped with vibe-qc in the
vibe-queue/
subpackage but is independently versioned (at the time of
writing it’s vq, version 0.5.25). It’s engine-agnostic: vibe-qc is the
primary workload, but anything you can call from a shell —
CRYSTAL14, ORCA 6.1, PySCF scripts — submits the same way
through contrib/ wrappers.
When to use vq¶
Laptop runs out of cores or RAM. Your MacBook has 16 GB and 10 cores; the remote has 128 GB and 32 cores. Queue the big runs; keep the laptop for development.
You want a record of what you ran. Every submission is a
JobSpecstored on the daemon, with a unique short-hash id, full command, environment, resource caps, terminal state, and outputs.You’re running many jobs. vq dispatches one at a time (by default; see § Concurrency below) and records every one, so you don’t lose track when a sweep takes hours.
You want resource enforcement. cgroup-v2 caps mean a runaway job doesn’t bring down the box.
When NOT to use vq¶
Tiny molecules on the laptop.
.venv/bin/python h2o.pyruns in 3 s; the queue + ssh round-trip adds latency for zero gain.Truly interactive sessions. vq is batch-shaped; use ssh
a remote venv directly, or set up the Jupyter Lab integration for notebooks.
HPC cluster job arrays. vq targets a single single-node host. SLURM is the right tool for cluster scheduling; a SLURM backend for vq is on the v1.0 roadmap but doesn’t ship yet.
Architecture¶
┌────────────────────┐ SSH ┌──────────────────────────┐
│ Your laptop │ ─────────────────→ │ Remote compute host │
│ │ │ │
│ vq CLI │ │ vq-daemon.service │
│ ~/.config/vq/ │ vq submit │ (systemd --user) │
│ config.toml │ │ │
│ │ ←───── stdout ──── │ Queue (SQLite-backed) │
│ ssh-key auth │ │ ↓ │
│ │ │ systemd-run scope │
│ │ │ (cgroup-v2 caps) │
│ │ │ ↓ │
│ │ │ your Python / ORCA / │
│ │ │ CRYSTAL14 process │
│ │ │ │
│ │ vq-web.service │ Web UI (FastAPI+htmx) │
│ browser ──────────┼───── port 8765 ───→│ :8765/queue, │
│ │ bearer token │ /jobs/<id> │
└────────────────────┘ └──────────────────────────┘
Pieces that need to be running:
vq-daemon.serviceon the remote — accepts submissions (over SSH), maintains the queue, dispatches jobs into cgroup scopes, survives reboots vialoginctl enable-linger.vq-web.serviceon the remote — read-only-plus-write REST + HTML UI, port 8765 by default, bearer-token-protected.vqCLI on the laptop — wrapsssh remote vq …so the laptop never deals with the queue state directly.
Installation¶
Two sides — local (laptop) and remote (compute box). Both run
the same pip install.
Local (laptop)¶
# Inside your vibe-qc checkout
cd vibe-queue
python3 -m venv .venv
.venv/bin/pip install -e .
# Put vq on PATH:
ln -s ~/path/to/vibe-queue/.venv/bin/vq ~/.local/bin/vq
# or in zshrc:
# alias vq="$HOME/path/to/vibe-queue/.venv/bin/vq"
The local install needs only the CLI dependencies (no FastAPI / systemd). Test:
vq --version # vq, version 0.5.25
Remote (compute box)¶
# 1. Install vq from a vibeqc-queue clone:
git clone https://gitlab.peintinger.com/mpei/vibeqc.git ~/vibeqc-queue
cd ~/vibeqc-queue/vibe-queue
python3 -m venv .venv
.venv/bin/pip install -e '.[web]' # [web] pulls FastAPI + uvicorn
# 2. Install the systemd-user units:
mkdir -p ~/.config/systemd/user
cp contrib/vq-daemon.service ~/.config/systemd/user/
cp contrib/vq-web.service ~/.config/systemd/user/
# Edit ExecStart in both unit files to use the venv's
# absolute vq path (e.g. /home/user/vibeqc-queue/vibe-queue/.venv/bin/vq).
# 3. Enable the daemon to start at boot (linger keeps the
# user instance alive without an active session):
sudo loginctl enable-linger $USER
systemctl --user daemon-reload
systemctl --user enable --now vq-daemon.service
systemctl --user enable --now vq-web.service
# 4. Verify:
systemctl --user status vq-daemon vq-web
journalctl --user -u vq-daemon -f # live log tail
The bearer token for the web UI is generated on first
daemon-start and written to ~/.config/vq/web-token mode
0600 on the remote. Print it once and store it locally; you’ll
need it to access the web UI from a browser. Re-generate by
deleting the file and restarting the daemon.
Configuration¶
vq reads ~/.config/vq/config.toml on the laptop. The
remote daemon doesn’t need a config file. Copy the template
from the repository and edit:
cp vibe-queue/docs/config.toml.example ~/.config/vq/config.toml
A working minimal config:
# ~/.config/vq/config.toml on your laptop
# Default host when you omit it from `vq <subcommand> ...`.
# Match a [hosts.<name>] block below.
default_host = "compute"
[hosts.compute]
ssh = "compute"
# 'compute' must be an SSH alias defined in ~/.ssh/config,
# or a literal user@host.example.com. Test with:
# ssh compute hostname
# Absolute path to vq on the remote. The remote shell's
# default PATH usually doesn't include the venv vq lives in.
remote_vq = "/home/USER/vibeqc-queue/vibe-queue/.venv/bin/vq"
# Default Python interpreter for single-file submits. Point
# at a venv where vibe-qc is installed.
remote_python = "/home/USER/vibeqc-dev/.venv/bin/python"
# Optional: multi-venv routing for --branch (v0.5.6+).
# Lets `vq submit foo.py --branch release` pick the right
# vibe-qc clone without hard-coding the path.
[hosts.compute.branches]
main = "/home/USER/vibeqc-dev/.venv/bin/python"
release = "/home/USER/vibeqc-release/.venv/bin/python"
[hosts.compute.branch_aliases]
dev = "main"
development = "main"
latest = "release"
The full annotated example is at
vibe-queue/docs/config.toml.example.
Multi-host¶
Add another [hosts.<name>] block:
[hosts.compute2]
ssh = "compute2"
remote_vq = "/home/USER/vibeqc-queue/vibe-queue/.venv/bin/vq"
remote_python = "/home/USER/vibeqc-dev/.venv/bin/python"
Then vq submit foo.py --host compute2 routes to that
machine. Omit --host to use default_host.
Your first job¶
# A trivial vibe-qc water RHF script.
cat > water.py <<'EOF'
from vibeqc import Atom, Molecule, run_job
mol = Molecule([
Atom(8, [0.0, 0.00, 0.00]),
Atom(1, [0.0, 1.43, -0.98]),
Atom(1, [0.0, -1.43, -0.98]),
])
run_job(mol, basis="sto-3g", method="rhf", output="water")
EOF
# Submit it.
vq submit water.py
# → printed to stdout: jobid (e.g. "c0ff50a06462") + a watch hint.
# Poll the queue:
vq list
# Wait for it (Ctrl-C exits the watcher; the job keeps running):
vq watch c0ff50a06462
# Once it finishes, pull the outputs back:
vq fetch c0ff50a06462 ./outputs/
# → ./outputs/water.out / .molden / .traj / stdout.log / stderr.log
That’s the entire core workflow.
Submission forms¶
vq accepts three submission shapes:
Single file (most common)¶
vq submit my_script.py
# Equivalent to:
# ssh <host> cd <remote-workspace> && <remote_python> my_script.py
The laptop copies my_script.py into a fresh per-job
workspace on the remote, runs it with the configured
remote_python (or the --branch-resolved one), captures
stdout / stderr, and tracks the result.
Directory submit (sweeps + multi-file inputs)¶
vq submit -d ./my_sweep_dir -- python run.py --basis def2-svp
# -d <path> = the directory to copy across to the workspace
# -- = end of vq flags
# python run.py … = the literal command to run inside the workspace
Use this when:
Your script imports local modules (
from helpers import ...).You need multiple input files in the workspace (
run.pyreadsgeometry.xyz,basis_def.g94, etc.).You want to encode the interpreter / engine in the command (e.g. running ORCA:
-- orca input.inp).
Pre-packed tarball¶
vq submit -t my_inputs.tar.gz -- bash run.sh
# vq unpacks the tarball into the workspace before dispatching.
For reproducibility — the tarball + command + JobSpec are a complete reproducible-run unit.
Resource caps¶
Every job dispatched after v0.4.0 runs inside its own
systemd-run user scope so cgroup-v2 memory + CPU caps
apply. Wall-time enforcement is Python-watchdog-based
(vq.watchdog); cgroup RuntimeMaxSec was tried in
v0.4 → v0.5.7 and dropped in v0.5.8 as not runtime-mutable
via systemctl --user set-property (see
vibe-queue/docs/wall_time_design.md
for the postmortem). The watchdog subtracts
paused_seconds_total from elapsed, so wall-time is naturally
pause-aware.
vq submit my_calc.py \
--cpus 8 \
--mem-mb 16000 \
--wall-time-seconds 7200 # 2-hour cap (watchdog-enforced)
If the job exceeds any cap, the cgroup or the watchdog kills it cleanly and the queue records a labelled terminal state:
Terminal state |
Trigger |
Owner |
Recovery |
|---|---|---|---|
|
exit code 0 |
— |
nothing — outputs ready to fetch |
|
non-zero exit code |
— |
inspect stderr.log; re-submit |
|
exceeded |
cgroup |
bump |
|
exceeded |
watchdog |
bump the cap, or checkpoint if vibe-qc supports it for the workload |
|
CPU-underutilisation watchdog (5 min < 10% CPU summed over the pgid descendants) |
watchdog |
check stderr.log — typically a hanging worker; v0.5.12+ samples the whole pgroup so the bash wrapper no longer false-positives |
|
terminated by |
daemon |
intentional; resubmit if needed |
Always pass --wall-time-seconds N for non-trivial jobs —
that’s the only guard against a wedged SCF eating cores
indefinitely.
Orphan exit-code recovery (v0.5.9+)¶
If the daemon restarts mid-job (deliberately via
systemctl --user restart vq-daemon or via Restart=on-failure),
the dispatched bash wrapper writes the inner process’s exit
code to <workspace>/_vq/exit-code on graceful exit. When the
new daemon reconciles orphans, it reads the marker and
classifies as COMPLETED (rc=0) or FAILED (rc≠0). Pre-v0.5.9
behaviour was to mark every restart-orphan as
ABORTED_BY_QUEUE even on clean completion; v0.5.9 fixes that
and is what makes vq admin update (v0.5.20+) safe to use —
it deliberately pause-restart-resumes the daemon.
Multi-venv --branch routing (v0.5.6+)¶
The remote may host multiple vibe-qc clones — typically
vibeqc-dev (tracking main) and vibeqc-release
(tracking the latest tag). Pick one per submit:
vq submit my_calc.py # default_host's default
vq submit my_calc.py --branch main # dev venv
vq submit my_calc.py --branch release # release venv
vq submit my_calc.py --branch latest # = release (alias)
--branch is mutually exclusive with --python, and only
applies to single-file submits. For -d / -t submits,
encode the interpreter in the explicit command.
The mapping is per-host config — [hosts.<name>.branches] +
[hosts.<name>.branch_aliases]. Add new entries by editing
~/.config/vq/config.toml on the laptop; no remote restart
needed.
External-program workflows (CRYSTAL / ORCA / PySCF)¶
vibe-qc treats other QC programs as external — see
CLAUDE.md § 10
for the policy. vq dispatches them through contrib/
wrappers that handle each program’s I/O conventions:
CRYSTAL14 (Pcrystal + PROPERTIES14)¶
# Parallel CRYSTAL14 (default --np 14):
vq submit -d ./calc --cpus 14 \
-- bash /home/USER/vibeqc-queue/vibe-queue/contrib/run-crystal.sh INPUT.d12
# Serial:
vq submit -d ./calc --cpus 1 \
-- bash /home/USER/vibeqc-queue/vibe-queue/contrib/run-crystal.sh --serial INPUT.d12
# Custom MPI rank count:
vq submit -d ./calc --cpus 8 \
-- bash /home/USER/vibeqc-queue/vibe-queue/contrib/run-crystal.sh --np 8 INPUT.d12
# PROPERTIES14 (parallel):
vq submit -d ./prop --cpus 14 \
-- bash /home/USER/vibeqc-queue/vibe-queue/contrib/run-crystal.sh --properties prop.d3
The wrapper stages the input file as ./INPUT, runs
mpirun -np N Pcrystal > out.out, restores any pre-existing
INPUT on exit.
ORCA 6.1¶
ORCA spawns its own MPI internally — don’t wrap with
mpirun:
vq submit -d ./orca_run --cpus 8 -- orca input.inp
ORCA reads --cpus-equivalent info from the ! PAL N line
in the input file; declare --cpus N matching for cgroup
accounting.
PySCF (as a comparison / parity reference)¶
vq submit my_pyscf_script.py # PySCF is in both vibe-qc venvs
Both the dev and release vibe-qc venvs have PySCF installed
(it’s in [test]), so PySCF scripts submit the same way as
vibe-qc scripts.
Monitoring + management¶
# Snapshot the queue:
vq queue # all states
vq queue --active # running + pending + suspended
vq queue -s running # only running (v0.5.27)
vq queue -s running -s pending # explicit two-state filter
vq queue -s failed -s killed # terminal-failure forensics
# Per-job snapshot (metadata + tail of stdout/stderr):
vq status <jobid> # last 50 lines
vq status <jobid> -n 200 # last 200 lines
vq status <jobid> -n 0 # full output
# Live tail of a workspace file (v0.5.26):
vq tail <jobid> # follow stdout.log
vq tail <jobid> -f # live-stream (Ctrl-C to stop)
vq tail <jobid> --name vibeqc.log -f # custom logger file
vq tail <jobid> --name mgo.out -f # CRYSTAL output
vq tail <jobid> --name h2.out -f # ORCA / Psi4 output
# Fetch outputs back to the laptop (live job: workspace dir;
# completed: workspace dir; archived: un-tars from the .tar.bz2):
vq fetch <jobid> -o ./results
# Cancel:
vq kill <jobid> # SIGTERM the process group,
# then SIGKILL after grace
# → terminal state KILLED
# Pause / resume (v0.5.1+):
vq pause <jobid> # SIGSTOP the job
vq resume <jobid> # SIGCONT
vq pause --all # pause every running job
vq resume --all # resume every paused job
vq tail is the canonical “watch the SCF converge live” verb: it
execs tail -f directly (locally) or via ssh (remotely), so SIGINT
goes straight through and there’s no Python buffering layer between
the job’s logger and your terminal. Use --name to target whatever
file vibe-qc’s logger is writing to (e.g.
logging.basicConfig(filename='vibeqc.log') → vq tail JOBID --name vibeqc.log -f).
The pause / resume flow is the right tool when you need to free
the box temporarily (kids gaming, an interactive workload) without
losing in-flight jobs. For automated venv refresh use vq admin update
instead (it pauses-pulls-builds-resumes in one verb; see Refreshing
the remote vibe-qc venv below).
Web dashboard¶
If vq-web.service is running, open
http://<remote>:8765/queue in a browser. First-time access
prompts for the bearer token (stored at
~/.config/vq/web-token on the remote).
Endpoints:
Endpoint |
Purpose |
|---|---|
|
Live queue table (htmx auto-refresh) |
|
Per-job detail: spec, resource history, log tail, exit status |
|
Kubernetes-style probes for external monitoring |
|
Per-job write actions (v0.5.1+) |
|
Queue-wide actions (v0.5.2+) |
All write endpoints require the bearer token in an
Authorization: Bearer <token> header. For browser use,
htmx + a small form prompts once and stores it in
sessionStorage.
Architecture detail (auth, request shapes, error handling)
is in
vibe-queue/docs/web.md.
Fetching outputs¶
When a job completes, the workspace on the remote contains
the outputs your script wrote (water.out, water.molden,
…) plus the queue-side capture files (stdout.log,
stderr.log, _vq/events.jsonl, _vq/exit-code).
vq fetch <jobid> ./local_outputs/ # rsync the whole workspace
vq fetch <jobid> ./outputs/ --files stdout.log water.out
# specific files only
vq fetch is archive-aware (v0.5.11+): if the workspace
was archived via vq cleanup --archive (see next section),
fetch streams the .tar.bz2 over SSH and reconstructs the
original directory layout on the laptop. No special flag
needed; the same vq fetch <jobid> <local-dir> command works
for both live and archived workspaces.
Operator controls (pause / resume / throttle / drain)¶
When the box gets busy for non-queue reasons (kids gaming, an interactive session, an urgent job from another chat), three knobs let vq step aside without losing in-flight work:
# Hard freeze (SIGSTOP); RAM stays allocated, no CPU used.
vq pause <jobid> # one job
vq pause --all # every running job
vq resume <jobid> # SIGCONT
vq resume --all
# Soft throttle (cgroup CPUWeight, renice fallback v0.5.21+).
# weight=100 is default; weight=20 = "step aside" under contention.
vq throttle <jobid> --weight 20
vq throttle --all --weight 20 --persist # persist across new dispatches
vq throttle --all --weight 20 --persist --duration 2h # auto-release after 2h
vq throttle --release-persist # clear persistent state
vq throttle --status # what's the current state?
# Drain (don't dispatch NEW jobs; running ones continue).
vq drain # full drain (no new dispatches)
vq drain --max-jobs 0 # explicit full drain
vq drain --max-jobs 2 # partial drain (cap at 2 concurrent)
vq drain --release # back to daemon's configured max
vq drain --duration 1h # auto-release after 1h
vq drain --status
Composable: vq drain + vq pause --all + vq throttle --all cover
the operator-control story. All four state files
(drain.json, throttle.json, auto-cleanup.json, plus the per-job
suspended-state on the spec) live under <state_root> and survive
daemon restarts.
Workspace cleanup (v0.5.10+)¶
Long-running queues accumulate workspaces. vq cleanup is the
manual housekeeping verb; it operates only on terminal-state
jobs (active / pending / suspended jobs are never touched).
# List terminal-state jobs and their workspace ages:
vq cleanup
# → table: jobid, terminal_state, finished_at, workspace_size_mb
# Dry-run preview — show what would be archived:
vq cleanup --archive --older-than 30d
# Actually archive (add -x to "execute"):
vq cleanup --archive --older-than 30d -x
# → workspaces become tar.bz2 files under <state_root>/archive/
# Hard delete archived workspaces older than 90 days:
vq cleanup --delete --older-than 90d -x
# Restore an archived workspace (un-tar in place):
vq cleanup --restore <jobid> -x
The archive→restore round-trip is lossless: the directory
tree after --restore is byte-identical to what was archived.
Auto-policy (v0.5.17+): instead of running the verb manually, register a daemon-side policy:
# Daemon runs the sweep once per --interval (default 24h):
vq cleanup --auto-enable --archive-after 30d --delete-after 90d
# Per-state retention (v0.5.23+) — keep failed-job forensics longer:
vq cleanup --auto-enable --archive-after 30d \
--archive-after-state failed:90d --delete-after 180d
# Read-only status:
vq cleanup --auto-status
# Disable:
vq cleanup --auto-disable
Configurable archive location (v0.5.22+): default
<state_root>/archive/ may live on a small partition. Override via:
$VQ_ARCHIVE_DIRenv var on the daemon host (applies to all archive paths globally)--archive-dir DIRflag on the verb (per-policy with--auto-enable, per-invocation with one-shot--archive)
Why this matters: when the queue gets busy, workspaces add up
fast (~10s of MB per typical SCF, ~hundreds of MB for big
periodic + Molden + cube + .traj). Without cleanup, the
<state_root> filesystem fills. With cleanup, you get a
straightforward archive → delete pipeline that preserves the
artefact history (every spec + final outputs) at small storage
cost (~5× compression for typical output mixes).
Daemon admin¶
What happens at host reboot¶
The daemon survives if loginctl enable-linger is set:
Daemon restart only — running jobs become orphans with their pgids preserved; the new daemon re-attaches at startup. Job completes normally; exit code is read from the dispatched job’s
_vq/exit-codefile (so re-attach works even after a restart that wiped thePopenhandle). This is v0.5.9’s_vq/exit-codemarker — pre-v0.5.9 restart-orphans got markedABORTED_BY_QUEUEeven on clean completion.Full host reboot — kernel kills everything; all RUNNING jobs are marked
ABORTED_BY_QUEUEon next daemon start. Resubmit using the JobSpecs in the queue history.
Note
Wall-time enforcement gap when the daemon is down. Because
v0.5.8 dropped cgroup RuntimeMaxSec (it wasn’t runtime-mutable
on pause; see vibe-queue/docs/wall_time_design.md), wall-time
enforcement is now the Python watchdog only. If the daemon
crashes and stays down beyond the watchdog’s poll interval, a
job that should have hit its --wall-time-seconds cap during
the outage isn’t killed by the kernel — it keeps running until
the daemon comes back and the watchdog catches up. In practice,
Restart=on-failure on the systemd unit keeps the gap to a
few seconds. The trade-off is documented in
vibe-queue/docs/wall_time_design.md.
Refreshing the remote vibe-qc venv after a release (v0.5.20+)¶
As of vq v0.5.20, this is one verb:
vq admin update vibeqc-release
Which does pause-all → git -C <git_dir> pull → bash <update_script> →
resume-all (always, even on Ctrl-C or pull failure — resume is in a
finally block so the queue always comes back up). Reads git_dir /
branch / update_script from the host’s [programs.X] registry
(see vq programs below).
Verifying a tagged release (v0.5.24+):
git push --tags
vq admin update vibeqc-release --tag v0.8.0
vq submit smoke_test.py --branch release
--tag v0.X.Y runs git describe --exact-match --tags HEAD after the
pull and fails the update (skips the build, exits non-zero) if HEAD
isn’t exactly at the expected tag — the libint-vanishing class of
“pull succeeded but landed on the wrong commit” failures.
Checking remote state (v0.5.25+):
vq admin status
# NAME BRANCH SHA DESCRIBE DIRTY LAST_UPDATED_AT LAST OK
# vibeqc-dev main abc12345defg v0.7.3-12-gabc1234 no 2026-05-13T14:30:00+00:00 True
# vibeqc-release release fedcba987654 v0.8.0 no 2026-05-13T14:35:12+00:00 True
Compare SHA to your laptop’s git rev-parse --short=12 HEAD to
answer “is planetx at the commit I just pushed?” without ssh.
Chat workflow for testing a just-pushed feature:
git push # laptop
vq admin update vibeqc-dev # refresh planetx
vq submit my_feature_test.py --branch main # exercise the new code
This is the canonical pattern — always vq admin update between
push and submit if you need planetx at your latest commit.
vq programs (v0.5.18+) — list registered programs:
vq programs # human-readable table
vq programs --json # machine-readable; for scripts
The registry lives at ~/.config/vq/config.toml on the remote host
under [programs.X]. Three kinds:
binary— CRYSTAL, ORCA, Psi4 (an executable on disk)venv— vibeqc-dev, vibeqc-release (a Python venv + git checkout thatvq admin updateknows how to refresh)import— pyscf (a module that should be importable from a specific Python)
Watching the daemon¶
journalctl --user -u vq-daemon -f # live tail
systemctl --user status vq-daemon # service health
Concurrency¶
The default daemon configuration is single-job dispatch
(--max-jobs 1 in the systemd unit). This is the test-phase
default — change to --max-jobs N in the unit file’s
ExecStart and restart the daemon to parallel-dispatch.
Set --max-jobs honestly against the CPU budget: if jobs
declare --cpus 8 and the box has 32 cores, --max-jobs 4
is the safe ceiling. The daemon does not currently enforce
this; it accepts whatever you set.
Troubleshooting¶
Symptom |
Likely cause |
Fix |
|---|---|---|
|
venv not on PATH |
symlink to |
|
local SSH client not installed |
install OpenSSH client |
|
SSH key not authorised on remote |
add laptop’s |
Job hangs in |
daemon not running or |
|
Job terminates |
pre-v0.5.12, the watchdog read CPU from the wrapper PID only (bash sleeping in |
upgrade to vq v0.5.12+; the watchdog now sums CPU across the whole pgid descendant set. As a workaround on older versions: |
|
wrong |
check |
Web UI says “401 unauthorised” |
bearer token expired or wrong |
re-read |
Comprehensive troubleshooting table in
vibe-queue/docs/handover.md § Troubleshooting.
Version history (recent)¶
vq version |
Headline |
|---|---|
v0.5.27 |
|
v0.5.26 |
|
v0.5.25 |
|
v0.5.24 |
|
v0.5.23 |
per-state retention overrides: |
v0.5.22 |
configurable archive_dir ( |
v0.5.21 |
|
v0.5.20 |
|
v0.5.19 |
smoke test consumes absolute paths from |
v0.5.18 |
|
v0.5.17 |
auto-cleanup policy (daemon main-loop hook reads |
v0.5.13–.16 |
|
v0.5.12 |
watchdog samples pgid descendants (fixes STARVED false-positive when bash-wrapped jobs sleep in |
v0.5.11 |
archive-aware remote |
v0.5.10 |
|
v0.5.9 |
orphan exit-code recovery via |
v0.5.8 |
drop broken cgroup |
v0.5.7 |
|
v0.5.6 |
|
v0.5.0–.5 |
web dashboard, pause/resume, bearer-token auth, CRYSTAL14 parallel dispatch |
v0.4 |
cgroup-v2 enforcement, pgid recovery, event log |
v0.3 |
resource watchdog (mem cap, wall-time, terminal-state machine) |
Full per-version detail at
vibe-queue/docs/handover.md § “What’s NEW in …”
(the handover is the deeper reference; this page is the
user-facing entry).
Roadmap (vq’s own)¶
vq has its own roadmap independent of vibe-qc — see
vibe-queue/docs/roadmap.md.
Near-term:
v0.6.0 full scope (some items already shipped in v0.5.20–v0.5.25; remaining:
vq admin update --allmulti-env,vq admin update vqself-update with daemon restart, admin-update-in-progress marker file for crash recovery, multi-user / per-uid).v0.7 — job priority (
--priority N), per-user quotas, retry on failure (--retry Nwith backoff), webhook notifications, opt-in--auto-resumeafter host reboot.v1.0 — SLURM / PBS backend so the same
vq submitshape works against HPC clusters.
See also¶
vibe-queue/docs/handover.md— operational reference + per-engine recipe table + full troubleshooting (1093 lines, the deep dive).vibe-queue/docs/SPEC.md— JobSpec wire format, on-disk schema, terminal-state semantics.vibe-queue/docs/web.md— web dashboard auth + request shapes.vibe-queue/docs/config.toml.example— annotated config template.jupyter.md— running vibe-qc from Jupyter Lab (interactive workflows, vs vq’s batch shape).docs/release_process.md— where vq fits in the per-tag + per-quarter docs cadence.