Handover — vibeqc-dev fleet-clone state (2026-05-20)

From: queue-system dev chat (vq). To: a vibe-qc dev chat. Scope: the vibeqc-dev venv clones on the fleet hosts (mars, planetx) need a rebuild + a branch-state cleanup. This is not a queue-system bug — vq itself is healthy on both hosts (v0.6.24, daemons OK). It surfaced during a routine fleet maintenance sweep.


TL;DR

  1. mars:~/gitlab/vibeqc-dev — git source is current (5957127, latest origin/main), working tree clean, but the venv’s compiled C++ extensions are stale. A git pull moved source only; the new COSX / MP2 / BIPOLE / CI-solver native code is not built. Needs scripts/update.sh --dev.

  2. mars:~/gitlab/vibeqc-dev is checked out on a local branch named release that actually contains main-line commits. The clone should be on main (the vq config declares branch = "main" for it). Pre-existing wart; needs a deliberate git checkout main (or branch rename).

  3. planetx:~/gitlab/vibeqc-dev — working tree is [DIRTY] (uncommitted changes) and well behind origin/main (v0.7.0-24-ge3a34a5). The dirty changes need triage (stash / commit / discard) before any update.

  4. The failed vq admin update vibeqc-dev mars (git pull rc=128, 2026-05-20 ~10:53 UTC) is most plausibly explained by #2 — scripts/update.sh --dev’s git logic likely does not expect a clone parked on a non-main local branch. Worth hardening the script.


Full state

mars:~/gitlab/vibeqc-dev

vq admin status mars row:

NAME        BRANCH  SHA           DESCRIBE             DIRTY  LAST OK
vibeqc-dev  main    595712771baa  v0.7.5-390-g5957127  no     False

(BRANCH main here is the vq-config-declared branch, not the git-checked-out branch — see below.)

Local branches in that clone:

  bench-checkout  ea7daaf  [origin/main: behind 249]
  main            83a5da8  [origin/main: behind 15]
* release         5957127  [origin/release: ahead 173, behind 50]
  • HEAD is on the local branch release, currently at 5957127 (a main-line commit — “docs/handover: meta-GGA τ-density landed”).

  • The local main branch is at 83a5da8, 15 behind origin/main.

  • LAST OK=False — the most recent vq admin update attempt for this env failed.

Transparency note: the queue chat advanced the local release branch by 15 commits (83a5da8 5957127) on 2026-05-20 via git pull --rebase origin main, while diagnosing the rc=128 failure. It was intended as a read-only check and should not have mutated the clone. The pull was a clean fast-forward — nothing was pushed, origin/release is untouched, no commits lost — but it means the local release branch now carries 15 more main commits than it did before. The branch was already parked on main-line commits before the queue chat touched it (release and main both sat at 83a5da8); the misnaming is pre-existing, not queue-chat- introduced. Flagging it in full so the dev chat has the complete picture before deciding how to restore the intended branch layout.

planetx:~/gitlab/vibeqc-dev

vq admin status planetx row (abridged):

vibeqc-dev  main  …  v0.7.0-24-ge3a34a5  DIRTY=yes
  • Working tree has uncommitted changes ([DIRTY]).

  • Well behind origin/main.

  • The dirty changes have not been inspected — the queue chat deliberately did not touch planetx’s clone.


Why this matters for job execution

The fleet vq config routes --branch main jobs to the vibeqc-dev venv python:

[hosts.mars.branches]
main = "~/gitlab/vibeqc-dev/.venv/bin/python"

A job submitted today with --branch main on mars runs against a venv whose compiled extensions predate the COSX OneCenterCorrection, the RI-MP2 use_float_intermediates path, the BIPOLE UHF scaffold, and the non-mean-field solvers module. The Python source is current; the .so files are not. Results would silently reflect the old C++ until the venv is rebuilt. No crash, no error — just stale numerics. That is the dangerous part and the reason this handover is worth acting on promptly.

The git branch name (release vs main) does not affect job execution — jobs use the venv python regardless of which branch is checked out. The branch wart only matters for git status readability and for vq admin update / scripts/ update.sh git operations.


What is NOT broken (queue side — no action needed)

  • vq is v0.6.24 on both mars and planetx.

  • Both vq-daemon units are active and healthy under systemd (vq daemon healthverdict: OK).

  • vibeqc-queue env: clean, LAST OK=True.

  • vibeqc-release env: clean, LAST OK=True.

  • The failed admin-update marker on mars has already been cleared (v0.5.51 marker-PID-staleness logic retired it once the dead update-process’s marker was re-evaluated).

Fleet-host incident context (FYI, not dev-chat work)

During the same sweep, mars’s systemd --user manager was found dead despite Linger=yes and no host reboot (uptime 4 days). It was restored by the operator (sudo systemctl restart user@1000.service), and a leftover unsupervised vq daemon run process was cleaned up. If the user manager dies again under linger, that warrants a separate investigation (journal around the death timestamp — a manager dropping under linger usually points to an OOM kill or a manual systemctl --user exit). Recorded here only so the dev chat has the timeline; no dev-chat action required.