# Plan: Auto-build docker image on relay version tags + fix release-tag CI handling ## Context When relay `1.11.0` was tagged on 2026-05-15, the docker build had to be manually triggered via `workflow_dispatch` because the dispatch workflow in the relay repo only fires on `push` to `main`. The manual build (run 25957678788) did produce the `1.11.0` image, but the downstream `cmlxc` test job failed because cmlxc's ref-handling assumes branches and runs `git reset --hard origin/`, which is meaningless for tags. On top of that, the GHCR cleanup policy does not protect full-semver tags, so a release image like `1.11.0` could eventually be pruned as more SHA builds accumulate. Four changes are needed across three repos: 1. relay: add a tag-push trigger to `docker-dispatch.yaml` (strict semver). 2. docker: re-introduce minor (`X.Y`) and `latest` tags in `docker-ci.yaml` (commit 4e77c3a that removed them was never shipped). 3. docker: harden `cleanup.yaml` so release tags are never pruned and tag deletes don't fire the branch-cleanup path. 4. cmlxc (`j4n/docker-support` branch): teach the docker driver to handle tag refs in addition to branches and SHAs. Result: pushing an annotated tag like `1.11.0` per `RELEASE.md` will automatically build, push, and integration-test the release image without human intervention. ## Branch setup - relay (`/work`): already on `j4n/docker-tag-trigger` — commit here. - docker (`/work/docker`): currently on `main`. Create `j4n/docker-tag-trigger` (matching the relay branch name) and commit the docker-ci + cleanup changes there. Open PR against `chatmail/docker:main`. - cmlxc (`/work/cmlxc`): already on `j4n/docker-support` — commit the tag-handling fixes onto that existing feature branch (which `docker-ci.yaml:208` already points at via the `@j4n/docker-support` ref, so no workflow edit needed to pick them up). All three branches should be merged in this order: cmlxc first (so the test job has a working ref-handling implementation), then docker, then relay (so the dispatch fires into a docker `main` that already builds correctly). ## Repo 1: relay (/work) ### File: `/work/.github/workflows/docker-dispatch.yaml` Add a tag filter to the `push` trigger. Tags follow `X.Y.Z` (no `v` prefix) per `RELEASE.md`. The existing `permissions: {}`, payload, and action pin stay as-is. ```yaml on: push: branches: [main] tags: ['[0-9]+.[0-9]+.[0-9]+'] workflow_dispatch: ``` No other changes — `github.ref_name` for a tag push is the bare tag (e.g. `1.11.0`), which is exactly what the docker repo's semver regex at `docker-ci.yaml:129` already matches. ## Repo 2: docker (/work/docker) ### File: `/work/docker/.github/workflows/docker-ci.yaml` Re-introduce both the minor-version tag (`X.Y`) and the `latest` tag for semver releases. Commit 4e77c3a (which removed them) was never shipped to a published release, so this is simply restoring the original behavior while keeping the bugfix-only-clobber guard for `latest`. In the `Compute build metadata` step (around lines 117-133), inside the existing `if [[ "${RELAY_REF}" =~ ^v?([0-9]+)\.([0-9]+)\.([0-9]+)$ ]]` branch: - Always append `${IMAGE}:X.Y.Z` (existing) and `${IMAGE}:X.Y`. - Append `${IMAGE}:latest` only when this release is the highest semver published so far. Use `gh api` to check existing tags — this prevents a back-ported patch on an older minor line from clobbering `latest`. ```bash if [[ "${RELAY_REF}" =~ ^v?([0-9]+)\.([0-9]+)\.([0-9]+)$ ]]; then MAJOR="${BASH_REMATCH[1]}" MINOR="${BASH_REMATCH[2]}" PATCH="${BASH_REMATCH[3]}" VERSION="${MAJOR}.${MINOR}.${PATCH}" MINOR_TAG="${MAJOR}.${MINOR}" TAGS="${TAGS}"$'\n'"${IMAGE}:${VERSION}"$'\n'"${IMAGE}:${MINOR_TAG}" HIGHEST=$(gh api "orgs/chatmail/packages/container/docker/versions?per_page=100" \ --jq '[.[].metadata.container.tags[] | select(test("^[0-9]+\\.[0-9]+\\.[0-9]+$"))] | sort_by( split(".") | map(tonumber)) | last // empty' 2>/dev/null) || true if [ -z "$HIGHEST" ] || \ [ "$(printf '%s\n%s\n' "$HIGHEST" "$VERSION" | sort -V | tail -1)" = "$VERSION" ]; then TAGS="${TAGS}"$'\n'"${IMAGE}:latest" fi fi ``` This requires `gh` in the runner (already used elsewhere in the file at line 141) and `GH_TOKEN`/`GITHUB_TOKEN` env (already implicit on github-actions). No new permissions needed beyond the existing `packages: read` from `secrets.GITHUB_TOKEN`. Note: the minor tag (`X.Y`) intentionally moves forward to point at the newest patch in that minor line. This matches the pre-4e77c3a behavior and the existing cleanup regex which already protects `\d+\.\d+`. ### File: `/work/docker/.github/workflows/cleanup.yaml` Two edits, both in the existing file: 1. Line 50 (`prune-old-sha` ignore regex): extend to protect full semver. ```yaml ignore-versions: '^(main|latest|\d+\.\d+|\d+\.\d+\.\d+)$' ``` Keeps the existing `main`, `latest`, and minor-version protections (still safe even though docker-ci no longer emits them after commit 4e77c3a), and adds `X.Y.Z` so release tags are never the target of the keep-30 SHA pruning rule. 2. Line 55 / `cleanup-branch` job: scope the `delete` trigger to branches, not tags. Otherwise deleting a release tag would delete the image tagged with that release. Change the `if:` condition: ```yaml if: github.event_name == 'delete' && github.event.ref_type == 'branch' ``` No change to `prune-untagged`. ## Repo 3: cmlxc (/work/cmlxc, branch `j4n/docker-support`) The `RELAY_REF=1.11.0 cmlxc test-cmdeploy dock0` step failed in run 25957678788 because git operations assume a branch with a remote tracking ref. Fix three sites identified by the explore agent. Reuse a single small helper rather than duplicating logic. ### New helper Add a helper in `/work/cmlxc/src/cmlxc/driver_base.py` near `parse_source()` (around line 44–64) that classifies a ref: ```python def classify_ref(checkout_dir: str, ref: str) -> str: # Returns "sha", "tag", or "branch". Run inside the worktree. # - 40-hex => sha # - git show-ref --tags --verify refs/tags/ => tag # - otherwise branch ``` Implementation must be tolerant of shallow clones and call git via the existing subprocess wrappers used in the file. ### Site 1: `prepare_source_in_builder()` in `driver_docker.py:101-134` Replace the unconditional `git reset --hard -q origin/{ref}` at line 123 with a branch-only path. For a tag, run `git fetch origin --tags` and `git checkout -q refs/tags/{ref}`; do not reset against `origin/`. For a SHA, keep current SHA logic. ### Site 2: `init_builder()` in `driver_base.py:189-210` The `reset_cmd` block currently sets `reset_cmd` only when `not is_sha`. Extend the check so tags also skip the reset: ```python kind = classify_ref(checkout, source.ref) reset_cmd = "" if kind == "branch": reset_cmd = f"git reset --hard -q origin/{source.ref}" elif kind == "tag": fetch_cmd = "git fetch origin --tags" # checkout step below uses refs/tags/{ref} ``` Drop the `2>/dev/null || true` masking — surface real failures. ### Site 3: `run_tests()` in `driver_docker.py:1060-1097` The SHA-prefix check at line 1088 (`if current_sha != ref and not ref.startswith(current_sha):`) treats a tag as a branch and re-checks out via `origin/`. Use `classify_ref` before that branch: - if `kind == "tag"`: resolve to its commit (`git rev-parse refs/tags/^{commit}`) and compare against `current_sha`; if they match, skip the re-checkout. - if `kind == "branch"`: existing behavior. - if `kind == "sha"`: existing behavior. ## Verification End-to-end test of the dispatch + build + test pipeline: 1. Push a throwaway pre-release-style tag on a fork (`0.0.1-test-tagci`) only if "Strict semver only" was loosened — otherwise: 2. Cut a real point-release dry run by tagging a no-op commit on a `j4n/release-pipeline-test` branch in a fork named like `1.99.0`. In a fork, the relay repo guard `if: github.repository == 'chatmail/relay'` will skip dispatch — so verify on `chatmail/relay` only by waiting until the next real release, or temporarily allow the fork by editing the guard locally. 3. Observe in the docker repo: - `gh run watch --repo chatmail/docker $(gh run list --repo chatmail/docker --limit 1 --json databaseId --jq '.[0].databaseId')` - the dispatched run's build job pushes tags `sha-` and `1.99.0`. - the test job (using updated cmlxc) succeeds end-to-end. 4. Confirm cleanup safety: - `gh workflow run --repo chatmail/docker cleanup.yaml` - `gh api orgs/chatmail/packages/container/docker/versions --jq '.[].metadata.container.tags' | grep -E '^\[.*"1\.11\.0".*\]'` still returns the 1.11.0 image after cleanup runs. 5. cmlxc unit-level: in `/work/cmlxc`, run the existing test suite (`uv run pytest`) and add a small test for `classify_ref` covering sha / tag / branch inputs. ## Critical files - `/work/.github/workflows/docker-dispatch.yaml` - `/work/docker/.github/workflows/docker-ci.yaml` - `/work/docker/.github/workflows/cleanup.yaml` - `/work/cmlxc/src/cmlxc/driver_base.py` - `/work/cmlxc/src/cmlxc/driver_docker.py` ## Out of scope - Removing the two `TODO: revert to @main once cmlxc docker support is merged` lines in `docker-ci.yaml:207-215` — that happens when the cmlxc fixes land on cmlxc `main`, as a follow-up.