Files
relay/doc/plans/260518-01-docker-release-tag-trigger.md
j4n e040a902db ci: auto-trigger docker build on release tag push
docker-dispatch.yaml previously only fired on push to main and manual
workflow_dispatch, so tagging 1.11.0 did not build the release image.
This change adds matching of X.Y.Z tag.
2026-05-18 11:49:51 +02:00

9.2 KiB
Raw Blame History

Plan: Auto-build docker image on relay version tags + fix release-tag CI handling

Context

When relay 1.11.0 was tagged on 2026-05-15, the docker build had to be manually triggered via workflow_dispatch because the dispatch workflow in the relay repo only fires on push to main. The manual build (run 25957678788) did produce the 1.11.0 image, but the downstream cmlxc test job failed because cmlxc's ref-handling assumes branches and runs git reset --hard origin/<ref>, which is meaningless for tags. On top of that, the GHCR cleanup policy does not protect full-semver tags, so a release image like 1.11.0 could eventually be pruned as more SHA builds accumulate.

Four changes are needed across three repos:

  1. relay: add a tag-push trigger to docker-dispatch.yaml (strict semver).
  2. docker: re-introduce minor (X.Y) and latest tags in docker-ci.yaml (commit 4e77c3a that removed them was never shipped).
  3. docker: harden cleanup.yaml so release tags are never pruned and tag deletes don't fire the branch-cleanup path.
  4. cmlxc (j4n/docker-support branch): teach the docker driver to handle tag refs in addition to branches and SHAs.

Result: pushing an annotated tag like 1.11.0 per RELEASE.md will automatically build, push, and integration-test the release image without human intervention.

Branch setup

  • relay (/work): already on j4n/docker-tag-trigger — commit here.
  • docker (/work/docker): currently on main. Create j4n/docker-tag-trigger (matching the relay branch name) and commit the docker-ci + cleanup changes there. Open PR against chatmail/docker:main.
  • cmlxc (/work/cmlxc): already on j4n/docker-support — commit the tag-handling fixes onto that existing feature branch (which docker-ci.yaml:208 already points at via the @j4n/docker-support ref, so no workflow edit needed to pick them up).

All three branches should be merged in this order: cmlxc first (so the test job has a working ref-handling implementation), then docker, then relay (so the dispatch fires into a docker main that already builds correctly).

Repo 1: relay (/work)

File: /work/.github/workflows/docker-dispatch.yaml

Add a tag filter to the push trigger. Tags follow X.Y.Z (no v prefix) per RELEASE.md. The existing permissions: {}, payload, and action pin stay as-is.

on:
  push:
    branches: [main]
    tags: ['[0-9]+.[0-9]+.[0-9]+']
  workflow_dispatch:

No other changes — github.ref_name for a tag push is the bare tag (e.g. 1.11.0), which is exactly what the docker repo's semver regex at docker-ci.yaml:129 already matches.

Repo 2: docker (/work/docker)

File: /work/docker/.github/workflows/docker-ci.yaml

Re-introduce both the minor-version tag (X.Y) and the latest tag for semver releases. Commit 4e77c3a (which removed them) was never shipped to a published release, so this is simply restoring the original behavior while keeping the bugfix-only-clobber guard for latest.

In the Compute build metadata step (around lines 117-133), inside the existing if [[ "${RELAY_REF}" =~ ^v?([0-9]+)\.([0-9]+)\.([0-9]+)$ ]] branch:

  • Always append ${IMAGE}:X.Y.Z (existing) and ${IMAGE}:X.Y.
  • Append ${IMAGE}:latest only when this release is the highest semver published so far. Use gh api to check existing tags — this prevents a back-ported patch on an older minor line from clobbering latest.
if [[ "${RELAY_REF}" =~ ^v?([0-9]+)\.([0-9]+)\.([0-9]+)$ ]]; then
  MAJOR="${BASH_REMATCH[1]}"
  MINOR="${BASH_REMATCH[2]}"
  PATCH="${BASH_REMATCH[3]}"
  VERSION="${MAJOR}.${MINOR}.${PATCH}"
  MINOR_TAG="${MAJOR}.${MINOR}"
  TAGS="${TAGS}"$'\n'"${IMAGE}:${VERSION}"$'\n'"${IMAGE}:${MINOR_TAG}"

  HIGHEST=$(gh api "orgs/chatmail/packages/container/docker/versions?per_page=100" \
    --jq '[.[].metadata.container.tags[]
      | select(test("^[0-9]+\\.[0-9]+\\.[0-9]+$"))] | sort_by(
        split(".") | map(tonumber)) | last // empty' 2>/dev/null) || true
  if [ -z "$HIGHEST" ] || \
     [ "$(printf '%s\n%s\n' "$HIGHEST" "$VERSION" | sort -V | tail -1)" = "$VERSION" ]; then
    TAGS="${TAGS}"$'\n'"${IMAGE}:latest"
  fi
fi

This requires gh in the runner (already used elsewhere in the file at line 141) and GH_TOKEN/GITHUB_TOKEN env (already implicit on github-actions). No new permissions needed beyond the existing packages: read from secrets.GITHUB_TOKEN.

Note: the minor tag (X.Y) intentionally moves forward to point at the newest patch in that minor line. This matches the pre-4e77c3a behavior and the existing cleanup regex which already protects \d+\.\d+.

File: /work/docker/.github/workflows/cleanup.yaml

Two edits, both in the existing file:

  1. Line 50 (prune-old-sha ignore regex): extend to protect full semver.

    ignore-versions: '^(main|latest|\d+\.\d+|\d+\.\d+\.\d+)$'
    

    Keeps the existing main, latest, and minor-version protections (still safe even though docker-ci no longer emits them after commit 4e77c3a), and adds X.Y.Z so release tags are never the target of the keep-30 SHA pruning rule.

  2. Line 55 / cleanup-branch job: scope the delete trigger to branches, not tags. Otherwise deleting a release tag would delete the image tagged with that release.

    Change the if: condition:

    if: github.event_name == 'delete' && github.event.ref_type == 'branch'
    

No change to prune-untagged.

Repo 3: cmlxc (/work/cmlxc, branch j4n/docker-support)

The RELAY_REF=1.11.0 cmlxc test-cmdeploy dock0 step failed in run 25957678788 because git operations assume a branch with a remote tracking ref. Fix three sites identified by the explore agent. Reuse a single small helper rather than duplicating logic.

New helper

Add a helper in /work/cmlxc/src/cmlxc/driver_base.py near parse_source() (around line 4464) that classifies a ref:

def classify_ref(checkout_dir: str, ref: str) -> str:
    # Returns "sha", "tag", or "branch". Run inside the worktree.
    # - 40-hex => sha
    # - git show-ref --tags --verify refs/tags/<ref> => tag
    # - otherwise branch

Implementation must be tolerant of shallow clones and call git via the existing subprocess wrappers used in the file.

Site 1: prepare_source_in_builder() in driver_docker.py:101-134

Replace the unconditional git reset --hard -q origin/{ref} at line 123 with a branch-only path. For a tag, run git fetch origin --tags and git checkout -q refs/tags/{ref}; do not reset against origin/. For a SHA, keep current SHA logic.

Site 2: init_builder() in driver_base.py:189-210

The reset_cmd block currently sets reset_cmd only when not is_sha. Extend the check so tags also skip the reset:

kind = classify_ref(checkout, source.ref)
reset_cmd = ""
if kind == "branch":
    reset_cmd = f"git reset --hard -q origin/{source.ref}"
elif kind == "tag":
    fetch_cmd = "git fetch origin --tags"
    # checkout step below uses refs/tags/{ref}

Drop the 2>/dev/null || true masking — surface real failures.

Site 3: run_tests() in driver_docker.py:1060-1097

The SHA-prefix check at line 1088 (if current_sha != ref and not ref.startswith(current_sha):) treats a tag as a branch and re-checks out via origin/<tag>. Use classify_ref before that branch:

  • if kind == "tag": resolve to its commit (git rev-parse refs/tags/<ref>^{commit}) and compare against current_sha; if they match, skip the re-checkout.
  • if kind == "branch": existing behavior.
  • if kind == "sha": existing behavior.

Verification

End-to-end test of the dispatch + build + test pipeline:

  1. Push a throwaway pre-release-style tag on a fork (0.0.1-test-tagci) only if "Strict semver only" was loosened — otherwise:
  2. Cut a real point-release dry run by tagging a no-op commit on a j4n/release-pipeline-test branch in a fork named like 1.99.0. In a fork, the relay repo guard if: github.repository == 'chatmail/relay' will skip dispatch — so verify on chatmail/relay only by waiting until the next real release, or temporarily allow the fork by editing the guard locally.
  3. Observe in the docker repo:
    • gh run watch --repo chatmail/docker $(gh run list --repo chatmail/docker --limit 1 --json databaseId --jq '.[0].databaseId')
    • the dispatched run's build job pushes tags sha-<short> and 1.99.0.
    • the test job (using updated cmlxc) succeeds end-to-end.
  4. Confirm cleanup safety:
    • gh workflow run --repo chatmail/docker cleanup.yaml
    • gh api orgs/chatmail/packages/container/docker/versions --jq '.[].metadata.container.tags' | grep -E '^\[.*"1\.11\.0".*\]' still returns the 1.11.0 image after cleanup runs.
  5. cmlxc unit-level: in /work/cmlxc, run the existing test suite (uv run pytest) and add a small test for classify_ref covering sha / tag / branch inputs.

Critical files

  • /work/.github/workflows/docker-dispatch.yaml
  • /work/docker/.github/workflows/docker-ci.yaml
  • /work/docker/.github/workflows/cleanup.yaml
  • /work/cmlxc/src/cmlxc/driver_base.py
  • /work/cmlxc/src/cmlxc/driver_docker.py

Out of scope

  • Removing the two TODO: revert to @main once cmlxc docker support is merged lines in docker-ci.yaml:207-215 — that happens when the cmlxc fixes land on cmlxc main, as a follow-up.