fix: bake version into Docker image at build time

- Add VERSION build arg to Dockerfile - Pass --build-arg VERSION in release workflow - Remove VERSION env override from docker-compose files - Version is now immutable inside the image, no runtime env var needed
fix: read version from env var so it works inside Docker
2026-04-20 17:24:20 +02:00 · 2026-04-20 17:15:55 +02:00 · 2026-04-20 17:12:43 +02:00 · 2026-04-20 17:09:02 +02:00 · 2026-04-20 16:23:55 +02:00 · 2026-04-20 16:13:52 +02:00
12 changed files with 187 additions and 16 deletions
--- a/.env.example
+++ b/.env.example
@@ -42,6 +42,6 @@ ALERTS_ENABLED=false
 LLM_API_KEY=
 LLM_BASE_URL=https://api.openai.com/v1
 LLM_MODEL=gpt-4o-mini
-LLM_MAX_EVENTS=50
+LLM_MAX_EVENTS=200
 LLM_TIMEOUT_SECONDS=30
 LLM_API_VERSION=
--- a/.gitea/workflows/release.yml
+++ b/.gitea/workflows/release.yml
@@ -16,7 +16,7 @@ jobs:
        run: echo "${{ secrets.REGISTRY_TOKEN }}" | docker login git.cqre.net -u ${{ github.actor }} --password-stdin 2>&1 | grep -v "WARNING! Your credentials are stored unencrypted"

      - name: Build Docker image
-        run: docker build ./backend --tag git.cqre.net/cqrenet/aoc-backend:${{ gitea.ref_name }}
+        run: docker build ./backend --build-arg VERSION=${{ gitea.ref_name }} --tag git.cqre.net/cqrenet/aoc-backend:${{ gitea.ref_name }}

      - name: Push Docker image
        run: docker push git.cqre.net/cqrenet/aoc-backend:${{ gitea.ref_name }}
--- a/RELEASE_NOTES_v1.2.5.md
+++ b/RELEASE_NOTES_v1.2.5.md
@@ -0,0 +1,78 @@
+# AOC v1.2.5 Release Notes
+
+**Release date:** 2026-04-20
+
+---
+
+## What's new
+
+### Natural language query (`/api/ask`)
+Ask questions in plain English and get AI-generated answers backed by your audit logs.
+
+- **Regex-based parsing** extracts time ranges (`last 3 days`, `yesterday`, `today`) and entities (`device ABC123`, `user bob@example.com`) without calling an LLM.
+- **AI narrative summarisation** via any OpenAI-compatible API (OpenAI, Azure OpenAI, MS Foundry, Ollama).
+- **Graceful fallback** when no LLM is configured — returns a structured bullet list with a clear error banner.
+- **Cited evidence** — every answer includes the raw events that back it up.
+
+### Filter-aware queries
+The ask endpoint now respects the filter panel. When you set **Service = Exchange**, **Result = failure** and ask *"What happened to device X?"*, the LLM only sees failed Exchange events for that device.
+
+### Scales to thousands of events
+For large result sets (>50 events), the LLM receives an **aggregated overview** instead of a raw dump:
+- Counts by service, action, result, and actor
+- Failure highlights
+- The 50 most recent raw events as samples
+
+This keeps token usage low while preserving accuracy.
+
+### Azure OpenAI / MS Foundry support
+- Automatic `api-key` header detection for Azure endpoints.
+- `LLM_API_VERSION` config for Azure `api-version` query parameters.
+- `max_completion_tokens` support for newer model deployments.
+
+### Version display
+- `GET /api/version` endpoint reads the `VERSION` file.
+- Frontend shows a version badge in the header (e.g., **1.2.5**).
+
+### Production hardening (from v1.1.0)
+- Dockerfile runs as non-root user with Gunicorn + Uvicorn workers.
+- `docker-compose.prod.yml` with internal-only MongoDB, health checks, and nginx reverse proxy.
+- Security headers (`X-Frame-Options`, `X-Content-Type-Options`, etc.).
+
+---
+
+## Configuration
+
+Add to your `.env`:
+
+```bash
+# Required for AI narrative summarisation
+LLM_API_KEY=your-key
+LLM_BASE_URL=https://api.openai.com/v1
+LLM_MODEL=gpt-4o-mini
+LLM_MAX_EVENTS=200
+LLM_TIMEOUT_SECONDS=30
+LLM_API_VERSION=                 # set for Azure OpenAI, e.g. 2024-12-01-preview
+```
+
+For Azure OpenAI / MS Foundry:
+```bash
+LLM_BASE_URL=https://your-resource.openai.azure.com/openai/deployments/your-deployment
+LLM_API_KEY=your-azure-key
+LLM_API_VERSION=2024-12-01-preview
+LLM_MODEL=your-deployment-name
+```
+
+---
+
+## Upgrade notes
+
+No breaking changes. Existing `/api/events`, filters, pagination, tags, and comments work unchanged.
+
+---
+
+## Docker image
+
+```
+git.cqre.net/cqrenet/aoc-backend:v1.2.5
+```
--- a/2
+++ b/2
@@ -1 +1 @@
-1.2.0
+1.2.5
--- a/backend/Dockerfile
+++ b/backend/Dockerfile
@@ -1,5 +1,9 @@
 FROM python:3.11-slim

+# Bake the version into the image at build time
+ARG VERSION=unknown
+ENV VERSION=${VERSION}
+
 # Security: run as non-root
 RUN groupadd -r aoc && useradd -r -g aoc aoc

--- a/backend/config.py
+++ b/backend/config.py
@@ -46,7 +46,7 @@ class Settings(BaseSettings):
    LLM_API_KEY: str = ""
    LLM_BASE_URL: str = "https://api.openai.com/v1"
    LLM_MODEL: str = "gpt-4o-mini"
-    LLM_MAX_EVENTS: int = 50
+    LLM_MAX_EVENTS: int = 200
    LLM_TIMEOUT_SECONDS: int = 30
    LLM_API_VERSION: str = ""  # e.g. 2025-01-01-preview for Azure OpenAI

--- a/backend/frontend/index.html
+++ b/backend/frontend/index.html
@@ -12,7 +12,7 @@
  <div class="page" x-data="aocApp()" x-init="initApp()">
    <header class="hero">
      <div>
-        <p class="eyebrow">Admin Operations Center</p>
+        <p class="eyebrow">Admin Operations Center <span class="version-badge" x-text="appVersion"></span></p>
        <h1>Directory Audit Explorer</h1>
        <p class="lede">Filter Microsoft Entra audit events by user, app, time, action, and action type.</p>
      </div>
@@ -243,6 +243,7 @@
          actor: '', selectedServices: [], search: '', operation: '', result: '', start: '', end: '', limit: 100, includeTags: '', excludeTags: '',
        },
        options: { actors: [], services: [], operations: [], results: [] },
+        appVersion: '',
        askQuestionText: '',
        askLoading: false,
        askAnswer: '',
@@ -252,6 +253,7 @@
        askLlmError: '',

        async initApp() {
+          await this.loadVersion();
          await this.initAuth();
          if (!this.authConfig?.auth_enabled || this.accessToken) {
            await this.loadFilterOptions();
@@ -260,6 +262,16 @@
          }
        },

+        async loadVersion() {
+          try {
+            const res = await fetch('/api/version');
+            if (res.ok) {
+              const body = await res.json();
+              this.appVersion = body.version || '';
+            }
+          } catch {}
+        },
+
        authHeader() {
          return this.accessToken ? { Authorization: `Bearer ${this.accessToken}` } : {};
        },
--- a/backend/frontend/style.css
+++ b/backend/frontend/style.css
@@ -433,6 +433,20 @@ input {
  color: var(--muted);
 }

+.version-badge {
+  display: inline-block;
+  margin-left: 8px;
+  padding: 2px 8px;
+  border-radius: 999px;
+  background: rgba(125, 211, 252, 0.15);
+  border: 1px solid rgba(125, 211, 252, 0.3);
+  color: var(--accent-strong);
+  font-size: 11px;
+  font-weight: 600;
+  letter-spacing: 0.05em;
+  vertical-align: middle;
+}
+
 .ask-events {
  margin-bottom: 14px;
 }
--- a/backend/main.py
+++ b/backend/main.py
@@ -134,6 +134,13 @@ async def metrics():
    return Response(content=prometheus_metrics(), media_type="text/plain")


+@app.get("/api/version")
+async def version():
+    import os
+
+    return {"version": os.environ.get("VERSION", "unknown")}
+
+
 frontend_dir = Path(__file__).parent / "frontend"
 app.mount("/", StaticFiles(directory=frontend_dir, html=True), name="frontend")

--- a/backend/routes/ask.py
+++ b/backend/routes/ask.py
@@ -168,22 +168,76 @@ def _build_event_query(
 # ---------------------------------------------------------------------------

 _SYSTEM_PROMPT = """You are an IT operations assistant. An administrator has asked a question about audit logs.
-Your job is to read the list of audit events below and write a concise, plain-language answer.
+Your job is to read the data below and write a concise, plain-language answer.
+
+The input may be either:
+- A small list of individual audit events (numbered Event #1, #2, etc.), or
+- An aggregated overview with counts by service, action, result, and actor, plus sample events.

 Rules:
 - Assume the reader is a non-expert admin.
- Group related events together and tell a coherent story.
+- For aggregated overviews: summarise the scale, top patterns, and highlight anomalies or failures.
+- For small event lists: group related events together and tell a coherent story.
 - Highlight anything unusual, failed actions, or privilege escalations.
 - Reference specific event numbers (e.g., "Event #3") when making claims so the user can verify.
+- If the data is an aggregated subset of a larger result set, acknowledge the scale (e.g., "847 events occurred — the top pattern was...").
 - If there are no events, say so clearly.
 - Keep the answer under 300 words.
- Do not invent events that are not in the list.
+- Do not invent events or patterns that are not supported by the data.
 """


-def _format_events_for_llm(events: list[dict]) -> str:
+def _aggregate_counts(events: list[dict]) -> dict:
+    """Build lightweight aggregation tables for large result sets."""
+    from collections import Counter
+
+    svc_counts = Counter(e.get("service") or "Unknown" for e in events)
+    op_counts = Counter(e.get("operation") or "Unknown" for e in events)
+    result_counts = Counter(e.get("result") or "Unknown" for e in events)
+    actor_counts = Counter(e.get("actor_display") or "Unknown" for e in events)
+    return {
+        "services": svc_counts.most_common(10),
+        "operations": op_counts.most_common(10),
+        "results": result_counts.most_common(5),
+        "actors": actor_counts.most_common(10),
+    }
+
+
+def _format_events_for_llm(events: list[dict], total: int | None = None) -> str:
    lines = []
-    for i, e in enumerate(events, 1):
+
+    # If we have a large result set, send aggregation + samples instead of raw dump
+    if total is not None and total > len(events) and len(events) >= 50:
+        lines.append(f"Result set overview: {total} total events (showing the {len(events)} most recent).\n")
+        agg = _aggregate_counts(events)
+        lines.append("Breakdown by service:")
+        for svc, cnt in agg["services"]:
+            lines.append(f"  {svc}: {cnt}")
+        lines.append("\nBreakdown by action:")
+        for op, cnt in agg["operations"]:
+            lines.append(f"  {op}: {cnt}")
+        lines.append("\nBreakdown by result:")
+        for res, cnt in agg["results"]:
+            lines.append(f"  {res}: {cnt}")
+        lines.append("\nTop actors:")
+        for actor, cnt in agg["actors"]:
+            lines.append(f"  {actor}: {cnt}")
+        # Include failures and a few recent samples
+        failures = [e for e in events if str(e.get("result") or "").lower() in ("failure", "failed")]
+        if failures:
+            lines.append(f"\nFailures ({len(failures)}):")
+            for e in failures[:10]:
+                ts = e.get("timestamp", "?")[:16].replace("T", " ")
+                op = e.get("operation", "unknown")
+                actor = e.get("actor_display", "unknown")
+                lines.append(f"  {ts} — {op} by {actor}")
+        lines.append("\nMost recent sample events:")
+    else:
+        if total is not None and total > len(events):
+            lines.append(f"Showing {len(events)} of {total} total matching events (most recent first):\n")
+
+    # Always include the first N raw events as detail (up to 50)
+    for i, e in enumerate(events[:50], 1):
        ts = e.get("timestamp") or "unknown time"
        op = e.get("operation") or "unknown action"
        actor = e.get("actor_display") or "unknown actor"
@@ -213,11 +267,11 @@ def _build_chat_url(base_url: str, api_version: str) -> str:
    return url


-async def _call_llm(question: str, events: list[dict]) -> str:
+async def _call_llm(question: str, events: list[dict], total: int | None = None) -> str:
    if not LLM_API_KEY:
        raise RuntimeError("LLM_API_KEY not configured")

-    context = _format_events_for_llm(events)
+    context = _format_events_for_llm(events, total=total)
    messages = [
        {"role": "system", "content": _SYSTEM_PROMPT},
        {
@@ -298,6 +352,7 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
    )

    try:
+        total = events_collection.count_documents(query)
        cursor = events_collection.find(query).sort([("timestamp", -1)]).limit(LLM_MAX_EVENTS)
        events = list(cursor)
    except Exception as exc:
@@ -325,7 +380,7 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
        llm_error = "LLM_API_KEY is not configured. Set it in your .env to enable AI narrative summarisation."
    else:
        try:
-            answer = await _call_llm(question, events)
+            answer = await _call_llm(question, events, total=total)
            llm_used = True
        except Exception as exc:
            llm_error = f"LLM call failed: {exc}"
@@ -359,6 +414,7 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
            "start": start,
            "end": end,
            "event_count": len(events),
+            "total_matched": total,
            "mongo_query": json.dumps(query, default=str),
        },
        llm_used=llm_used,
--- a/backend/tests/test_ask.py
+++ b/backend/tests/test_ask.py
@@ -236,7 +236,7 @@ class TestAskEndpoint:
            }
        )

-        async def fake_llm(question, events):
+        async def fake_llm(question, events, total=None):
            return "The device had a failed wipe attempt."

        monkeypatch.setattr("routes.ask.LLM_API_KEY", "fake-key")
@@ -265,7 +265,7 @@ class TestAskEndpoint:
            }
        )

-        async def failing_llm(question, events):
+        async def failing_llm(question, events, total=None):
            raise RuntimeError("LLM service down")

        monkeypatch.setattr("routes.ask.LLM_API_KEY", "fake-key")
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -14,7 +14,7 @@ services:
  backend:
    build: ./backend
    # For production, use the pre-built image instead:
-    # image: git.cqre.net/cqrenet/aoc-backend:v1.1.0
+    # image: git.cqre.net/cqrenet/aoc-backend:v1.2.5
    container_name: aoc-backend
    restart: always
    env_file:
Author	SHA1	Message	Date
Tomas Kracmar	11fd87411d	fix: bake version into Docker image at build time All checks were successful Release / build-and-push (push) Successful in 1m18s Details CI / lint-and-test (push) Successful in 20s Details - Add VERSION build arg to Dockerfile - Pass --build-arg VERSION in release workflow - Remove VERSION env override from docker-compose files - Version is now immutable inside the image, no runtime env var needed	2026-04-20 17:24:20 +02:00
Tomas Kracmar	6a80bf4eb9	fix: read version from env var so it works inside Docker All checks were successful Release / build-and-push (push) Successful in 28s Details CI / lint-and-test (push) Successful in 21s Details	2026-04-20 17:15:55 +02:00
Tomas Kracmar	5e02f5a402	docs: add v1.2.5 release notes All checks were successful CI / lint-and-test (push) Successful in 25s Details	2026-04-20 17:12:43 +02:00
Tomas Kracmar	0c3e5ec57b	feat: add version display to frontend and /api/version endpoint (v1.2.5) All checks were successful Release / build-and-push (push) Successful in 40s Details CI / lint-and-test (push) Successful in 22s Details - Add GET /api/version endpoint that reads VERSION file - Frontend fetches version on init and displays it as a badge in the header - Add version-badge CSS styling - Update docker-compose.yml comment to v1.2.5	2026-04-20 17:09:02 +02:00
Tomas Kracmar	a255be93fe	feat: aggregate large event sets before sending to LLM All checks were successful CI / lint-and-test (push) Successful in 18s Details Release / build-and-push (push) Successful in 29s Details When a query matches >50 events, the LLM now receives: - Aggregated counts by service, operation, result, and actor - A list of failures (up to 10) - The 50 most recent raw events as samples This scales to thousands of events without blowing the token budget or losing signal. The LLM gets a bird's-eye view plus concrete examples. Also updates the system prompt to handle both individual event lists and aggregated overviews correctly.	2026-04-20 16:23:55 +02:00
Tomas Kracmar	cfe9397cc5	feat: raise LLM event limit to 200 and show total count awareness All checks were successful CI / lint-and-test (push) Successful in 23s Details Release / build-and-push (push) Successful in 27s Details - Bump LLM_MAX_EVENTS default from 50 to 200 - Add total_matched count to /api/ask response - Include 'Showing X of Y total' header in LLM prompt so the model knows when its view is a subset and avoids false certainty - Update system prompt to instruct acknowledging scale when truncated - Update test mocks to accept new total parameter	2026-04-20 16:13:52 +02:00
@@ -1 +1 @@
 .2.0
 .2.5