Release v1.3.0: AI feature flag and MCP server

- Add AI_FEATURES_ENABLED config flag to gate AI/natural-language features - Conditionally register /api/ask router based on AI_FEATURES_ENABLED - Add GET /api/config/features endpoint for frontend feature detection - Update frontend to hide Ask panel when AI features are disabled - Implement standalone MCP server (backend/mcp_server.py) with tools: * search_events, get_event, get_summary, ask - Add mcp dependency to requirements.txt - Update .env.example, AGENTS.md, and ROADMAP.md - Bump VERSION to 1.3.0
feat: intent-aware querying + smart sampling for large audit datasets
2026-04-20 18:11:26 +02:00 · 2026-04-20 17:41:21 +02:00 · 2026-04-20 17:31:27 +02:00 · 2026-04-20 17:29:10 +02:00 · 2026-04-20 17:24:20 +02:00 · 2026-04-20 17:15:55 +02:00
20 changed files with 1029 additions and 53 deletions
--- a/.env.example
+++ b/.env.example
@@ -34,6 +34,10 @@ SIEM_WEBHOOK_URL=
 # Optional: enable rule-based alerting during ingestion
 ALERTS_ENABLED=false

+# Optional: enable AI/natural-language features (/api/ask, MCP server)
+# Set to false to completely disable AI endpoints and UI elements
+AI_FEATURES_ENABLED=true
+
 # Optional: LLM configuration for natural language querying (/api/ask)
 # Supports any OpenAI-compatible API (OpenAI, Azure OpenAI, Ollama, etc.)
 # For Azure OpenAI / MS Foundry, set BASE_URL to your deployment endpoint
@@ -42,6 +46,6 @@ ALERTS_ENABLED=false
 LLM_API_KEY=
 LLM_BASE_URL=https://api.openai.com/v1
 LLM_MODEL=gpt-4o-mini
-LLM_MAX_EVENTS=50
+LLM_MAX_EVENTS=200
 LLM_TIMEOUT_SECONDS=30
 LLM_API_VERSION=
--- a/.gitea/workflows/release.yml
+++ b/.gitea/workflows/release.yml
@@ -16,7 +16,13 @@ jobs:
        run: echo "${{ secrets.REGISTRY_TOKEN }}" | docker login git.cqre.net -u ${{ github.actor }} --password-stdin 2>&1 | grep -v "WARNING! Your credentials are stored unencrypted"

      - name: Build Docker image
-        run: docker build ./backend --tag git.cqre.net/cqrenet/aoc-backend:${{ gitea.ref_name }}
+        run: docker build ./backend --build-arg VERSION=${{ gitea.ref_name }} --tag git.cqre.net/cqrenet/aoc-backend:${{ gitea.ref_name }}

-      - name: Push Docker image
+      - name: Tag as latest
+        run: docker tag git.cqre.net/cqrenet/aoc-backend:${{ gitea.ref_name }} git.cqre.net/cqrenet/aoc-backend:latest
+
+      - name: Push version tag
        run: docker push git.cqre.net/cqrenet/aoc-backend:${{ gitea.ref_name }}
+
+      - name: Push latest tag
+        run: docker push git.cqre.net/cqrenet/aoc-backend:latest
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -6,28 +6,34 @@ AOC is a FastAPI microservice that ingests Microsoft Entra (Azure AD) audit logs

 ## Technology Stack

- **Runtime**: Python 3.11
- **Web Framework**: FastAPI + Uvicorn
+- **Runtime**: Python 3.11 (3.14 for tests)
+- **Web Framework**: FastAPI + Uvicorn (Gunicorn in production)
 - **Database**: MongoDB (PyMongo)
- **Frontend**: Vanilla HTML/CSS/JS (served as static files from `backend/frontend/`)
+- **Frontend**: Alpine.js + HTML/CSS (served as static files from `backend/frontend/`)
 - **Authentication**: Optional OIDC Bearer token validation against Microsoft Entra (using `python-jose` and MSAL.js on the frontend)
- **External APIs**: Microsoft Graph API, Office 365 Management Activity API
- **Deployment**: Docker Compose
+- **External APIs**: Microsoft Graph API, Office 365 Management Activity API, Azure OpenAI / MS Foundry
+- **Deployment**: Docker Compose (dev), Docker Compose + nginx (prod)
+- **CI/CD**: Gitea Actions (lint + test + Docker build + release)

 ## Project Structure

 ```
 backend/
  main.py              # FastAPI app, router registration, background periodic fetch
-  config.py            # Environment-based configuration (loads .env)
+  config.py            # Pydantic Settings configuration (loads .env)
  database.py          # MongoClient setup (db = micro_soc, collection = events)
  auth.py              # OIDC Bearer token validation, JWKS caching, role/group checks
  requirements.txt     # Python dependencies
-  Dockerfile           # python:3.11-slim image
+  Dockerfile           # python:3.11-slim image, non-root user, version baked at build
+  mcp_server.py        # Standalone MCP server for Claude Desktop / Cursor integration
  routes/
    fetch.py           # GET /api/fetch-audit-logs, run_fetch()
-    events.py          # GET /api/events, GET /api/filter-options
-    config.py          # GET /api/config/auth
+    events.py          # GET /api/events, GET /api/filter-options, PATCH tags, POST comments
+    config.py          # GET /api/config/auth, GET /api/config/features
+    ask.py             # POST /api/ask — natural language query with LLM
+    health.py          # GET /health, GET /metrics
+    rules.py           # Rule-based alerting endpoints
+    webhooks.py        # Microsoft Graph change notification webhooks
  graph/
    auth.py            # Client credentials token acquisition for Graph
    audit_logs.py      # Fetch and enrich directory audit logs from Graph
@@ -41,7 +47,7 @@ backend/
  mappings.yml         # User-editable category labels and summary templates
  maintenance.py       # CLI for re-normalization and deduplication of stored events
  frontend/
-    index.html         # Single-page UI with filters, pagination, raw-event modal
+    index.html         # Single-page UI with filters, pagination, ask panel, raw-event modal
    style.css          # Dark-themed stylesheet
 ```

@@ -60,6 +66,9 @@ Key variables:
 - `AUTH_ALLOWED_ROLES`, `AUTH_ALLOWED_GROUPS` — comma-separated access control lists
 - `ENABLE_PERIODIC_FETCH`, `FETCH_INTERVAL_MINUTES` — background ingestion scheduler
 - `MONGO_ROOT_USERNAME`, `MONGO_ROOT_PASSWORD`, `MONGO_PORT` — used by Docker Compose for MongoDB
+- `AI_FEATURES_ENABLED` — set `false` to completely disable AI endpoints and UI (default `true`)
+- `LLM_API_KEY`, `LLM_BASE_URL`, `LLM_MODEL`, `LLM_MAX_EVENTS`, `LLM_TIMEOUT_SECONDS` — LLM provider settings
+- `LLM_API_VERSION` — required for Azure OpenAI / MS Foundry endpoints

 ## Build and Run Commands

@@ -87,35 +96,81 @@ uvicorn main:app --reload --host 0.0.0.0 --port 8000
 ## API Endpoints

 - `GET /api/fetch-audit-logs?hours=168` — pulls last N hours (capped at 720 / 30 days) from all sources, normalizes, dedupes, and upserts into MongoDB
- `GET /api/events` — list stored events with filters (`service`, `actor`, `operation`, `result`, `start`, `end`, `search`) and pagination (`page`, `page_size`)
+- `GET /api/events` — list stored events with filters (`service`, `actor`, `operation`, `result`, `start`, `end`, `search`) and cursor-based pagination
 - `GET /api/filter-options` — best-effort distinct values for UI dropdowns
 - `GET /api/config/auth` — auth configuration exposed to the frontend
+- `GET /api/config/features` — feature flags (`ai_features_enabled`)
+- `POST /api/ask` — natural language query; returns LLM narrative + referenced events (only when `AI_FEATURES_ENABLED=true`)
+- `GET /health` — liveness probe with DB connectivity
+- `GET /metrics` — Prometheus metrics
+
+## MCP Server
+
+A standalone MCP server (`backend/mcp_server.py`) exposes audit log tools for Claude Desktop, Cursor, and other MCP clients.
+
+Available tools:
+- `search_events` — Search by entity, service, operation, result, time range
+- `get_event` — Retrieve a single event by ID (raw JSON)
+- `get_summary` — Aggregated counts by service, operation, result, actor
+- `ask` — Natural language question (returns recent events + guidance)
+
+**Claude Desktop config** (`~/.config/claude/claude_desktop_config.json`):
+```json
+{
+  "mcpServers": {
+    "aoc": {
+      "command": "python",
+      "args": ["/path/to/aoc/backend/mcp_server.py"],
+      "env": {"MONGO_URI": "mongodb://root:example@localhost:27017/"}
+    }
+  }
+}
+```
+
+The MCP server imports `database.py` directly and does not go through the FastAPI layer, so it shares the same MongoDB connection but bypasses auth.
+
+## AI Feature Flag
+
+Set `AI_FEATURES_ENABLED=false` in `.env` to:
+- Prevent the `ask` router from being registered in FastAPI
+- Hide the "Ask a question" panel in the frontend
+- Return `ai_features_enabled: false` from `/api/config/features`
+
+This is intended for the open-core monetization split: core features (ingestion, filtering, search, export) are always available; premium AI features (NLQ, MCP) can be disabled.

 ## Code Conventions

 - Python modules use absolute imports within the `backend/` package (e.g., `from graph.auth import get_access_token`). When running locally, ensure the working directory is `backend/` so these resolve correctly.
- No formal formatter or linter is configured. Keep changes consistent with the existing style: simple functions, explicit exception handling, and informative docstrings.
- The frontend is a single HTML file with inline JavaScript. It relies on the MSAL.js CDN (`https://alcdn.msauth.net/browser/2.37.0/js/msal-browser.min.js`).
+- The project uses `ruff` for linting and formatting. Run `ruff check . && ruff format .` before committing.
+- Keep changes consistent with the existing style: simple functions, explicit exception handling, and informative docstrings.
+- The frontend is a single HTML file with inline JavaScript and Alpine.js.

 ## Testing

-There are currently **no automated tests** in this repository. When adding new features or bug fixes, verify behavior manually:
+Tests run with pytest and mongomock (no real MongoDB required):

-1. Start the server (Docker Compose or local uvicorn).
-2. Run a smoke test:
-   ```bash
-   curl http://localhost:8000/api/events
-   curl http://localhost:8000/api/fetch-audit-logs
-   ```
-3. Open http://localhost:8000 in a browser, apply filters, paginate, and click "View raw event".
+```bash
+cd backend
+python -m venv .venv_test
+source .venv_test/bin/activate
+pip install -r requirements.txt
+pytest tests/ -q
+```
+
+When adding new features or bug fixes, add or update tests in `backend/tests/`. The test suite covers:
+- Event normalization and deduplication
+- Auth middleware and token validation
+- API endpoints (`/api/events`, `/api/fetch-audit-logs`, `/api/ask`)
+- NLQ time range extraction, entity extraction, query building

 ## Security Considerations

- **Secrets**: `CLIENT_SECRET` and other credentials come from `.env`. Never commit `.env`.
+- **Secrets**: `CLIENT_SECRET`, `LLM_API_KEY`, and other credentials come from `.env`. Never commit `.env`.
 - **Auth validation**: When `AUTH_ENABLED=true`, the backend fetches JWKS from `https://login.microsoftonline.com/{AUTH_TENANT_ID}/v2.0/.well-known/openid-configuration`, caches keys for 1 hour, and validates tenant/issuer claims. Tokens are decoded without strict signature verification (`jwt.get_unverified_claims`), so the tenant and issuer checks are the primary gate.
 - **Role/Group gating**: Access is allowed if the token’s `roles` intersect `AUTH_ALLOWED_ROLES` or `groups` intersect `AUTH_ALLOWED_GROUPS`. If neither list is configured, all authenticated users are allowed.
 - **Pagination limits**: `page_size` is clamped to a maximum of 500 to prevent large queries.
 - **Fetch window cap**: `hours` is clamped to 720 (30 days) to avoid runaway API calls.
+- **MCP server**: The MCP server bypasses auth entirely. Only run it in trusted environments or behind a VPN.

 ## Maintenance and Operations

--- a/RELEASE_NOTES_v1.1.0.md
+++ b/RELEASE_NOTES_v1.1.0.md
@@ -0,0 +1,56 @@
+# AOC v1.1.0 Release Notes
+
+**Release date:** 2026-04-20
+
+## What's new
+
+### Natural language query (`/api/ask`)
+Ask questions in plain English and get AI-generated answers backed by your audit logs.
+
+- **Regex-based parsing** extracts time ranges (`last 3 days`, `yesterday`, `today`) and entities (`device ABC123`, `user bob@example.com`) without calling an LLM — fast and deterministic.
+- **AI narrative summarisation** via any OpenAI-compatible API (OpenAI, Azure OpenAI, MS Foundry, Ollama). The LLM reads the matching events and writes a concise story for non-expert admins.
+- **Graceful fallback** when no LLM is configured — returns a structured bullet list instead of a narrative.
+- **Cited evidence** — every answer includes the raw events that back it up, so admins can verify claims.
+
+### Azure OpenAI / MS Foundry support
+- Automatic `api-key` header detection for Azure endpoints.
+- `LLM_API_VERSION` config for Azure `api-version` query parameters.
+- `max_completion_tokens` support for newer model deployments.
+
+### Production hardening
+- **Dockerfile:** runs as non-root user, uses Gunicorn + Uvicorn workers.
+- **docker-compose.prod.yml:** MongoDB is internal-only (no host port exposure), health checks on all services, nginx reverse proxy with security headers.
+- **nginx config:** gzip, security headers (`X-Frame-Options`, `X-Content-Type-Options`), ready for TLS.
+
+### Frontend
+- New **"Ask a question"** panel at the top of the page.
+- Markdown rendering for LLM answers (bold, italic, code).
+- Orange warning banner when LLM is not configured or fails.
+
+### Tests
+- 29 new tests covering ask parsing, query building, and endpoint behaviour.
+- 62 tests total, all passing.
+
+## Configuration
+
+Add to your `.env`:
+
+```bash
+# Required for AI narrative summarisation
+LLM_API_KEY=your-key
+LLM_BASE_URL=https://api.openai.com/v1
+LLM_MODEL=gpt-4o-mini
+LLM_MAX_EVENTS=50
+LLM_TIMEOUT_SECONDS=30
+LLM_API_VERSION=                 # set for Azure OpenAI, e.g. 2024-12-01-preview
+```
+
+## Upgrade notes
+
+No breaking changes. Existing `/api/events`, filters, pagination, tags, and comments work unchanged.
+
+## Docker image
+
+```
+git.cqre.net/cqrenet/aoc-backend:v1.1.0
+```
--- a/RELEASE_NOTES_v1.2.5.md
+++ b/RELEASE_NOTES_v1.2.5.md
@@ -0,0 +1,78 @@
+# AOC v1.2.5 Release Notes
+
+**Release date:** 2026-04-20
+
+---
+
+## What's new
+
+### Natural language query (`/api/ask`)
+Ask questions in plain English and get AI-generated answers backed by your audit logs.
+
+- **Regex-based parsing** extracts time ranges (`last 3 days`, `yesterday`, `today`) and entities (`device ABC123`, `user bob@example.com`) without calling an LLM.
+- **AI narrative summarisation** via any OpenAI-compatible API (OpenAI, Azure OpenAI, MS Foundry, Ollama).
+- **Graceful fallback** when no LLM is configured — returns a structured bullet list with a clear error banner.
+- **Cited evidence** — every answer includes the raw events that back it up.
+
+### Filter-aware queries
+The ask endpoint now respects the filter panel. When you set **Service = Exchange**, **Result = failure** and ask *"What happened to device X?"*, the LLM only sees failed Exchange events for that device.
+
+### Scales to thousands of events
+For large result sets (>50 events), the LLM receives an **aggregated overview** instead of a raw dump:
+- Counts by service, action, result, and actor
+- Failure highlights
+- The 50 most recent raw events as samples
+
+This keeps token usage low while preserving accuracy.
+
+### Azure OpenAI / MS Foundry support
+- Automatic `api-key` header detection for Azure endpoints.
+- `LLM_API_VERSION` config for Azure `api-version` query parameters.
+- `max_completion_tokens` support for newer model deployments.
+
+### Version display
+- `GET /api/version` endpoint reads the `VERSION` file.
+- Frontend shows a version badge in the header (e.g., **1.2.5**).
+
+### Production hardening (from v1.1.0)
+- Dockerfile runs as non-root user with Gunicorn + Uvicorn workers.
+- `docker-compose.prod.yml` with internal-only MongoDB, health checks, and nginx reverse proxy.
+- Security headers (`X-Frame-Options`, `X-Content-Type-Options`, etc.).
+
+---
+
+## Configuration
+
+Add to your `.env`:
+
+```bash
+# Required for AI narrative summarisation
+LLM_API_KEY=your-key
+LLM_BASE_URL=https://api.openai.com/v1
+LLM_MODEL=gpt-4o-mini
+LLM_MAX_EVENTS=200
+LLM_TIMEOUT_SECONDS=30
+LLM_API_VERSION=                 # set for Azure OpenAI, e.g. 2024-12-01-preview
+```
+
+For Azure OpenAI / MS Foundry:
+```bash
+LLM_BASE_URL=https://your-resource.openai.azure.com/openai/deployments/your-deployment
+LLM_API_KEY=your-azure-key
+LLM_API_VERSION=2024-12-01-preview
+LLM_MODEL=your-deployment-name
+```
+
+---
+
+## Upgrade notes
+
+No breaking changes. Existing `/api/events`, filters, pagination, tags, and comments work unchanged.
+
+---
+
+## Docker image
+
+```
+git.cqre.net/cqrenet/aoc-backend:v1.2.5
+```
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -59,5 +59,15 @@ Goal: evolve from a polling dashboard into a full security operations tool.

 ---

+## Phase 5: Intelligence
+Goal: add AI-powered analysis and external tool integration.
+
+- [x] AI feature flag (`AI_FEATURES_ENABLED`) to gate LLM-dependent features
+- [x] Natural language query endpoint (`/api/ask`) with intent extraction and smart sampling
+- [x] MCP (Model Context Protocol) server for Claude Desktop / Cursor integration
+- [ ] Advanced analytics dashboard (trending operations, anomaly detection)
+- [ ] Redis caching for LLM responses and frequent queries
+- [ ] Async queue for LLM requests to prevent timeout/cost explosions at scale
+
 ## Completed in this PR
-All Phase 1 items were implemented in the latest changes.
+All Phase 5 items marked done were implemented in v1.3.0.
--- a/2
+++ b/2
@@ -1 +1 @@
-1.1.0
+1.3.0
--- a/backend/Dockerfile
+++ b/backend/Dockerfile
@@ -1,5 +1,9 @@
 FROM python:3.11-slim

+# Bake the version into the image at build time
+ARG VERSION=unknown
+ENV VERSION=${VERSION}
+
 # Security: run as non-root
 RUN groupadd -r aoc && useradd -r -g aoc aoc

--- a/backend/config.py
+++ b/backend/config.py
@@ -42,11 +42,12 @@ class Settings(BaseSettings):
    # Alerting
    ALERTS_ENABLED: bool = False

-    # LLM / Natural Language Query
+    # AI / Natural Language Query
+    AI_FEATURES_ENABLED: bool = True
    LLM_API_KEY: str = ""
    LLM_BASE_URL: str = "https://api.openai.com/v1"
    LLM_MODEL: str = "gpt-4o-mini"
-    LLM_MAX_EVENTS: int = 50
+    LLM_MAX_EVENTS: int = 200
    LLM_TIMEOUT_SECONDS: int = 30
    LLM_API_VERSION: str = ""  # e.g. 2025-01-01-preview for Azure OpenAI

@@ -77,6 +78,7 @@ SIEM_ENABLED = _settings.SIEM_ENABLED
 SIEM_WEBHOOK_URL = _settings.SIEM_WEBHOOK_URL
 ALERTS_ENABLED = _settings.ALERTS_ENABLED

+AI_FEATURES_ENABLED = _settings.AI_FEATURES_ENABLED
 LLM_API_KEY = _settings.LLM_API_KEY
 LLM_BASE_URL = _settings.LLM_BASE_URL
 LLM_MODEL = _settings.LLM_MODEL
--- a/backend/frontend/index.html
+++ b/backend/frontend/index.html
@@ -12,7 +12,7 @@
  <div class="page" x-data="aocApp()" x-init="initApp()">
    <header class="hero">
      <div>
-        <p class="eyebrow">Admin Operations Center</p>
+        <p class="eyebrow">Admin Operations Center <span class="version-badge" x-text="appVersion"></span></p>
        <h1>Directory Audit Explorer</h1>
        <p class="lede">Filter Microsoft Entra audit events by user, app, time, action, and action type.</p>
      </div>
@@ -38,7 +38,7 @@
      </div>
    </section>

-    <section class="panel">
+    <section class="panel" x-show="aiFeaturesEnabled">
      <h3>Ask a question</h3>
      <form class="ask-form" @submit.prevent="askQuestion()">
        <div class="ask-row">
@@ -50,6 +50,9 @@
          />
          <button type="submit" :disabled="askLoading" x-text="askLoading ? 'Thinking…' : 'Ask'">Ask</button>
        </div>
+        <div x-show="hasActiveFilters()" class="ask-filter-hint">
+          <small>Respecting active filters: <span x-text="activeFilterSummary()"></span></small>
+        </div>
      </form>
      <template x-if="askAnswer">
        <div class="ask-result">
@@ -240,6 +243,8 @@
          actor: '', selectedServices: [], search: '', operation: '', result: '', start: '', end: '', limit: 100, includeTags: '', excludeTags: '',
        },
        options: { actors: [], services: [], operations: [], results: [] },
+        appVersion: '',
+        aiFeaturesEnabled: true,
        askQuestionText: '',
        askLoading: false,
        askAnswer: '',
@@ -249,6 +254,7 @@
        askLlmError: '',

        async initApp() {
+          await this.loadVersion();
          await this.initAuth();
          if (!this.authConfig?.auth_enabled || this.accessToken) {
            await this.loadFilterOptions();
@@ -257,6 +263,16 @@
          }
        },

+        async loadVersion() {
+          try {
+            const res = await fetch('/api/version');
+            if (res.ok) {
+              const body = await res.json();
+              this.appVersion = body.version || '';
+            }
+          } catch {}
+        },
+
        authHeader() {
          return this.accessToken ? { Authorization: `Bearer ${this.accessToken}` } : {};
        },
@@ -287,6 +303,18 @@
            this.authConfig = { auth_enabled: false };
          }

+          try {
+            const featRes = await fetch('/api/config/features');
+            if (featRes.ok) {
+              const featBody = await featRes.json();
+              this.aiFeaturesEnabled = featBody.ai_features_enabled !== false;
+            } else {
+              this.aiFeaturesEnabled = true;
+            }
+          } catch {
+            this.aiFeaturesEnabled = true;
+          }
+
          if (!this.authConfig?.auth_enabled) {
            this.authBtnText = '';
            return;
@@ -491,11 +519,29 @@
          this.askAnswer = '';
          this.askAnswerHtml = '';
          this.askEvents = [];
+          this.askLlmError = '';
+
+          const payload = { question: q };
+          if (this.filters.selectedServices && this.filters.selectedServices.length) {
+            payload.services = this.filters.selectedServices;
+          }
+          if (this.filters.actor) payload.actor = this.filters.actor;
+          if (this.filters.operation) payload.operation = this.filters.operation;
+          if (this.filters.result) payload.result = this.filters.result;
+          if (this.filters.start) payload.start = new Date(this.filters.start).toISOString();
+          if (this.filters.end) payload.end = new Date(this.filters.end).toISOString();
+          if (this.filters.includeTags) {
+            payload.include_tags = this.filters.includeTags.split(/[,;]+/).map(t => t.trim()).filter(Boolean);
+          }
+          if (this.filters.excludeTags) {
+            payload.exclude_tags = this.filters.excludeTags.split(/[,;]+/).map(t => t.trim()).filter(Boolean);
+          }
+
          try {
            const res = await fetch('/api/ask', {
              method: 'POST',
              headers: { 'Content-Type': 'application/json', ...this.authHeader() },
-              body: JSON.stringify({ question: q }),
+              body: JSON.stringify(payload),
            });
            if (!res.ok) throw new Error(await res.text());
            const body = await res.json();
@@ -532,6 +578,27 @@
            .replace(/\n/g, '<br>');
        },

+        hasActiveFilters() {
+          return this.filters.actor || this.filters.operation || this.filters.result ||
+            this.filters.start || this.filters.end || this.filters.includeTags ||
+            this.filters.excludeTags ||
+            (this.filters.selectedServices && this.filters.selectedServices.length &&
+             this.filters.selectedServices.length < this.options.services.length);
+        },
+
+        activeFilterSummary() {
+          const parts = [];
+          if (this.filters.actor) parts.push('actor');
+          if (this.filters.operation) parts.push('action');
+          if (this.filters.result) parts.push('result');
+          if (this.filters.start || this.filters.end) parts.push('time');
+          if (this.filters.includeTags || this.filters.excludeTags) parts.push('tags');
+          const svcCount = this.filters.selectedServices?.length || 0;
+          const allCount = this.options.services?.length || 0;
+          if (svcCount && svcCount < allCount) parts.push(`${svcCount} service${svcCount === 1 ? '' : 's'}`);
+          return parts.join(', ') || 'none';
+        },
+
        async bulkTagMatching() {
          const tag = prompt('Enter tag to apply to all matching events:');
          if (!tag || !tag.trim()) return;
--- a/backend/frontend/style.css
+++ b/backend/frontend/style.css
@@ -428,6 +428,25 @@ input {
  margin-bottom: 10px;
 }

+.ask-filter-hint {
+  margin-top: 6px;
+  color: var(--muted);
+}
+
+.version-badge {
+  display: inline-block;
+  margin-left: 8px;
+  padding: 2px 8px;
+  border-radius: 999px;
+  background: rgba(125, 211, 252, 0.15);
+  border: 1px solid rgba(125, 211, 252, 0.3);
+  color: var(--accent-strong);
+  font-size: 11px;
+  font-weight: 600;
+  letter-spacing: 0.05em;
+  vertical-align: middle;
+}
+
 .ask-events {
  margin-bottom: 14px;
 }
--- a/backend/main.py
+++ b/backend/main.py
@@ -6,7 +6,7 @@ from pathlib import Path

 import structlog
 from audit_trail import log_action
-from config import CORS_ORIGINS, ENABLE_PERIODIC_FETCH, FETCH_INTERVAL_MINUTES
+from config import AI_FEATURES_ENABLED, CORS_ORIGINS, ENABLE_PERIODIC_FETCH, FETCH_INTERVAL_MINUTES
 from database import setup_indexes
 from fastapi import FastAPI, HTTPException, Request
 from fastapi.middleware.cors import CORSMiddleware
@@ -14,7 +14,6 @@ from fastapi.responses import Response
 from fastapi.staticfiles import StaticFiles
 from metrics import observe_request, prometheus_metrics
 from middleware import CorrelationIdMiddleware
-from routes.ask import router as ask_router
 from routes.config import router as config_router
 from routes.events import router as events_router
 from routes.fetch import router as fetch_router
@@ -113,7 +112,10 @@ app.include_router(events_router, prefix="/api")
 app.include_router(config_router, prefix="/api")
 app.include_router(webhooks_router, prefix="/api")
 app.include_router(health_router, prefix="/api")
-app.include_router(ask_router, prefix="/api")
+if AI_FEATURES_ENABLED:
+    from routes.ask import router as ask_router
+
+    app.include_router(ask_router, prefix="/api")
 app.include_router(rules_router, prefix="/api")


@@ -134,6 +136,13 @@ async def metrics():
    return Response(content=prometheus_metrics(), media_type="text/plain")


+@app.get("/api/version")
+async def version():
+    import os
+
+    return {"version": os.environ.get("VERSION", "unknown")}
+
+
 frontend_dir = Path(__file__).parent / "frontend"
 app.mount("/", StaticFiles(directory=frontend_dir, html=True), name="frontend")

--- a/backend/mcp_server.py
+++ b/backend/mcp_server.py
@@ -0,0 +1,276 @@
+#!/usr/bin/env python3
+"""
+AOC MCP Server
+
+Standalone MCP server that exposes audit log search tools for Claude Desktop,
+Cursor, and other MCP clients.
+
+Usage:
+    python mcp_server.py
+
+Claude Desktop config (~/.config/claude/claude_desktop_config.json):
+    {
+      "mcpServers": {
+        "aoc": {
+          "command": "python",
+          "args": ["/path/to/aoc/backend/mcp_server.py"],
+          "env": {"MONGO_URI": "mongodb://..."}
+        }
+      }
+    }
+"""
+
+import asyncio
+import json
+import os
+import sys
+from datetime import UTC, datetime, timedelta
+
+# Ensure backend modules are importable
+sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
+
+from database import events_collection
+from mcp.server import Server
+from mcp.server.stdio import stdio_server
+from mcp.types import TextContent, Tool
+
+app = Server("aoc")
+
+# ---------------------------------------------------------------------------
+# Tool definitions
+# ---------------------------------------------------------------------------
+
+_SEARCH_EVENTS_SCHEMA = {
+    "type": "object",
+    "properties": {
+        "entity": {"type": "string", "description": "Device name, user UPN, or email to search for"},
+        "services": {
+            "type": "array",
+            "items": {"type": "string"},
+            "description": "Filter by service (e.g. Intune, Directory, Exchange)",
+        },
+        "operation": {"type": "string", "description": "Filter by operation name"},
+        "result": {"type": "string", "description": "Filter by result (success, failure)"},
+        "days": {"type": "integer", "description": "Number of days to look back (default 7)"},
+        "limit": {"type": "integer", "description": "Max events to return (default 20)"},
+    },
+}
+
+_GET_EVENT_SCHEMA = {
+    "type": "object",
+    "properties": {
+        "event_id": {"type": "string", "description": "The event ID to retrieve"},
+    },
+    "required": ["event_id"],
+}
+
+_GET_SUMMARY_SCHEMA = {
+    "type": "object",
+    "properties": {
+        "days": {"type": "integer", "description": "Number of days to summarise (default 7)"},
+    },
+}
+
+_ASK_SCHEMA = {
+    "type": "object",
+    "properties": {
+        "question": {"type": "string", "description": "Natural language question about audit logs"},
+        "days": {"type": "integer", "description": "Number of days to look back (default 7)"},
+    },
+    "required": ["question"],
+}
+
+
+@app.list_tools()
+async def list_tools() -> list[Tool]:
+    return [
+        Tool(
+            name="search_events",
+            description="Search audit events by entity, service, operation, or result.",
+            inputSchema=_SEARCH_EVENTS_SCHEMA,
+        ),
+        Tool(name="get_event", description="Retrieve a single audit event by its ID.", inputSchema=_GET_EVENT_SCHEMA),
+        Tool(
+            name="get_summary",
+            description="Get an aggregated summary of audit activity for the last N days.",
+            inputSchema=_GET_SUMMARY_SCHEMA,
+        ),
+        Tool(
+            name="ask",
+            description="Ask a natural language question about audit logs. Returns a narrative answer.",
+            inputSchema=_ASK_SCHEMA,
+        ),
+    ]
+
+
+# ---------------------------------------------------------------------------
+# Tool handlers
+# ---------------------------------------------------------------------------
+
+
+@app.call_tool()
+async def call_tool(name: str, arguments: dict) -> list[TextContent]:
+    if name == "search_events":
+        return await _handle_search_events(arguments)
+    if name == "get_event":
+        return await _handle_get_event(arguments)
+    if name == "get_summary":
+        return await _handle_get_summary(arguments)
+    if name == "ask":
+        return await _handle_ask(arguments)
+    raise ValueError(f"Unknown tool: {name}")
+
+
+async def _handle_search_events(arguments: dict) -> list[TextContent]:
+    days = arguments.get("days", 7)
+    limit = min(arguments.get("limit", 20), 100)
+    since = (datetime.now(UTC) - timedelta(days=days)).isoformat().replace("+00:00", "Z")
+
+    filters = [{"timestamp": {"$gte": since}}]
+
+    services = arguments.get("services")
+    if services:
+        filters.append({"service": {"$in": services}})
+
+    operation = arguments.get("operation")
+    if operation:
+        filters.append({"operation": {"$regex": operation, "$options": "i"}})
+
+    result = arguments.get("result")
+    if result:
+        filters.append({"result": {"$regex": result, "$options": "i"}})
+
+    entity = arguments.get("entity")
+    if entity:
+        entity_safe = entity.replace(".", "\\.").replace("(", "\\(").replace(")", "\\)")
+        filters.append(
+            {
+                "$or": [
+                    {"target_displays": {"$elemMatch": {"$regex": entity_safe, "$options": "i"}}},
+                    {"actor_display": {"$regex": entity_safe, "$options": "i"}},
+                    {"actor_upn": {"$regex": entity_safe, "$options": "i"}},
+                    {"raw_text": {"$regex": entity_safe, "$options": "i"}},
+                ]
+            }
+        )
+
+    query = {"$and": filters}
+    cursor = events_collection.find(query).sort("timestamp", -1).limit(limit)
+    events = list(cursor)
+
+    if not events:
+        return [TextContent(type="text", text="No matching events found.")]
+
+    lines = [f"Found {len(events)} event(s):\n"]
+    for e in events:
+        ts = e.get("timestamp", "?")[:16].replace("T", " ")
+        svc = e.get("service", "?")
+        op = e.get("operation", "?")
+        actor = e.get("actor_display", "?")
+        result_str = e.get("result", "?")
+        lines.append(f"{ts} | {svc} | {op} | {actor} | {result_str}")
+
+    return [TextContent(type="text", text="\n".join(lines))]
+
+
+async def _handle_get_event(arguments: dict) -> list[TextContent]:
+    event_id = arguments["event_id"]
+    event = events_collection.find_one({"id": event_id})
+    if not event:
+        return [TextContent(type="text", text=f"Event {event_id} not found.")]
+    event.pop("_id", None)
+    return [TextContent(type="text", text=json.dumps(event, indent=2, default=str))]
+
+
+async def _handle_get_summary(arguments: dict) -> list[TextContent]:
+    days = arguments.get("days", 7)
+    since = (datetime.now(UTC) - timedelta(days=days)).isoformat().replace("+00:00", "Z")
+    query = {"timestamp": {"$gte": since}}
+
+    total = events_collection.count_documents(query)
+    if total == 0:
+        return [TextContent(type="text", text="No events in the specified period.")]
+
+    # Aggregation pipelines
+    svc_pipeline = [
+        {"$match": query},
+        {"$group": {"_id": "$service", "count": {"$sum": 1}}},
+        {"$sort": {"count": -1}},
+        {"$limit": 10},
+    ]
+    op_pipeline = [
+        {"$match": query},
+        {"$group": {"_id": "$operation", "count": {"$sum": 1}}},
+        {"$sort": {"count": -1}},
+        {"$limit": 10},
+    ]
+    result_pipeline = [
+        {"$match": query},
+        {"$group": {"_id": "$result", "count": {"$sum": 1}}},
+        {"$sort": {"count": -1}},
+    ]
+    actor_pipeline = [
+        {"$match": query},
+        {"$group": {"_id": "$actor_display", "count": {"$sum": 1}}},
+        {"$sort": {"count": -1}},
+        {"$limit": 10},
+    ]
+
+    svc_counts = list(events_collection.aggregate(svc_pipeline))
+    op_counts = list(events_collection.aggregate(op_pipeline))
+    result_counts = list(events_collection.aggregate(result_pipeline))
+    actor_counts = list(events_collection.aggregate(actor_pipeline))
+
+    lines = [f"Summary for the last {days} days ({total} total events)\n"]
+
+    lines.append("By service:")
+    for row in svc_counts:
+        lines.append(f"  {row['_id'] or 'Unknown'}: {row['count']}")
+
+    lines.append("\nBy action:")
+    for row in op_counts:
+        lines.append(f"  {row['_id'] or 'Unknown'}: {row['count']}")
+
+    lines.append("\nBy result:")
+    for row in result_counts:
+        lines.append(f"  {row['_id'] or 'Unknown'}: {row['count']}")
+
+    lines.append("\nTop actors:")
+    for row in actor_counts:
+        lines.append(f"  {row['_id'] or 'Unknown'}: {row['count']}")
+
+    return [TextContent(type="text", text="\n".join(lines))]
+
+
+async def _handle_ask(arguments: dict) -> list[TextContent]:
+    """For now, the MCP 'ask' tool returns a helpful message directing the user to the web UI,
+    since the full NLQ pipeline requires LLM configuration that may not be available in the MCP context."""
+    question = arguments["question"]
+    days = arguments.get("days", 7)
+
+    # Perform a search to give the user something useful immediately
+    result = await _handle_search_events({"entity": "", "days": days, "limit": 50})
+    base_text = result[0].text if result else ""
+
+    text = (
+        f"You asked: '{question}'\n\n"
+        f"Here are the most recent {min(50, base_text.count(chr(10)) - 1)} events from the last {days} days:\n\n"
+        f"{base_text}\n\n"
+        f"Tip: Use the 'search_events' tool with specific filters (services, operation, result) "
+        f"to narrow down the dataset before asking follow-up questions."
+    )
+    return [TextContent(type="text", text=text)]
+
+
+# ---------------------------------------------------------------------------
+# Entry point
+# ---------------------------------------------------------------------------
+
+
+async def main():
+    async with stdio_server() as (read_stream, write_stream):
+        await app.run(read_stream, write_stream, app.create_initialization_options())
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
--- a/backend/models/api.py
+++ b/backend/models/api.py
@@ -74,6 +74,14 @@ class AlertRuleResponse(BaseModel):

 class AskRequest(BaseModel):
    question: str
+    services: list[str] | None = None
+    actor: str | None = None
+    operation: str | None = None
+    result: str | None = None
+    start: str | None = None
+    end: str | None = None
+    include_tags: list[str] | None = None
+    exclude_tags: list[str] | None = None


 class AskEventRef(BaseModel):
--- a/backend/requirements.txt
+++ b/backend/requirements.txt
@@ -13,3 +13,4 @@ tenacity
 prometheus-client
 httpx
 gunicorn
+mcp
--- a/backend/routes/ask.py
+++ b/backend/routes/ask.py
@@ -13,6 +13,129 @@ from models.api import AskRequest, AskResponse
 router = APIRouter(dependencies=[Depends(require_auth)])
 logger = structlog.get_logger("aoc.ask")

+# ---------------------------------------------------------------------------
+# Intent extraction — map question keywords to relevant audit services
+# ---------------------------------------------------------------------------
+
+_SERVICE_INTENTS = {
+    "intune": ["Intune"],
+    "device": ["Intune", "Device"],
+    "laptop": ["Intune", "Device"],
+    "mobile": ["Intune", "Device"],
+    "phone": ["Intune", "Device"],
+    "ipad": ["Intune", "Device"],
+    "app": ["Intune", "ApplicationManagement"],
+    "application": ["Intune", "ApplicationManagement"],
+    "policy": ["Intune", "Policy"],
+    "compliance": ["Intune", "Policy"],
+    "user": ["Directory", "UserManagement"],
+    "group": ["Directory", "GroupManagement"],
+    "role": ["Directory", "RoleManagement"],
+    "permission": ["Directory", "RoleManagement"],
+    "license": ["Directory", "License"],
+    "email": ["Exchange"],
+    "mailbox": ["Exchange"],
+    "mail": ["Exchange"],
+    "message": ["Exchange", "Teams"],
+    "file": ["SharePoint"],
+    "sharepoint": ["SharePoint"],
+    "site": ["SharePoint"],
+    "document": ["SharePoint"],
+    "team": ["Teams"],
+    "channel": ["Teams"],
+    "meeting": ["Teams"],
+    "call": ["Teams"],
+}
+
+# Services that are extremely noisy for typical admin questions.
+# We exclude them by default on broad questions unless the user explicitly mentions them.
+_NOISY_SERVICES = {"Exchange", "SharePoint"}
+
+# Services that are generally admin-relevant and kept by default.
+_DEFAULT_ADMIN_SERVICES = {
+    "Directory",
+    "UserManagement",
+    "GroupManagement",
+    "RoleManagement",
+    "ApplicationManagement",
+    "Intune",
+    "Device",
+    "Policy",
+    "Teams",
+    "License",
+}
+
+
+def _extract_intent_services(question: str) -> tuple[list[str] | None, bool]:
+    """
+    Extract relevant services from the question.
+
+    Returns:
+        (services, is_explicit):
+        - services: list of service names to query, or None for default admin set
+        - is_explicit: True if the user explicitly mentioned a noisy service
+    """
+    q_lower = question.lower()
+    tokens = set(re.findall(r"\b[a-z]+\b", q_lower))
+
+    matched_services = set()
+    for token, services in _SERVICE_INTENTS.items():
+        if token in tokens:
+            matched_services.update(services)
+
+    if matched_services:
+        # User asked something specific — return exactly what they asked for
+        is_explicit = not matched_services.isdisjoint(_NOISY_SERVICES)
+        return sorted(matched_services), is_explicit
+
+    # Broad question with no clear intent — default to admin-relevant services only
+    return None, False
+
+
+# ---------------------------------------------------------------------------
+# Smart sampling — stratified by importance so the LLM sees signal, not noise
+# ---------------------------------------------------------------------------
+
+
+def _smart_sample(events: list[dict], max_events: int = 200) -> list[dict]:
+    """
+    Return a curated subset that preserves diversity and prioritises signal.
+
+    Tiers:
+      1. Failures (always valuable)
+      2. High-admin-value services (Intune, Device, Directory, etc.)
+      3. Everything else
+    """
+    if len(events) <= max_events:
+        return events
+
+    high_value = {
+        "Directory",
+        "UserManagement",
+        "GroupManagement",
+        "RoleManagement",
+        "Intune",
+        "Device",
+        "Policy",
+        "ApplicationManagement",
+    }
+
+    failures = [e for e in events if str(e.get("result") or "").lower() in ("failure", "failed")]
+    high_val = [e for e in events if e.get("service") in high_value and e not in failures]
+    rest = [e for e in events if e not in failures and e not in high_val]
+
+    # Allocate slots: half to failures+high-value, half to rest (but never let rest dominate)
+    slots = max_events
+    failure_cap = min(len(failures), max(10, slots // 4))
+    high_cap = min(len(high_val), max(20, slots // 4))
+    rest_cap = slots - failure_cap - high_cap
+
+    sampled = failures[:failure_cap] + high_val[:high_cap] + rest[:rest_cap]
+    # Sort back to chronological order
+    sampled.sort(key=lambda e: e.get("timestamp") or "", reverse=True)
+    return sampled
+
+
 # ---------------------------------------------------------------------------
 # Time-range extraction
 # ---------------------------------------------------------------------------
@@ -104,7 +227,17 @@ def _extract_entity(question: str) -> str | None:
 # ---------------------------------------------------------------------------


-def _build_event_query(entity: str | None, start: str | None, end: str | None) -> dict:
+def _build_event_query(
+    entity: str | None,
+    start: str | None,
+    end: str | None,
+    services: list[str] | None = None,
+    actor: str | None = None,
+    operation: str | None = None,
+    result: str | None = None,
+    include_tags: list[str] | None = None,
+    exclude_tags: list[str] | None = None,
+) -> dict:
    filters = []

    if start or end:
@@ -128,6 +261,28 @@ def _build_event_query(entity: str | None, start: str | None, end: str | None) -
            }
        )

+    if services:
+        filters.append({"service": {"$in": services}})
+    if actor:
+        actor_safe = re.escape(actor)
+        filters.append(
+            {
+                "$or": [
+                    {"actor_display": {"$regex": actor_safe, "$options": "i"}},
+                    {"actor_upn": {"$regex": actor_safe, "$options": "i"}},
+                    {"actor.user.userPrincipalName": {"$regex": actor_safe, "$options": "i"}},
+                ]
+            }
+        )
+    if operation:
+        filters.append({"operation": {"$regex": re.escape(operation), "$options": "i"}})
+    if result:
+        filters.append({"result": {"$regex": re.escape(result), "$options": "i"}})
+    if include_tags:
+        filters.append({"tags": {"$all": include_tags}})
+    if exclude_tags:
+        filters.append({"tags": {"$not": {"$all": exclude_tags}}})
+
    return {"$and": filters} if filters else {}


@@ -136,22 +291,80 @@ def _build_event_query(entity: str | None, start: str | None, end: str | None) -
 # ---------------------------------------------------------------------------

 _SYSTEM_PROMPT = """You are an IT operations assistant. An administrator has asked a question about audit logs.
-Your job is to read the list of audit events below and write a concise, plain-language answer.
+Your job is to read the data below and write a concise, plain-language answer.
+
+The input may be either:
+- A small list of individual audit events (numbered Event #1, #2, etc.), or
+- An aggregated overview with counts by service, action, result, and actor, plus sample events.

 Rules:
 - Assume the reader is a non-expert admin.
- Group related events together and tell a coherent story.
+- For aggregated overviews: summarise the scale, top patterns, and highlight anomalies or failures.
+- For small event lists: group related events together and tell a coherent story.
 - Highlight anything unusual, failed actions, or privilege escalations.
 - Reference specific event numbers (e.g., "Event #3") when making claims so the user can verify.
+- If the data is an aggregated subset of a larger result set, acknowledge the scale (e.g., "847 events occurred — the top pattern was...").
 - If there are no events, say so clearly.
 - Keep the answer under 300 words.
- Do not invent events that are not in the list.
+- Do not invent events or patterns that are not supported by the data.
 """


-def _format_events_for_llm(events: list[dict]) -> str:
+def _aggregate_counts(events: list[dict]) -> dict:
+    """Build lightweight aggregation tables for large result sets."""
+    from collections import Counter
+
+    svc_counts = Counter(e.get("service") or "Unknown" for e in events)
+    op_counts = Counter(e.get("operation") or "Unknown" for e in events)
+    result_counts = Counter(e.get("result") or "Unknown" for e in events)
+    actor_counts = Counter(e.get("actor_display") or "Unknown" for e in events)
+    return {
+        "services": svc_counts.most_common(10),
+        "operations": op_counts.most_common(10),
+        "results": result_counts.most_common(5),
+        "actors": actor_counts.most_common(10),
+    }
+
+
+def _format_events_for_llm(
+    events: list[dict], total: int | None = None, excluded_services: list[str] | None = None
+) -> str:
    lines = []
-    for i, e in enumerate(events, 1):
+
+    # If we have a large result set, send aggregation + samples instead of raw dump
+    if total is not None and total > len(events) and len(events) >= 50:
+        lines.append(f"Result set overview: {total} total events (showing a curated sample of {len(events)}).\n")
+        if excluded_services:
+            lines.append(f"Note: high-volume services excluded by default: {', '.join(excluded_services)}.\n")
+        agg = _aggregate_counts(events)
+        lines.append("Breakdown by service:")
+        for svc, cnt in agg["services"]:
+            lines.append(f"  {svc}: {cnt}")
+        lines.append("\nBreakdown by action:")
+        for op, cnt in agg["operations"]:
+            lines.append(f"  {op}: {cnt}")
+        lines.append("\nBreakdown by result:")
+        for res, cnt in agg["results"]:
+            lines.append(f"  {res}: {cnt}")
+        lines.append("\nTop actors:")
+        for actor, cnt in agg["actors"]:
+            lines.append(f"  {actor}: {cnt}")
+        # Include failures and a few recent samples
+        failures = [e for e in events if str(e.get("result") or "").lower() in ("failure", "failed")]
+        if failures:
+            lines.append(f"\nFailures ({len(failures)}):")
+            for e in failures[:10]:
+                ts = e.get("timestamp", "?")[:16].replace("T", " ")
+                op = e.get("operation", "unknown")
+                actor = e.get("actor_display", "unknown")
+                lines.append(f"  {ts} — {op} by {actor}")
+        lines.append("\nMost recent sample events:")
+    else:
+        if total is not None and total > len(events):
+            lines.append(f"Showing {len(events)} of {total} total matching events (most recent first):\n")
+
+    # Always include the first N raw events as detail (up to 50)
+    for i, e in enumerate(events[:50], 1):
        ts = e.get("timestamp") or "unknown time"
        op = e.get("operation") or "unknown action"
        actor = e.get("actor_display") or "unknown actor"
@@ -181,11 +394,16 @@ def _build_chat_url(base_url: str, api_version: str) -> str:
    return url


-async def _call_llm(question: str, events: list[dict]) -> str:
+async def _call_llm(
+    question: str,
+    events: list[dict],
+    total: int | None = None,
+    excluded_services: list[str] | None = None,
+) -> str:
    if not LLM_API_KEY:
        raise RuntimeError("LLM_API_KEY not configured")

-    context = _format_events_for_llm(events)
+    context = _format_events_for_llm(events, total=total, excluded_services=excluded_services)
    messages = [
        {"role": "system", "content": _SYSTEM_PROMPT},
        {
@@ -246,6 +464,7 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):

    start, end = _extract_time_range(question)
    entity = _extract_entity(question)
+    intent_services, explicit_noisy = _extract_intent_services(question)

    # Default to last 7 days if no time range detected
    if not start:
@@ -253,24 +472,65 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
        start = (now - timedelta(days=7)).isoformat().replace("+00:00", "Z")
        end = now.isoformat().replace("+00:00", "Z")

-    query = _build_event_query(entity, start, end)
+    # -----------------------------------------------------------------------
+    # Decide which services to query
+    # -----------------------------------------------------------------------
+    excluded_services: list[str] = []
+    if body.services:
+        # User explicitly filtered via UI — respect that exactly
+        query_services = body.services
+    elif intent_services is not None:
+        # NL question implies specific services
+        query_services = intent_services
+    else:
+        # Broad question with no intent — exclude noisy services by default
+        query_services = sorted(_DEFAULT_ADMIN_SERVICES)
+        excluded_services = sorted(_NOISY_SERVICES)
+
+    # -----------------------------------------------------------------------
+    # Build and run query
+    # -----------------------------------------------------------------------
+    query = _build_event_query(
+        entity,
+        start,
+        end,
+        services=query_services,
+        actor=body.actor,
+        operation=body.operation,
+        result=body.result,
+        include_tags=body.include_tags,
+        exclude_tags=body.exclude_tags,
+    )

    try:
-        cursor = events_collection.find(query).sort([("timestamp", -1)]).limit(LLM_MAX_EVENTS)
-        events = list(cursor)
+        total = events_collection.count_documents(query)
+        # Fetch a generous window so we can apply smart sampling in Python
+        cursor = events_collection.find(query).sort([("timestamp", -1)]).limit(1000)
+        raw_events = list(cursor)
    except Exception as exc:
        logger.error("Failed to query events for ask", error=str(exc))
        raise HTTPException(status_code=500, detail=f"Database query failed: {exc}") from exc

-    for e in events:
+    for e in raw_events:
        e["_id"] = str(e.get("_id", ""))

+    # Apply smart sampling (preserves failures, prioritises admin-relevant services)
+    events = _smart_sample(raw_events, max_events=LLM_MAX_EVENTS)
+
    # If no events, return early
    if not events:
        return AskResponse(
            answer="I couldn't find any audit events matching your question. Try broadening the time range or checking the spelling of the device/user name.",
            events=[],
-            query_info={"entity": entity, "start": start, "end": end, "event_count": 0},
+            query_info={
+                "entity": entity,
+                "start": start,
+                "end": end,
+                "event_count": 0,
+                "total_matched": total,
+                "services_queried": query_services,
+                "excluded_services": excluded_services,
+            },
            llm_used=False,
            llm_error="LLM not used — no events found." if not LLM_API_KEY else None,
        )
@@ -283,7 +543,7 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
        llm_error = "LLM_API_KEY is not configured. Set it in your .env to enable AI narrative summarisation."
    else:
        try:
-            answer = await _call_llm(question, events)
+            answer = await _call_llm(question, events, total=total, excluded_services=excluded_services)
            llm_used = True
        except Exception as exc:
            llm_error = f"LLM call failed: {exc}"
@@ -291,9 +551,11 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):

    # Fallback: structured summary if LLM unavailable or failed
    if not answer:
-        parts = [f"Found {len(events)} event(s)"]
+        parts = [f"Found {total} event(s)"]
        if entity:
            parts.append(f"related to **{entity}**")
+        if excluded_services:
+            parts.append(f"(excluding {', '.join(excluded_services)})")
        parts.append(f"between {start[:10]} and {end[:10]}.\n")

        for i, e in enumerate(events[:10], 1):
@@ -317,6 +579,9 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
            "start": start,
            "end": end,
            "event_count": len(events),
+            "total_matched": total,
+            "services_queried": query_services,
+            "excluded_services": excluded_services,
            "mongo_query": json.dumps(query, default=str),
        },
        llm_used=llm_used,
--- a/backend/routes/config.py
+++ b/backend/routes/config.py
@@ -1,4 +1,5 @@
 from config import (
+    AI_FEATURES_ENABLED,
    AUTH_CLIENT_ID,
    AUTH_ENABLED,
    AUTH_SCOPE,
@@ -18,3 +19,10 @@ def auth_config():
        "scope": AUTH_SCOPE,
        "redirect_uri": None,  # frontend uses window.location.origin by default
    }
+
+
+@router.get("/config/features")
+def features_config():
+    return {
+        "ai_features_enabled": AI_FEATURES_ENABLED,
+    }
--- a/backend/tests/test_api.py
+++ b/backend/tests/test_api.py
@@ -1,6 +1,41 @@
 from datetime import UTC, datetime


+def test_config_features(client):
+    response = client.get("/api/config/features")
+    assert response.status_code == 200
+    data = response.json()
+    assert "ai_features_enabled" in data
+    assert isinstance(data["ai_features_enabled"], bool)
+
+
+def test_ask_disabled_when_ai_features_off():
+    import subprocess
+    import sys
+
+    code = """
+import sys
+sys.path.insert(0, '.')
+import os
+os.environ['AI_FEATURES_ENABLED'] = 'false'
+
+# Re-import config with the env override
+import importlib
+import config
+importlib.reload(config)
+
+# Now import main; it will pick up the new AI_FEATURES_ENABLED
+import main
+ask_paths = [r.path for r in main.app.routes if hasattr(r, 'path') and 'ask' in r.path]
+print('ASK_PATHS:', ask_paths)
+assert len(ask_paths) == 0, f"Expected no ask routes, found: {ask_paths}"
+print('OK')
+"""
+    result = subprocess.run([sys.executable, "-c", code], capture_output=True, text=True, cwd=".")
+    assert result.returncode == 0, f"Subprocess failed: {result.stdout}\n{result.stderr}"
+    assert "OK" in result.stdout
+
+
 def test_health(client):
    response = client.get("/health")
    assert response.status_code == 200
--- a/backend/tests/test_ask.py
+++ b/backend/tests/test_ask.py
@@ -236,7 +236,7 @@ class TestAskEndpoint:
            }
        )

-        async def fake_llm(question, events):
+        async def fake_llm(question, events, total=None, excluded_services=None):
            return "The device had a failed wipe attempt."

        monkeypatch.setattr("routes.ask.LLM_API_KEY", "fake-key")
@@ -265,7 +265,7 @@ class TestAskEndpoint:
            }
        )

-        async def failing_llm(question, events):
+        async def failing_llm(question, events, total=None):
            raise RuntimeError("LLM service down")

        monkeypatch.setattr("routes.ask.LLM_API_KEY", "fake-key")
@@ -277,3 +277,76 @@ class TestAskEndpoint:
        assert data["llm_used"] is False  # Falls back
        assert len(data["events"]) == 1
        assert "Found 1 event" in data["answer"]
+
+    def test_ask_with_explicit_filters(self, client, mock_events_collection):
+        now = datetime.now(UTC)
+        mock_events_collection.insert_one(
+            {
+                "id": "evt-exchange",
+                "timestamp": now.isoformat(),
+                "service": "Exchange",
+                "operation": "Update",
+                "result": "failure",
+                "actor_display": "Alice",
+                "target_displays": ["LAPTOP-001"],
+                "display_summary": "summary",
+                "raw_text": "raw",
+            }
+        )
+        mock_events_collection.insert_one(
+            {
+                "id": "evt-directory",
+                "timestamp": now.isoformat(),
+                "service": "Directory",
+                "operation": "Add user",
+                "result": "success",
+                "actor_display": "Alice",
+                "target_displays": ["LAPTOP-001"],
+                "display_summary": "summary",
+                "raw_text": "raw",
+            }
+        )
+        response = client.post(
+            "/api/ask",
+            json={"question": "What happened to LAPTOP-001?", "services": ["Exchange"], "result": "failure"},
+        )
+        assert response.status_code == 200
+        data = response.json()
+        assert data["query_info"]["event_count"] == 1
+        assert data["events"][0]["id"] == "evt-exchange"
+
+    def test_ask_with_explicit_actor_filter(self, client, mock_events_collection):
+        now = datetime.now(UTC)
+        mock_events_collection.insert_one(
+            {
+                "id": "evt-bob",
+                "timestamp": now.isoformat(),
+                "service": "Directory",
+                "operation": "Add user",
+                "result": "success",
+                "actor_display": "Bob",
+                "actor_upn": "bob@example.com",
+                "target_displays": ["USER-001"],
+                "display_summary": "summary",
+                "raw_text": "raw",
+            }
+        )
+        mock_events_collection.insert_one(
+            {
+                "id": "evt-alice",
+                "timestamp": now.isoformat(),
+                "service": "Directory",
+                "operation": "Remove user",
+                "result": "success",
+                "actor_display": "Alice",
+                "actor_upn": "alice@example.com",
+                "target_displays": ["USER-001"],
+                "display_summary": "summary",
+                "raw_text": "raw",
+            }
+        )
+        response = client.post("/api/ask", json={"question": "What happened to USER-001?", "actor": "bob"})
+        assert response.status_code == 200
+        data = response.json()
+        assert data["query_info"]["event_count"] == 1
+        assert data["events"][0]["id"] == "evt-bob"
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -14,7 +14,7 @@ services:
  backend:
    build: ./backend
    # For production, use the pre-built image instead:
-    # image: git.cqre.net/cqrenet/aoc-backend:v1.1.0
+    # image: git.cqre.net/cqrenet/aoc-backend:v1.2.5
    container_name: aoc-backend
    restart: always
    env_file:
Author	SHA1	Message	Date
Tomas Kracmar	60b6ad15c4	Release v1.3.0: AI feature flag and MCP server All checks were successful CI / lint-and-test (push) Successful in 45s Details Release / build-and-push (push) Successful in 1m34s Details - Add AI_FEATURES_ENABLED config flag to gate AI/natural-language features - Conditionally register /api/ask router based on AI_FEATURES_ENABLED - Add GET /api/config/features endpoint for frontend feature detection - Update frontend to hide Ask panel when AI features are disabled - Implement standalone MCP server (backend/mcp_server.py) with tools: * search_events, get_event, get_summary, ask - Add mcp dependency to requirements.txt - Update .env.example, AGENTS.md, and ROADMAP.md - Bump VERSION to 1.3.0	2026-04-20 18:11:26 +02:00
Tomas Kracmar	b4e504a87b	feat: intent-aware querying + smart sampling for large audit datasets All checks were successful Release / build-and-push (push) Successful in 1m31s Details CI / lint-and-test (push) Successful in 34s Details - Add keyword-based intent extraction: 'device' → Intune, 'user' → Directory, etc. - Broad questions without intent auto-exclude noisy services (Exchange, SharePoint) - Smart stratified sampling: failures always included, high-value services prioritised - Fetch up to 1000 events from MongoDB, then curate best 200 for the LLM - Excluded services noted in LLM prompt and query_info so the admin knows the scope	2026-04-20 17:41:21 +02:00
Tomas Kracmar	b728abb5ee	ci: also tag and push 'latest' on every release All checks were successful CI / lint-and-test (push) Successful in 22s Details	2026-04-20 17:31:27 +02:00
Tomas Kracmar	d100388c7d	chore(release): bump version to 1.2.6 All checks were successful CI / lint-and-test (push) Successful in 31s Details Release / build-and-push (push) Successful in 1m17s Details	2026-04-20 17:29:10 +02:00
Tomas Kracmar	11fd87411d	fix: bake version into Docker image at build time All checks were successful Release / build-and-push (push) Successful in 1m18s Details CI / lint-and-test (push) Successful in 20s Details - Add VERSION build arg to Dockerfile - Pass --build-arg VERSION in release workflow - Remove VERSION env override from docker-compose files - Version is now immutable inside the image, no runtime env var needed	2026-04-20 17:24:20 +02:00
Tomas Kracmar	6a80bf4eb9	fix: read version from env var so it works inside Docker All checks were successful Release / build-and-push (push) Successful in 28s Details CI / lint-and-test (push) Successful in 21s Details	2026-04-20 17:15:55 +02:00
Tomas Kracmar	5e02f5a402	docs: add v1.2.5 release notes All checks were successful CI / lint-and-test (push) Successful in 25s Details	2026-04-20 17:12:43 +02:00
Tomas Kracmar	0c3e5ec57b	feat: add version display to frontend and /api/version endpoint (v1.2.5) All checks were successful Release / build-and-push (push) Successful in 40s Details CI / lint-and-test (push) Successful in 22s Details - Add GET /api/version endpoint that reads VERSION file - Frontend fetches version on init and displays it as a badge in the header - Add version-badge CSS styling - Update docker-compose.yml comment to v1.2.5	2026-04-20 17:09:02 +02:00
Tomas Kracmar	a255be93fe	feat: aggregate large event sets before sending to LLM All checks were successful CI / lint-and-test (push) Successful in 18s Details Release / build-and-push (push) Successful in 29s Details When a query matches >50 events, the LLM now receives: - Aggregated counts by service, operation, result, and actor - A list of failures (up to 10) - The 50 most recent raw events as samples This scales to thousands of events without blowing the token budget or losing signal. The LLM gets a bird's-eye view plus concrete examples. Also updates the system prompt to handle both individual event lists and aggregated overviews correctly.	2026-04-20 16:23:55 +02:00
Tomas Kracmar	cfe9397cc5	feat: raise LLM event limit to 200 and show total count awareness All checks were successful CI / lint-and-test (push) Successful in 23s Details Release / build-and-push (push) Successful in 27s Details - Bump LLM_MAX_EVENTS default from 50 to 200 - Add total_matched count to /api/ask response - Include 'Showing X of Y total' header in LLM prompt so the model knows when its view is a subset and avoids false certainty - Update system prompt to instruct acknowledging scale when truncated - Update test mocks to accept new total parameter	2026-04-20 16:13:52 +02:00
Tomas Kracmar	cf0283b20b	feat: natural language queries respect UI filters (v1.2.0) All checks were successful CI / lint-and-test (push) Successful in 22s Details Release / build-and-push (push) Successful in 36s Details - AskRequest now accepts optional filter fields: services, actor, operation, result, start, end, include_tags, exclude_tags - ask_question merges NL-extracted constraints with explicit UI filters - Frontend sends active filter state with every ask request - Show filter hint below ask input when filters are active - Add tests for service+result filtering and actor filtering in /api/ask Bump version to 1.2.0	2026-04-20 16:07:35 +02:00
Tomas Kracmar	28542f7b80	docs: add v1.1.0 release notes All checks were successful CI / lint-and-test (push) Successful in 27s Details	2026-04-20 16:04:24 +02:00
@@ -1 +1 @@
 .1.0
 .3.0