9 Commits

Author SHA1 Message Date
a5db0d363d chore: bump version to 1.3.1
All checks were successful
Release / build-and-push (push) Successful in 1m16s
CI / lint-and-test (push) Successful in 25s
2026-04-21 11:28:32 +02:00
43582692ba ui: fix page title and hero text to match product name
All checks were successful
CI / lint-and-test (push) Successful in 38s
2026-04-21 07:41:41 +02:00
5122739c01 feat: MCP server over SSE with OIDC auth
All checks were successful
CI / lint-and-test (push) Successful in 36s
- Extract shared MCP tool handlers to mcp_common.py
- mcp_server.py now uses shared handlers (stdio transport for local dev)
- New routes/mcp.py: SSE transport behind existing OIDC Bearer auth
- Mount MCP ASGI app at /mcp in main.py when AI_FEATURES_ENABLED
- /mcp/sse  -> establishes SSE stream (requires valid token when auth enabled)
- /mcp/messages/ -> receives MCP client messages
- Update README with SSE MCP docs
- Add tests for mount existence, auth, and message routing
2026-04-21 07:38:12 +02:00
6cf5c0a28b ui: move filters section before ask section
All checks were successful
CI / lint-and-test (push) Successful in 27s
2026-04-20 18:17:09 +02:00
6aa47e9b1e docs: update README and ROADMAP for v1.3.0
All checks were successful
CI / lint-and-test (push) Successful in 27s
2026-04-20 18:14:28 +02:00
60b6ad15c4 Release v1.3.0: AI feature flag and MCP server
All checks were successful
CI / lint-and-test (push) Successful in 45s
Release / build-and-push (push) Successful in 1m34s
- Add AI_FEATURES_ENABLED config flag to gate AI/natural-language features
- Conditionally register /api/ask router based on AI_FEATURES_ENABLED
- Add GET /api/config/features endpoint for frontend feature detection
- Update frontend to hide Ask panel when AI features are disabled
- Implement standalone MCP server (backend/mcp_server.py) with tools:
  * search_events, get_event, get_summary, ask
- Add mcp dependency to requirements.txt
- Update .env.example, AGENTS.md, and ROADMAP.md
- Bump VERSION to 1.3.0
2026-04-20 18:11:26 +02:00
b4e504a87b feat: intent-aware querying + smart sampling for large audit datasets
All checks were successful
Release / build-and-push (push) Successful in 1m31s
CI / lint-and-test (push) Successful in 34s
- Add keyword-based intent extraction: 'device' → Intune, 'user' → Directory, etc.
- Broad questions without intent auto-exclude noisy services (Exchange, SharePoint)
- Smart stratified sampling: failures always included, high-value services prioritised
- Fetch up to 1000 events from MongoDB, then curate best 200 for the LLM
- Excluded services noted in LLM prompt and query_info so the admin knows the scope
2026-04-20 17:41:21 +02:00
b728abb5ee ci: also tag and push 'latest' on every release
All checks were successful
CI / lint-and-test (push) Successful in 22s
2026-04-20 17:31:27 +02:00
d100388c7d chore(release): bump version to 1.2.6
All checks were successful
CI / lint-and-test (push) Successful in 31s
Release / build-and-push (push) Successful in 1m17s
2026-04-20 17:29:10 +02:00
18 changed files with 846 additions and 87 deletions

View File

@@ -34,6 +34,10 @@ SIEM_WEBHOOK_URL=
# Optional: enable rule-based alerting during ingestion # Optional: enable rule-based alerting during ingestion
ALERTS_ENABLED=false ALERTS_ENABLED=false
# Optional: enable AI/natural-language features (/api/ask, MCP server)
# Set to false to completely disable AI endpoints and UI elements
AI_FEATURES_ENABLED=true
# Optional: LLM configuration for natural language querying (/api/ask) # Optional: LLM configuration for natural language querying (/api/ask)
# Supports any OpenAI-compatible API (OpenAI, Azure OpenAI, Ollama, etc.) # Supports any OpenAI-compatible API (OpenAI, Azure OpenAI, Ollama, etc.)
# For Azure OpenAI / MS Foundry, set BASE_URL to your deployment endpoint # For Azure OpenAI / MS Foundry, set BASE_URL to your deployment endpoint

View File

@@ -18,5 +18,11 @@ jobs:
- name: Build Docker image - name: Build Docker image
run: docker build ./backend --build-arg VERSION=${{ gitea.ref_name }} --tag git.cqre.net/cqrenet/aoc-backend:${{ gitea.ref_name }} run: docker build ./backend --build-arg VERSION=${{ gitea.ref_name }} --tag git.cqre.net/cqrenet/aoc-backend:${{ gitea.ref_name }}
- name: Push Docker image - name: Tag as latest
run: docker tag git.cqre.net/cqrenet/aoc-backend:${{ gitea.ref_name }} git.cqre.net/cqrenet/aoc-backend:latest
- name: Push version tag
run: docker push git.cqre.net/cqrenet/aoc-backend:${{ gitea.ref_name }} run: docker push git.cqre.net/cqrenet/aoc-backend:${{ gitea.ref_name }}
- name: Push latest tag
run: docker push git.cqre.net/cqrenet/aoc-backend:latest

View File

@@ -6,28 +6,34 @@ AOC is a FastAPI microservice that ingests Microsoft Entra (Azure AD) audit logs
## Technology Stack ## Technology Stack
- **Runtime**: Python 3.11 - **Runtime**: Python 3.11 (3.14 for tests)
- **Web Framework**: FastAPI + Uvicorn - **Web Framework**: FastAPI + Uvicorn (Gunicorn in production)
- **Database**: MongoDB (PyMongo) - **Database**: MongoDB (PyMongo)
- **Frontend**: Vanilla HTML/CSS/JS (served as static files from `backend/frontend/`) - **Frontend**: Alpine.js + HTML/CSS (served as static files from `backend/frontend/`)
- **Authentication**: Optional OIDC Bearer token validation against Microsoft Entra (using `python-jose` and MSAL.js on the frontend) - **Authentication**: Optional OIDC Bearer token validation against Microsoft Entra (using `python-jose` and MSAL.js on the frontend)
- **External APIs**: Microsoft Graph API, Office 365 Management Activity API - **External APIs**: Microsoft Graph API, Office 365 Management Activity API, Azure OpenAI / MS Foundry
- **Deployment**: Docker Compose - **Deployment**: Docker Compose (dev), Docker Compose + nginx (prod)
- **CI/CD**: Gitea Actions (lint + test + Docker build + release)
## Project Structure ## Project Structure
``` ```
backend/ backend/
main.py # FastAPI app, router registration, background periodic fetch main.py # FastAPI app, router registration, background periodic fetch
config.py # Environment-based configuration (loads .env) config.py # Pydantic Settings configuration (loads .env)
database.py # MongoClient setup (db = micro_soc, collection = events) database.py # MongoClient setup (db = micro_soc, collection = events)
auth.py # OIDC Bearer token validation, JWKS caching, role/group checks auth.py # OIDC Bearer token validation, JWKS caching, role/group checks
requirements.txt # Python dependencies requirements.txt # Python dependencies
Dockerfile # python:3.11-slim image Dockerfile # python:3.11-slim image, non-root user, version baked at build
mcp_server.py # Standalone MCP server for Claude Desktop / Cursor integration
routes/ routes/
fetch.py # GET /api/fetch-audit-logs, run_fetch() fetch.py # GET /api/fetch-audit-logs, run_fetch()
events.py # GET /api/events, GET /api/filter-options events.py # GET /api/events, GET /api/filter-options, PATCH tags, POST comments
config.py # GET /api/config/auth config.py # GET /api/config/auth, GET /api/config/features
ask.py # POST /api/ask — natural language query with LLM
health.py # GET /health, GET /metrics
rules.py # Rule-based alerting endpoints
webhooks.py # Microsoft Graph change notification webhooks
graph/ graph/
auth.py # Client credentials token acquisition for Graph auth.py # Client credentials token acquisition for Graph
audit_logs.py # Fetch and enrich directory audit logs from Graph audit_logs.py # Fetch and enrich directory audit logs from Graph
@@ -41,7 +47,7 @@ backend/
mappings.yml # User-editable category labels and summary templates mappings.yml # User-editable category labels and summary templates
maintenance.py # CLI for re-normalization and deduplication of stored events maintenance.py # CLI for re-normalization and deduplication of stored events
frontend/ frontend/
index.html # Single-page UI with filters, pagination, raw-event modal index.html # Single-page UI with filters, pagination, ask panel, raw-event modal
style.css # Dark-themed stylesheet style.css # Dark-themed stylesheet
``` ```
@@ -60,6 +66,9 @@ Key variables:
- `AUTH_ALLOWED_ROLES`, `AUTH_ALLOWED_GROUPS` — comma-separated access control lists - `AUTH_ALLOWED_ROLES`, `AUTH_ALLOWED_GROUPS` — comma-separated access control lists
- `ENABLE_PERIODIC_FETCH`, `FETCH_INTERVAL_MINUTES` — background ingestion scheduler - `ENABLE_PERIODIC_FETCH`, `FETCH_INTERVAL_MINUTES` — background ingestion scheduler
- `MONGO_ROOT_USERNAME`, `MONGO_ROOT_PASSWORD`, `MONGO_PORT` — used by Docker Compose for MongoDB - `MONGO_ROOT_USERNAME`, `MONGO_ROOT_PASSWORD`, `MONGO_PORT` — used by Docker Compose for MongoDB
- `AI_FEATURES_ENABLED` — set `false` to completely disable AI endpoints and UI (default `true`)
- `LLM_API_KEY`, `LLM_BASE_URL`, `LLM_MODEL`, `LLM_MAX_EVENTS`, `LLM_TIMEOUT_SECONDS` — LLM provider settings
- `LLM_API_VERSION` — required for Azure OpenAI / MS Foundry endpoints
## Build and Run Commands ## Build and Run Commands
@@ -87,35 +96,81 @@ uvicorn main:app --reload --host 0.0.0.0 --port 8000
## API Endpoints ## API Endpoints
- `GET /api/fetch-audit-logs?hours=168` — pulls last N hours (capped at 720 / 30 days) from all sources, normalizes, dedupes, and upserts into MongoDB - `GET /api/fetch-audit-logs?hours=168` — pulls last N hours (capped at 720 / 30 days) from all sources, normalizes, dedupes, and upserts into MongoDB
- `GET /api/events` — list stored events with filters (`service`, `actor`, `operation`, `result`, `start`, `end`, `search`) and pagination (`page`, `page_size`) - `GET /api/events` — list stored events with filters (`service`, `actor`, `operation`, `result`, `start`, `end`, `search`) and cursor-based pagination
- `GET /api/filter-options` — best-effort distinct values for UI dropdowns - `GET /api/filter-options` — best-effort distinct values for UI dropdowns
- `GET /api/config/auth` — auth configuration exposed to the frontend - `GET /api/config/auth` — auth configuration exposed to the frontend
- `GET /api/config/features` — feature flags (`ai_features_enabled`)
- `POST /api/ask` — natural language query; returns LLM narrative + referenced events (only when `AI_FEATURES_ENABLED=true`)
- `GET /health` — liveness probe with DB connectivity
- `GET /metrics` — Prometheus metrics
## MCP Server
A standalone MCP server (`backend/mcp_server.py`) exposes audit log tools for Claude Desktop, Cursor, and other MCP clients.
Available tools:
- `search_events` — Search by entity, service, operation, result, time range
- `get_event` — Retrieve a single event by ID (raw JSON)
- `get_summary` — Aggregated counts by service, operation, result, actor
- `ask` — Natural language question (returns recent events + guidance)
**Claude Desktop config** (`~/.config/claude/claude_desktop_config.json`):
```json
{
"mcpServers": {
"aoc": {
"command": "python",
"args": ["/path/to/aoc/backend/mcp_server.py"],
"env": {"MONGO_URI": "mongodb://root:example@localhost:27017/"}
}
}
}
```
The MCP server imports `database.py` directly and does not go through the FastAPI layer, so it shares the same MongoDB connection but bypasses auth.
## AI Feature Flag
Set `AI_FEATURES_ENABLED=false` in `.env` to:
- Prevent the `ask` router from being registered in FastAPI
- Hide the "Ask a question" panel in the frontend
- Return `ai_features_enabled: false` from `/api/config/features`
This is intended for the open-core monetization split: core features (ingestion, filtering, search, export) are always available; premium AI features (NLQ, MCP) can be disabled.
## Code Conventions ## Code Conventions
- Python modules use absolute imports within the `backend/` package (e.g., `from graph.auth import get_access_token`). When running locally, ensure the working directory is `backend/` so these resolve correctly. - Python modules use absolute imports within the `backend/` package (e.g., `from graph.auth import get_access_token`). When running locally, ensure the working directory is `backend/` so these resolve correctly.
- No formal formatter or linter is configured. Keep changes consistent with the existing style: simple functions, explicit exception handling, and informative docstrings. - The project uses `ruff` for linting and formatting. Run `ruff check . && ruff format .` before committing.
- The frontend is a single HTML file with inline JavaScript. It relies on the MSAL.js CDN (`https://alcdn.msauth.net/browser/2.37.0/js/msal-browser.min.js`). - Keep changes consistent with the existing style: simple functions, explicit exception handling, and informative docstrings.
- The frontend is a single HTML file with inline JavaScript and Alpine.js.
## Testing ## Testing
There are currently **no automated tests** in this repository. When adding new features or bug fixes, verify behavior manually: Tests run with pytest and mongomock (no real MongoDB required):
1. Start the server (Docker Compose or local uvicorn). ```bash
2. Run a smoke test: cd backend
```bash python -m venv .venv_test
curl http://localhost:8000/api/events source .venv_test/bin/activate
curl http://localhost:8000/api/fetch-audit-logs pip install -r requirements.txt
``` pytest tests/ -q
3. Open http://localhost:8000 in a browser, apply filters, paginate, and click "View raw event". ```
When adding new features or bug fixes, add or update tests in `backend/tests/`. The test suite covers:
- Event normalization and deduplication
- Auth middleware and token validation
- API endpoints (`/api/events`, `/api/fetch-audit-logs`, `/api/ask`)
- NLQ time range extraction, entity extraction, query building
## Security Considerations ## Security Considerations
- **Secrets**: `CLIENT_SECRET` and other credentials come from `.env`. Never commit `.env`. - **Secrets**: `CLIENT_SECRET`, `LLM_API_KEY`, and other credentials come from `.env`. Never commit `.env`.
- **Auth validation**: When `AUTH_ENABLED=true`, the backend fetches JWKS from `https://login.microsoftonline.com/{AUTH_TENANT_ID}/v2.0/.well-known/openid-configuration`, caches keys for 1 hour, and validates tenant/issuer claims. Tokens are decoded without strict signature verification (`jwt.get_unverified_claims`), so the tenant and issuer checks are the primary gate. - **Auth validation**: When `AUTH_ENABLED=true`, the backend fetches JWKS from `https://login.microsoftonline.com/{AUTH_TENANT_ID}/v2.0/.well-known/openid-configuration`, caches keys for 1 hour, and validates tenant/issuer claims. Tokens are decoded without strict signature verification (`jwt.get_unverified_claims`), so the tenant and issuer checks are the primary gate.
- **Role/Group gating**: Access is allowed if the tokens `roles` intersect `AUTH_ALLOWED_ROLES` or `groups` intersect `AUTH_ALLOWED_GROUPS`. If neither list is configured, all authenticated users are allowed. - **Role/Group gating**: Access is allowed if the tokens `roles` intersect `AUTH_ALLOWED_ROLES` or `groups` intersect `AUTH_ALLOWED_GROUPS`. If neither list is configured, all authenticated users are allowed.
- **Pagination limits**: `page_size` is clamped to a maximum of 500 to prevent large queries. - **Pagination limits**: `page_size` is clamped to a maximum of 500 to prevent large queries.
- **Fetch window cap**: `hours` is clamped to 720 (30 days) to avoid runaway API calls. - **Fetch window cap**: `hours` is clamped to 720 (30 days) to avoid runaway API calls.
- **MCP server**: The MCP server bypasses auth entirely. Only run it in trusted environments or behind a VPN.
## Maintenance and Operations ## Maintenance and Operations

View File

@@ -9,6 +9,8 @@ FastAPI microservice that ingests Microsoft Entra (Azure AD) and other admin aud
- Office 365 Management Activity API client for Exchange/SharePoint/Teams admin audit logs. - Office 365 Management Activity API client for Exchange/SharePoint/Teams admin audit logs.
- Frontend served from the backend for filtering/searching events and viewing raw entries. - Frontend served from the backend for filtering/searching events and viewing raw entries.
- Optional OIDC bearer auth (Entra) to protect the API/UI and gate access by roles/groups. - Optional OIDC bearer auth (Entra) to protect the API/UI and gate access by roles/groups.
- Natural language query (`/api/ask`) powered by LLM (OpenAI, Azure OpenAI, or any compatible API).
- MCP server for Claude Desktop / Cursor integration.
## Prerequisites (macOS) ## Prerequisites (macOS)
- Python 3.11 - Python 3.11
@@ -38,6 +40,15 @@ cp .env.example .env
# Optional: CORS origins if the frontend is served separately # Optional: CORS origins if the frontend is served separately
# CORS_ORIGINS=http://localhost:3000,https://app.example.com # CORS_ORIGINS=http://localhost:3000,https://app.example.com
# Optional: enable AI/natural-language features (/api/ask, MCP server)
# AI_FEATURES_ENABLED=true
# Optional: LLM configuration for natural language querying
# LLM_API_KEY=...
# LLM_BASE_URL=https://api.openai.com/v1
# LLM_MODEL=gpt-4o-mini
# LLM_TIMEOUT_SECONDS=30
``` ```
## Run with Docker Compose (recommended) ## Run with Docker Compose (recommended)
@@ -66,6 +77,7 @@ uvicorn main:app --reload --host 0.0.0.0 --port 8000
## API ## API
- `GET /health` — health check with MongoDB connectivity status. - `GET /health` — health check with MongoDB connectivity status.
- `GET /metrics` — Prometheus metrics for request latency, fetch volume, and errors. - `GET /metrics` — Prometheus metrics for request latency, fetch volume, and errors.
- `GET /api/version` — running version (baked into the Docker image at build time).
- `GET /api/fetch-audit-logs` — pulls the last 7 days by default (override with `?hours=N`, capped to 30 days) of: - `GET /api/fetch-audit-logs` — pulls the last 7 days by default (override with `?hours=N`, capped to 30 days) of:
- Entra directory audit logs (`/auditLogs/directoryAudits`) - Entra directory audit logs (`/auditLogs/directoryAudits`)
- Exchange/SharePoint/Teams admin audits (via Office 365 Management Activity API) - Exchange/SharePoint/Teams admin audits (via Office 365 Management Activity API)
@@ -82,11 +94,33 @@ uvicorn main:app --reload --host 0.0.0.0 --port 8000
- `GET /api/source-health` — last fetch status for each ingestion source (`directory`, `unified`, `intune`). - `GET /api/source-health` — last fetch status for each ingestion source (`directory`, `unified`, `intune`).
- `PATCH /api/events/{id}/tags` — update tags on an event (e.g., `investigating`, `false_positive`). - `PATCH /api/events/{id}/tags` — update tags on an event (e.g., `investigating`, `false_positive`).
- `POST /api/events/{id}/comments` — add a comment to an event. - `POST /api/events/{id}/comments` — add a comment to an event.
- `POST /api/ask` — natural language query. Returns a narrative answer + referenced events. Supports time ranges, entity names, and respects active UI filters. Only available when `AI_FEATURES_ENABLED=true`.
- `GET /api/config/features` — feature flags (`ai_features_enabled`).
- `GET /api/rules` — list alert rules. - `GET /api/rules` — list alert rules.
- `POST /api/rules` — create an alert rule. - `POST /api/rules` — create an alert rule.
- `PUT /api/rules/{id}` — update an alert rule. - `PUT /api/rules/{id}` — update an alert rule.
- `DELETE /api/rules/{id}` — delete an alert rule. - `DELETE /api/rules/{id}` — delete an alert rule.
### MCP Server
AOC exposes an MCP interface in two forms:
**1. HTTP/SSE (production)** — mounted at `/mcp` inside the FastAPI app, behind OIDC auth:
- `GET /mcp/sse` — establish SSE stream (requires Bearer token if `AUTH_ENABLED=true`)
- `POST /mcp/messages/?session_id=...` — send tool calls
This is the recommended way to use MCP against a remote deployment like `aoc.cqre.net`. Any MCP client that supports SSE transport (e.g. Cursor, Claude Desktop with an SSE bridge, or custom scripts) can connect using the same Entra token as the web UI.
**2. stdio (local development)**`python backend/mcp_server.py`:
- Runs as a local subprocess for Claude Desktop
- Connects directly to MongoDB (bypasses FastAPI auth)
- Useful for local development when you have the repo cloned and MongoDB running locally
Available tools (both transports):
- `search_events` — filter by entity, service, operation, result, time range.
- `get_event` — retrieve raw event JSON by ID.
- `get_summary` — aggregated summary (service, operation, result, actor counts) for the last N days.
- `ask` — natural language query returning recent events.
Stored document shape (collection `micro_soc.events`): Stored document shape (collection `micro_soc.events`):
```json ```json
{ {

View File

@@ -59,5 +59,15 @@ Goal: evolve from a polling dashboard into a full security operations tool.
--- ---
## Phase 5: Intelligence
Goal: add AI-powered analysis and external tool integration.
- [x] AI feature flag (`AI_FEATURES_ENABLED`) to gate LLM-dependent features
- [x] Natural language query endpoint (`/api/ask`) with intent extraction and smart sampling
- [x] MCP (Model Context Protocol) server for Claude Desktop / Cursor integration
- [ ] Advanced analytics dashboard (trending operations, anomaly detection)
- [ ] Redis caching for LLM responses and frequent queries
- [ ] Async queue for LLM requests to prevent timeout/cost explosions at scale
## Completed in this PR ## Completed in this PR
All Phase 1 items were implemented in the latest changes. All Phase 5 items marked done were implemented in v1.3.0.

View File

@@ -1 +1 @@
1.2.5 1.3.1

View File

@@ -42,7 +42,8 @@ class Settings(BaseSettings):
# Alerting # Alerting
ALERTS_ENABLED: bool = False ALERTS_ENABLED: bool = False
# LLM / Natural Language Query # AI / Natural Language Query
AI_FEATURES_ENABLED: bool = True
LLM_API_KEY: str = "" LLM_API_KEY: str = ""
LLM_BASE_URL: str = "https://api.openai.com/v1" LLM_BASE_URL: str = "https://api.openai.com/v1"
LLM_MODEL: str = "gpt-4o-mini" LLM_MODEL: str = "gpt-4o-mini"
@@ -77,6 +78,7 @@ SIEM_ENABLED = _settings.SIEM_ENABLED
SIEM_WEBHOOK_URL = _settings.SIEM_WEBHOOK_URL SIEM_WEBHOOK_URL = _settings.SIEM_WEBHOOK_URL
ALERTS_ENABLED = _settings.ALERTS_ENABLED ALERTS_ENABLED = _settings.ALERTS_ENABLED
AI_FEATURES_ENABLED = _settings.AI_FEATURES_ENABLED
LLM_API_KEY = _settings.LLM_API_KEY LLM_API_KEY = _settings.LLM_API_KEY
LLM_BASE_URL = _settings.LLM_BASE_URL LLM_BASE_URL = _settings.LLM_BASE_URL
LLM_MODEL = _settings.LLM_MODEL LLM_MODEL = _settings.LLM_MODEL

View File

@@ -3,7 +3,7 @@
<head> <head>
<meta charset="UTF-8" /> <meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>AOC Events</title> <title>Admin Operations Center</title>
<link rel="stylesheet" href="/style.css?v=8" /> <link rel="stylesheet" href="/style.css?v=8" />
<script defer src="https://cdn.jsdelivr.net/npm/alpinejs@3.x.x/dist/cdn.min.js"></script> <script defer src="https://cdn.jsdelivr.net/npm/alpinejs@3.x.x/dist/cdn.min.js"></script>
<script src="https://alcdn.msauth.net/browser/2.37.0/js/msal-browser.min.js" crossorigin="anonymous"></script> <script src="https://alcdn.msauth.net/browser/2.37.0/js/msal-browser.min.js" crossorigin="anonymous"></script>
@@ -13,8 +13,8 @@
<header class="hero"> <header class="hero">
<div> <div>
<p class="eyebrow">Admin Operations Center <span class="version-badge" x-text="appVersion"></span></p> <p class="eyebrow">Admin Operations Center <span class="version-badge" x-text="appVersion"></span></p>
<h1>Directory Audit Explorer</h1> <h1>Audit Log Explorer</h1>
<p class="lede">Filter Microsoft Entra audit events by user, app, time, action, and action type.</p> <p class="lede">Search and review Microsoft audit events from Entra, Intune, Exchange, SharePoint, and Teams.</p>
</div> </div>
<div class="cta"> <div class="cta">
<button id="authBtn" class="ghost" aria-label="Login" x-text="authBtnText" @click="toggleAuth()"></button> <button id="authBtn" class="ghost" aria-label="Login" x-text="authBtnText" @click="toggleAuth()"></button>
@@ -38,49 +38,6 @@
</div> </div>
</section> </section>
<section class="panel">
<h3>Ask a question</h3>
<form class="ask-form" @submit.prevent="askQuestion()">
<div class="ask-row">
<input
type="text"
placeholder="What happened to device ABC123 in the last 3 days?"
x-model="askQuestionText"
class="ask-input"
/>
<button type="submit" :disabled="askLoading" x-text="askLoading ? 'Thinking…' : 'Ask'">Ask</button>
</div>
<div x-show="hasActiveFilters()" class="ask-filter-hint">
<small>Respecting active filters: <span x-text="activeFilterSummary()"></span></small>
</div>
</form>
<template x-if="askAnswer">
<div class="ask-result">
<div x-show="askLlmError" class="ask-error" x-text="askLlmError"></div>
<div class="ask-answer" x-html="askAnswerHtml"></div>
<template x-if="askEvents.length">
<div class="ask-events">
<h4>Referenced events</h4>
<template x-for="(evt, idx) in askEvents" :key="evt.id || idx">
<article class="event event--compact">
<div class="event__meta">
<span class="pill" x-text="evt.display_category || evt.service || '—'"></span>
<span class="pill" :class="['success','succeeded','ok','passed'].includes((evt.result || '').toLowerCase()) ? 'pill--ok' : 'pill--warn'" x-text="evt.result || '—'"></span>
</div>
<h3 x-text="evt.operation || '—'"></h3>
<p class="event__detail" x-show="evt.display_summary"><strong>Summary:</strong> <span x-text="evt.display_summary"></span></p>
<p class="event__detail"><strong>Actor:</strong> <span x-text="evt.actor_display || '—'"></span></p>
<p class="event__detail"><strong>Target:</strong> <span x-text="Array.isArray(evt.target_displays) ? evt.target_displays.join(', ') : '—'"></span></p>
<p class="event__detail"><strong>When:</strong> <span x-text="evt.timestamp ? new Date(evt.timestamp).toLocaleString() : '—'"></span></p>
</article>
</template>
</div>
</template>
<button type="button" class="ghost" @click="clearAsk()">Clear</button>
</div>
</template>
</section>
<section class="panel"> <section class="panel">
<form id="filters" class="filters" @submit.prevent="resetPagination(); loadEvents()"> <form id="filters" class="filters" @submit.prevent="resetPagination(); loadEvents()">
<div class="filter-row"> <div class="filter-row">
@@ -163,6 +120,49 @@
</form> </form>
</section> </section>
<section class="panel" x-show="aiFeaturesEnabled">
<h3>Ask a question</h3>
<form class="ask-form" @submit.prevent="askQuestion()">
<div class="ask-row">
<input
type="text"
placeholder="What happened to device ABC123 in the last 3 days?"
x-model="askQuestionText"
class="ask-input"
/>
<button type="submit" :disabled="askLoading" x-text="askLoading ? 'Thinking…' : 'Ask'">Ask</button>
</div>
<div x-show="hasActiveFilters()" class="ask-filter-hint">
<small>Respecting active filters: <span x-text="activeFilterSummary()"></span></small>
</div>
</form>
<template x-if="askAnswer">
<div class="ask-result">
<div x-show="askLlmError" class="ask-error" x-text="askLlmError"></div>
<div class="ask-answer" x-html="askAnswerHtml"></div>
<template x-if="askEvents.length">
<div class="ask-events">
<h4>Referenced events</h4>
<template x-for="(evt, idx) in askEvents" :key="evt.id || idx">
<article class="event event--compact">
<div class="event__meta">
<span class="pill" x-text="evt.display_category || evt.service || '—'"></span>
<span class="pill" :class="['success','succeeded','ok','passed'].includes((evt.result || '').toLowerCase()) ? 'pill--ok' : 'pill--warn'" x-text="evt.result || '—'"></span>
</div>
<h3 x-text="evt.operation || '—'"></h3>
<p class="event__detail" x-show="evt.display_summary"><strong>Summary:</strong> <span x-text="evt.display_summary"></span></p>
<p class="event__detail"><strong>Actor:</strong> <span x-text="evt.actor_display || '—'"></span></p>
<p class="event__detail"><strong>Target:</strong> <span x-text="Array.isArray(evt.target_displays) ? evt.target_displays.join(', ') : '—'"></span></p>
<p class="event__detail"><strong>When:</strong> <span x-text="evt.timestamp ? new Date(evt.timestamp).toLocaleString() : '—'"></span></p>
</article>
</template>
</div>
</template>
<button type="button" class="ghost" @click="clearAsk()">Clear</button>
</div>
</template>
</section>
<section class="panel"> <section class="panel">
<div class="panel-header"> <div class="panel-header">
<h2>Events</h2> <h2>Events</h2>
@@ -244,6 +244,7 @@
}, },
options: { actors: [], services: [], operations: [], results: [] }, options: { actors: [], services: [], operations: [], results: [] },
appVersion: '', appVersion: '',
aiFeaturesEnabled: true,
askQuestionText: '', askQuestionText: '',
askLoading: false, askLoading: false,
askAnswer: '', askAnswer: '',
@@ -302,6 +303,18 @@
this.authConfig = { auth_enabled: false }; this.authConfig = { auth_enabled: false };
} }
try {
const featRes = await fetch('/api/config/features');
if (featRes.ok) {
const featBody = await featRes.json();
this.aiFeaturesEnabled = featBody.ai_features_enabled !== false;
} else {
this.aiFeaturesEnabled = true;
}
} catch {
this.aiFeaturesEnabled = true;
}
if (!this.authConfig?.auth_enabled) { if (!this.authConfig?.auth_enabled) {
this.authBtnText = ''; this.authBtnText = '';
return; return;

View File

@@ -6,7 +6,7 @@ from pathlib import Path
import structlog import structlog
from audit_trail import log_action from audit_trail import log_action
from config import CORS_ORIGINS, ENABLE_PERIODIC_FETCH, FETCH_INTERVAL_MINUTES from config import AI_FEATURES_ENABLED, CORS_ORIGINS, ENABLE_PERIODIC_FETCH, FETCH_INTERVAL_MINUTES
from database import setup_indexes from database import setup_indexes
from fastapi import FastAPI, HTTPException, Request from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware from fastapi.middleware.cors import CORSMiddleware
@@ -14,7 +14,6 @@ from fastapi.responses import Response
from fastapi.staticfiles import StaticFiles from fastapi.staticfiles import StaticFiles
from metrics import observe_request, prometheus_metrics from metrics import observe_request, prometheus_metrics
from middleware import CorrelationIdMiddleware from middleware import CorrelationIdMiddleware
from routes.ask import router as ask_router
from routes.config import router as config_router from routes.config import router as config_router
from routes.events import router as events_router from routes.events import router as events_router
from routes.fetch import router as fetch_router from routes.fetch import router as fetch_router
@@ -113,7 +112,13 @@ app.include_router(events_router, prefix="/api")
app.include_router(config_router, prefix="/api") app.include_router(config_router, prefix="/api")
app.include_router(webhooks_router, prefix="/api") app.include_router(webhooks_router, prefix="/api")
app.include_router(health_router, prefix="/api") app.include_router(health_router, prefix="/api")
app.include_router(ask_router, prefix="/api") if AI_FEATURES_ENABLED:
from routes.ask import router as ask_router
app.include_router(ask_router, prefix="/api")
from routes.mcp import mcp_asgi
app.mount("/mcp", mcp_asgi)
app.include_router(rules_router, prefix="/api") app.include_router(rules_router, prefix="/api")

187
backend/mcp_common.py Normal file
View File

@@ -0,0 +1,187 @@
"""Shared MCP tool handlers used by both stdio and SSE transports."""
import json
from datetime import UTC, datetime, timedelta
from database import events_collection
from mcp.types import TextContent
async def handle_search_events(arguments: dict) -> list[TextContent]:
days = arguments.get("days", 7)
limit = min(arguments.get("limit", 20), 100)
since = (datetime.now(UTC) - timedelta(days=days)).isoformat().replace("+00:00", "Z")
filters = [{"timestamp": {"$gte": since}}]
services = arguments.get("services")
if services:
filters.append({"service": {"$in": services}})
operation = arguments.get("operation")
if operation:
filters.append({"operation": {"$regex": operation, "$options": "i"}})
result = arguments.get("result")
if result:
filters.append({"result": {"$regex": result, "$options": "i"}})
entity = arguments.get("entity")
if entity:
entity_safe = entity.replace(".", "\\.").replace("(", "\\(").replace(")", "\\)")
filters.append(
{
"$or": [
{"target_displays": {"$elemMatch": {"$regex": entity_safe, "$options": "i"}}},
{"actor_display": {"$regex": entity_safe, "$options": "i"}},
{"actor_upn": {"$regex": entity_safe, "$options": "i"}},
{"raw_text": {"$regex": entity_safe, "$options": "i"}},
]
}
)
query = {"$and": filters}
cursor = events_collection.find(query).sort("timestamp", -1).limit(limit)
events = list(cursor)
if not events:
return [TextContent(type="text", text="No matching events found.")]
lines = [f"Found {len(events)} event(s):\n"]
for e in events:
ts = e.get("timestamp", "?")[:16].replace("T", " ")
svc = e.get("service", "?")
op = e.get("operation", "?")
actor = e.get("actor_display", "?")
result_str = e.get("result", "?")
lines.append(f"{ts} | {svc} | {op} | {actor} | {result_str}")
return [TextContent(type="text", text="\n".join(lines))]
async def handle_get_event(arguments: dict) -> list[TextContent]:
event_id = arguments["event_id"]
event = events_collection.find_one({"id": event_id})
if not event:
return [TextContent(type="text", text=f"Event {event_id} not found.")]
event.pop("_id", None)
return [TextContent(type="text", text=json.dumps(event, indent=2, default=str))]
async def handle_get_summary(arguments: dict) -> list[TextContent]:
days = arguments.get("days", 7)
since = (datetime.now(UTC) - timedelta(days=days)).isoformat().replace("+00:00", "Z")
query = {"timestamp": {"$gte": since}}
total = events_collection.count_documents(query)
if total == 0:
return [TextContent(type="text", text="No events in the specified period.")]
svc_pipeline = [
{"$match": query},
{"$group": {"_id": "$service", "count": {"$sum": 1}}},
{"$sort": {"count": -1}},
{"$limit": 10},
]
op_pipeline = [
{"$match": query},
{"$group": {"_id": "$operation", "count": {"$sum": 1}}},
{"$sort": {"count": -1}},
{"$limit": 10},
]
result_pipeline = [
{"$match": query},
{"$group": {"_id": "$result", "count": {"$sum": 1}}},
{"$sort": {"count": -1}},
]
actor_pipeline = [
{"$match": query},
{"$group": {"_id": "$actor_display", "count": {"$sum": 1}}},
{"$sort": {"count": -1}},
{"$limit": 10},
]
svc_counts = list(events_collection.aggregate(svc_pipeline))
op_counts = list(events_collection.aggregate(op_pipeline))
result_counts = list(events_collection.aggregate(result_pipeline))
actor_counts = list(events_collection.aggregate(actor_pipeline))
lines = [f"Summary for the last {days} days ({total} total events)\n"]
lines.append("By service:")
for row in svc_counts:
lines.append(f" {row['_id'] or 'Unknown'}: {row['count']}")
lines.append("\nBy action:")
for row in op_counts:
lines.append(f" {row['_id'] or 'Unknown'}: {row['count']}")
lines.append("\nBy result:")
for row in result_counts:
lines.append(f" {row['_id'] or 'Unknown'}: {row['count']}")
lines.append("\nTop actors:")
for row in actor_counts:
lines.append(f" {row['_id'] or 'Unknown'}: {row['count']}")
return [TextContent(type="text", text="\n".join(lines))]
async def handle_ask(arguments: dict) -> list[TextContent]:
"""For now, returns recent events + guidance. In the future this could call the LLM backend."""
question = arguments["question"]
days = arguments.get("days", 7)
result = await handle_search_events({"entity": "", "days": days, "limit": 50})
base_text = result[0].text if result else ""
text = (
f"You asked: '{question}'\n\n"
f"Here are the most recent events from the last {days} days:\n\n"
f"{base_text}\n\n"
f"Tip: Use the 'search_events' tool with specific filters "
f"to narrow down the dataset before asking follow-up questions."
)
return [TextContent(type="text", text=text)]
# JSON schemas for tool definitions
SEARCH_EVENTS_SCHEMA = {
"type": "object",
"properties": {
"entity": {"type": "string", "description": "Device name, user UPN, or email to search for"},
"services": {
"type": "array",
"items": {"type": "string"},
"description": "Filter by service (e.g. Intune, Directory, Exchange)",
},
"operation": {"type": "string", "description": "Filter by operation name"},
"result": {"type": "string", "description": "Filter by result (success, failure)"},
"days": {"type": "integer", "description": "Number of days to look back (default 7)"},
"limit": {"type": "integer", "description": "Max events to return (default 20)"},
},
}
GET_EVENT_SCHEMA = {
"type": "object",
"properties": {
"event_id": {"type": "string", "description": "The event ID to retrieve"},
},
"required": ["event_id"],
}
GET_SUMMARY_SCHEMA = {
"type": "object",
"properties": {
"days": {"type": "integer", "description": "Number of days to summarise (default 7)"},
},
}
ASK_SCHEMA = {
"type": "object",
"properties": {
"question": {"type": "string", "description": "Natural language question about audit logs"},
"days": {"type": "integer", "description": "Number of days to look back (default 7)"},
},
"required": ["question"],
}

88
backend/mcp_server.py Normal file
View File

@@ -0,0 +1,88 @@
#!/usr/bin/env python3
"""
AOC MCP Server — stdio transport
Standalone MCP server for local use (Claude Desktop, Cursor, etc.).
For the HTTP/SSE version (production, behind auth), see routes/mcp.py.
Usage:
python mcp_server.py
Claude Desktop config (~/.config/claude/claude_desktop_config.json):
{
"mcpServers": {
"aoc": {
"command": "python",
"args": ["/path/to/aoc/backend/mcp_server.py"],
"env": {"MONGO_URI": "mongodb://..."}
}
}
}
"""
import asyncio
import os
import sys
# Ensure backend modules are importable when run standalone
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.types import TextContent, Tool
from mcp_common import (
ASK_SCHEMA,
GET_EVENT_SCHEMA,
GET_SUMMARY_SCHEMA,
SEARCH_EVENTS_SCHEMA,
handle_ask,
handle_get_event,
handle_get_summary,
handle_search_events,
)
app = Server("aoc")
@app.list_tools()
async def list_tools() -> list[Tool]:
return [
Tool(
name="search_events",
description="Search audit events by entity, service, operation, or result.",
inputSchema=SEARCH_EVENTS_SCHEMA,
),
Tool(name="get_event", description="Retrieve a single audit event by its ID.", inputSchema=GET_EVENT_SCHEMA),
Tool(
name="get_summary",
description="Get an aggregated summary of audit activity for the last N days.",
inputSchema=GET_SUMMARY_SCHEMA,
),
Tool(
name="ask",
description="Ask a natural language question about audit logs. Returns a narrative answer.",
inputSchema=ASK_SCHEMA,
),
]
@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
if name == "search_events":
return await handle_search_events(arguments)
if name == "get_event":
return await handle_get_event(arguments)
if name == "get_summary":
return await handle_get_summary(arguments)
if name == "ask":
return await handle_ask(arguments)
raise ValueError(f"Unknown tool: {name}")
async def main():
async with stdio_server() as (read_stream, write_stream):
await app.run(read_stream, write_stream, app.create_initialization_options())
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -13,3 +13,4 @@ tenacity
prometheus-client prometheus-client
httpx httpx
gunicorn gunicorn
mcp

View File

@@ -13,6 +13,129 @@ from models.api import AskRequest, AskResponse
router = APIRouter(dependencies=[Depends(require_auth)]) router = APIRouter(dependencies=[Depends(require_auth)])
logger = structlog.get_logger("aoc.ask") logger = structlog.get_logger("aoc.ask")
# ---------------------------------------------------------------------------
# Intent extraction — map question keywords to relevant audit services
# ---------------------------------------------------------------------------
_SERVICE_INTENTS = {
"intune": ["Intune"],
"device": ["Intune", "Device"],
"laptop": ["Intune", "Device"],
"mobile": ["Intune", "Device"],
"phone": ["Intune", "Device"],
"ipad": ["Intune", "Device"],
"app": ["Intune", "ApplicationManagement"],
"application": ["Intune", "ApplicationManagement"],
"policy": ["Intune", "Policy"],
"compliance": ["Intune", "Policy"],
"user": ["Directory", "UserManagement"],
"group": ["Directory", "GroupManagement"],
"role": ["Directory", "RoleManagement"],
"permission": ["Directory", "RoleManagement"],
"license": ["Directory", "License"],
"email": ["Exchange"],
"mailbox": ["Exchange"],
"mail": ["Exchange"],
"message": ["Exchange", "Teams"],
"file": ["SharePoint"],
"sharepoint": ["SharePoint"],
"site": ["SharePoint"],
"document": ["SharePoint"],
"team": ["Teams"],
"channel": ["Teams"],
"meeting": ["Teams"],
"call": ["Teams"],
}
# Services that are extremely noisy for typical admin questions.
# We exclude them by default on broad questions unless the user explicitly mentions them.
_NOISY_SERVICES = {"Exchange", "SharePoint"}
# Services that are generally admin-relevant and kept by default.
_DEFAULT_ADMIN_SERVICES = {
"Directory",
"UserManagement",
"GroupManagement",
"RoleManagement",
"ApplicationManagement",
"Intune",
"Device",
"Policy",
"Teams",
"License",
}
def _extract_intent_services(question: str) -> tuple[list[str] | None, bool]:
"""
Extract relevant services from the question.
Returns:
(services, is_explicit):
- services: list of service names to query, or None for default admin set
- is_explicit: True if the user explicitly mentioned a noisy service
"""
q_lower = question.lower()
tokens = set(re.findall(r"\b[a-z]+\b", q_lower))
matched_services = set()
for token, services in _SERVICE_INTENTS.items():
if token in tokens:
matched_services.update(services)
if matched_services:
# User asked something specific — return exactly what they asked for
is_explicit = not matched_services.isdisjoint(_NOISY_SERVICES)
return sorted(matched_services), is_explicit
# Broad question with no clear intent — default to admin-relevant services only
return None, False
# ---------------------------------------------------------------------------
# Smart sampling — stratified by importance so the LLM sees signal, not noise
# ---------------------------------------------------------------------------
def _smart_sample(events: list[dict], max_events: int = 200) -> list[dict]:
"""
Return a curated subset that preserves diversity and prioritises signal.
Tiers:
1. Failures (always valuable)
2. High-admin-value services (Intune, Device, Directory, etc.)
3. Everything else
"""
if len(events) <= max_events:
return events
high_value = {
"Directory",
"UserManagement",
"GroupManagement",
"RoleManagement",
"Intune",
"Device",
"Policy",
"ApplicationManagement",
}
failures = [e for e in events if str(e.get("result") or "").lower() in ("failure", "failed")]
high_val = [e for e in events if e.get("service") in high_value and e not in failures]
rest = [e for e in events if e not in failures and e not in high_val]
# Allocate slots: half to failures+high-value, half to rest (but never let rest dominate)
slots = max_events
failure_cap = min(len(failures), max(10, slots // 4))
high_cap = min(len(high_val), max(20, slots // 4))
rest_cap = slots - failure_cap - high_cap
sampled = failures[:failure_cap] + high_val[:high_cap] + rest[:rest_cap]
# Sort back to chronological order
sampled.sort(key=lambda e: e.get("timestamp") or "", reverse=True)
return sampled
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
# Time-range extraction # Time-range extraction
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
@@ -203,12 +326,16 @@ def _aggregate_counts(events: list[dict]) -> dict:
} }
def _format_events_for_llm(events: list[dict], total: int | None = None) -> str: def _format_events_for_llm(
events: list[dict], total: int | None = None, excluded_services: list[str] | None = None
) -> str:
lines = [] lines = []
# If we have a large result set, send aggregation + samples instead of raw dump # If we have a large result set, send aggregation + samples instead of raw dump
if total is not None and total > len(events) and len(events) >= 50: if total is not None and total > len(events) and len(events) >= 50:
lines.append(f"Result set overview: {total} total events (showing the {len(events)} most recent).\n") lines.append(f"Result set overview: {total} total events (showing a curated sample of {len(events)}).\n")
if excluded_services:
lines.append(f"Note: high-volume services excluded by default: {', '.join(excluded_services)}.\n")
agg = _aggregate_counts(events) agg = _aggregate_counts(events)
lines.append("Breakdown by service:") lines.append("Breakdown by service:")
for svc, cnt in agg["services"]: for svc, cnt in agg["services"]:
@@ -267,11 +394,16 @@ def _build_chat_url(base_url: str, api_version: str) -> str:
return url return url
async def _call_llm(question: str, events: list[dict], total: int | None = None) -> str: async def _call_llm(
question: str,
events: list[dict],
total: int | None = None,
excluded_services: list[str] | None = None,
) -> str:
if not LLM_API_KEY: if not LLM_API_KEY:
raise RuntimeError("LLM_API_KEY not configured") raise RuntimeError("LLM_API_KEY not configured")
context = _format_events_for_llm(events, total=total) context = _format_events_for_llm(events, total=total, excluded_services=excluded_services)
messages = [ messages = [
{"role": "system", "content": _SYSTEM_PROMPT}, {"role": "system", "content": _SYSTEM_PROMPT},
{ {
@@ -332,6 +464,7 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
start, end = _extract_time_range(question) start, end = _extract_time_range(question)
entity = _extract_entity(question) entity = _extract_entity(question)
intent_services, explicit_noisy = _extract_intent_services(question)
# Default to last 7 days if no time range detected # Default to last 7 days if no time range detected
if not start: if not start:
@@ -339,11 +472,29 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
start = (now - timedelta(days=7)).isoformat().replace("+00:00", "Z") start = (now - timedelta(days=7)).isoformat().replace("+00:00", "Z")
end = now.isoformat().replace("+00:00", "Z") end = now.isoformat().replace("+00:00", "Z")
# -----------------------------------------------------------------------
# Decide which services to query
# -----------------------------------------------------------------------
excluded_services: list[str] = []
if body.services:
# User explicitly filtered via UI — respect that exactly
query_services = body.services
elif intent_services is not None:
# NL question implies specific services
query_services = intent_services
else:
# Broad question with no intent — exclude noisy services by default
query_services = sorted(_DEFAULT_ADMIN_SERVICES)
excluded_services = sorted(_NOISY_SERVICES)
# -----------------------------------------------------------------------
# Build and run query
# -----------------------------------------------------------------------
query = _build_event_query( query = _build_event_query(
entity, entity,
start, start,
end, end,
services=body.services, services=query_services,
actor=body.actor, actor=body.actor,
operation=body.operation, operation=body.operation,
result=body.result, result=body.result,
@@ -353,21 +504,33 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
try: try:
total = events_collection.count_documents(query) total = events_collection.count_documents(query)
cursor = events_collection.find(query).sort([("timestamp", -1)]).limit(LLM_MAX_EVENTS) # Fetch a generous window so we can apply smart sampling in Python
events = list(cursor) cursor = events_collection.find(query).sort([("timestamp", -1)]).limit(1000)
raw_events = list(cursor)
except Exception as exc: except Exception as exc:
logger.error("Failed to query events for ask", error=str(exc)) logger.error("Failed to query events for ask", error=str(exc))
raise HTTPException(status_code=500, detail=f"Database query failed: {exc}") from exc raise HTTPException(status_code=500, detail=f"Database query failed: {exc}") from exc
for e in events: for e in raw_events:
e["_id"] = str(e.get("_id", "")) e["_id"] = str(e.get("_id", ""))
# Apply smart sampling (preserves failures, prioritises admin-relevant services)
events = _smart_sample(raw_events, max_events=LLM_MAX_EVENTS)
# If no events, return early # If no events, return early
if not events: if not events:
return AskResponse( return AskResponse(
answer="I couldn't find any audit events matching your question. Try broadening the time range or checking the spelling of the device/user name.", answer="I couldn't find any audit events matching your question. Try broadening the time range or checking the spelling of the device/user name.",
events=[], events=[],
query_info={"entity": entity, "start": start, "end": end, "event_count": 0}, query_info={
"entity": entity,
"start": start,
"end": end,
"event_count": 0,
"total_matched": total,
"services_queried": query_services,
"excluded_services": excluded_services,
},
llm_used=False, llm_used=False,
llm_error="LLM not used — no events found." if not LLM_API_KEY else None, llm_error="LLM not used — no events found." if not LLM_API_KEY else None,
) )
@@ -380,7 +543,7 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
llm_error = "LLM_API_KEY is not configured. Set it in your .env to enable AI narrative summarisation." llm_error = "LLM_API_KEY is not configured. Set it in your .env to enable AI narrative summarisation."
else: else:
try: try:
answer = await _call_llm(question, events, total=total) answer = await _call_llm(question, events, total=total, excluded_services=excluded_services)
llm_used = True llm_used = True
except Exception as exc: except Exception as exc:
llm_error = f"LLM call failed: {exc}" llm_error = f"LLM call failed: {exc}"
@@ -388,9 +551,11 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
# Fallback: structured summary if LLM unavailable or failed # Fallback: structured summary if LLM unavailable or failed
if not answer: if not answer:
parts = [f"Found {len(events)} event(s)"] parts = [f"Found {total} event(s)"]
if entity: if entity:
parts.append(f"related to **{entity}**") parts.append(f"related to **{entity}**")
if excluded_services:
parts.append(f"(excluding {', '.join(excluded_services)})")
parts.append(f"between {start[:10]} and {end[:10]}.\n") parts.append(f"between {start[:10]} and {end[:10]}.\n")
for i, e in enumerate(events[:10], 1): for i, e in enumerate(events[:10], 1):
@@ -415,6 +580,8 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
"end": end, "end": end,
"event_count": len(events), "event_count": len(events),
"total_matched": total, "total_matched": total,
"services_queried": query_services,
"excluded_services": excluded_services,
"mongo_query": json.dumps(query, default=str), "mongo_query": json.dumps(query, default=str),
}, },
llm_used=llm_used, llm_used=llm_used,

View File

@@ -1,4 +1,5 @@
from config import ( from config import (
AI_FEATURES_ENABLED,
AUTH_CLIENT_ID, AUTH_CLIENT_ID,
AUTH_ENABLED, AUTH_ENABLED,
AUTH_SCOPE, AUTH_SCOPE,
@@ -18,3 +19,10 @@ def auth_config():
"scope": AUTH_SCOPE, "scope": AUTH_SCOPE,
"redirect_uri": None, # frontend uses window.location.origin by default "redirect_uri": None, # frontend uses window.location.origin by default
} }
@router.get("/config/features")
def features_config():
return {
"ai_features_enabled": AI_FEATURES_ENABLED,
}

124
backend/routes/mcp.py Normal file
View File

@@ -0,0 +1,124 @@
"""MCP server over SSE (HTTP) transport, mounted inside FastAPI with OIDC auth."""
import structlog
from auth import (
AUTH_ALLOWED_GROUPS,
AUTH_ALLOWED_ROLES,
AUTH_ENABLED,
_allowed,
_decode_token,
_get_jwks,
)
from mcp.server import Server
from mcp.server.sse import SseServerTransport
from mcp.types import TextContent, Tool
from mcp_common import (
ASK_SCHEMA,
GET_EVENT_SCHEMA,
GET_SUMMARY_SCHEMA,
SEARCH_EVENTS_SCHEMA,
handle_ask,
handle_get_event,
handle_get_summary,
handle_search_events,
)
from starlette.requests import Request
from starlette.responses import Response
logger = structlog.get_logger("aoc.mcp")
mcp_app = Server("aoc")
transport = SseServerTransport("/messages/")
@mcp_app.list_tools()
async def list_tools() -> list[Tool]:
return [
Tool(
name="search_events",
description="Search audit events by entity, service, operation, or result.",
inputSchema=SEARCH_EVENTS_SCHEMA,
),
Tool(name="get_event", description="Retrieve a single audit event by its ID.", inputSchema=GET_EVENT_SCHEMA),
Tool(
name="get_summary",
description="Get an aggregated summary of audit activity for the last N days.",
inputSchema=GET_SUMMARY_SCHEMA,
),
Tool(
name="ask",
description="Ask a natural language question about audit logs. Returns a narrative answer.",
inputSchema=ASK_SCHEMA,
),
]
@mcp_app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
if name == "search_events":
return await handle_search_events(arguments)
if name == "get_event":
return await handle_get_event(arguments)
if name == "get_summary":
return await handle_get_summary(arguments)
if name == "ask":
return await handle_ask(arguments)
raise ValueError(f"Unknown tool: {name}")
async def _validate_auth(request: Request) -> dict | None:
"""Validate Bearer token. Returns claims dict or None on failure."""
if not AUTH_ENABLED:
return {"sub": "anonymous"}
auth_header = request.headers.get("authorization", "")
if not auth_header or not auth_header.lower().startswith("bearer "):
return None
token = auth_header.split(" ", 1)[1]
try:
jwks = _get_jwks()
claims = _decode_token(token, jwks)
except Exception as exc:
logger.warning("MCP auth failed", error=str(exc))
return None
if not _allowed(claims, AUTH_ALLOWED_ROLES, AUTH_ALLOWED_GROUPS):
logger.warning("MCP auth forbidden", sub=claims.get("sub"))
return None
return claims
async def mcp_asgi(scope: dict, receive, send):
"""ASGI application for MCP over SSE, mounted under /mcp in FastAPI."""
if scope["type"] != "http":
return
request = Request(scope, receive)
# Auth check
claims = await _validate_auth(request)
if claims is None:
response = Response("Unauthorized", status_code=401)
await response(scope, receive, send)
return
path = scope.get("path", "")
root_path = scope.get("root_path", "")
relative_path = path[len(root_path) :] if path.startswith(root_path) else path
method = scope.get("method", "")
if relative_path == "/sse" and method == "GET":
logger.info("MCP SSE connection established", sub=claims.get("sub", "unknown"))
async with transport.connect_sse(scope, receive, send) as (read_stream, write_stream):
await mcp_app.run(
read_stream,
write_stream,
mcp_app.create_initialization_options(),
)
elif relative_path == "/messages/" and method == "POST":
await transport.handle_post_message(scope, receive, send)
else:
response = Response("Not found", status_code=404)
await response(scope, receive, send)

View File

@@ -30,6 +30,7 @@ def client(mock_events_collection, mock_watermarks_collection, monkeypatch):
monkeypatch.setattr("routes.fetch.get_watermark", lambda source: None) monkeypatch.setattr("routes.fetch.get_watermark", lambda source: None)
monkeypatch.setattr("routes.fetch.set_watermark", lambda source, ts: None) monkeypatch.setattr("routes.fetch.set_watermark", lambda source, ts: None)
monkeypatch.setattr("auth.AUTH_ENABLED", False) monkeypatch.setattr("auth.AUTH_ENABLED", False)
monkeypatch.setattr("routes.mcp.AUTH_ENABLED", False)
monkeypatch.setattr("database.db.command", lambda cmd: {"ok": 1} if cmd == "ping" else {}) monkeypatch.setattr("database.db.command", lambda cmd: {"ok": 1} if cmd == "ping" else {})
# Mock audit trail and rules collections so tests don't wait on real MongoDB # Mock audit trail and rules collections so tests don't wait on real MongoDB

View File

@@ -1,6 +1,60 @@
from datetime import UTC, datetime from datetime import UTC, datetime
def test_config_features(client):
response = client.get("/api/config/features")
assert response.status_code == 200
data = response.json()
assert "ai_features_enabled" in data
assert isinstance(data["ai_features_enabled"], bool)
def test_ask_disabled_when_ai_features_off():
import subprocess
import sys
code = """
import sys
sys.path.insert(0, '.')
import os
os.environ['AI_FEATURES_ENABLED'] = 'false'
# Re-import config with the env override
import importlib
import config
importlib.reload(config)
# Now import main; it will pick up the new AI_FEATURES_ENABLED
import main
ask_paths = [r.path for r in main.app.routes if hasattr(r, 'path') and 'ask' in r.path]
print('ASK_PATHS:', ask_paths)
assert len(ask_paths) == 0, f"Expected no ask routes, found: {ask_paths}"
print('OK')
"""
result = subprocess.run([sys.executable, "-c", code], capture_output=True, text=True, cwd=".")
assert result.returncode == 0, f"Subprocess failed: {result.stdout}\n{result.stderr}"
assert "OK" in result.stdout
def test_mcp_sse_mount_exists():
from main import app
mcp_mounts = [r for r in app.routes if getattr(r, "path", "") == "/mcp"]
assert len(mcp_mounts) == 1, "MCP mount not found in app routes"
def test_mcp_messages_no_session(client):
response = client.post("/mcp/messages/")
# MCP transport returns 400 when session_id is missing, 404 when session not found
assert response.status_code in (400, 404)
def test_mcp_sse_auth_required_when_enabled(client, monkeypatch):
monkeypatch.setattr("routes.mcp.AUTH_ENABLED", True)
response = client.get("/mcp/sse")
assert response.status_code == 401
def test_health(client): def test_health(client):
response = client.get("/health") response = client.get("/health")
assert response.status_code == 200 assert response.status_code == 200

View File

@@ -236,7 +236,7 @@ class TestAskEndpoint:
} }
) )
async def fake_llm(question, events, total=None): async def fake_llm(question, events, total=None, excluded_services=None):
return "The device had a failed wipe attempt." return "The device had a failed wipe attempt."
monkeypatch.setattr("routes.ask.LLM_API_KEY", "fake-key") monkeypatch.setattr("routes.ask.LLM_API_KEY", "fake-key")