Release v1.3.0: AI feature flag and MCP server
All checks were successful
CI / lint-and-test (push) Successful in 45s
Release / build-and-push (push) Successful in 1m34s

- Add AI_FEATURES_ENABLED config flag to gate AI/natural-language features
- Conditionally register /api/ask router based on AI_FEATURES_ENABLED
- Add GET /api/config/features endpoint for frontend feature detection
- Update frontend to hide Ask panel when AI features are disabled
- Implement standalone MCP server (backend/mcp_server.py) with tools:
  * search_events, get_event, get_summary, ask
- Add mcp dependency to requirements.txt
- Update .env.example, AGENTS.md, and ROADMAP.md
- Bump VERSION to 1.3.0
This commit is contained in:
2026-04-20 18:11:26 +02:00
parent b4e504a87b
commit 60b6ad15c4
11 changed files with 435 additions and 29 deletions

View File

@@ -6,28 +6,34 @@ AOC is a FastAPI microservice that ingests Microsoft Entra (Azure AD) audit logs
## Technology Stack
- **Runtime**: Python 3.11
- **Web Framework**: FastAPI + Uvicorn
- **Runtime**: Python 3.11 (3.14 for tests)
- **Web Framework**: FastAPI + Uvicorn (Gunicorn in production)
- **Database**: MongoDB (PyMongo)
- **Frontend**: Vanilla HTML/CSS/JS (served as static files from `backend/frontend/`)
- **Frontend**: Alpine.js + HTML/CSS (served as static files from `backend/frontend/`)
- **Authentication**: Optional OIDC Bearer token validation against Microsoft Entra (using `python-jose` and MSAL.js on the frontend)
- **External APIs**: Microsoft Graph API, Office 365 Management Activity API
- **Deployment**: Docker Compose
- **External APIs**: Microsoft Graph API, Office 365 Management Activity API, Azure OpenAI / MS Foundry
- **Deployment**: Docker Compose (dev), Docker Compose + nginx (prod)
- **CI/CD**: Gitea Actions (lint + test + Docker build + release)
## Project Structure
```
backend/
main.py # FastAPI app, router registration, background periodic fetch
config.py # Environment-based configuration (loads .env)
config.py # Pydantic Settings configuration (loads .env)
database.py # MongoClient setup (db = micro_soc, collection = events)
auth.py # OIDC Bearer token validation, JWKS caching, role/group checks
requirements.txt # Python dependencies
Dockerfile # python:3.11-slim image
Dockerfile # python:3.11-slim image, non-root user, version baked at build
mcp_server.py # Standalone MCP server for Claude Desktop / Cursor integration
routes/
fetch.py # GET /api/fetch-audit-logs, run_fetch()
events.py # GET /api/events, GET /api/filter-options
config.py # GET /api/config/auth
events.py # GET /api/events, GET /api/filter-options, PATCH tags, POST comments
config.py # GET /api/config/auth, GET /api/config/features
ask.py # POST /api/ask — natural language query with LLM
health.py # GET /health, GET /metrics
rules.py # Rule-based alerting endpoints
webhooks.py # Microsoft Graph change notification webhooks
graph/
auth.py # Client credentials token acquisition for Graph
audit_logs.py # Fetch and enrich directory audit logs from Graph
@@ -41,7 +47,7 @@ backend/
mappings.yml # User-editable category labels and summary templates
maintenance.py # CLI for re-normalization and deduplication of stored events
frontend/
index.html # Single-page UI with filters, pagination, raw-event modal
index.html # Single-page UI with filters, pagination, ask panel, raw-event modal
style.css # Dark-themed stylesheet
```
@@ -60,6 +66,9 @@ Key variables:
- `AUTH_ALLOWED_ROLES`, `AUTH_ALLOWED_GROUPS` — comma-separated access control lists
- `ENABLE_PERIODIC_FETCH`, `FETCH_INTERVAL_MINUTES` — background ingestion scheduler
- `MONGO_ROOT_USERNAME`, `MONGO_ROOT_PASSWORD`, `MONGO_PORT` — used by Docker Compose for MongoDB
- `AI_FEATURES_ENABLED` — set `false` to completely disable AI endpoints and UI (default `true`)
- `LLM_API_KEY`, `LLM_BASE_URL`, `LLM_MODEL`, `LLM_MAX_EVENTS`, `LLM_TIMEOUT_SECONDS` — LLM provider settings
- `LLM_API_VERSION` — required for Azure OpenAI / MS Foundry endpoints
## Build and Run Commands
@@ -87,35 +96,81 @@ uvicorn main:app --reload --host 0.0.0.0 --port 8000
## API Endpoints
- `GET /api/fetch-audit-logs?hours=168` — pulls last N hours (capped at 720 / 30 days) from all sources, normalizes, dedupes, and upserts into MongoDB
- `GET /api/events` — list stored events with filters (`service`, `actor`, `operation`, `result`, `start`, `end`, `search`) and pagination (`page`, `page_size`)
- `GET /api/events` — list stored events with filters (`service`, `actor`, `operation`, `result`, `start`, `end`, `search`) and cursor-based pagination
- `GET /api/filter-options` — best-effort distinct values for UI dropdowns
- `GET /api/config/auth` — auth configuration exposed to the frontend
- `GET /api/config/features` — feature flags (`ai_features_enabled`)
- `POST /api/ask` — natural language query; returns LLM narrative + referenced events (only when `AI_FEATURES_ENABLED=true`)
- `GET /health` — liveness probe with DB connectivity
- `GET /metrics` — Prometheus metrics
## MCP Server
A standalone MCP server (`backend/mcp_server.py`) exposes audit log tools for Claude Desktop, Cursor, and other MCP clients.
Available tools:
- `search_events` — Search by entity, service, operation, result, time range
- `get_event` — Retrieve a single event by ID (raw JSON)
- `get_summary` — Aggregated counts by service, operation, result, actor
- `ask` — Natural language question (returns recent events + guidance)
**Claude Desktop config** (`~/.config/claude/claude_desktop_config.json`):
```json
{
"mcpServers": {
"aoc": {
"command": "python",
"args": ["/path/to/aoc/backend/mcp_server.py"],
"env": {"MONGO_URI": "mongodb://root:example@localhost:27017/"}
}
}
}
```
The MCP server imports `database.py` directly and does not go through the FastAPI layer, so it shares the same MongoDB connection but bypasses auth.
## AI Feature Flag
Set `AI_FEATURES_ENABLED=false` in `.env` to:
- Prevent the `ask` router from being registered in FastAPI
- Hide the "Ask a question" panel in the frontend
- Return `ai_features_enabled: false` from `/api/config/features`
This is intended for the open-core monetization split: core features (ingestion, filtering, search, export) are always available; premium AI features (NLQ, MCP) can be disabled.
## Code Conventions
- Python modules use absolute imports within the `backend/` package (e.g., `from graph.auth import get_access_token`). When running locally, ensure the working directory is `backend/` so these resolve correctly.
- No formal formatter or linter is configured. Keep changes consistent with the existing style: simple functions, explicit exception handling, and informative docstrings.
- The frontend is a single HTML file with inline JavaScript. It relies on the MSAL.js CDN (`https://alcdn.msauth.net/browser/2.37.0/js/msal-browser.min.js`).
- The project uses `ruff` for linting and formatting. Run `ruff check . && ruff format .` before committing.
- Keep changes consistent with the existing style: simple functions, explicit exception handling, and informative docstrings.
- The frontend is a single HTML file with inline JavaScript and Alpine.js.
## Testing
There are currently **no automated tests** in this repository. When adding new features or bug fixes, verify behavior manually:
Tests run with pytest and mongomock (no real MongoDB required):
1. Start the server (Docker Compose or local uvicorn).
2. Run a smoke test:
```bash
curl http://localhost:8000/api/events
curl http://localhost:8000/api/fetch-audit-logs
```
3. Open http://localhost:8000 in a browser, apply filters, paginate, and click "View raw event".
```bash
cd backend
python -m venv .venv_test
source .venv_test/bin/activate
pip install -r requirements.txt
pytest tests/ -q
```
When adding new features or bug fixes, add or update tests in `backend/tests/`. The test suite covers:
- Event normalization and deduplication
- Auth middleware and token validation
- API endpoints (`/api/events`, `/api/fetch-audit-logs`, `/api/ask`)
- NLQ time range extraction, entity extraction, query building
## Security Considerations
- **Secrets**: `CLIENT_SECRET` and other credentials come from `.env`. Never commit `.env`.
- **Secrets**: `CLIENT_SECRET`, `LLM_API_KEY`, and other credentials come from `.env`. Never commit `.env`.
- **Auth validation**: When `AUTH_ENABLED=true`, the backend fetches JWKS from `https://login.microsoftonline.com/{AUTH_TENANT_ID}/v2.0/.well-known/openid-configuration`, caches keys for 1 hour, and validates tenant/issuer claims. Tokens are decoded without strict signature verification (`jwt.get_unverified_claims`), so the tenant and issuer checks are the primary gate.
- **Role/Group gating**: Access is allowed if the tokens `roles` intersect `AUTH_ALLOWED_ROLES` or `groups` intersect `AUTH_ALLOWED_GROUPS`. If neither list is configured, all authenticated users are allowed.
- **Pagination limits**: `page_size` is clamped to a maximum of 500 to prevent large queries.
- **Fetch window cap**: `hours` is clamped to 720 (30 days) to avoid runaway API calls.
- **MCP server**: The MCP server bypasses auth entirely. Only run it in trusted environments or behind a VPN.
## Maintenance and Operations