Files
aoc/README.md
Tomas Kracmar fe95dfcfce
All checks were successful
Release / build-and-push (push) Successful in 21s
CI / lint-and-test (push) Successful in 25s
docs: update AGENTS.md, README.md, DEPLOY.md, ROADMAP.md for v1.7.14 security features
2026-04-27 16:52:35 +02:00

233 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Admin Operations Center (AOC)
FastAPI microservice that ingests Microsoft Entra (Azure AD) and other admin audit logs into MongoDB, dedupes them, and exposes a UI/API to fetch, search, and review events.
## Components
- FastAPI app under `backend/` with routes to fetch audit logs and list stored events.
- MongoDB for persistence (provisioned via Docker Compose).
- Microsoft Graph client (client credentials) for retrieving directory audit events and Intune audit events.
- Office 365 Management Activity API client for Exchange/SharePoint/Teams admin audit logs.
- Frontend served from the backend for filtering/searching events and viewing raw entries.
- Optional OIDC bearer auth (Entra) to protect the API/UI and gate access by roles/groups.
- Natural language query (`/api/ask`) powered by LLM (OpenAI, Azure OpenAI, or any compatible API).
- MCP server for Claude Desktop / Cursor integration.
- Optional Azure Key Vault integration for secrets storage.
## Prerequisites (macOS)
- Python 3.11
- Docker Desktop (for the quickest start) or a local MongoDB instance
- An Entra app registration with **Application** permission `AuditLog.Read.All` and admin consent granted
- Also required to fetch other sources:
- `https://manage.office.com/.default` (Audit API) with `ActivityFeed.Read`/`ActivityFeed.ReadDlp` (built into the app registration's API permissions for Office 365 Management APIs)
- Intune audit: `DeviceManagementConfiguration.Read.All` (or broader) for `/deviceManagement/auditEvents`
- Optional API protection: configure `AUTH_ENABLED=true` and set `AUTH_TENANT_ID`/`AUTH_CLIENT_ID` (the audience) plus allowed roles/groups.
## Configuration
Create a `.env` file at the repo root (copy `.env.example`) and fill in your Microsoft Graph app credentials. The provided `MONGO_URI` works with the bundled MongoDB container; change it if you use a different Mongo instance.
```bash
cp .env.example .env
# edit .env to add TENANT_ID, CLIENT_ID, CLIENT_SECRET (and MONGO_URI if needed)
# optional: enable auth & periodic fetch
# AUTH_ENABLED=true
# AUTH_TENANT_ID=...
# AUTH_CLIENT_ID=...
# AUTH_ALLOWED_ROLES=Admins,SecurityOps
# ENABLE_PERIODIC_FETCH=true
# FETCH_INTERVAL_MINUTES=60
# Optional: data retention (auto-expire old events via MongoDB TTL)
# RETENTION_DAYS=90
# Optional: CORS origins if the frontend is served separately
# CORS_ORIGINS=http://localhost:3000,https://app.example.com
# Optional: enable AI/natural-language features (/api/ask, MCP server)
# AI_FEATURES_ENABLED=true
# Optional: LLM configuration for natural language querying
# LLM_API_KEY=...
# LLM_BASE_URL=https://api.openai.com/v1
# LLM_MODEL=gpt-4o-mini
# LLM_TIMEOUT_SECONDS=30
# LLM_ALLOWED_DOMAINS=api.openai.com,*.openai.azure.com
# Optional: SIEM forwarding
# SIEM_ENABLED=true
# SIEM_WEBHOOK_URL=https://your-siem.com/webhook
# SIEM_ALLOWED_DOMAINS=your-siem.com
# Optional: Azure Key Vault for secrets storage
# AZURE_KEY_VAULT_NAME=your-keyvault-name
```
### Using Azure Key Vault for secrets
Instead of storing `CLIENT_SECRET`, `LLM_API_KEY`, `MONGO_URI`, and `WEBHOOK_CLIENT_SECRET` in `.env`, you can store them in Azure Key Vault:
1. Create a Key Vault and add secrets with these names:
- `aoc-client-secret` → your Graph app `CLIENT_SECRET`
- `aoc-llm-api-key` → your `LLM_API_KEY`
- `aoc-mongo-uri` → your `MONGO_URI`
- `aoc-webhook-client-secret` → your `WEBHOOK_CLIENT_SECRET`
2. Uncomment `azure-identity` and `azure-keyvault-secrets` in `backend/requirements.txt`
3. Set `AZURE_KEY_VAULT_NAME=your-keyvault-name` in `.env`
4. Ensure the container has Azure identity credentials (managed identity, service principal, or Azure CLI auth)
## Security Hardening Checklist
Before deploying to production:
- [ ] Set `AUTH_ENABLED=true` and configure `AUTH_ALLOWED_ROLES` or `AUTH_ALLOWED_GROUPS` to restrict access
- [ ] Set explicit `CORS_ORIGINS` (do not use `*` in production with auth enabled)
- [ ] Set `DOCS_ENABLED=false` (default) to hide OpenAPI docs
- [ ] Configure `WEBHOOK_CLIENT_SECRET` to validate Graph webhook notifications
- [ ] Set `LLM_ALLOWED_DOMAINS` if using AI features to prevent data exfiltration
- [ ] Set `SIEM_ALLOWED_DOMAINS` if using SIEM forwarding
- [ ] Review `METRICS_ALLOWED_IPS` — defaults to private networks only
- [ ] Consider Azure Key Vault instead of `.env` for secrets
- [ ] Review the threat model: `THREAT_MODEL_v1.7.13.md`
## Run with Docker Compose (recommended)
```bash
docker compose up --build
```
- API: http://localhost:8000
- Frontend: http://localhost:8000
- Health: http://localhost:8000/health
- Mongo: localhost:27017 (root/example)
## Run locally without Docker
1) Start MongoDB (e.g. with Docker):
`docker run --rm -p 27017:27017 -e MONGO_INITDB_ROOT_USERNAME=root -e MONGO_INITDB_ROOT_PASSWORD=example mongo:7`
2) Prepare the backend environment:
```bash
cd backend
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
export $(cat ../.env | xargs) # or set env vars manually
uvicorn main:app --reload --host 0.0.0.0 --port 8000
```
## API
- `GET /health` — health check with MongoDB connectivity status.
- `GET /metrics` — Prometheus metrics for request latency, fetch volume, and errors (IP-restricted).
- `GET /api/version` — running version (baked into the Docker image at build time).
- `GET /api/fetch-audit-logs` — pulls the last 7 days by default (override with `?hours=N`, capped to 30 days) of:
- Entra directory audit logs (`/auditLogs/directoryAudits`)
- Exchange/SharePoint/Teams admin audits (via Office 365 Management Activity API)
- Intune audit logs (`/deviceManagement/auditEvents`)
Dedupes on a stable key (source id or timestamp/category/operation/target). Returns count and per-source warnings.
- **Incremental fetch**: each source remembers its last successful fetch time in MongoDB (`watermarks` collection). Subsequent calls fetch only new events since the watermark.
- **Alerting**: if `ALERTS_ENABLED=true`, events are evaluated against stored rules during ingestion.
- **SIEM export**: if `SIEM_ENABLED=true`, each ingested event is forwarded to `SIEM_WEBHOOK_URL`.
- `GET /api/events` — list stored events with filters:
- `service`, `actor`, `operation`, `result`, `start`, `end`, `search` (free text over raw/summary/actor/targets)
- Pagination: `cursor`-based (`page_size` defaults to 50, max 500). Pass `cursor` from `next_cursor` to paginate forward.
- `GET /api/filter-options` — best-effort distinct values for services, operations, results, actors (used by UI dropdowns).
- `POST /api/webhooks/graph` — receive Microsoft Graph change notifications. Echoes `validationToken` when present.
- `GET /api/source-health` — last fetch status for each ingestion source (`directory`, `unified`, `intune`).
- `PATCH /api/events/{id}/tags` — update tags on an event (e.g., `investigating`, `false_positive`).
- `POST /api/events/{id}/comments` — add a comment to an event.
- `POST /api/events/{id}/explain` — AI explanation of a single audit event with security context (requires `LLM_API_KEY`).
- `POST /api/ask` — natural language query. Returns a narrative answer + referenced events. Supports time ranges, entity names, and respects active UI filters. Only available when `AI_FEATURES_ENABLED=true`.
- `GET /api/config/features` — feature flags (`ai_features_enabled`).
- `GET /api/rules` — list alert rules.
- `POST /api/rules` — create an alert rule.
- `PUT /api/rules/{id}` — update an alert rule.
- `DELETE /api/rules/{id}` — delete an alert rule.
### MCP Server
AOC exposes an MCP interface in two forms:
**1. HTTP/SSE (production)** — mounted at `/mcp` inside the FastAPI app, behind OIDC auth:
- `GET /mcp/sse` — establish SSE stream (requires Bearer token if `AUTH_ENABLED=true`)
- `POST /mcp/messages/?session_id=...` — send tool calls
This is the recommended way to use MCP against a remote deployment like `aoc.cqre.net`. Any MCP client that supports SSE transport (e.g. Cursor, Claude Desktop with an SSE bridge, or custom scripts) can connect using the same Entra token as the web UI.
**2. stdio (local development)**`python backend/mcp_server.py`:
- Runs as a local subprocess for Claude Desktop
- Connects directly to MongoDB (bypasses FastAPI auth)
- Useful for local development when you have the repo cloned and MongoDB running locally
Available tools (both transports):
- `search_events` — filter by entity, service, operation, result, time range.
- `get_event` — retrieve raw event JSON by ID.
- `get_summary` — aggregated summary (service, operation, result, actor counts) for the last N days.
- `ask` — natural language query returning recent events.
Stored document shape (collection `micro_soc.events`):
```json
{
"id": "...", // original source id
"timestamp": "...", // activityDateTime
"service": "...", // category
"operation": "...", // activityDisplayName
"result": "...",
"actor_display": "...", // resolved user/app name
"target_displays": [ ... ],
"display_summary": "...",
"dedupe_key": "...", // used for upserts
"actor": { ... }, // initiatedBy
"targets": [ ... ], // targetResources
"raw": { ... }, // full source event
"raw_text": "..." // raw as string for text search
}
```
## Development
### Linting and formatting
We use `ruff` for linting and formatting.
```bash
cd backend
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt -r requirements-dev.txt
ruff check ..
ruff format ..
```
### Running tests
```bash
cd backend
pytest -q
```
## Quick smoke tests
With the server running:
```bash
curl http://localhost:8000/health
curl http://localhost:8000/api/events
curl http://localhost:8000/api/fetch-audit-logs
```
- Visit the UI at http://localhost:8000 to filter by user/service/action/result/time, search raw text, paginate, and view raw events.
## Maintenance (Dockerized)
Use the backend image so you don't need a local venv:
```bash
# ensure Mongo + backend network are up
docker compose up -d mongo
# re-run enrichment/normalization on stored events (uses .env for Graph/Mongo)
docker compose run --rm backend python maintenance.py renormalize --limit 500
# deduplicate existing events (optional)
docker compose run --rm backend python maintenance.py dedupe
```
Omit `--limit` to process all events. You can also run commands inside a running backend container with `docker compose exec backend ...`.
## Security Documentation
- `PEN_TEST_REPORT_v1.7.11.md` — Penetration test findings and remediation
- `THREAT_MODEL_v1.7.13.md` — Comprehensive threat model covering Entra application abuse, token handling, data exfiltration vectors
## Notes / Troubleshooting
- Ensure `TENANT_ID`, `CLIENT_ID`, and `CLIENT_SECRET` match an app registration with `AuditLog.Read.All` (application) permission and admin consent.
- Additional permissions: Office 365 Management Activity (`ActivityFeed.Read`), and Intune audit (`DeviceManagementConfiguration.Read.All`).
- Auth: if `AUTH_ENABLED=true`, issued tokens must be from `AUTH_TENANT_ID`, audience = `AUTH_CLIENT_ID`; access is granted if roles or groups overlap `AUTH_ALLOWED_ROLES`/`AUTH_ALLOWED_GROUPS` (if set). A startup warning is logged if auth is enabled but no roles/groups are configured.
- Backfill limits: Management Activity API typically exposes ~7 days of history via API (longer if your tenant has extended/Advanced Audit retention). Directory/Intune audit retention follows your tenant policy (commonly 3090 days, longer with Advanced Audit).
- If you change Mongo credentials/ports, update `MONGO_URI` in `.env` (Docker Compose passes it through to the backend).
- The service uses the `micro_soc` database and `events` collection by default; adjust in `backend/config.py` if needed.
- If using Azure Key Vault, ensure the runtime identity (managed identity, service principal, or local Azure CLI) has `Get` permission on secrets.