aoc/README.md

# Admin Operations Center (AOC)

FastAPI microservice that ingests Microsoft Entra (Azure AD) and other admin audit logs into MongoDB, dedupes them, and exposes a UI/API to fetch, search, and review events.

## Components
- FastAPI app under `backend/` with routes to fetch audit logs and list stored events.
- MongoDB for persistence (provisioned via Docker Compose).
- Microsoft Graph client (client credentials) for retrieving directory audit events and Intune audit events.
- Office 365 Management Activity API client for Exchange/SharePoint/Teams admin audit logs.
- Frontend served from the backend for filtering/searching events and viewing raw entries.
- Optional OIDC bearer auth (Entra) to protect the API/UI and gate access by roles/groups.
- Natural language query (`/api/ask`) powered by LLM (OpenAI, Azure OpenAI, or any compatible API).
- MCP server for Claude Desktop / Cursor integration.
- Optional Azure Key Vault integration for secrets storage.

## Prerequisites (macOS)
- Python 3.11
- Docker Desktop (for the quickest start) or a local MongoDB instance
- An Entra app registration with **Application** permission `AuditLog.Read.All` and admin consent granted
  - Also required to fetch other sources:
    - `https://manage.office.com/.default` (Audit API) with `ActivityFeed.Read`/`ActivityFeed.ReadDlp` (built into the app registration's API permissions for Office 365 Management APIs)
    - Intune audit: `DeviceManagementConfiguration.Read.All` (or broader) for `/deviceManagement/auditEvents`
  - Optional API protection: configure `AUTH_ENABLED=true` and set `AUTH_TENANT_ID`/`AUTH_CLIENT_ID` (the audience) plus allowed roles/groups.

## Configuration
Create a `.env` file at the repo root (copy `.env.example`) and fill in your Microsoft Graph app credentials. The provided `MONGO_URI` works with the bundled MongoDB container; change it if you use a different Mongo instance.

```bash
cp .env.example .env
# edit .env to add TENANT_ID, CLIENT_ID, CLIENT_SECRET (and MONGO_URI if needed)
# optional: enable auth & periodic fetch
# AUTH_ENABLED=true
# AUTH_TENANT_ID=...
# AUTH_CLIENT_ID=...
# AUTH_ALLOWED_ROLES=Admins,SecurityOps
# ENABLE_PERIODIC_FETCH=true
# FETCH_INTERVAL_MINUTES=60

# Optional: data retention (auto-expire old events via MongoDB TTL)
# RETENTION_DAYS=90

# Optional: CORS origins if the frontend is served separately
# CORS_ORIGINS=http://localhost:3000,https://app.example.com

# Optional: enable AI/natural-language features (/api/ask, MCP server)
# AI_FEATURES_ENABLED=true

# Optional: LLM configuration for natural language querying
# LLM_API_KEY=...
# LLM_BASE_URL=https://api.openai.com/v1
# LLM_MODEL=gpt-4o-mini
# LLM_TIMEOUT_SECONDS=30
# LLM_ALLOWED_DOMAINS=api.openai.com,*.openai.azure.com

# Optional: SIEM forwarding
# SIEM_ENABLED=true
# SIEM_WEBHOOK_URL=https://your-siem.com/webhook
# SIEM_ALLOWED_DOMAINS=your-siem.com

# Optional: Azure Key Vault for secrets storage
# AZURE_KEY_VAULT_NAME=your-keyvault-name
```

### Using Azure Key Vault for secrets
Instead of storing `CLIENT_SECRET`, `LLM_API_KEY`, `MONGO_URI`, and `WEBHOOK_CLIENT_SECRET` in `.env`, you can store them in Azure Key Vault:

1. Create a Key Vault and add secrets with these names:
   - `aoc-client-secret` → your Graph app `CLIENT_SECRET`
   - `aoc-llm-api-key` → your `LLM_API_KEY`
   - `aoc-mongo-uri` → your `MONGO_URI`
   - `aoc-webhook-client-secret` → your `WEBHOOK_CLIENT_SECRET`
2. Uncomment `azure-identity` and `azure-keyvault-secrets` in `backend/requirements.txt`
3. Set `AZURE_KEY_VAULT_NAME=your-keyvault-name` in `.env`
4. Ensure the container has Azure identity credentials (managed identity, service principal, or Azure CLI auth)

## Security Hardening Checklist

Before deploying to production:

- [ ] Set `AUTH_ENABLED=true` and configure `AUTH_ALLOWED_ROLES` or `AUTH_ALLOWED_GROUPS` to restrict access
- [ ] Set explicit `CORS_ORIGINS` (do not use `*` in production with auth enabled)
- [ ] Set `DOCS_ENABLED=false` (default) to hide OpenAPI docs
- [ ] Configure `WEBHOOK_CLIENT_SECRET` to validate Graph webhook notifications
- [ ] Set `LLM_ALLOWED_DOMAINS` if using AI features to prevent data exfiltration
- [ ] Set `SIEM_ALLOWED_DOMAINS` if using SIEM forwarding
- [ ] Review `METRICS_ALLOWED_IPS` — defaults to private networks only
- [ ] Consider Azure Key Vault instead of `.env` for secrets
- [ ] Review the threat model: `THREAT_MODEL_v1.7.13.md`

## Run with Docker Compose (recommended)
```bash
docker compose up --build
```
- API: http://localhost:8000
- Frontend: http://localhost:8000
- Health: http://localhost:8000/health
- Mongo: localhost:27017 (root/example)

## Run locally without Docker
1) Start MongoDB (e.g. with Docker):
   `docker run --rm -p 27017:27017 -e MONGO_INITDB_ROOT_USERNAME=root -e MONGO_INITDB_ROOT_PASSWORD=example mongo:7`

2) Prepare the backend environment:
```bash
cd backend
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
export $(cat ../.env | xargs)   # or set env vars manually
uvicorn main:app --reload --host 0.0.0.0 --port 8000
```

## API
- `GET /health` — health check with MongoDB connectivity status.
- `GET /metrics` — Prometheus metrics for request latency, fetch volume, and errors (IP-restricted).
- `GET /api/version` — running version (baked into the Docker image at build time).
- `GET /api/fetch-audit-logs` — pulls the last 7 days by default (override with `?hours=N`, capped to 30 days) of:
  - Entra directory audit logs (`/auditLogs/directoryAudits`)
  - Exchange/SharePoint/Teams admin audits (via Office 365 Management Activity API)
  - Intune audit logs (`/deviceManagement/auditEvents`)
  Dedupes on a stable key (source id or timestamp/category/operation/target). Returns count and per-source warnings.
  - **Incremental fetch**: each source remembers its last successful fetch time in MongoDB (`watermarks` collection). Subsequent calls fetch only new events since the watermark.
  - **Alerting**: if `ALERTS_ENABLED=true`, events are evaluated against stored rules during ingestion.
  - **SIEM export**: if `SIEM_ENABLED=true`, each ingested event is forwarded to `SIEM_WEBHOOK_URL`.
- `GET /api/events` — list stored events with filters:
  - `service`, `actor`, `operation`, `result`, `start`, `end`, `search` (free text over raw/summary/actor/targets)
  - Pagination: `cursor`-based (`page_size` defaults to 50, max 500). Pass `cursor` from `next_cursor` to paginate forward.
- `GET /api/filter-options` — best-effort distinct values for services, operations, results, actors (used by UI dropdowns).
- `POST /api/webhooks/graph` — receive Microsoft Graph change notifications. Echoes `validationToken` when present.
- `GET /api/source-health` — last fetch status for each ingestion source (`directory`, `unified`, `intune`).
- `PATCH /api/events/{id}/tags` — update tags on an event (e.g., `investigating`, `false_positive`).
- `POST /api/events/{id}/comments` — add a comment to an event.
- `POST /api/events/{id}/explain` — AI explanation of a single audit event with security context (requires `LLM_API_KEY`).
- `POST /api/ask` — natural language query. Returns a narrative answer + referenced events. Supports time ranges, entity names, and respects active UI filters. Only available when `AI_FEATURES_ENABLED=true`.
- `GET /api/config/features` — feature flags (`ai_features_enabled`).
- `GET /api/rules` — list alert rules.
- `POST /api/rules` — create an alert rule.
- `PUT /api/rules/{id}` — update an alert rule.
- `DELETE /api/rules/{id}` — delete an alert rule.

### MCP Server
AOC exposes an MCP interface in two forms:

**1. HTTP/SSE (production)** — mounted at `/mcp` inside the FastAPI app, behind OIDC auth:
- `GET /mcp/sse` — establish SSE stream (requires Bearer token if `AUTH_ENABLED=true`)
- `POST /mcp/messages/?session_id=...` — send tool calls

This is the recommended way to use MCP against a remote deployment like `aoc.cqre.net`. Any MCP client that supports SSE transport (e.g. Cursor, Claude Desktop with an SSE bridge, or custom scripts) can connect using the same Entra token as the web UI.

**2. stdio (local development)** — `python backend/mcp_server.py`:
- Runs as a local subprocess for Claude Desktop
- Connects directly to MongoDB (bypasses FastAPI auth)
- Useful for local development when you have the repo cloned and MongoDB running locally

Available tools (both transports):
- `search_events` — filter by entity, service, operation, result, time range.
- `get_event` — retrieve raw event JSON by ID.
- `get_summary` — aggregated summary (service, operation, result, actor counts) for the last N days.
- `ask` — natural language query returning recent events.

Stored document shape (collection `micro_soc.events`):
```json
{
  "id": "...",                // original source id
  "timestamp": "...",         // activityDateTime
  "service": "...",           // category
  "operation": "...",         // activityDisplayName
  "result": "...",
  "actor_display": "...",     // resolved user/app name
  "target_displays": [ ... ],
  "display_summary": "...",
  "dedupe_key": "...",        // used for upserts
  "actor": { ... },           // initiatedBy
  "targets": [ ... ],         // targetResources
  "raw": { ... },             // full source event
  "raw_text": "..."           // raw as string for text search
}
```

## Development

### Linting and formatting
We use `ruff` for linting and formatting.

```bash
cd backend
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt -r requirements-dev.txt
ruff check ..
ruff format ..
```

### Running tests
```bash
cd backend
pytest -q
```

## Quick smoke tests
With the server running:
```bash
curl http://localhost:8000/health
curl http://localhost:8000/api/events
curl http://localhost:8000/api/fetch-audit-logs
```
- Visit the UI at http://localhost:8000 to filter by user/service/action/result/time, search raw text, paginate, and view raw events.

## Maintenance (Dockerized)
Use the backend image so you don't need a local venv:
```bash
# ensure Mongo + backend network are up
docker compose up -d mongo
# re-run enrichment/normalization on stored events (uses .env for Graph/Mongo)
docker compose run --rm backend python maintenance.py renormalize --limit 500
# deduplicate existing events (optional)
docker compose run --rm backend python maintenance.py dedupe
```
Omit `--limit` to process all events. You can also run commands inside a running backend container with `docker compose exec backend ...`.

## Security Documentation
- `PEN_TEST_REPORT_v1.7.11.md` — Penetration test findings and remediation
- `THREAT_MODEL_v1.7.13.md` — Comprehensive threat model covering Entra application abuse, token handling, data exfiltration vectors

## Notes / Troubleshooting
- Ensure `TENANT_ID`, `CLIENT_ID`, and `CLIENT_SECRET` match an app registration with `AuditLog.Read.All` (application) permission and admin consent.
- Additional permissions: Office 365 Management Activity (`ActivityFeed.Read`), and Intune audit (`DeviceManagementConfiguration.Read.All`).
- Auth: if `AUTH_ENABLED=true`, issued tokens must be from `AUTH_TENANT_ID`, audience = `AUTH_CLIENT_ID`; access is granted if roles or groups overlap `AUTH_ALLOWED_ROLES`/`AUTH_ALLOWED_GROUPS` (if set). A startup warning is logged if auth is enabled but no roles/groups are configured.
- Backfill limits: Management Activity API typically exposes ~7 days of history via API (longer if your tenant has extended/Advanced Audit retention). Directory/Intune audit retention follows your tenant policy (commonly 30–90 days, longer with Advanced Audit).
- If you change Mongo credentials/ports, update `MONGO_URI` in `.env` (Docker Compose passes it through to the backend).
- The service uses the `micro_soc` database and `events` collection by default; adjust in `backend/config.py` if needed.
- If using Azure Key Vault, ensure the runtime identity (managed identity, service principal, or local Azure CLI) has `Get` permission on secrets.