docs: update AGENTS.md, README.md, DEPLOY.md, ROADMAP.md for v1.7.14 security features
This commit is contained in:
61
AGENTS.md
61
AGENTS.md
@@ -9,20 +9,24 @@ AOC is a FastAPI microservice that ingests Microsoft Entra (Azure AD) audit logs
|
||||
- **Runtime**: Python 3.11 (3.14 for tests)
|
||||
- **Web Framework**: FastAPI + Uvicorn (Gunicorn in production)
|
||||
- **Database**: MongoDB (PyMongo)
|
||||
- **Cache/Queue**: Valkey/Redis 8 (caching + arq async job queue)
|
||||
- **Frontend**: Alpine.js + HTML/CSS (served as static files from `backend/frontend/`)
|
||||
- **Authentication**: Optional OIDC Bearer token validation against Microsoft Entra (using `python-jose` and MSAL.js on the frontend)
|
||||
- **External APIs**: Microsoft Graph API, Office 365 Management Activity API, Azure OpenAI / MS Foundry
|
||||
- **Deployment**: Docker Compose (dev), Docker Compose + nginx (prod)
|
||||
- **CI/CD**: Gitea Actions (lint + test + Docker build + release)
|
||||
- **Secrets Storage**: Environment variables (`.env`) or optional Azure Key Vault
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
backend/
|
||||
main.py # FastAPI app, router registration, background periodic fetch
|
||||
config.py # Pydantic Settings configuration (loads .env)
|
||||
config.py # Pydantic Settings configuration (loads .env + optional Key Vault)
|
||||
database.py # MongoClient setup (db = micro_soc, collection = events)
|
||||
auth.py # OIDC Bearer token validation, JWKS caching, role/group checks
|
||||
secrets_manager.py # Optional Azure Key Vault integration for secrets
|
||||
rate_limiter.py # Redis-backed fixed-window rate limiter (fail-closed)
|
||||
requirements.txt # Python dependencies
|
||||
Dockerfile # python:3.11-slim image, non-root user, version baked at build
|
||||
mcp_server.py # Standalone MCP server for Claude Desktop / Cursor integration
|
||||
@@ -34,6 +38,9 @@ backend/
|
||||
health.py # GET /health, GET /metrics
|
||||
rules.py # Rule-based alerting endpoints
|
||||
webhooks.py # Microsoft Graph change notification webhooks
|
||||
alerts.py # Alert management endpoints
|
||||
saved_searches.py # Saved filter presets
|
||||
jobs.py # Async job status polling
|
||||
graph/
|
||||
auth.py # Client credentials token acquisition for Graph
|
||||
audit_logs.py # Fetch and enrich directory audit logs from Graph
|
||||
@@ -59,16 +66,42 @@ Copy `.env.example` to `.env` at the repo root and fill in values:
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
Key variables:
|
||||
### Core variables
|
||||
- `TENANT_ID`, `CLIENT_ID`, `CLIENT_SECRET` — Microsoft app registration credentials (application permissions)
|
||||
- `AUTH_ENABLED` — set `true` to protect API/UI with OIDC Bearer tokens
|
||||
- `AUTH_TENANT_ID`, `AUTH_CLIENT_ID` — token validation audience/issuer
|
||||
- `AUTH_ALLOWED_ROLES`, `AUTH_ALLOWED_GROUPS` — comma-separated access control lists
|
||||
- `ENABLE_PERIODIC_FETCH`, `FETCH_INTERVAL_MINUTES` — background ingestion scheduler
|
||||
- `MONGO_ROOT_USERNAME`, `MONGO_ROOT_PASSWORD`, `MONGO_PORT` — used by Docker Compose for MongoDB
|
||||
|
||||
### AI / LLM variables
|
||||
- `AI_FEATURES_ENABLED` — set `false` to completely disable AI endpoints and UI (default `true`)
|
||||
- `LLM_API_KEY`, `LLM_BASE_URL`, `LLM_MODEL`, `LLM_MAX_EVENTS`, `LLM_TIMEOUT_SECONDS` — LLM provider settings
|
||||
- `LLM_API_VERSION` — required for Azure OpenAI / MS Foundry endpoints
|
||||
- `LLM_ALLOWED_DOMAINS` — comma-separated domain allowlist for LLM endpoints (e.g. `api.openai.com,*.openai.azure.com`)
|
||||
|
||||
### Security variables
|
||||
- `CORS_ORIGINS` — comma-separated allowed origins (default `*`; set explicit origins in production)
|
||||
- `DOCS_ENABLED` — set `true` to expose `/docs`, `/redoc`, `/openapi.json` (default `false`)
|
||||
- `METRICS_ALLOWED_IPS` — comma-separated CIDRs allowed to access `/metrics` (default: private networks + loopback)
|
||||
- `WEBHOOK_CLIENT_SECRET` — secret for validating Graph webhook `clientState`
|
||||
- `SIEM_ENABLED`, `SIEM_WEBHOOK_URL` — optional SIEM forwarding
|
||||
- `SIEM_ALLOWED_DOMAINS` — comma-separated domain allowlist for SIEM webhook URLs
|
||||
- `RATE_LIMIT_ENABLED`, `RATE_LIMIT_REQUESTS`, `RATE_LIMIT_WINDOW_SECONDS` — Redis-backed rate limiting
|
||||
|
||||
### Optional Azure Key Vault
|
||||
- `AZURE_KEY_VAULT_NAME` — name of the Azure Key Vault to load secrets from
|
||||
- When set, AOC fetches these secrets at startup:
|
||||
- `aoc-client-secret` → `CLIENT_SECRET`
|
||||
- `aoc-llm-api-key` → `LLM_API_KEY`
|
||||
- `aoc-mongo-uri` → `MONGO_URI`
|
||||
- `aoc-webhook-client-secret` → `WEBHOOK_CLIENT_SECRET`
|
||||
- Requires `azure-identity` and `azure-keyvault-secrets` (uncomment in `requirements.txt`)
|
||||
|
||||
### Privacy / access control
|
||||
- `PRIVACY_SERVICES` — comma-separated services to hide from non-privileged users (e.g. `Exchange,Teams`)
|
||||
- `PRIVACY_SENSITIVE_OPERATIONS` — comma-separated operations to gate
|
||||
- `PRIVACY_SERVICE_ROLES` — comma-separated Entra roles that grant access to privacy data
|
||||
|
||||
## Build and Run Commands
|
||||
|
||||
@@ -102,7 +135,9 @@ uvicorn main:app --reload --host 0.0.0.0 --port 8000
|
||||
- `GET /api/config/features` — feature flags (`ai_features_enabled`)
|
||||
- `POST /api/ask` — natural language query; returns LLM narrative + referenced events (only when `AI_FEATURES_ENABLED=true`)
|
||||
- `GET /health` — liveness probe with DB connectivity
|
||||
- `GET /metrics` — Prometheus metrics
|
||||
- `GET /metrics` — Prometheus metrics (IP-restricted by default)
|
||||
- `GET /api/source-health` — last fetch status per ingestion source
|
||||
- `GET /api/version` — running version
|
||||
|
||||
## MCP Server
|
||||
|
||||
@@ -162,16 +197,30 @@ When adding new features or bug fixes, add or update tests in `backend/tests/`.
|
||||
- Auth middleware and token validation
|
||||
- API endpoints (`/api/events`, `/api/fetch-audit-logs`, `/api/ask`)
|
||||
- NLQ time range extraction, entity extraction, query building
|
||||
- Rate limiting behavior
|
||||
|
||||
## Security Considerations
|
||||
|
||||
- **Secrets**: `CLIENT_SECRET`, `LLM_API_KEY`, and other credentials come from `.env`. Never commit `.env`.
|
||||
- **Auth validation**: When `AUTH_ENABLED=true`, the backend fetches JWKS from `https://login.microsoftonline.com/{AUTH_TENANT_ID}/v2.0/.well-known/openid-configuration`, caches keys for 1 hour, and validates tenant/issuer claims. Tokens are decoded without strict signature verification (`jwt.get_unverified_claims`), so the tenant and issuer checks are the primary gate.
|
||||
- **Role/Group gating**: Access is allowed if the token’s `roles` intersect `AUTH_ALLOWED_ROLES` or `groups` intersect `AUTH_ALLOWED_GROUPS`. If neither list is configured, all authenticated users are allowed.
|
||||
- **Secrets**: `CLIENT_SECRET`, `LLM_API_KEY`, and other credentials come from `.env` or Azure Key Vault. Never commit `.env`.
|
||||
- **Auth validation**: When `AUTH_ENABLED=true`, the backend fetches JWKS from `https://login.microsoftonline.com/{AUTH_TENANT_ID}/v2.0/.well-known/openid-configuration`, caches keys for 1 hour, and validates tenant/issuer/audience claims. Tokens are decoded with RS256 signature verification.
|
||||
- **Role/Group gating**: Access is allowed if the token's `roles` intersect `AUTH_ALLOWED_ROLES` or `groups` intersect `AUTH_ALLOWED_GROUPS`. If neither list is configured, all authenticated users are allowed — a startup warning is logged in this case.
|
||||
- **CORS**: When `AUTH_ENABLED=true` and `CORS_ORIGINS="*"`, `allow_credentials` is forced to `false` to prevent cross-origin token leakage.
|
||||
- **Rate limiting**: Redis-backed fixed-window rate limiting with per-category limits (fetch=10/hr, ask=30/min, write=20/min, default=120/min). Fails closed (returns 429) when Redis is unavailable.
|
||||
- **Pagination limits**: `page_size` is clamped to a maximum of 500 to prevent large queries.
|
||||
- **Fetch window cap**: `hours` is clamped to 720 (30 days) to avoid runaway API calls.
|
||||
- **LLM SSRF guard**: `LLM_BASE_URL` must be HTTPS and cannot point to private IPs. Optional `LLM_ALLOWED_DOMAINS` restricts to specific domains.
|
||||
- **SIEM SSRF guard**: `SIEM_WEBHOOK_URL` has the same validation as LLM URLs, plus optional `SIEM_ALLOWED_DOMAINS`.
|
||||
- **Metrics IP gating**: `/metrics` is restricted to private/loopback IPs by default via `METRICS_ALLOWED_IPS`.
|
||||
- **OpenAPI docs**: Disabled by default (`DOCS_ENABLED=false`). Enable only in development.
|
||||
- **CSP**: Content-Security-Policy headers are set on all responses. `unsafe-eval` is required for Alpine.js v3 expression evaluation.
|
||||
- **SRI**: CDN scripts (Alpine.js, MSAL.js) include Subresource Integrity hashes to prevent supply chain compromise.
|
||||
- **MCP server**: The MCP server bypasses auth entirely. Only run it in trusted environments or behind a VPN.
|
||||
|
||||
### Security Documentation
|
||||
|
||||
- `PEN_TEST_REPORT_v1.7.11.md` — Internal soft penetration test findings and remediation
|
||||
- `THREAT_MODEL_v1.7.13.md` — Comprehensive threat model covering Entra/token abuse vectors
|
||||
|
||||
## Maintenance and Operations
|
||||
|
||||
The `backend/maintenance.py` script provides two CLI commands useful for backfilling or correcting stored data:
|
||||
|
||||
Reference in New Issue
Block a user