v1.7.12: security hardening — CORS fix, security headers, fail-closed rate limiter, OpenAPI docs disabled by default, config auth privacy, webhook validation

hotfix(v1.7.11): add unsafe-eval to CSP for Alpine.js
2026-04-27 14:19:28 +02:00 · 2026-04-27 10:39:33 +02:00
11 changed files with 350 additions and 16 deletions
--- a/.env.example
+++ b/.env.example
@@ -27,6 +27,9 @@ RETENTION_DAYS=0
 # Optional: comma-separated CORS origins (e.g., http://localhost:3000,https://app.example.com)
 CORS_ORIGINS=*

+# OpenAPI docs exposure (set true only for dev)
+DOCS_ENABLED=false
+
 # Optional: SIEM export webhook (e.g., Splunk HEC, Sentinel, or generic syslog webhook)
 SIEM_ENABLED=false
 SIEM_WEBHOOK_URL=
--- a/PEN_TEST_REPORT_v1.7.11.md
+++ b/PEN_TEST_REPORT_v1.7.11.md
@@ -0,0 +1,203 @@
+# AOC v1.7.11 Soft Penetration Test Report
+
+**Date:** 2026-04-27
+**Target:** Local AOC instance (port 8001), auth disabled, AI disabled
+**Tester:** Automated + manual curl-based probing
+**Scope:** FastAPI backend, REST API endpoints, middleware, headers
+
+---
+
+## Executive Summary
+
+AOC v1.7.11 has one **CRITICAL** vulnerability (CORS credentials leak) and several defense-in-depth gaps. The good news: input validation, NoSQL injection resistance, and error handling are solid. The bad news: CORS is dangerously permissive, security headers are missing, and the rate limiter fails open on Redis failure.
+
+| Severity | Count | Categories |
+|----------|-------|------------|
+| CRITICAL | 1 | CORS with credentials |
+| HIGH | 1 | Missing security headers |
+| MEDIUM | 2 | Fail-open rate limiter, OpenAPI exposure |
+| LOW | 2 | Information disclosure, webhook content injection |
+| INFO | 3 | Positive findings (no stack traces, input validation, NoSQL resistance) |
+
+---
+
+## CRITICAL
+
+### 1. CORS Reflects Any Origin with `allow_credentials=true`
+
+**Finding:** The CORS middleware returns `Access-Control-Allow-Origin: <any origin>` AND `Access-Control-Allow-Credentials: true` for every origin that sends an `Origin` header.
+
+**Evidence:**
+```bash
+curl -H "Origin: https://evil-attacker.com" http://localhost:8001/api/config/auth
+# Response headers:
+# access-control-allow-origin: https://evil-attacker.com
+# access-control-allow-credentials: true
+```
+
+**Impact:** An attacker can host a malicious page on any domain and make authenticated cross-origin requests to the AOC API using the victim's browser cookies/tokens. This effectively bypasses Same-Origin Policy for authenticated actions.
+
+**Root Cause:** `main.py` configures CORS with `allow_origins=["*"]` (from `CORS_ORIGINS` env var, default `"*"`) AND `allow_credentials=True`. According to CORS spec, a wildcard origin with credentials is technically invalid, but Starlette/FastAPI appears to reflect the request origin instead.
+
+**Recommendation:**
+- When `AUTH_ENABLED=true`, reject requests from origins not in an explicit allowlist.
+- Set `allow_credentials=False` if wildcard origins are needed.
+- Or, require `CORS_ORIGINS` to be explicitly configured (no default wildcard) when auth is enabled.
+
+---
+
+## HIGH
+
+### 2. Missing Security Headers
+
+**Finding:** The following security headers are absent from all responses:
+
+| Header | Purpose | Status |
+|--------|---------|--------|
+| `X-Content-Type-Options: nosniff` | Prevents MIME sniffing | MISSING |
+| `X-Frame-Options: DENY` or `SAMEORIGIN` | Clickjacking protection | MISSING |
+| `Strict-Transport-Security` | HSTS enforcement | MISSING |
+| `Referrer-Policy: strict-origin-when-cross-origin` | Limits referrer leakage | MISSING |
+| `Permissions-Policy` | Restricts browser features | MISSING |
+
+**Impact:** Increased attack surface for clickjacking, MIME confusion attacks, and information leakage via referrer headers.
+
+**Recommendation:** Add a security headers middleware to set these on all responses. HSTS only when served over HTTPS.
+
+---
+
+## MEDIUM
+
+### 3. Rate Limiter Fails Open on Redis Failure
+
+**Finding:** In `rate_limiter.py` line 81-82:
+```python
+except Exception as exc:
+    logger.warning("Rate limiter Redis error; allowing request", error=str(exc))
+```
+
+If Redis becomes unreachable, all rate limits are silently bypassed.
+
+**Evidence:** When Redis was down, 150+ requests to `/api/events` all returned 200 with no 429s.
+
+**Impact:** A DoS on Redis (or a network partition) removes all rate limiting, allowing unlimited API abuse.
+
+**Recommendation:** Make the rate limiter fail-closed: return 429 or 503 when Redis is unavailable, or use an in-memory fallback with a conservative limit.
+
+### 4. OpenAPI Schema Publicly Exposed
+
+**Finding:** `/docs`, `/redoc`, and `/openapi.json` are accessible without authentication and return the full API schema.
+
+**Evidence:**
+```bash
+curl -s http://localhost:8001/openapi.json | jq '.paths | keys'
+# Returns all 15+ API paths including internal endpoints
+```
+
+**Impact:** Attackers get a complete map of the API, including request/response schemas, parameter types, and endpoint structure. This significantly reduces reconnaissance time.
+
+**Recommendation:** Disable OpenAPI docs in production (`docs_url=None, redoc_url=None, openapi_url=None`) or gate them behind admin authentication.
+
+---
+
+## LOW
+
+### 5. Information Disclosure via `/api/config/auth` and `/metrics`
+
+**Finding:**
+- `/api/config/auth` leaks `tenant_id` and `client_id` even when auth is disabled. These values fall back to the Graph API credentials (`TENANT_ID`/`CLIENT_ID`), which may be sensitive.
+- `/metrics` exposes Python version (`3.14.3`), GC statistics, and application-internal metric names.
+
+**Evidence:**
+```json
+{
+  "auth_enabled": false,
+  "tenant_id": "0ec9f34c-17c8-4541-b084-7d64ecdcc997",
+  "client_id": "cc31fd45-1eca-431f-a2c6-ba81cd4c5d50"
+}
+```
+
+**Impact:** Low direct impact (tenant/client IDs are not secrets), but aids reconnaissance and narrows the attack surface.
+
+**Recommendation:**
+- Return empty strings for `tenant_id`/`client_id` when `auth_enabled=false`.
+- Gate `/metrics` behind IP allowlist or admin auth (standard Prometheus practice).
+
+### 6. Webhook Validation Token Echoed Without Sanitization
+
+**Finding:** The `/api/webhooks/graph` endpoint echoes `validationToken` query parameter as `text/plain` without any sanitization or length limits.
+
+**Evidence:**
+```bash
+curl -X POST "http://localhost:8001/api/webhooks/graph?validationToken=<script>alert(1)</script>"
+# Returns: <script>alert(1)</script> with Content-Type: text/plain
+```
+
+**Impact:** Low in the intended Microsoft Graph flow (token is Microsoft-generated), but if the endpoint is hit directly, an attacker could use this for cache poisoning, response splitting, or social engineering by making the endpoint return attacker-controlled content.
+
+**Recommendation:** Validate the validationToken format (e.g., JWT-like structure, length limits) before echoing, or set `Content-Type: text/plain; charset=utf-8` with `X-Content-Type-Options: nosniff` to reduce MIME confusion risk.
+
+---
+
+## INFO (Positive Findings)
+
+### A. No Stack Traces in Error Responses
+
+All errors (422, 404, 429, 500 if triggered) return generic JSON messages without internal details or stack traces. Good.
+
+### B. Pydantic Input Validation is Effective
+
+- `page_size` capped at 500 (returns 422 for 501, 0, -1)
+- `hours` capped at 720 (returns 422 for 721)
+- Invalid cursors return 400 with "Invalid cursor"
+- Malformed JSON bodies return 422 with field-level validation errors
+- `AlertCondition` op field strictly validated against `Literal["eq", "neq", "contains", "in", "after_hours"]`
+
+### C. NoSQL Injection Resistant
+
+MongoDB operators passed as string filter values are treated as literals, not operators:
+
+```bash
+curl "http://localhost:8001/api/events?operation=\$ne"
+# Returns 0 results (treated as literal string "$ne")
+```
+
+The `_build_query()` function in `events.py` uses `re.escape()` for search input and constructs queries safely.
+
+### D. Bulk Tags Pre-Count Check Works
+
+`bulk_tags` endpoint capped at 10,000 matched documents via pre-count check. 93 events were successfully tagged with no bypass.
+
+### E. Rate Limiting Works When Redis is Healthy
+
+- `/api/fetch-audit-logs`: 429 after 11 requests (limit: 10/hr)
+- `/api/events`: 429 after ~120 requests (limit: 120/min)
+- Exempt paths work correctly: `/health`, `/metrics`, `/api/config/auth`, `/api/config/features`
+- `Retry-After` header is returned on 429 responses
+
+---
+
+## Recommendations Summary
+
+| Priority | Action | Effort |
+|----------|--------|--------|
+| P0 | Fix CORS: do not allow credentials with wildcard/reflected origins | Small |
+| P1 | Add security headers middleware (X-Content-Type-Options, X-Frame-Options, HSTS, Referrer-Policy) | Small |
+| P2 | Make rate limiter fail-closed on Redis errors | Small |
+| P2 | Disable OpenAPI docs in production or gate behind auth | Small |
+| P3 | Sanitize or validate webhook validationToken before echo | Small |
+| P3 | Gate `/metrics` behind IP allowlist | Small |
+| P3 | Hide tenant_id/client_id from `/api/config/auth` when auth is disabled | Tiny |
+| P4 | Consider Alpine.js CSP build to remove `unsafe-eval` from script-src | Medium |
+
+---
+
+## Test Environment
+
+```
+Backend: uvicorn on localhost:8001 (auth=false, ai=false)
+MongoDB: docker container, port 27018
+Redis:   docker container, port 6380
+```
+
+*Test commands and raw outputs available in `/tmp/pen_test*.sh` scripts.*
--- a/RELEASE_NOTES_v1.7.12.md
+++ b/RELEASE_NOTES_v1.7.12.md
@@ -0,0 +1,43 @@
+# AOC v1.7.12 Release Notes
+
+**Release Date:** 2026-04-27
+
+## Security Hardening (Penetration Test Remediation)
+
+This release addresses all findings from the internal soft penetration test of v1.7.11.
+
+### Critical Fix: CORS Credentials Leak
+- **Issue:** When `AUTH_ENABLED=true` and `CORS_ORIGINS="*"`, the CORS middleware reflected any origin with `Access-Control-Allow-Credentials: true`, allowing cross-origin authenticated requests from attacker-controlled domains.
+- **Fix:** When auth is enabled with a wildcard origin, `allow_credentials` is now forced to `False`. CORS still works for unauthenticated requests, but bearer tokens cannot be leaked cross-origin.
+
+### High Fix: Missing Security Headers
+- Added `X-Content-Type-Options: nosniff`
+- Added `X-Frame-Options: DENY`
+- Added `Referrer-Policy: strict-origin-when-cross-origin`
+- Added `Permissions-Policy` restricting browser features (accelerometer, camera, geolocation, gyroscope, magnetometer, microphone, payment, USB)
+
+### Medium Fixes
+- **Rate limiter fail-closed:** Previously, a Redis outage silently disabled all rate limiting. The rate limiter now returns `429` when Redis is unreachable.
+- **OpenAPI docs exposure:** `/docs`, `/redoc`, and `/openapi.json` are disabled by default. Set `DOCS_ENABLED=true` to re-enable (intended for development only).
+
+### Low Fixes
+- **Information disclosure:** `/api/config/auth` no longer leaks `tenant_id` and `client_id` when `auth_enabled=false`.
+- **Webhook validation token:** Added length cap (1024 chars) and ASCII-only validation before echoing `validationToken`. Response now includes `X-Content-Type-Options: nosniff`.
+
+## Files Changed
+
+| File | Change |
+|------|--------|
+| `backend/main.py` | CORS fix, security headers middleware, conditional OpenAPI docs |
+| `backend/config.py` | Added `DOCS_ENABLED` setting |
+| `backend/rate_limiter.py` | Fail-closed on Redis errors |
+| `backend/routes/config.py` | Hide tenant/client IDs when auth disabled |
+| `backend/routes/webhooks.py` | Validate validationToken before echo |
+| `backend/tests/conftest.py` | Enhanced FakeRedis mock with `incr`/`expire` |
+| `.env.example` | Documented `DOCS_ENABLED` |
+| `VERSION` | Bumped to 1.7.12 |
+
+## Test Results
+
+- **80/80 pytest tests passing**
+- Penetration test report: `PEN_TEST_REPORT_v1.7.11.md`
--- a/2
+++ b/2
@@ -1 +1 @@
-1.7.10
+1.7.12
--- a/backend/config.py
+++ b/backend/config.py
@@ -76,6 +76,10 @@ class Settings(BaseSettings):
    RATE_LIMIT_REQUESTS: int = 120
    RATE_LIMIT_WINDOW_SECONDS: int = 60

+    # Security / docs exposure
+    DOCS_ENABLED: bool = False
+    METRICS_ALLOWED_IPS: str = "127.0.0.1,::1,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16"
+

 _settings = Settings()

@@ -127,3 +131,6 @@ WEBHOOK_CLIENT_SECRET = _settings.WEBHOOK_CLIENT_SECRET
 RATE_LIMIT_ENABLED = _settings.RATE_LIMIT_ENABLED
 RATE_LIMIT_REQUESTS = _settings.RATE_LIMIT_REQUESTS
 RATE_LIMIT_WINDOW_SECONDS = _settings.RATE_LIMIT_WINDOW_SECONDS
+
+DOCS_ENABLED = _settings.DOCS_ENABLED
+METRICS_ALLOWED_IPS = _settings.METRICS_ALLOWED_IPS
--- a/backend/main.py
+++ b/backend/main.py
@@ -1,4 +1,5 @@
 import asyncio
+import ipaddress
 import logging
 import os
 import time
@@ -7,7 +8,15 @@ from pathlib import Path

 import structlog
 from audit_trail import log_action
-from config import AI_FEATURES_ENABLED, AUTH_ENABLED, CORS_ORIGINS, ENABLE_PERIODIC_FETCH, FETCH_INTERVAL_MINUTES
+from config import (
+    AI_FEATURES_ENABLED,
+    AUTH_ENABLED,
+    CORS_ORIGINS,
+    DOCS_ENABLED,
+    ENABLE_PERIODIC_FETCH,
+    FETCH_INTERVAL_MINUTES,
+    METRICS_ALLOWED_IPS,
+)
 from database import setup_indexes
 from fastapi import FastAPI, HTTPException, Request
 from fastapi.middleware.cors import CORSMiddleware
@@ -51,20 +60,28 @@ def configure_logging():
 configure_logging()
 logger = structlog.get_logger("aoc.fetcher")

-app = FastAPI()
+# Disable OpenAPI docs in production by default
+app = FastAPI(
+    docs_url="/docs" if DOCS_ENABLED else None,
+    redoc_url="/redoc" if DOCS_ENABLED else None,
+    openapi_url="/openapi.json" if DOCS_ENABLED else None,
+)

-# CORS: warn if wildcard is used with auth enabled, but do not break deployments
+# CORS: when auth is enabled, never allow credentials with wildcard origins
 _effective_cors = CORS_ORIGINS
+_cors_credentials = True
 if AUTH_ENABLED and "*" in _effective_cors:
    logger.warning(
-        "CORS wildcard (*) is insecure when AUTH_ENABLED=true. Set CORS_ORIGINS to your actual origin(s) in production."
+        "CORS wildcard (*) is insecure with AUTH_ENABLED=true and allow_credentials. "
+        "Disabling credentials. Set CORS_ORIGINS to your actual origin(s)."
    )
+    _cors_credentials = False

 app.add_middleware(CorrelationIdMiddleware)
 app.add_middleware(
    CORSMiddleware,
    allow_origins=_effective_cors,
-    allow_credentials=True,
+    allow_credentials=_cors_credentials,
    allow_methods=["*"],
    allow_headers=["*"],
 )
@@ -81,7 +98,7 @@ async def prometheus_middleware(request: Request, call_next):


@app.middleware("http")
-async def cache_control_middleware(request: Request, call_next):
+async def security_headers_middleware(request: Request, call_next):
    response = await call_next(request)
    # Prevent caching of HTML and API responses by default
    if request.url.path.startswith("/api/") or request.url.path in ("/", "/index.html"):
@@ -92,7 +109,7 @@ async def cache_control_middleware(request: Request, call_next):
    if request.url.path.startswith("/api/") or request.url.path in ("/", "/index.html"):
        response.headers["Content-Security-Policy"] = (
            "default-src 'self'; "
-            "script-src 'self' 'unsafe-inline' cdn.jsdelivr.net alcdn.msauth.net; "
+            "script-src 'self' 'unsafe-inline' 'unsafe-eval' cdn.jsdelivr.net alcdn.msauth.net; "
            "style-src 'self' 'unsafe-inline'; "
            "connect-src 'self' https://login.microsoftonline.com; "
            "frame-src 'self' https://login.microsoftonline.com; "
@@ -100,6 +117,13 @@ async def cache_control_middleware(request: Request, call_next):
            "img-src 'self' data:; "
            "font-src 'self' data:;"
        )
+    # Additional security headers
+    response.headers["X-Content-Type-Options"] = "nosniff"
+    response.headers["X-Frame-Options"] = "DENY"
+    response.headers["Referrer-Policy"] = "strict-origin-when-cross-origin"
+    response.headers["Permissions-Policy"] = (
+        "accelerometer=(), camera=(), geolocation=(), gyroscope=(), magnetometer=(), microphone=(), payment=(), usb=()"
+    )
    return response


@@ -165,8 +189,39 @@ async def health_check():
        raise HTTPException(status_code=503, detail="Database unavailable") from exc


+def _client_ip(request: Request) -> str:
+    """Best-effort client IP: X-Forwarded-For first hop, or direct client host."""
+    forwarded = request.headers.get("x-forwarded-for")
+    if forwarded:
+        return forwarded.split(",")[0].strip()
+    return request.client.host if request.client else ""
+
+
+def _is_metrics_allowed(ip: str) -> bool:
+    """Check if IP is in the configured metrics allowlist."""
+    if not METRICS_ALLOWED_IPS:
+        return True
+    try:
+        client_addr = ipaddress.ip_address(ip)
+    except ValueError:
+        return False
+    for network in METRICS_ALLOWED_IPS.split(","):
+        network = network.strip()
+        if not network:
+            continue
+        try:
+            if client_addr in ipaddress.ip_network(network, strict=False):
+                return True
+        except ValueError:
+            continue
+    return False
+
+
@app.get("/metrics")
-async def metrics():
+async def metrics(request: Request):
+    client_ip = _client_ip(request)
+    if not _is_metrics_allowed(client_ip):
+        raise HTTPException(status_code=403, detail="Forbidden")
    return Response(content=prometheus_metrics(), media_type="text/plain")


--- a/backend/rate_limiter.py
+++ b/backend/rate_limiter.py
@@ -79,4 +79,5 @@ async def check_rate_limit(request: Request):
    except RateLimitExceeded:
        raise
    except Exception as exc:
-        logger.warning("Rate limiter Redis error; allowing request", error=str(exc))
+        logger.warning("Rate limiter Redis error; failing closed", error=str(exc))
+        raise RateLimitExceeded(retry_after=60) from None
--- a/backend/routes/config.py
+++ b/backend/routes/config.py
@@ -18,8 +18,8 @@ def auth_config():
    logger.debug("Auth config requested", auth_enabled=AUTH_ENABLED)
    return {
        "auth_enabled": AUTH_ENABLED,
-        "tenant_id": AUTH_TENANT_ID,
-        "client_id": AUTH_CLIENT_ID,
+        "tenant_id": AUTH_TENANT_ID if AUTH_ENABLED else "",
+        "client_id": AUTH_CLIENT_ID if AUTH_ENABLED else "",
        "scope": AUTH_SCOPE,
        "redirect_uri": None,  # frontend uses window.location.origin by default
    }
--- a/backend/routes/webhooks.py
+++ b/backend/routes/webhooks.py
@@ -17,7 +17,15 @@ async def graph_webhook(request: Request):
    if validation_token:
        # Microsoft sends validationToken as a query param during subscription creation.
        # Echo it back as plain text to prove endpoint ownership.
-        return Response(content=validation_token, media_type="text/plain")
+        # Validate to prevent content injection if endpoint is hit directly.
+        if len(validation_token) > 1024 or not validation_token.isascii():
+            logger.warning("Invalid validationToken rejected", length=len(validation_token))
+            return Response(status_code=400)
+        return Response(
+            content=validation_token,
+            media_type="text/plain",
+            headers={"X-Content-Type-Options": "nosniff"},
+        )

    try:
        body = await request.json()
--- a/backend/tests/conftest.py
+++ b/backend/tests/conftest.py
@@ -51,18 +51,32 @@ def client(mock_events_collection, mock_watermarks_collection, monkeypatch):

    # Mock Redis so tests don't require a running Redis server
    class FakeRedis:
+        _store = {}
+
        async def get(self, key):
-            return None
+            return self._store.get(key)

        async def setex(self, key, ttl, value):
+            self._store[key] = value
+
+        async def incr(self, key):
+            self._store[key] = self._store.get(key, 0) + 1
+            return self._store[key]
+
+        async def expire(self, key, ttl):
            pass

    async def fake_get_arq_pool():
        return FakeRedis()

+    async def fake_get_redis():
+        return FakeRedis()
+
    monkeypatch.setattr("redis_client.get_arq_pool", fake_get_arq_pool)
+    monkeypatch.setattr("redis_client.get_redis", fake_get_redis)
    monkeypatch.setattr("routes.ask.get_arq_pool", fake_get_arq_pool)
-    monkeypatch.setattr("routes.jobs.get_redis", fake_get_arq_pool)
+    monkeypatch.setattr("routes.jobs.get_redis", fake_get_redis)
+    monkeypatch.setattr("rate_limiter.get_redis", fake_get_redis)

    from main import app

--- a/backend/tests/test_api.py
+++ b/backend/tests/test_api.py
@@ -268,7 +268,7 @@ def test_health(client):


 def test_metrics(client):
-    response = client.get("/metrics")
+    response = client.get("/metrics", headers={"X-Forwarded-For": "127.0.0.1"})
    assert response.status_code == 200
    assert "aoc_request_duration_seconds" in response.text
Author	SHA1	Message	Date
Tomas Kracmar	07a841615b	v1.7.12: security hardening — CORS fix, security headers, fail-closed rate limiter, OpenAPI docs disabled by default, config auth privacy, webhook validation All checks were successful Release / build-and-push (push) Successful in 44s Details CI / lint-and-test (push) Successful in 22s Details	2026-04-27 14:19:28 +02:00
Tomas Kracmar	c086fa4260	hotfix(v1.7.11): add unsafe-eval to CSP for Alpine.js All checks were successful CI / lint-and-test (push) Successful in 1m26s Details Release / build-and-push (push) Successful in 3m1s Details	2026-04-27 10:39:33 +02:00
@@ -1 +1 @@
 .7.10
 .7.12