feat: implement Phase 1 hardening
- Verify JWT signatures via JWKS in auth.py - Fix broken frontend auth button references - Add Pydantic Settings for env validation (RETENTION_DAYS, CORS_ORIGINS) - Create MongoDB indexes + TTL on startup - Add /health endpoint and CORS middleware - Escape regex input in event queries - Fix dedupe() return calculation in maintenance.py - Replace basic logging with structured structlog JSON logs - Update README and add ROADMAP.md
This commit is contained in:
@@ -14,3 +14,12 @@ AUTH_ALLOWED_GROUPS=
|
||||
MONGO_ROOT_USERNAME=root
|
||||
MONGO_ROOT_PASSWORD=example
|
||||
MONGO_PORT=27017
|
||||
|
||||
# MongoDB connection string (takes precedence over root credentials in Docker Compose)
|
||||
MONGO_URI=mongodb://root:example@localhost:27017
|
||||
|
||||
# Optional: number of days to retain events in MongoDB (0 = disabled)
|
||||
RETENTION_DAYS=0
|
||||
|
||||
# Optional: comma-separated CORS origins (e.g., http://localhost:3000,https://app.example.com)
|
||||
CORS_ORIGINS=*
|
||||
|
||||
132
AGENTS.md
Normal file
132
AGENTS.md
Normal file
@@ -0,0 +1,132 @@
|
||||
# Admin Operations Center (AOC)
|
||||
|
||||
## Project Overview
|
||||
|
||||
AOC is a FastAPI microservice that ingests Microsoft Entra (Azure AD) audit logs, Intune audit logs, and Exchange/SharePoint/Teams admin audits (via the Office 365 Management Activity API) into MongoDB. It deduplicates events, enriches them with readable names from Microsoft Graph, and exposes a REST API plus a minimal web UI for searching, filtering, and reviewing events.
|
||||
|
||||
## Technology Stack
|
||||
|
||||
- **Runtime**: Python 3.11
|
||||
- **Web Framework**: FastAPI + Uvicorn
|
||||
- **Database**: MongoDB (PyMongo)
|
||||
- **Frontend**: Vanilla HTML/CSS/JS (served as static files from `backend/frontend/`)
|
||||
- **Authentication**: Optional OIDC Bearer token validation against Microsoft Entra (using `python-jose` and MSAL.js on the frontend)
|
||||
- **External APIs**: Microsoft Graph API, Office 365 Management Activity API
|
||||
- **Deployment**: Docker Compose
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
backend/
|
||||
main.py # FastAPI app, router registration, background periodic fetch
|
||||
config.py # Environment-based configuration (loads .env)
|
||||
database.py # MongoClient setup (db = micro_soc, collection = events)
|
||||
auth.py # OIDC Bearer token validation, JWKS caching, role/group checks
|
||||
requirements.txt # Python dependencies
|
||||
Dockerfile # python:3.11-slim image
|
||||
routes/
|
||||
fetch.py # GET /api/fetch-audit-logs, run_fetch()
|
||||
events.py # GET /api/events, GET /api/filter-options
|
||||
config.py # GET /api/config/auth
|
||||
graph/
|
||||
auth.py # Client credentials token acquisition for Graph
|
||||
audit_logs.py # Fetch and enrich directory audit logs from Graph
|
||||
resolve.py # Resolve directory object IDs to human-readable names
|
||||
sources/
|
||||
unified_audit.py # Office 365 Management Activity API (Exchange/SharePoint/Teams)
|
||||
intune_audit.py # Intune audit events from Graph
|
||||
models/
|
||||
event_model.py # normalize_event() — transforms raw events to stored schema
|
||||
mapping_loader.py # Loads mappings.yml (cached) with fallback defaults
|
||||
mappings.yml # User-editable category labels and summary templates
|
||||
maintenance.py # CLI for re-normalization and deduplication of stored events
|
||||
frontend/
|
||||
index.html # Single-page UI with filters, pagination, raw-event modal
|
||||
style.css # Dark-themed stylesheet
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Copy `.env.example` to `.env` at the repo root and fill in values:
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
Key variables:
|
||||
- `TENANT_ID`, `CLIENT_ID`, `CLIENT_SECRET` — Microsoft app registration credentials (application permissions)
|
||||
- `AUTH_ENABLED` — set `true` to protect API/UI with OIDC Bearer tokens
|
||||
- `AUTH_TENANT_ID`, `AUTH_CLIENT_ID` — token validation audience/issuer
|
||||
- `AUTH_ALLOWED_ROLES`, `AUTH_ALLOWED_GROUPS` — comma-separated access control lists
|
||||
- `ENABLE_PERIODIC_FETCH`, `FETCH_INTERVAL_MINUTES` — background ingestion scheduler
|
||||
- `MONGO_ROOT_USERNAME`, `MONGO_ROOT_PASSWORD`, `MONGO_PORT` — used by Docker Compose for MongoDB
|
||||
|
||||
## Build and Run Commands
|
||||
|
||||
**Docker Compose (recommended):**
|
||||
```bash
|
||||
docker compose up --build
|
||||
```
|
||||
- API/UI: http://localhost:8000
|
||||
- MongoDB: localhost:27017
|
||||
|
||||
**Local development (without Docker):**
|
||||
```bash
|
||||
# 1) Start MongoDB
|
||||
docker run --rm -p 27017:27017 -e MONGO_INITDB_ROOT_USERNAME=root -e MONGO_INITDB_ROOT_PASSWORD=example mongo:7
|
||||
|
||||
# 2) Run backend
|
||||
cd backend
|
||||
python3 -m venv .venv
|
||||
source .venv/bin/activate
|
||||
pip install -r requirements.txt
|
||||
export $(cat ../.env | xargs)
|
||||
uvicorn main:app --reload --host 0.0.0.0 --port 8000
|
||||
```
|
||||
|
||||
## API Endpoints
|
||||
|
||||
- `GET /api/fetch-audit-logs?hours=168` — pulls last N hours (capped at 720 / 30 days) from all sources, normalizes, dedupes, and upserts into MongoDB
|
||||
- `GET /api/events` — list stored events with filters (`service`, `actor`, `operation`, `result`, `start`, `end`, `search`) and pagination (`page`, `page_size`)
|
||||
- `GET /api/filter-options` — best-effort distinct values for UI dropdowns
|
||||
- `GET /api/config/auth` — auth configuration exposed to the frontend
|
||||
|
||||
## Code Conventions
|
||||
|
||||
- Python modules use absolute imports within the `backend/` package (e.g., `from graph.auth import get_access_token`). When running locally, ensure the working directory is `backend/` so these resolve correctly.
|
||||
- No formal formatter or linter is configured. Keep changes consistent with the existing style: simple functions, explicit exception handling, and informative docstrings.
|
||||
- The frontend is a single HTML file with inline JavaScript. It relies on the MSAL.js CDN (`https://alcdn.msauth.net/browser/2.37.0/js/msal-browser.min.js`).
|
||||
|
||||
## Testing
|
||||
|
||||
There are currently **no automated tests** in this repository. When adding new features or bug fixes, verify behavior manually:
|
||||
|
||||
1. Start the server (Docker Compose or local uvicorn).
|
||||
2. Run a smoke test:
|
||||
```bash
|
||||
curl http://localhost:8000/api/events
|
||||
curl http://localhost:8000/api/fetch-audit-logs
|
||||
```
|
||||
3. Open http://localhost:8000 in a browser, apply filters, paginate, and click "View raw event".
|
||||
|
||||
## Security Considerations
|
||||
|
||||
- **Secrets**: `CLIENT_SECRET` and other credentials come from `.env`. Never commit `.env`.
|
||||
- **Auth validation**: When `AUTH_ENABLED=true`, the backend fetches JWKS from `https://login.microsoftonline.com/{AUTH_TENANT_ID}/v2.0/.well-known/openid-configuration`, caches keys for 1 hour, and validates tenant/issuer claims. Tokens are decoded without strict signature verification (`jwt.get_unverified_claims`), so the tenant and issuer checks are the primary gate.
|
||||
- **Role/Group gating**: Access is allowed if the token’s `roles` intersect `AUTH_ALLOWED_ROLES` or `groups` intersect `AUTH_ALLOWED_GROUPS`. If neither list is configured, all authenticated users are allowed.
|
||||
- **Pagination limits**: `page_size` is clamped to a maximum of 500 to prevent large queries.
|
||||
- **Fetch window cap**: `hours` is clamped to 720 (30 days) to avoid runaway API calls.
|
||||
|
||||
## Maintenance and Operations
|
||||
|
||||
The `backend/maintenance.py` script provides two CLI commands useful for backfilling or correcting stored data:
|
||||
|
||||
```bash
|
||||
# Re-run Graph enrichment + normalization on stored events
|
||||
docker compose run --rm backend python maintenance.py renormalize --limit 500
|
||||
|
||||
# Remove duplicate events based on dedupe_key
|
||||
docker compose run --rm backend python maintenance.py dedupe
|
||||
```
|
||||
|
||||
Both commands operate directly against the MongoDB collection configured in `config.py`.
|
||||
@@ -32,6 +32,12 @@ cp .env.example .env
|
||||
# AUTH_ALLOWED_ROLES=Admins,SecurityOps
|
||||
# ENABLE_PERIODIC_FETCH=true
|
||||
# FETCH_INTERVAL_MINUTES=60
|
||||
|
||||
# Optional: data retention (auto-expire old events via MongoDB TTL)
|
||||
# RETENTION_DAYS=90
|
||||
|
||||
# Optional: CORS origins if the frontend is served separately
|
||||
# CORS_ORIGINS=http://localhost:3000,https://app.example.com
|
||||
```
|
||||
|
||||
## Run with Docker Compose (recommended)
|
||||
@@ -40,6 +46,7 @@ docker compose up --build
|
||||
```
|
||||
- API: http://localhost:8000
|
||||
- Frontend: http://localhost:8000
|
||||
- Health: http://localhost:8000/health
|
||||
- Mongo: localhost:27017 (root/example)
|
||||
|
||||
## Run locally without Docker
|
||||
@@ -57,6 +64,7 @@ uvicorn main:app --reload --host 0.0.0.0 --port 8000
|
||||
```
|
||||
|
||||
## API
|
||||
- `GET /health` — health check with MongoDB connectivity status.
|
||||
- `GET /api/fetch-audit-logs` — pulls the last 7 days by default (override with `?hours=N`, capped to 30 days) of:
|
||||
- Entra directory audit logs (`/auditLogs/directoryAudits`)
|
||||
- Exchange/SharePoint/Teams admin audits (via Office 365 Management Activity API)
|
||||
@@ -89,6 +97,7 @@ Stored document shape (collection `micro_soc.events`):
|
||||
## Quick smoke tests
|
||||
With the server running:
|
||||
```bash
|
||||
curl http://localhost:8000/health
|
||||
curl http://localhost:8000/api/events
|
||||
curl http://localhost:8000/api/fetch-audit-logs
|
||||
```
|
||||
|
||||
63
ROADMAP.md
Normal file
63
ROADMAP.md
Normal file
@@ -0,0 +1,63 @@
|
||||
# AOC Roadmap
|
||||
|
||||
This roadmap tracks planned improvements for the Admin Operations Center (AOC) project, organized by phase.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Harden ✅
|
||||
Goal: fix critical security and reliability gaps before production use.
|
||||
|
||||
- [x] Fix JWT signature verification in `auth.py`
|
||||
- [x] Fix broken frontend auth button references (`loginBtn` / `logoutBtn`)
|
||||
- [x] Add MongoDB indexes (`dedupe_key`, `timestamp`, `service+timestamp`, `id`, text search)
|
||||
- [x] Add MongoDB TTL index for data retention (`RETENTION_DAYS`)
|
||||
- [x] Add `/health` endpoint with database connectivity check
|
||||
- [x] Replace manual `os.getenv` parsing with Pydantic Settings (`pydantic-settings`)
|
||||
- [x] Add structured JSON logging (`structlog`)
|
||||
- [x] Configure CORS middleware via `CORS_ORIGINS` environment variable
|
||||
- [x] Escape user input before MongoDB `$regex` queries (`routes/events.py`)
|
||||
- [x] Fix incorrect return value in `maintenance.py dedupe()`
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Stabilize
|
||||
Goal: improve resilience, code quality, and development experience.
|
||||
|
||||
- [ ] Cache Graph API tokens and reuse them until near expiry
|
||||
- [ ] Add exponential backoff / retry logic for Graph API and Office 365 API calls
|
||||
- [ ] Add unit tests for `normalize_event()`, `_make_dedupe_key()`, and `auth.py`
|
||||
- [ ] Add integration tests for `/api/events` and `/api/fetch-audit-logs`
|
||||
- [ ] Configure linter/formatter (`ruff` or `black` + `isort`) and pre-commit hooks
|
||||
- [ ] Set up GitHub Actions CI pipeline (lint + test)
|
||||
- [ ] Add Pydantic request/response models for API endpoints
|
||||
- [ ] Validate `page_size` and `hours` with strict FastAPI constraints
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Scale
|
||||
Goal: handle larger data volumes and support real-time ingestion.
|
||||
|
||||
- [ ] Replace skip-based pagination with cursor-based (search-after) pagination
|
||||
- [ ] Add Prometheus `/metrics` endpoint and a Grafana dashboard
|
||||
- [ ] Implement incremental fetch watermarking per source (store last fetch timestamp)
|
||||
- [ ] Add webhook endpoints to receive Microsoft Graph change notifications
|
||||
- [ ] Evaluate Elasticsearch or Azure Cognitive Search for advanced full-text search
|
||||
- [ ] Add request ID / correlation ID middleware for distributed tracing
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Enhance
|
||||
Goal: evolve from a polling dashboard into a full security operations tool.
|
||||
|
||||
- [ ] Migrate frontend to a maintainable framework (Vue 3, React, or HTMX + Alpine.js)
|
||||
- [ ] Add rule-based alerting (e.g., alert on privileged operations, after-hours activity)
|
||||
- [ ] Add SIEM export (Splunk, Sentinel, syslog webhook)
|
||||
- [ ] Build an audit trail for AOC itself (who queried what, who triggered fetches)
|
||||
- [ ] Add event tagging and commenting (e.g., `investigating`, `false_positive`)
|
||||
- [ ] Add export functionality (CSV / JSON) from the UI
|
||||
- [ ] Add source health dashboard showing last fetch time and status per source
|
||||
|
||||
---
|
||||
|
||||
## Completed in this PR
|
||||
All Phase 1 items were implemented in the latest changes.
|
||||
@@ -1,10 +1,11 @@
|
||||
import time
|
||||
import logging
|
||||
import structlog
|
||||
from typing import Optional, Set
|
||||
|
||||
import requests
|
||||
from fastapi import Depends, HTTPException, Header
|
||||
from jose import jwt
|
||||
from jose.jwk import construct
|
||||
|
||||
from config import (
|
||||
AUTH_ENABLED,
|
||||
@@ -15,7 +16,7 @@ from config import (
|
||||
)
|
||||
|
||||
JWKS_CACHE = {"exp": 0, "keys": []}
|
||||
logger = logging.getLogger("aoc.auth")
|
||||
logger = structlog.get_logger("aoc.auth")
|
||||
|
||||
|
||||
def _get_jwks():
|
||||
@@ -48,9 +49,18 @@ def _allowed(claims: dict, allowed_roles: Set[str], allowed_groups: Set[str]) ->
|
||||
|
||||
def _decode_token(token: str, jwks):
|
||||
try:
|
||||
# Unverified decode to accept tokens from single-app setups without strict signing validation.
|
||||
claims = jwt.get_unverified_claims(token)
|
||||
header = jwt.get_unverified_header(token)
|
||||
kid = header.get("kid")
|
||||
key_dict = next((k for k in jwks if k.get("kid") == kid), None)
|
||||
if not key_dict:
|
||||
raise HTTPException(status_code=401, detail="Invalid token: signing key not found")
|
||||
|
||||
key = construct(key_dict)
|
||||
decode_kwargs = {"algorithms": ["RS256"]}
|
||||
if AUTH_CLIENT_ID:
|
||||
decode_kwargs["audience"] = AUTH_CLIENT_ID
|
||||
claims = jwt.decode(token, key, **decode_kwargs)
|
||||
|
||||
tid = claims.get("tid")
|
||||
iss = claims.get("iss", "")
|
||||
if AUTH_TENANT_ID and tid and tid != AUTH_TENANT_ID:
|
||||
@@ -61,7 +71,7 @@ def _decode_token(token: str, jwks):
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as exc:
|
||||
logger.warning("Token parse failed: %s", exc)
|
||||
logger.warning("Token verification failed", error=str(exc))
|
||||
raise HTTPException(status_code=401, detail="Invalid token")
|
||||
|
||||
|
||||
|
||||
@@ -1,22 +1,59 @@
|
||||
import os
|
||||
from dotenv import load_dotenv
|
||||
from pydantic_settings import BaseSettings, SettingsConfigDict
|
||||
|
||||
load_dotenv()
|
||||
|
||||
TENANT_ID = os.getenv("TENANT_ID")
|
||||
CLIENT_ID = os.getenv("CLIENT_ID")
|
||||
CLIENT_SECRET = os.getenv("CLIENT_SECRET")
|
||||
MONGO_URI = os.getenv("MONGO_URI")
|
||||
DB_NAME = "micro_soc"
|
||||
class Settings(BaseSettings):
|
||||
model_config = SettingsConfigDict(
|
||||
env_file=[".env", "../.env"],
|
||||
env_file_encoding="utf-8",
|
||||
extra="ignore",
|
||||
)
|
||||
|
||||
# Optional periodic fetch settings
|
||||
ENABLE_PERIODIC_FETCH = os.getenv("ENABLE_PERIODIC_FETCH", "false").lower() == "true"
|
||||
FETCH_INTERVAL_MINUTES = int(os.getenv("FETCH_INTERVAL_MINUTES", "60"))
|
||||
# Microsoft Graph / App credentials
|
||||
TENANT_ID: str = ""
|
||||
CLIENT_ID: str = ""
|
||||
CLIENT_SECRET: str = ""
|
||||
|
||||
# Auth (OIDC/Bearer) settings
|
||||
AUTH_ENABLED = os.getenv("AUTH_ENABLED", "false").lower() == "true"
|
||||
AUTH_TENANT_ID = os.getenv("AUTH_TENANT_ID") or TENANT_ID or ""
|
||||
AUTH_CLIENT_ID = os.getenv("AUTH_CLIENT_ID") or CLIENT_ID or ""
|
||||
AUTH_SCOPE = os.getenv("AUTH_SCOPE", "")
|
||||
AUTH_ALLOWED_ROLES = set([r.strip() for r in os.getenv("AUTH_ALLOWED_ROLES", "").split(",") if r.strip()])
|
||||
AUTH_ALLOWED_GROUPS = set([g.strip() for g in os.getenv("AUTH_ALLOWED_GROUPS", "").split(",") if g.strip()])
|
||||
# MongoDB
|
||||
MONGO_URI: str = ""
|
||||
DB_NAME: str = "micro_soc"
|
||||
|
||||
# Periodic fetch
|
||||
ENABLE_PERIODIC_FETCH: bool = False
|
||||
FETCH_INTERVAL_MINUTES: int = 60
|
||||
|
||||
# Auth (OIDC/Bearer) settings
|
||||
AUTH_ENABLED: bool = False
|
||||
AUTH_TENANT_ID: str = ""
|
||||
AUTH_CLIENT_ID: str = ""
|
||||
AUTH_SCOPE: str = ""
|
||||
AUTH_ALLOWED_ROLES: str = ""
|
||||
AUTH_ALLOWED_GROUPS: str = ""
|
||||
|
||||
# Data retention (0 = disabled)
|
||||
RETENTION_DAYS: int = 0
|
||||
|
||||
# CORS
|
||||
CORS_ORIGINS: str = "*"
|
||||
|
||||
|
||||
_settings = Settings()
|
||||
|
||||
# Backward-compatible module-level exports
|
||||
TENANT_ID = _settings.TENANT_ID
|
||||
CLIENT_ID = _settings.CLIENT_ID
|
||||
CLIENT_SECRET = _settings.CLIENT_SECRET
|
||||
MONGO_URI = _settings.MONGO_URI
|
||||
DB_NAME = _settings.DB_NAME
|
||||
|
||||
ENABLE_PERIODIC_FETCH = _settings.ENABLE_PERIODIC_FETCH
|
||||
FETCH_INTERVAL_MINUTES = _settings.FETCH_INTERVAL_MINUTES
|
||||
|
||||
AUTH_ENABLED = _settings.AUTH_ENABLED
|
||||
AUTH_TENANT_ID = _settings.AUTH_TENANT_ID or _settings.TENANT_ID or ""
|
||||
AUTH_CLIENT_ID = _settings.AUTH_CLIENT_ID or _settings.CLIENT_ID or ""
|
||||
AUTH_SCOPE = _settings.AUTH_SCOPE
|
||||
AUTH_ALLOWED_ROLES = {r.strip() for r in _settings.AUTH_ALLOWED_ROLES.split(",") if r.strip()}
|
||||
AUTH_ALLOWED_GROUPS = {g.strip() for g in _settings.AUTH_ALLOWED_GROUPS.split(",") if g.strip()}
|
||||
|
||||
RETENTION_DAYS = _settings.RETENTION_DAYS
|
||||
CORS_ORIGINS = [o.strip() for o in _settings.CORS_ORIGINS.split(",") if o.strip()]
|
||||
|
||||
@@ -1,6 +1,43 @@
|
||||
from pymongo import MongoClient
|
||||
from config import MONGO_URI, DB_NAME
|
||||
from pymongo import MongoClient, ASCENDING, DESCENDING, TEXT
|
||||
from config import MONGO_URI, DB_NAME, RETENTION_DAYS
|
||||
import structlog
|
||||
|
||||
client = MongoClient(MONGO_URI)
|
||||
db = client[DB_NAME]
|
||||
events_collection = db["events"]
|
||||
logger = structlog.get_logger("aoc.database")
|
||||
|
||||
|
||||
def setup_indexes(max_retries: int = 5, delay: float = 2.0):
|
||||
"""Ensure MongoDB indexes exist. Retries on connection errors."""
|
||||
from time import sleep
|
||||
|
||||
for attempt in range(1, max_retries + 1):
|
||||
try:
|
||||
events_collection.create_index("dedupe_key", unique=True, sparse=True)
|
||||
events_collection.create_index([("timestamp", DESCENDING)])
|
||||
events_collection.create_index([("service", ASCENDING), ("timestamp", DESCENDING)])
|
||||
events_collection.create_index("id")
|
||||
events_collection.create_index(
|
||||
[("actor_display", TEXT), ("raw_text", TEXT), ("operation", TEXT)],
|
||||
name="text_search_index",
|
||||
)
|
||||
if RETENTION_DAYS > 0:
|
||||
events_collection.create_index(
|
||||
[("timestamp", ASCENDING)],
|
||||
expireAfterSeconds=RETENTION_DAYS * 24 * 60 * 60,
|
||||
name="ttl_timestamp",
|
||||
)
|
||||
else:
|
||||
try:
|
||||
events_collection.drop_index("ttl_timestamp")
|
||||
except Exception:
|
||||
pass
|
||||
logger.info("MongoDB indexes ensured")
|
||||
return
|
||||
except Exception as exc:
|
||||
if attempt == max_retries:
|
||||
logger.error("Failed to ensure MongoDB indexes", error=str(exc))
|
||||
raise
|
||||
logger.warning("MongoDB not ready, retrying...", attempt=attempt, error=str(exc))
|
||||
sleep(delay)
|
||||
|
||||
@@ -299,8 +299,7 @@ async function initAuth() {
|
||||
}
|
||||
|
||||
if (!authConfig?.auth_enabled) {
|
||||
loginBtn.classList.add('hidden');
|
||||
logoutBtn.classList.add('hidden');
|
||||
authBtn.classList.add('hidden');
|
||||
return;
|
||||
}
|
||||
|
||||
|
||||
@@ -2,41 +2,85 @@ import asyncio
|
||||
import logging
|
||||
from pathlib import Path
|
||||
|
||||
from fastapi import FastAPI
|
||||
import structlog
|
||||
from fastapi import FastAPI, HTTPException
|
||||
from fastapi.staticfiles import StaticFiles
|
||||
from fastapi.middleware.cors import CORSMiddleware
|
||||
|
||||
from database import setup_indexes
|
||||
from routes.fetch import router as fetch_router, run_fetch
|
||||
from routes.events import router as events_router
|
||||
from routes.config import router as config_router
|
||||
from config import ENABLE_PERIODIC_FETCH, FETCH_INTERVAL_MINUTES
|
||||
from config import ENABLE_PERIODIC_FETCH, FETCH_INTERVAL_MINUTES, CORS_ORIGINS
|
||||
|
||||
|
||||
def configure_logging():
|
||||
structlog.configure(
|
||||
processors=[
|
||||
structlog.stdlib.filter_by_level,
|
||||
structlog.stdlib.add_logger_name,
|
||||
structlog.stdlib.add_log_level,
|
||||
structlog.stdlib.PositionalArgumentsFormatter(),
|
||||
structlog.processors.TimeStamper(fmt="iso"),
|
||||
structlog.processors.StackInfoRenderer(),
|
||||
structlog.processors.format_exc_info,
|
||||
structlog.processors.UnicodeDecoder(),
|
||||
structlog.processors.JSONRenderer(),
|
||||
],
|
||||
context_class=dict,
|
||||
logger_factory=structlog.stdlib.LoggerFactory(),
|
||||
wrapper_class=structlog.stdlib.BoundLogger,
|
||||
cache_logger_on_first_use=True,
|
||||
)
|
||||
logging.basicConfig(format="%(message)s", level=logging.INFO)
|
||||
|
||||
|
||||
configure_logging()
|
||||
logger = structlog.get_logger("aoc.fetcher")
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
app.add_middleware(
|
||||
CORSMiddleware,
|
||||
allow_origins=CORS_ORIGINS,
|
||||
allow_credentials=True,
|
||||
allow_methods=["*"],
|
||||
allow_headers=["*"],
|
||||
)
|
||||
|
||||
app.include_router(fetch_router, prefix="/api")
|
||||
app.include_router(events_router, prefix="/api")
|
||||
app.include_router(config_router, prefix="/api")
|
||||
|
||||
# Serve a minimal frontend for browsing events. Use an absolute path so it
|
||||
# works regardless of the working directory used to start uvicorn.
|
||||
|
||||
@app.get("/health")
|
||||
async def health_check():
|
||||
from database import db
|
||||
try:
|
||||
db.command("ping")
|
||||
return {"status": "ok", "database": "connected"}
|
||||
except Exception as exc:
|
||||
logger.error("Health check failed", error=str(exc))
|
||||
raise HTTPException(status_code=503, detail="Database unavailable") from exc
|
||||
|
||||
|
||||
frontend_dir = Path(__file__).parent / "frontend"
|
||||
app.mount("/", StaticFiles(directory=frontend_dir, html=True), name="frontend")
|
||||
|
||||
|
||||
logger = logging.getLogger("aoc.fetcher")
|
||||
|
||||
|
||||
async def _periodic_fetch():
|
||||
while True:
|
||||
try:
|
||||
await asyncio.to_thread(run_fetch)
|
||||
logger.info("Periodic fetch completed.")
|
||||
except Exception as exc:
|
||||
logger.error("Periodic fetch failed: %s", exc)
|
||||
logger.error("Periodic fetch failed", error=str(exc))
|
||||
await asyncio.sleep(FETCH_INTERVAL_MINUTES * 60)
|
||||
|
||||
|
||||
@app.on_event("startup")
|
||||
async def start_periodic_fetch():
|
||||
setup_indexes()
|
||||
if ENABLE_PERIODIC_FETCH:
|
||||
app.state.fetch_task = asyncio.create_task(_periodic_fetch())
|
||||
|
||||
|
||||
@@ -79,7 +79,8 @@ def dedupe(limit: int = None, batch_size: int = 500) -> int:
|
||||
if to_delete:
|
||||
events_collection.delete_many({"_id": {"$in": to_delete}})
|
||||
|
||||
return len(seen) - processed if processed > len(seen) else 0
|
||||
removed = processed - len(seen)
|
||||
return removed if removed > 0 else 0
|
||||
|
||||
|
||||
def main():
|
||||
|
||||
@@ -5,3 +5,5 @@ python-dotenv
|
||||
requests
|
||||
PyYAML
|
||||
python-jose[cryptography]
|
||||
pydantic-settings
|
||||
structlog
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
import re
|
||||
from fastapi import APIRouter, HTTPException, Depends
|
||||
from database import events_collection
|
||||
from auth import require_auth
|
||||
@@ -22,20 +23,21 @@ def list_events(
|
||||
if service:
|
||||
filters.append({"service": service})
|
||||
if actor:
|
||||
actor_safe = re.escape(actor)
|
||||
filters.append(
|
||||
{
|
||||
"$or": [
|
||||
{"actor_display": {"$regex": actor, "$options": "i"}},
|
||||
{"actor_upn": {"$regex": actor, "$options": "i"}},
|
||||
{"actor.user.userPrincipalName": {"$regex": actor, "$options": "i"}},
|
||||
{"actor_display": {"$regex": actor_safe, "$options": "i"}},
|
||||
{"actor_upn": {"$regex": actor_safe, "$options": "i"}},
|
||||
{"actor.user.userPrincipalName": {"$regex": actor_safe, "$options": "i"}},
|
||||
{"actor.user.id": actor},
|
||||
]
|
||||
}
|
||||
)
|
||||
if operation:
|
||||
filters.append({"operation": {"$regex": operation, "$options": "i"}})
|
||||
filters.append({"operation": {"$regex": re.escape(operation), "$options": "i"}})
|
||||
if result:
|
||||
filters.append({"result": {"$regex": result, "$options": "i"}})
|
||||
filters.append({"result": {"$regex": re.escape(result), "$options": "i"}})
|
||||
if start or end:
|
||||
time_filter = {}
|
||||
if start:
|
||||
@@ -44,14 +46,15 @@ def list_events(
|
||||
time_filter["$lte"] = end
|
||||
filters.append({"timestamp": time_filter})
|
||||
if search:
|
||||
search_safe = re.escape(search)
|
||||
filters.append(
|
||||
{
|
||||
"$or": [
|
||||
{"raw_text": {"$regex": search, "$options": "i"}},
|
||||
{"display_summary": {"$regex": search, "$options": "i"}},
|
||||
{"actor_display": {"$regex": search, "$options": "i"}},
|
||||
{"target_displays": {"$elemMatch": {"$regex": search, "$options": "i"}}},
|
||||
{"operation": {"$regex": search, "$options": "i"}},
|
||||
{"raw_text": {"$regex": search_safe, "$options": "i"}},
|
||||
{"display_summary": {"$regex": search_safe, "$options": "i"}},
|
||||
{"actor_display": {"$regex": search_safe, "$options": "i"}},
|
||||
{"target_displays": {"$elemMatch": {"$regex": search_safe, "$options": "i"}}},
|
||||
{"operation": {"$regex": search_safe, "$options": "i"}},
|
||||
]
|
||||
}
|
||||
)
|
||||
|
||||
Reference in New Issue
Block a user