Files
aoc/ROADMAP.md
Tomas Kracmar 4f6e16d64d feat: implement Phase 1 hardening
- Verify JWT signatures via JWKS in auth.py
- Fix broken frontend auth button references
- Add Pydantic Settings for env validation (RETENTION_DAYS, CORS_ORIGINS)
- Create MongoDB indexes + TTL on startup
- Add /health endpoint and CORS middleware
- Escape regex input in event queries
- Fix dedupe() return calculation in maintenance.py
- Replace basic logging with structured structlog JSON logs
- Update README and add ROADMAP.md
2026-04-14 11:48:29 +02:00

64 lines
2.8 KiB
Markdown

# AOC Roadmap
This roadmap tracks planned improvements for the Admin Operations Center (AOC) project, organized by phase.
---
## Phase 1: Harden ✅
Goal: fix critical security and reliability gaps before production use.
- [x] Fix JWT signature verification in `auth.py`
- [x] Fix broken frontend auth button references (`loginBtn` / `logoutBtn`)
- [x] Add MongoDB indexes (`dedupe_key`, `timestamp`, `service+timestamp`, `id`, text search)
- [x] Add MongoDB TTL index for data retention (`RETENTION_DAYS`)
- [x] Add `/health` endpoint with database connectivity check
- [x] Replace manual `os.getenv` parsing with Pydantic Settings (`pydantic-settings`)
- [x] Add structured JSON logging (`structlog`)
- [x] Configure CORS middleware via `CORS_ORIGINS` environment variable
- [x] Escape user input before MongoDB `$regex` queries (`routes/events.py`)
- [x] Fix incorrect return value in `maintenance.py dedupe()`
---
## Phase 2: Stabilize
Goal: improve resilience, code quality, and development experience.
- [ ] Cache Graph API tokens and reuse them until near expiry
- [ ] Add exponential backoff / retry logic for Graph API and Office 365 API calls
- [ ] Add unit tests for `normalize_event()`, `_make_dedupe_key()`, and `auth.py`
- [ ] Add integration tests for `/api/events` and `/api/fetch-audit-logs`
- [ ] Configure linter/formatter (`ruff` or `black` + `isort`) and pre-commit hooks
- [ ] Set up GitHub Actions CI pipeline (lint + test)
- [ ] Add Pydantic request/response models for API endpoints
- [ ] Validate `page_size` and `hours` with strict FastAPI constraints
---
## Phase 3: Scale
Goal: handle larger data volumes and support real-time ingestion.
- [ ] Replace skip-based pagination with cursor-based (search-after) pagination
- [ ] Add Prometheus `/metrics` endpoint and a Grafana dashboard
- [ ] Implement incremental fetch watermarking per source (store last fetch timestamp)
- [ ] Add webhook endpoints to receive Microsoft Graph change notifications
- [ ] Evaluate Elasticsearch or Azure Cognitive Search for advanced full-text search
- [ ] Add request ID / correlation ID middleware for distributed tracing
---
## Phase 4: Enhance
Goal: evolve from a polling dashboard into a full security operations tool.
- [ ] Migrate frontend to a maintainable framework (Vue 3, React, or HTMX + Alpine.js)
- [ ] Add rule-based alerting (e.g., alert on privileged operations, after-hours activity)
- [ ] Add SIEM export (Splunk, Sentinel, syslog webhook)
- [ ] Build an audit trail for AOC itself (who queried what, who triggered fetches)
- [ ] Add event tagging and commenting (e.g., `investigating`, `false_positive`)
- [ ] Add export functionality (CSV / JSON) from the UI
- [ ] Add source health dashboard showing last fetch time and status per source
---
## Completed in this PR
All Phase 1 items were implemented in the latest changes.