6 Commits

Author SHA1 Message Date
f75f165911 feat: Redis caching + async queue for LLM scaling (v1.6.0)
Some checks failed
Release / build-and-push (push) Successful in 1m24s
CI / lint-and-test (push) Failing after 29s
- Add async Redis client singleton (redis_client.py) for caching and arq pool
- Add arq job functions (jobs.py) for background LLM processing
- Cache ask/explain LLM responses with TTL (1h ask, 24h explain)
- Add async mode to /api/ask: enqueue job, return job_id, poll /api/jobs/{id}
- Add GET /api/jobs/{job_id} endpoint for job status polling
- Add arq worker service to docker-compose (dev + prod)
- Switch from Redis to Valkey (BSD fork) in Docker Compose
- Add REDIS_URL config setting
- Add tests for cache hit, async mode, and job status
2026-04-22 09:55:05 +02:00
47e0dfc2ca chore: bump version to 1.5.0
All checks were successful
CI / lint-and-test (push) Successful in 37s
Release / build-and-push (push) Successful in 1m51s
2026-04-22 08:30:20 +02:00
2fffe3aec2 feat: operation-level privacy gating instead of broad service-level
All checks were successful
CI / lint-and-test (push) Successful in 21s
- Replace broad service-level hiding with fine-grained operation-level gating
- PRIVACY_SENSITIVE_OPERATIONS config: hide specific operations across ALL services
- PRIVACY_SERVICES still works for broad service-level blocking (optional)
- Users without PRIVACY_SERVICE_ROLES:
  * Don't see sensitive operations in /api/filter-options
  * Can't query sensitive operations via /api/events or /api/ask
  * Get 403 on /api/events/{id}/explain for sensitive events
- Exchange/Teams services remain visible; only privacy ops are hidden
- Update .env.example with new operation-level config docs
2026-04-22 08:23:46 +02:00
b2f4cabef4 feat: service-level role gating for privacy-sensitive services (Option A)
All checks were successful
CI / lint-and-test (push) Successful in 25s
- Add PRIVACY_SERVICES and PRIVACY_SERVICE_ROLES config variables
- Add user_can_access_privacy_services(claims) helper in auth.py
- /api/events filters out privacy services for users without required roles
- /api/filter-options excludes privacy services from dropdown options
- /api/ask excludes privacy services from NLQ queries
- /api/events/{id}/explain returns 403 for privacy events if unauthorized
- Teams added to default noisy service exclusion (frontend + backend)
- Update .env.example with privacy config documentation
- Add tests for event filtering, filter-options exclusion, and explain 403
2026-04-22 07:26:21 +02:00
e069869a94 feat: exclude Teams from defaults + GUID resolution in explain
All checks were successful
CI / lint-and-test (push) Successful in 26s
- Add Teams to noisy services excluded by default (frontend + backend ask)
- Exchange, SharePoint, and Teams now unchecked by default in filters
- Enhance explain endpoint with GUID resolution:
  * Extract UUIDs from raw event JSON recursively
  * Resolve directory objects via Graph API (user, group, SP, device)
  * Include resolved names in LLM prompt so explanations reference
    human-readable names instead of raw GUIDs
- Add asyncio import for to_thread wrapper around sync Graph calls
2026-04-22 07:12:10 +02:00
fb2386e190 feat: saved searches (bookmarks)
All checks were successful
CI / lint-and-test (push) Successful in 23s
- Add saved_searches_collection to database.py with index on created_by+created_at
- New routes/saved_searches.py: GET /api/saved-searches, POST, DELETE
- Saved searches are scoped per user (created_by = token sub)
- Mount router in main.py
- Frontend: Save filters button, saved search pills with load/delete
- loadSavedSearches called on initApp
- applySavedSearch restores filters and validates services against current options
- Add CSS for saved-searches row
- Add tests for CRUD, delete 404, and name validation
2026-04-22 07:04:07 +02:00
22 changed files with 934 additions and 22 deletions

View File

@@ -49,3 +49,16 @@ LLM_MODEL=gpt-4o-mini
LLM_MAX_EVENTS=200 LLM_MAX_EVENTS=200
LLM_TIMEOUT_SECONDS=30 LLM_TIMEOUT_SECONDS=30
LLM_API_VERSION= LLM_API_VERSION=
# Valkey (caching + async job queue for LLM calls)
# In Docker Compose, this is set automatically to redis://redis:6379/0
# For local dev, start Valkey with: docker run -d -p 6379:6379 valkey/valkey:8-alpine
REDIS_URL=redis://localhost:6379/0
# Optional: privacy / access control
# Hide entire services from users without PRIVACY_SERVICE_ROLES
# PRIVACY_SERVICES=Exchange,Teams
# Hide specific operations across all services from users without PRIVACY_SERVICE_ROLES
# PRIVACY_SENSITIVE_OPERATIONS=MailItemsAccessed,Search-Mailbox,Send,ChatMessageRead
# Comma-separated list of Entra roles that can access privacy-sensitive data
# PRIVACY_SERVICE_ROLES=SecurityAdministrator,ComplianceAdministrator

View File

@@ -65,9 +65,10 @@ Goal: add AI-powered analysis and external tool integration.
- [x] AI feature flag (`AI_FEATURES_ENABLED`) to gate LLM-dependent features - [x] AI feature flag (`AI_FEATURES_ENABLED`) to gate LLM-dependent features
- [x] Natural language query endpoint (`/api/ask`) with intent extraction and smart sampling - [x] Natural language query endpoint (`/api/ask`) with intent extraction and smart sampling
- [x] MCP (Model Context Protocol) server for Claude Desktop / Cursor integration - [x] MCP (Model Context Protocol) server for Claude Desktop / Cursor integration
- [x] Valkey caching for LLM responses and frequent queries
- [x] Async queue (arq) for LLM requests to prevent timeout/cost explosions at scale
- [ ] Advanced analytics dashboard (trending operations, anomaly detection) - [ ] Advanced analytics dashboard (trending operations, anomaly detection)
- [ ] Redis caching for LLM responses and frequent queries
- [ ] Async queue for LLM requests to prevent timeout/cost explosions at scale
## Completed in this PR ## Completed in this PR
All Phase 5 items marked done were implemented in v1.3.0. All Phase 5 items marked done were implemented in v1.3.0v1.5.0.
Redis caching + async queue implemented in v1.6.0, switched to Valkey.

View File

@@ -1 +1 @@
1.4.0 1.6.0

View File

@@ -8,6 +8,8 @@ from config import (
AUTH_CLIENT_ID, AUTH_CLIENT_ID,
AUTH_ENABLED, AUTH_ENABLED,
AUTH_TENANT_ID, AUTH_TENANT_ID,
PRIVACY_SERVICE_ROLES,
PRIVACY_SERVICES,
) )
from fastapi import Header, HTTPException from fastapi import Header, HTTPException
from jwt import ExpiredSignatureError, InvalidTokenError, decode from jwt import ExpiredSignatureError, InvalidTokenError, decode
@@ -82,6 +84,14 @@ def _decode_token(token: str, jwks):
raise HTTPException(status_code=401, detail=f"Invalid token ({type(exc).__name__})") from None raise HTTPException(status_code=401, detail=f"Invalid token ({type(exc).__name__})") from None
def user_can_access_privacy_services(claims: dict) -> bool:
"""Check if the user has roles that grant access to privacy-sensitive services."""
if not PRIVACY_SERVICES or not PRIVACY_SERVICE_ROLES:
return True
user_roles = set(claims.get("roles", []) or claims.get("role", []) or [])
return bool(user_roles.intersection(PRIVACY_SERVICE_ROLES))
def require_auth(authorization: str | None = Header(None)): def require_auth(authorization: str | None = Header(None)):
if not AUTH_ENABLED: if not AUTH_ENABLED:
return {"sub": "anonymous"} return {"sub": "anonymous"}

View File

@@ -51,6 +51,15 @@ class Settings(BaseSettings):
LLM_TIMEOUT_SECONDS: int = 30 LLM_TIMEOUT_SECONDS: int = 30
LLM_API_VERSION: str = "" # e.g. 2025-01-01-preview for Azure OpenAI LLM_API_VERSION: str = "" # e.g. 2025-01-01-preview for Azure OpenAI
# Privacy / access control
# Entire services can be hidden, or specific operations can be gated.
PRIVACY_SERVICES: str = "" # comma-separated, e.g. "Exchange,Teams"
PRIVACY_SENSITIVE_OPERATIONS: str = "" # comma-separated, e.g. "MailItemsAccessed,Search-Mailbox,Send"
PRIVACY_SERVICE_ROLES: str = "" # comma-separated, e.g. "SecurityAdministrator,ComplianceAdministrator"
# Redis (caching + async job queue)
REDIS_URL: str = "redis://localhost:6379/0"
_settings = Settings() _settings = Settings()
@@ -85,3 +94,9 @@ LLM_MODEL = _settings.LLM_MODEL
LLM_MAX_EVENTS = _settings.LLM_MAX_EVENTS LLM_MAX_EVENTS = _settings.LLM_MAX_EVENTS
LLM_TIMEOUT_SECONDS = _settings.LLM_TIMEOUT_SECONDS LLM_TIMEOUT_SECONDS = _settings.LLM_TIMEOUT_SECONDS
LLM_API_VERSION = _settings.LLM_API_VERSION LLM_API_VERSION = _settings.LLM_API_VERSION
PRIVACY_SERVICES = {s.strip() for s in _settings.PRIVACY_SERVICES.split(",") if s.strip()}
PRIVACY_SENSITIVE_OPERATIONS = {o.strip() for o in _settings.PRIVACY_SENSITIVE_OPERATIONS.split(",") if o.strip()}
PRIVACY_SERVICE_ROLES = {r.strip() for r in _settings.PRIVACY_SERVICE_ROLES.split(",") if r.strip()}
REDIS_URL = _settings.REDIS_URL

View File

@@ -7,6 +7,7 @@ from pymongo import ASCENDING, DESCENDING, TEXT, MongoClient
client = MongoClient(MONGO_URI or "mongodb://localhost:27017") client = MongoClient(MONGO_URI or "mongodb://localhost:27017")
db = client[DB_NAME] db = client[DB_NAME]
events_collection = db["events"] events_collection = db["events"]
saved_searches_collection = db["saved_searches"]
logger = structlog.get_logger("aoc.database") logger = structlog.get_logger("aoc.database")
@@ -20,6 +21,7 @@ def setup_indexes(max_retries: int = 5, delay: float = 2.0):
events_collection.create_index([("timestamp", DESCENDING)]) events_collection.create_index([("timestamp", DESCENDING)])
events_collection.create_index([("service", ASCENDING), ("timestamp", DESCENDING)]) events_collection.create_index([("service", ASCENDING), ("timestamp", DESCENDING)])
events_collection.create_index("id") events_collection.create_index("id")
saved_searches_collection.create_index([("created_by", ASCENDING), ("created_at", DESCENDING)])
events_collection.create_index( events_collection.create_index(
[("actor_display", TEXT), ("raw_text", TEXT), ("operation", TEXT)], [("actor_display", TEXT), ("raw_text", TEXT), ("operation", TEXT)],
name="text_search_index", name="text_search_index",

View File

@@ -112,11 +112,23 @@
<div class="actions"> <div class="actions">
<button type="submit">Apply filters</button> <button type="submit">Apply filters</button>
<button type="button" id="clearBtn" class="ghost" @click="clearFilters()">Clear</button> <button type="button" id="clearBtn" class="ghost" @click="clearFilters()">Clear</button>
<button type="button" class="ghost" @click="saveCurrentFilters()">Save filters</button>
<button type="button" class="ghost" @click="bulkTagMatching()">Bulk tag matching</button> <button type="button" class="ghost" @click="bulkTagMatching()">Bulk tag matching</button>
<button type="button" class="ghost" @click="exportJSON()">Export JSON</button> <button type="button" class="ghost" @click="exportJSON()">Export JSON</button>
<button type="button" class="ghost" @click="exportCSV()">Export CSV</button> <button type="button" class="ghost" @click="exportCSV()">Export CSV</button>
</div> </div>
</div> </div>
<div class="filter-row" x-show="savedSearches.length">
<div class="saved-searches">
<span>Saved:</span>
<template x-for="ss in savedSearches" :key="ss.id">
<span class="pill pill--tag" style="cursor:pointer;" @click="applySavedSearch(ss)">
<span x-text="ss.name"></span>
<button type="button" class="link" style="margin-left:4px;" @click.stop="deleteSavedSearch(ss.id)">×</button>
</span>
</template>
</div>
</div>
</form> </form>
</section> </section>
@@ -255,6 +267,7 @@
actor: '', selectedServices: [], search: '', operation: '', result: '', start: '', end: '', limit: 100, includeTags: '', excludeTags: '', actor: '', selectedServices: [], search: '', operation: '', result: '', start: '', end: '', limit: 100, includeTags: '', excludeTags: '',
}, },
options: { actors: [], services: [], operations: [], results: [] }, options: { actors: [], services: [], operations: [], results: [] },
savedSearches: [],
appVersion: '', appVersion: '',
aiFeaturesEnabled: true, aiFeaturesEnabled: true,
askQuestionText: '', askQuestionText: '',
@@ -271,6 +284,7 @@
this.loadSavedFilters(); this.loadSavedFilters();
if (!this.authConfig?.auth_enabled || this.accessToken) { if (!this.authConfig?.auth_enabled || this.accessToken) {
await this.loadFilterOptions(); await this.loadFilterOptions();
await this.loadSavedSearches();
await this.loadSourceHealth(); await this.loadSourceHealth();
await this.loadEvents(); await this.loadEvents();
} }
@@ -508,7 +522,7 @@
const saved = localStorage.getItem('aoc_filters'); const saved = localStorage.getItem('aoc_filters');
if (!saved && this.options.services.length) { if (!saved && this.options.services.length) {
// Default: exclude noisy high-volume services // Default: exclude noisy high-volume services
const noisy = ['Exchange', 'SharePoint']; const noisy = ['Exchange', 'SharePoint', 'Teams'];
this.filters.selectedServices = this.options.services.filter((s) => !noisy.includes(s)); this.filters.selectedServices = this.options.services.filter((s) => !noisy.includes(s));
} else if (saved) { } else if (saved) {
try { try {
@@ -529,6 +543,59 @@
} catch {} } catch {}
}, },
async loadSavedSearches() {
try {
const res = await fetch('/api/saved-searches', { headers: this.authHeader() });
if (!res.ok) return;
this.savedSearches = await res.json();
} catch {}
},
async saveCurrentFilters() {
const name = prompt('Name this saved filter:');
if (!name || !name.trim()) return;
try {
const res = await fetch('/api/saved-searches', {
method: 'POST',
headers: { 'Content-Type': 'application/json', ...this.authHeader() },
body: JSON.stringify({ name: name.trim(), filters: { ...this.filters } }),
});
if (!res.ok) throw new Error(await res.text());
const created = await res.json();
this.savedSearches.unshift(created);
this.statusText = 'Filters saved.';
setTimeout(() => { if (this.statusText === 'Filters saved.') this.statusText = ''; }, 2000);
} catch (err) {
this.statusText = err.message || 'Failed to save filters.';
}
},
applySavedSearch(ss) {
if (!ss || !ss.filters) return;
const fields = ['actor', 'selectedServices', 'search', 'operation', 'result', 'start', 'end', 'limit', 'includeTags', 'excludeTags'];
fields.forEach((f) => {
if (ss.filters[f] !== undefined) this.filters[f] = ss.filters[f];
});
// Validate selectedServices against current options
this.filters.selectedServices = this.filters.selectedServices.filter((s) => this.options.services.includes(s));
this.resetPagination();
this.loadEvents();
},
async deleteSavedSearch(id) {
if (!confirm('Delete this saved search?')) return;
try {
const res = await fetch(`/api/saved-searches/${id}`, {
method: 'DELETE',
headers: this.authHeader(),
});
if (!res.ok) throw new Error(await res.text());
this.savedSearches = this.savedSearches.filter((s) => s.id !== id);
} catch (err) {
this.statusText = err.message || 'Failed to delete saved search.';
}
},
resetPagination() { resetPagination() {
this.cursorStack = []; this.cursorStack = [];
this.nextCursor = null; this.nextCursor = null;
@@ -550,7 +617,7 @@
}, },
clearFilters() { clearFilters() {
const noisy = ['Exchange', 'SharePoint']; const noisy = ['Exchange', 'SharePoint', 'Teams'];
this.filters = { actor: '', selectedServices: this.options.services.filter((s) => !noisy.includes(s)), search: '', operation: '', result: '', start: '', end: '', limit: 100, includeTags: '', excludeTags: '' }; this.filters = { actor: '', selectedServices: this.options.services.filter((s) => !noisy.includes(s)), search: '', operation: '', result: '', start: '', end: '', limit: 100, includeTags: '', excludeTags: '' };
this.saveFilters(); this.saveFilters();
this.resetPagination(); this.resetPagination();

View File

@@ -370,6 +370,14 @@ input {
align-items: center; align-items: center;
} }
.saved-searches {
display: flex;
flex-wrap: wrap;
gap: 8px;
align-items: center;
font-size: 13px;
}
.modal__explanation { .modal__explanation {
background: rgba(255, 255, 255, 0.03); background: rgba(255, 255, 255, 0.03);
border: 1px solid var(--border); border: 1px solid var(--border);

113
backend/jobs.py Normal file
View File

@@ -0,0 +1,113 @@
"""arq job functions for async LLM processing."""
import hashlib
import json
import structlog
from arq.connections import RedisSettings
from config import REDIS_URL
logger = structlog.get_logger("aoc.jobs")
# ---------------------------------------------------------------------------
# Cache helpers
# ---------------------------------------------------------------------------
CACHE_TTL_ASK = 3600 # 1 hour
CACHE_TTL_EXPLAIN = 86400 # 24 hours
def _ask_cache_key(question: str, filters: dict, events: list) -> str:
payload = json.dumps({"q": question, "f": filters, "e": [e.get("id") for e in events]}, sort_keys=True)
return f"aoc:cache:ask:{hashlib.md5(payload.encode()).hexdigest()}"
def _explain_cache_key(event_id: str) -> str:
return f"aoc:cache:explain:{event_id}"
async def get_cached_ask(redis, question: str, filters: dict, events: list) -> dict | None:
key = _ask_cache_key(question, filters, events)
raw = await redis.get(key)
if raw:
return json.loads(raw)
return None
async def set_cached_ask(redis, question: str, filters: dict, events: list, result: dict):
key = _ask_cache_key(question, filters, events)
await redis.setex(key, CACHE_TTL_ASK, json.dumps(result, default=str))
async def get_cached_explain(redis, event_id: str) -> dict | None:
key = _explain_cache_key(event_id)
raw = await redis.get(key)
if raw:
return json.loads(raw)
return None
async def set_cached_explain(redis, event_id: str, result: dict):
key = _explain_cache_key(event_id)
await redis.setex(key, CACHE_TTL_EXPLAIN, json.dumps(result, default=str))
# ---------------------------------------------------------------------------
# arq job functions
# ---------------------------------------------------------------------------
async def process_ask_question(ctx, question: str, filters: dict, events: list, total: int, excluded_services: list | None):
"""Background job: call LLM for /api/ask and cache result."""
from routes.ask import _call_llm
redis = ctx["redis"]
try:
answer = await _call_llm(question, events, total=total, excluded_services=excluded_services)
result = {"status": "completed", "answer": answer, "llm_used": True, "llm_error": None}
except Exception as exc:
logger.warning("Async ask LLM failed", error=str(exc))
result = {"status": "failed", "answer": "", "llm_used": False, "llm_error": str(exc)}
await set_cached_ask(redis, question, filters, events, result)
return result
async def process_explain_event(ctx, event_id: str, event: dict, related: list):
"""Background job: call LLM for /api/events/{id}/explain and cache result."""
from routes.ask import _explain_event
redis = ctx["redis"]
try:
explanation = await _explain_event(event, related)
result = {"status": "completed", "explanation": explanation, "llm_used": True, "llm_error": None}
except Exception as exc:
logger.warning("Async explain LLM failed", error=str(exc))
result = {"status": "failed", "explanation": "", "llm_used": False, "llm_error": str(exc)}
await set_cached_explain(redis, event_id, result)
return result
# ---------------------------------------------------------------------------
# arq worker configuration
# ---------------------------------------------------------------------------
async def startup(ctx):
from redis.asyncio import Redis
ctx["redis"] = Redis.from_url(REDIS_URL, decode_responses=True)
async def shutdown(ctx):
await ctx["redis"].close()
class WorkerSettings:
functions = [process_ask_question, process_explain_event]
redis_settings = RedisSettings.from_dsn(REDIS_URL)
on_startup = startup
on_shutdown = shutdown
max_jobs = 10
job_timeout = 120
keep_result = 3600
keep_result_forever = False

View File

@@ -19,7 +19,9 @@ from routes.events import router as events_router
from routes.fetch import router as fetch_router from routes.fetch import router as fetch_router
from routes.fetch import run_fetch from routes.fetch import run_fetch
from routes.health import router as health_router from routes.health import router as health_router
from routes.jobs import router as jobs_router
from routes.rules import router as rules_router from routes.rules import router as rules_router
from routes.saved_searches import router as saved_searches_router
from routes.webhooks import router as webhooks_router from routes.webhooks import router as webhooks_router
@@ -119,7 +121,9 @@ if AI_FEATURES_ENABLED:
from routes.mcp import mcp_asgi from routes.mcp import mcp_asgi
app.mount("/mcp", mcp_asgi) app.mount("/mcp", mcp_asgi)
app.include_router(saved_searches_router, prefix="/api")
app.include_router(rules_router, prefix="/api") app.include_router(rules_router, prefix="/api")
app.include_router(jobs_router, prefix="/api")
@app.get("/health") @app.get("/health")
@@ -174,3 +178,6 @@ async def stop_periodic_fetch():
task.cancel() task.cancel()
with suppress(Exception): with suppress(Exception):
await task await task
from redis_client import close_redis_connections
await close_redis_connections()

View File

@@ -82,6 +82,7 @@ class AskRequest(BaseModel):
end: str | None = None end: str | None = None
include_tags: list[str] | None = None include_tags: list[str] | None = None
exclude_tags: list[str] | None = None exclude_tags: list[str] | None = None
async_mode: bool = False # enqueue async job instead of waiting
class AskEventRef(BaseModel): class AskEventRef(BaseModel):
@@ -101,3 +102,4 @@ class AskResponse(BaseModel):
query_info: dict query_info: dict
llm_used: bool llm_used: bool
llm_error: str | None = None llm_error: str | None = None
job_id: str | None = None

36
backend/redis_client.py Normal file
View File

@@ -0,0 +1,36 @@
"""Async Redis client singleton for caching and job queue."""
import redis.asyncio as aioredis
from arq import create_pool
from arq.connections import ArqRedis, RedisSettings
from config import REDIS_URL
_arq_pool: ArqRedis | None = None
_plain_redis: aioredis.Redis | None = None
async def get_arq_pool() -> ArqRedis:
"""Return a shared arq pool (ArqRedis extends redis.asyncio.Redis)."""
global _arq_pool
if _arq_pool is None:
_arq_pool = await create_pool(RedisSettings.from_dsn(REDIS_URL))
return _arq_pool
async def get_redis() -> aioredis.Redis:
"""Return a shared plain async Redis client."""
global _plain_redis
if _plain_redis is None:
_plain_redis = aioredis.from_url(REDIS_URL, decode_responses=True)
return _plain_redis
async def close_redis_connections():
"""Close all Redis connections (call on shutdown)."""
global _arq_pool, _plain_redis
if _arq_pool:
await _arq_pool.close()
_arq_pool = None
if _plain_redis:
await _plain_redis.close()
_plain_redis = None

View File

@@ -14,3 +14,5 @@ prometheus-client
httpx httpx
gunicorn gunicorn
mcp mcp
redis
arq

View File

@@ -1,14 +1,26 @@
import asyncio
import json import json
import re import re
from datetime import UTC, datetime, timedelta from datetime import UTC, datetime, timedelta
import httpx import httpx
import structlog import structlog
from auth import require_auth from auth import require_auth, user_can_access_privacy_services
from config import LLM_API_KEY, LLM_API_VERSION, LLM_BASE_URL, LLM_MAX_EVENTS, LLM_MODEL, LLM_TIMEOUT_SECONDS from config import (
LLM_API_KEY,
LLM_API_VERSION,
LLM_BASE_URL,
LLM_MAX_EVENTS,
LLM_MODEL,
LLM_TIMEOUT_SECONDS,
PRIVACY_SENSITIVE_OPERATIONS,
PRIVACY_SERVICES,
)
from database import events_collection from database import events_collection
from fastapi import APIRouter, Depends, HTTPException from fastapi import APIRouter, Depends, HTTPException
from jobs import get_cached_ask, get_cached_explain, set_cached_ask, set_cached_explain
from models.api import AskRequest, AskResponse from models.api import AskRequest, AskResponse
from redis_client import get_arq_pool
router = APIRouter(dependencies=[Depends(require_auth)]) router = APIRouter(dependencies=[Depends(require_auth)])
logger = structlog.get_logger("aoc.ask") logger = structlog.get_logger("aoc.ask")
@@ -49,7 +61,7 @@ _SERVICE_INTENTS = {
# Services that are extremely noisy for typical admin questions. # Services that are extremely noisy for typical admin questions.
# We exclude them by default on broad questions unless the user explicitly mentions them. # We exclude them by default on broad questions unless the user explicitly mentions them.
_NOISY_SERVICES = {"Exchange", "SharePoint"} _NOISY_SERVICES = {"Exchange", "SharePoint", "Teams"}
# Services that are generally admin-relevant and kept by default. # Services that are generally admin-relevant and kept by default.
_DEFAULT_ADMIN_SERVICES = { _DEFAULT_ADMIN_SERVICES = {
@@ -471,11 +483,73 @@ Do not invent facts that are not in the data.
""" """
_GUID_RE = re.compile(r"^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}$")
def _extract_guids(obj: dict | list | str) -> set[str]:
"""Recursively extract UUID-like strings from a JSON structure."""
guids = set()
if isinstance(obj, dict):
for k, v in obj.items():
if k.lower() in ("id", "groupid", "userid", "targetid") and isinstance(v, str) and _GUID_RE.match(v):
guids.add(v)
guids.update(_extract_guids(v))
elif isinstance(obj, list):
for item in obj:
guids.update(_extract_guids(item))
elif isinstance(obj, str) and _GUID_RE.match(obj):
guids.add(obj)
return guids
async def _resolve_guids_for_event(event: dict) -> dict[str, str]:
"""Try to resolve GUIDs in an event to human-readable names via Graph API."""
raw = event.get("raw") or {}
guids = _extract_guids(raw)
# Also include any GUIDs in targetResources that might not have displayName
for tr in raw.get("targetResources") or []:
tid = tr.get("id")
if tid and _GUID_RE.match(tid):
guids.add(tid)
for tr in raw.get("modifiedProperties") or []:
for key in ("oldValue", "newValue"):
val = tr.get(key)
if val and _GUID_RE.match(val):
guids.add(val)
if not guids:
return {}
try:
from graph.auth import get_access_token
from graph.resolve import resolve_directory_object
token = await asyncio.to_thread(get_access_token)
cache: dict[str, dict] = {}
resolved = {}
for gid in guids:
result = await asyncio.to_thread(resolve_directory_object, gid, token, cache)
if result:
resolved[gid] = result["name"]
return resolved
except Exception as exc:
logger.warning("GUID resolution failed", error=str(exc))
return {}
async def _explain_event(event: dict, related: list[dict]) -> str: async def _explain_event(event: dict, related: list[dict]) -> str:
if not LLM_API_KEY: if not LLM_API_KEY:
raise RuntimeError("LLM_API_KEY not configured") raise RuntimeError("LLM_API_KEY not configured")
# Resolve GUIDs to names before sending to LLM
resolved = await _resolve_guids_for_event(event)
event_text = json.dumps(event, indent=2, default=str) event_text = json.dumps(event, indent=2, default=str)
resolution_text = ""
if resolved:
resolution_text = "\nResolved GUIDs:\n"
for gid, name in resolved.items():
resolution_text += f" {gid}{name}\n"
related_text = "" related_text = ""
if related: if related:
@@ -492,7 +566,7 @@ async def _explain_event(event: dict, related: list[dict]) -> str:
{"role": "system", "content": _EXPLAIN_SYSTEM_PROMPT}, {"role": "system", "content": _EXPLAIN_SYSTEM_PROMPT},
{ {
"role": "user", "role": "user",
"content": f"Audit event:\n{event_text}{related_text}\n\nPlease explain this event.", "content": f"Audit event:\n{event_text}{resolution_text}{related_text}\n\nPlease explain this event.",
}, },
] ]
@@ -525,6 +599,11 @@ async def explain_event(event_id: str, user: dict = Depends(require_auth)):
if not event: if not event:
raise HTTPException(status_code=404, detail="Event not found") raise HTTPException(status_code=404, detail="Event not found")
if (
event.get("service") in PRIVACY_SERVICES or event.get("operation") in PRIVACY_SENSITIVE_OPERATIONS
) and not user_can_access_privacy_services(user):
raise HTTPException(status_code=403, detail="Access to this event is restricted")
event.pop("_id", None) event.pop("_id", None)
# Fetch related events for context (same actor or target in last 24h) # Fetch related events for context (same actor or target in last 24h)
@@ -563,14 +642,23 @@ async def explain_event(event_id: str, user: dict = Depends(require_auth)):
"llm_error": "LLM_API_KEY not configured", "llm_error": "LLM_API_KEY not configured",
} }
# Check cache first
redis = await get_arq_pool()
cached = await get_cached_explain(redis, event_id)
if cached:
cached["related_count"] = len(related)
return cached
try: try:
explanation = await _explain_event(event, related) explanation = await _explain_event(event, related)
return { result = {
"explanation": explanation, "explanation": explanation,
"llm_used": True, "llm_used": True,
"llm_error": None, "llm_error": None,
"related_count": len(related), "related_count": len(related),
} }
await set_cached_explain(redis, event_id, result)
return result
except Exception as exc: except Exception as exc:
logger.warning("Event explanation failed", error=str(exc)) logger.warning("Event explanation failed", error=str(exc))
return { return {
@@ -615,6 +703,8 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
# ----------------------------------------------------------------------- # -----------------------------------------------------------------------
# Build and run query # Build and run query
# ----------------------------------------------------------------------- # -----------------------------------------------------------------------
privacy_excluded_services = [] if user_can_access_privacy_services(user) else list(PRIVACY_SERVICES)
privacy_excluded_ops = [] if user_can_access_privacy_services(user) else list(PRIVACY_SENSITIVE_OPERATIONS)
query = _build_event_query( query = _build_event_query(
entity, entity,
start, start,
@@ -626,6 +716,13 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
include_tags=body.include_tags, include_tags=body.include_tags,
exclude_tags=body.exclude_tags, exclude_tags=body.exclude_tags,
) )
extra_filters = []
if privacy_excluded_services:
extra_filters.append({"service": {"$nin": privacy_excluded_services}})
if privacy_excluded_ops:
extra_filters.append({"operation": {"$nin": privacy_excluded_ops}})
if extra_filters:
query["$and"] = query.get("$and", []) + extra_filters
try: try:
total = events_collection.count_documents(query) total = events_collection.count_documents(query)
@@ -660,19 +757,70 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
llm_error="LLM not used — no events found." if not LLM_API_KEY else None, llm_error="LLM not used — no events found." if not LLM_API_KEY else None,
) )
# Try LLM summarisation # Try LLM summarisation (with caching + optional async)
answer = "" answer = ""
llm_used = False llm_used = False
llm_error = None llm_error = None
if not LLM_API_KEY: job_id = None
llm_error = "LLM_API_KEY is not configured. Set it in your .env to enable AI narrative summarisation."
filters_snapshot = {
"services": body.services,
"actor": body.actor,
"operation": body.operation,
"result": body.result,
"start": body.start,
"end": body.end,
"include_tags": body.include_tags,
"exclude_tags": body.exclude_tags,
}
if LLM_API_KEY:
redis = await get_arq_pool()
cached = await get_cached_ask(redis, question, filters_snapshot, events)
if cached:
answer = cached.get("answer", "")
llm_used = cached.get("llm_used", False)
llm_error = cached.get("llm_error")
elif body.async_mode:
pool = await get_arq_pool()
job = await pool.enqueue_job(
"process_ask_question",
question,
filters_snapshot,
events,
total,
excluded_services,
)
job_id = job.job_id if job else None
return AskResponse(
answer="Your question is being processed. Poll /api/jobs/{job_id} for the result.",
events=[_to_event_ref(e) for e in events],
query_info={
"entity": entity,
"start": start,
"end": end,
"event_count": len(events),
"total_matched": total,
"services_queried": query_services,
"excluded_services": excluded_services,
"mongo_query": json.dumps(query, default=str),
},
llm_used=False,
llm_error=None,
job_id=job_id,
)
else: else:
try: try:
answer = await _call_llm(question, events, total=total, excluded_services=excluded_services) answer = await _call_llm(question, events, total=total, excluded_services=excluded_services)
llm_used = True llm_used = True
await set_cached_ask(redis, question, filters_snapshot, events, {
"answer": answer, "llm_used": True, "llm_error": None,
})
except Exception as exc: except Exception as exc:
llm_error = f"LLM call failed: {exc}" llm_error = f"LLM call failed: {exc}"
logger.warning("LLM call failed, falling back to structured summary", error=str(exc)) logger.warning("LLM call failed, falling back to structured summary", error=str(exc))
else:
llm_error = "LLM_API_KEY is not configured. Set it in your .env to enable AI narrative summarisation."
# Fallback: structured summary if LLM unavailable or failed # Fallback: structured summary if LLM unavailable or failed
if not answer: if not answer:
@@ -711,4 +859,5 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
}, },
llm_used=llm_used, llm_used=llm_used,
llm_error=llm_error, llm_error=llm_error,
job_id=job_id,
) )

View File

@@ -3,8 +3,9 @@ import re
from datetime import UTC, datetime from datetime import UTC, datetime
from audit_trail import log_action from audit_trail import log_action
from auth import require_auth from auth import require_auth, user_can_access_privacy_services
from bson import ObjectId from bson import ObjectId
from config import PRIVACY_SENSITIVE_OPERATIONS, PRIVACY_SERVICES
from database import events_collection from database import events_collection
from fastapi import APIRouter, Depends, HTTPException, Query from fastapi import APIRouter, Depends, HTTPException, Query
from models.api import ( from models.api import (
@@ -44,6 +45,7 @@ def _build_query(
cursor: str | None = None, cursor: str | None = None,
include_tags: list[str] | None = None, include_tags: list[str] | None = None,
exclude_tags: list[str] | None = None, exclude_tags: list[str] | None = None,
exclude_operations: list[str] | None = None,
) -> dict: ) -> dict:
filters = [] filters = []
@@ -51,6 +53,8 @@ def _build_query(
filters.append({"service": service}) filters.append({"service": service})
if services: if services:
filters.append({"service": {"$in": services}}) filters.append({"service": {"$in": services}})
if exclude_operations:
filters.append({"operation": {"$nin": exclude_operations}})
if actor: if actor:
actor_safe = re.escape(actor) actor_safe = re.escape(actor)
filters.append( filters.append(
@@ -125,6 +129,8 @@ def list_events(
exclude_tags: list[str] | None = Query(default=None), exclude_tags: list[str] | None = Query(default=None),
user: dict = Depends(require_auth), user: dict = Depends(require_auth),
): ):
privacy_excluded_services = [] if user_can_access_privacy_services(user) else list(PRIVACY_SERVICES)
privacy_excluded_ops = [] if user_can_access_privacy_services(user) else list(PRIVACY_SENSITIVE_OPERATIONS)
query = _build_query( query = _build_query(
service=service, service=service,
services=services, services=services,
@@ -137,7 +143,13 @@ def list_events(
cursor=cursor, cursor=cursor,
include_tags=include_tags, include_tags=include_tags,
exclude_tags=exclude_tags, exclude_tags=exclude_tags,
exclude_operations=privacy_excluded_ops,
) )
if privacy_excluded_services:
query = query if query else {}
if "$and" not in query:
query = {"$and": [query]} if query else {"$and": []}
query["$and"].append({"service": {"$nin": privacy_excluded_services}})
safe_page_size = max(1, min(page_size, 500)) safe_page_size = max(1, min(page_size, 500))
@@ -202,6 +214,8 @@ def bulk_tags(
exclude_tags: list[str] | None = Query(default=None), exclude_tags: list[str] | None = Query(default=None),
user: dict = Depends(require_auth), user: dict = Depends(require_auth),
): ):
privacy_excluded_services = [] if user_can_access_privacy_services(user) else list(PRIVACY_SERVICES)
privacy_excluded_ops = [] if user_can_access_privacy_services(user) else list(PRIVACY_SENSITIVE_OPERATIONS)
query = _build_query( query = _build_query(
service=service, service=service,
services=services, services=services,
@@ -213,7 +227,13 @@ def bulk_tags(
search=search, search=search,
include_tags=include_tags, include_tags=include_tags,
exclude_tags=exclude_tags, exclude_tags=exclude_tags,
exclude_operations=privacy_excluded_ops,
) )
if privacy_excluded_services:
query = query if query else {}
if "$and" not in query:
query = {"$and": [query]} if query else {"$and": []}
query["$and"].append({"service": {"$nin": privacy_excluded_services}})
tags = [t.strip() for t in body.tags if t.strip()] tags = [t.strip() for t in body.tags if t.strip()]
if not tags: if not tags:
raise HTTPException(status_code=400, detail="No tags provided") raise HTTPException(status_code=400, detail="No tags provided")
@@ -235,7 +255,10 @@ def bulk_tags(
@router.get("/filter-options", response_model=FilterOptionsResponse) @router.get("/filter-options", response_model=FilterOptionsResponse)
def filter_options(limit: int = Query(default=200, ge=1, le=1000)): def filter_options(
limit: int = Query(default=200, ge=1, le=1000),
user: dict = Depends(require_auth),
):
safe_limit = max(1, min(limit, 1000)) safe_limit = max(1, min(limit, 1000))
try: try:
services = sorted(events_collection.distinct("service"))[:safe_limit] services = sorted(events_collection.distinct("service"))[:safe_limit]
@@ -247,6 +270,10 @@ def filter_options(limit: int = Query(default=200, ge=1, le=1000)):
except Exception as exc: except Exception as exc:
raise HTTPException(status_code=500, detail=f"Failed to load filter options: {exc}") from exc raise HTTPException(status_code=500, detail=f"Failed to load filter options: {exc}") from exc
if not user_can_access_privacy_services(user):
services = [s for s in services if s not in PRIVACY_SERVICES]
operations = [o for o in operations if o not in PRIVACY_SENSITIVE_OPERATIONS]
return { return {
"services": services, "services": services,
"operations": operations, "operations": operations,

43
backend/routes/jobs.py Normal file
View File

@@ -0,0 +1,43 @@
"""Job status endpoints for async LLM operations."""
from arq.jobs import Job, JobStatus
from auth import require_auth
from fastapi import APIRouter, Depends, HTTPException
from pydantic import BaseModel
from redis_client import get_redis
router = APIRouter(dependencies=[Depends(require_auth)])
class JobStatusResponse(BaseModel):
job_id: str
status: str # queued, in_progress, complete, not_found, deferred
result: dict | None = None
error: str | None = None
@router.get("/jobs/{job_id}", response_model=JobStatusResponse)
async def get_job_status(job_id: str, user: dict = Depends(require_auth)):
"""Poll for the result of an async LLM job."""
redis = await get_redis()
job = Job(job_id, redis)
status = await job.status()
if status == JobStatus.not_found:
raise HTTPException(status_code=404, detail="Job not found")
result = None
error = None
if status == JobStatus.complete:
try:
result_data = await job.result(timeout=0)
result = result_data if isinstance(result_data, dict) else {"data": str(result_data)}
except Exception as exc:
error = str(exc)
return JobStatusResponse(
job_id=job_id,
status=status.value,
result=result,
error=error,
)

View File

@@ -0,0 +1,60 @@
"""CRUD for saved filter searches (bookmarks)."""
import uuid
from datetime import UTC, datetime
import structlog
from auth import require_auth
from database import saved_searches_collection
from fastapi import APIRouter, Depends, HTTPException
router = APIRouter(dependencies=[Depends(require_auth)])
logger = structlog.get_logger("aoc.saved_searches")
def _user_sub(user: dict) -> str:
return user.get("sub", "anonymous")
@router.get("/saved-searches")
async def list_saved_searches(user: dict = Depends(require_auth)):
"""Return saved searches for the current user."""
sub = _user_sub(user)
cursor = saved_searches_collection.find({"created_by": sub}).sort("created_at", -1)
items = []
for doc in cursor:
doc["id"] = doc.pop("_id")
items.append(doc)
return items
@router.post("/saved-searches")
async def create_saved_search(body: dict, user: dict = Depends(require_auth)):
"""Save the current filter set."""
name = (body.get("name") or "").strip()
if not name:
raise HTTPException(status_code=400, detail="Name is required")
filters = body.get("filters") or {}
doc = {
"_id": str(uuid.uuid4()),
"name": name,
"filters": filters,
"created_at": datetime.now(UTC).isoformat().replace("+00:00", "Z"),
"created_by": _user_sub(user),
}
saved_searches_collection.insert_one(doc)
logger.info("Saved search created", name=name, user=doc["created_by"])
doc["id"] = doc.pop("_id")
return doc
@router.delete("/saved-searches/{search_id}")
async def delete_saved_search(search_id: str, user: dict = Depends(require_auth)):
"""Delete a saved search (only if owned by current user)."""
sub = _user_sub(user)
result = saved_searches_collection.delete_one({"_id": search_id, "created_by": sub})
if result.deleted_count == 0:
raise HTTPException(status_code=404, detail="Saved search not found")
logger.info("Saved search deleted", search_id=search_id, user=sub)
return {"status": "deleted"}

View File

@@ -22,15 +22,23 @@ def mock_watermarks_collection():
@pytest.fixture(scope="function") @pytest.fixture(scope="function")
def client(mock_events_collection, mock_watermarks_collection, monkeypatch): def client(mock_events_collection, mock_watermarks_collection, monkeypatch):
monkeypatch.setattr("database.events_collection", mock_events_collection) monkeypatch.setattr("database.events_collection", mock_events_collection)
monkeypatch.setattr("database.saved_searches_collection", mock_events_collection)
monkeypatch.setattr("routes.fetch.events_collection", mock_events_collection) monkeypatch.setattr("routes.fetch.events_collection", mock_events_collection)
monkeypatch.setattr("routes.events.events_collection", mock_events_collection) monkeypatch.setattr("routes.events.events_collection", mock_events_collection)
monkeypatch.setattr("routes.ask.events_collection", mock_events_collection) monkeypatch.setattr("routes.ask.events_collection", mock_events_collection)
monkeypatch.setattr("routes.saved_searches.saved_searches_collection", mock_events_collection)
monkeypatch.setattr("watermark.watermarks_collection", mock_watermarks_collection) monkeypatch.setattr("watermark.watermarks_collection", mock_watermarks_collection)
monkeypatch.setattr("routes.health.watermarks_collection", mock_watermarks_collection) monkeypatch.setattr("routes.health.watermarks_collection", mock_watermarks_collection)
monkeypatch.setattr("routes.fetch.get_watermark", lambda source: None) monkeypatch.setattr("routes.fetch.get_watermark", lambda source: None)
monkeypatch.setattr("routes.fetch.set_watermark", lambda source, ts: None) monkeypatch.setattr("routes.fetch.set_watermark", lambda source, ts: None)
monkeypatch.setattr("auth.AUTH_ENABLED", False) monkeypatch.setattr("auth.AUTH_ENABLED", False)
monkeypatch.setattr("routes.mcp.AUTH_ENABLED", False) monkeypatch.setattr("routes.mcp.AUTH_ENABLED", False)
monkeypatch.setattr("config.PRIVACY_SERVICES", set())
monkeypatch.setattr("config.PRIVACY_SENSITIVE_OPERATIONS", set())
monkeypatch.setattr("routes.events.PRIVACY_SERVICES", set())
monkeypatch.setattr("routes.events.PRIVACY_SENSITIVE_OPERATIONS", set())
monkeypatch.setattr("routes.ask.PRIVACY_SERVICES", set())
monkeypatch.setattr("routes.ask.PRIVACY_SENSITIVE_OPERATIONS", set())
monkeypatch.setattr("database.db.command", lambda cmd: {"ok": 1} if cmd == "ping" else {}) monkeypatch.setattr("database.db.command", lambda cmd: {"ok": 1} if cmd == "ping" else {})
# Mock audit trail and rules collections so tests don't wait on real MongoDB # Mock audit trail and rules collections so tests don't wait on real MongoDB
@@ -41,6 +49,20 @@ def client(mock_events_collection, mock_watermarks_collection, monkeypatch):
monkeypatch.setattr("rules.rules_collection", audit_db["alert_rules"]) monkeypatch.setattr("rules.rules_collection", audit_db["alert_rules"])
monkeypatch.setattr("routes.rules.rules_collection", audit_db["alert_rules"]) monkeypatch.setattr("routes.rules.rules_collection", audit_db["alert_rules"])
# Mock Redis so tests don't require a running Redis server
class FakeRedis:
async def get(self, key):
return None
async def setex(self, key, ttl, value):
pass
async def fake_get_arq_pool():
return FakeRedis()
monkeypatch.setattr("redis_client.get_arq_pool", fake_get_arq_pool)
monkeypatch.setattr("routes.ask.get_arq_pool", fake_get_arq_pool)
monkeypatch.setattr("routes.jobs.get_redis", fake_get_arq_pool)
from main import app from main import app
return TestClient(app) return TestClient(app)

View File

@@ -89,6 +89,17 @@ def test_explain_event_with_llm_mock(client, mock_events_collection, monkeypatch
monkeypatch.setattr("routes.ask._explain_event", fake_explain) monkeypatch.setattr("routes.ask._explain_event", fake_explain)
class FakeRedis:
async def get(self, key):
return None
async def setex(self, key, ttl, value):
pass
async def fake_get_arq_pool():
return FakeRedis()
monkeypatch.setattr("routes.ask.get_arq_pool", fake_get_arq_pool)
mock_events_collection.insert_one( mock_events_collection.insert_one(
{ {
"id": "evt-explain2", "id": "evt-explain2",
@@ -107,6 +118,146 @@ def test_explain_event_with_llm_mock(client, mock_events_collection, monkeypatch
assert data["llm_used"] is True assert data["llm_used"] is True
def test_saved_searches_crud(client, monkeypatch):
monkeypatch.setattr("auth.AUTH_ENABLED", False)
# Create
response = client.post(
"/api/saved-searches", json={"name": "Test search", "filters": {"actor": "alice", "result": "success"}}
)
assert response.status_code == 200
created = response.json()
assert created["name"] == "Test search"
assert created["filters"]["actor"] == "alice"
search_id = created["id"]
# List
response2 = client.get("/api/saved-searches")
assert response2.status_code == 200
items = response2.json()
assert len(items) == 1
assert items[0]["name"] == "Test search"
# Delete
response3 = client.delete(f"/api/saved-searches/{search_id}")
assert response3.status_code == 200
# List empty
response4 = client.get("/api/saved-searches")
assert response4.status_code == 200
assert len(response4.json()) == 0
def test_saved_searches_delete_not_found(client, monkeypatch):
monkeypatch.setattr("auth.AUTH_ENABLED", False)
response = client.delete("/api/saved-searches/nonexistent")
assert response.status_code == 404
def test_saved_searches_create_validation(client, monkeypatch):
monkeypatch.setattr("auth.AUTH_ENABLED", False)
response = client.post("/api/saved-searches", json={"name": " ", "filters": {}})
assert response.status_code == 400
def test_privacy_filtering_events_by_operation(client, mock_events_collection, monkeypatch):
monkeypatch.setattr("config.PRIVACY_SENSITIVE_OPERATIONS", {"MailItemsAccessed", "Send"})
monkeypatch.setattr("routes.events.PRIVACY_SENSITIVE_OPERATIONS", {"MailItemsAccessed", "Send"})
monkeypatch.setattr("auth.PRIVACY_SERVICE_ROLES", {"SecurityAdmin"})
monkeypatch.setattr("auth.user_can_access_privacy_services", lambda claims: False)
monkeypatch.setattr("routes.events.user_can_access_privacy_services", lambda claims: False)
mock_events_collection.insert_one(
{
"id": "evt-safe",
"timestamp": datetime.now(UTC).isoformat(),
"service": "Exchange",
"operation": "Add-MailboxPermission",
"result": "success",
"actor_display": "Alice",
"raw_text": "",
}
)
mock_events_collection.insert_one(
{
"id": "evt-priv",
"timestamp": datetime.now(UTC).isoformat(),
"service": "Exchange",
"operation": "Send",
"result": "success",
"actor_display": "Bob",
"raw_text": "",
}
)
response = client.get("/api/events")
assert response.status_code == 200
data = response.json()
ids = [e["id"] for e in data["items"]]
assert "evt-safe" in ids
assert "evt-priv" not in ids
def test_privacy_filter_options_shows_service_hides_ops(client, mock_events_collection, monkeypatch):
monkeypatch.setattr("config.PRIVACY_SENSITIVE_OPERATIONS", {"MailItemsAccessed"})
monkeypatch.setattr("routes.events.PRIVACY_SENSITIVE_OPERATIONS", {"MailItemsAccessed"})
monkeypatch.setattr("auth.PRIVACY_SERVICE_ROLES", {"SecurityAdmin"})
monkeypatch.setattr("auth.user_can_access_privacy_services", lambda claims: False)
monkeypatch.setattr("routes.events.user_can_access_privacy_services", lambda claims: False)
mock_events_collection.insert_one(
{
"id": "evt-1",
"timestamp": datetime.now(UTC).isoformat(),
"service": "Exchange",
"operation": "MailItemsAccessed",
"result": "success",
"actor_display": "Alice",
"raw_text": "",
}
)
mock_events_collection.insert_one(
{
"id": "evt-2",
"timestamp": datetime.now(UTC).isoformat(),
"service": "Exchange",
"operation": "Add-MailboxPermission",
"result": "success",
"actor_display": "Bob",
"raw_text": "",
}
)
response = client.get("/api/filter-options")
assert response.status_code == 200
data = response.json()
assert "Exchange" in data["services"]
assert "MailItemsAccessed" not in data["operations"]
assert "Add-MailboxPermission" in data["operations"]
def test_privacy_explain_forbidden_by_operation(client, mock_events_collection, monkeypatch):
monkeypatch.setattr("config.PRIVACY_SENSITIVE_OPERATIONS", {"Send"})
monkeypatch.setattr("routes.ask.PRIVACY_SENSITIVE_OPERATIONS", {"Send"})
monkeypatch.setattr("auth.PRIVACY_SERVICE_ROLES", {"SecurityAdmin"})
monkeypatch.setattr("auth.user_can_access_privacy_services", lambda claims: False)
monkeypatch.setattr("routes.ask.user_can_access_privacy_services", lambda claims: False)
mock_events_collection.insert_one(
{
"id": "evt-send",
"timestamp": datetime.now(UTC).isoformat(),
"service": "Exchange",
"operation": "Send",
"result": "success",
"actor_display": "Bob",
"raw_text": "",
}
)
response = client.post("/api/events/evt-send/explain")
assert response.status_code == 403
def test_health(client): def test_health(client):
response = client.get("/health") response = client.get("/health")
assert response.status_code == 200 assert response.status_code == 200

View File

@@ -350,3 +350,124 @@ class TestAskEndpoint:
data = response.json() data = response.json()
assert data["query_info"]["event_count"] == 1 assert data["query_info"]["event_count"] == 1
assert data["events"][0]["id"] == "evt-bob" assert data["events"][0]["id"] == "evt-bob"
class TestAskCaching:
def test_ask_cache_hit_returns_cached_answer(self, client, mock_events_collection, monkeypatch):
"""If the answer is cached, the LLM should not be called."""
now = datetime.now(UTC)
mock_events_collection.insert_one(
{
"id": "evt-cache",
"timestamp": now.isoformat(),
"service": "Directory",
"operation": "Add user",
"result": "success",
"actor_display": "Alice",
"target_displays": ["USER-001"],
"display_summary": "summary",
"raw_text": "raw",
}
)
llm_called = False
async def fake_llm(question, events, total=None, excluded_services=None):
nonlocal llm_called
llm_called = True
return "This should NOT appear."
monkeypatch.setattr("routes.ask.LLM_API_KEY", "fake-key")
monkeypatch.setattr("routes.ask._call_llm", fake_llm)
# Pre-populate cache with a specific answer
class CachingFakeRedis:
def __init__(self):
self.store = {}
async def get(self, key):
return self.store.get(key)
async def setex(self, key, ttl, value):
self.store[key] = value
redis = CachingFakeRedis()
# Seed cache with the exact filters the endpoint will generate
import asyncio
from jobs import set_cached_ask
filters_snapshot = {
"services": None,
"actor": None,
"operation": None,
"result": None,
"start": None,
"end": None,
"include_tags": None,
"exclude_tags": None,
}
asyncio.run(set_cached_ask(redis, "What happened to USER-001?", filters_snapshot, [{"id": "evt-cache"}], {"answer": "Cached answer!", "llm_used": True, "llm_error": None}))
async def fake_get_arq_pool():
return redis
monkeypatch.setattr("routes.ask.get_arq_pool", fake_get_arq_pool)
response = client.post("/api/ask", json={"question": "What happened to USER-001?"})
assert response.status_code == 200
data = response.json()
assert data["answer"] == "Cached answer!"
assert data["llm_used"] is True
assert llm_called is False
def test_ask_async_mode_returns_job_id(self, client, mock_events_collection, monkeypatch):
"""Async mode should return immediately with a job_id."""
now = datetime.now(UTC)
mock_events_collection.insert_one(
{
"id": "evt-async",
"timestamp": now.isoformat(),
"service": "Directory",
"operation": "Add user",
"result": "success",
"actor_display": "Alice",
"target_displays": ["USER-001"],
"display_summary": "summary",
"raw_text": "raw",
}
)
monkeypatch.setattr("routes.ask.LLM_API_KEY", "fake-key")
# Mock arq pool to capture enqueue_job call
class FakeArqPool:
def __init__(self):
self.enqueued = []
async def get(self, key):
return None
async def setex(self, key, ttl, value):
pass
async def enqueue_job(self, func, *args, **kwargs):
from unittest.mock import MagicMock
job = MagicMock()
job.job_id = "job-12345"
self.enqueued.append((func, args, kwargs))
return job
pool = FakeArqPool()
async def fake_get_arq_pool():
return pool
monkeypatch.setattr("routes.ask.get_arq_pool", fake_get_arq_pool)
response = client.post("/api/ask", json={"question": "What happened to USER-001?", "async_mode": True})
assert response.status_code == 200
data = response.json()
assert data["job_id"] == "job-12345"
assert data["llm_used"] is False
assert "being processed" in data["answer"]
assert len(pool.enqueued) == 1
assert pool.enqueued[0][0] == "process_ask_question"

View File

@@ -1,4 +1,19 @@
services: services:
redis:
image: valkey/valkey:8-alpine
container_name: aoc-redis
restart: always
volumes:
- redis_data:/data
networks:
- aoc-internal
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 3s
retries: 5
start_period: 5s
mongo: mongo:
image: mongo:7 image: mongo:7
container_name: aoc-mongo container_name: aoc-mongo
@@ -27,9 +42,12 @@ services:
- .env - .env
environment: environment:
MONGO_URI: mongodb://${MONGO_ROOT_USERNAME}:${MONGO_ROOT_PASSWORD}@mongo:27017/ MONGO_URI: mongodb://${MONGO_ROOT_USERNAME}:${MONGO_ROOT_PASSWORD}@mongo:27017/
REDIS_URL: redis://redis:6379/0
depends_on: depends_on:
mongo: mongo:
condition: service_healthy condition: service_healthy
redis:
condition: service_healthy
networks: networks:
- aoc-internal - aoc-internal
healthcheck: healthcheck:
@@ -39,6 +57,24 @@ services:
retries: 3 retries: 3
start_period: 10s start_period: 10s
worker:
image: git.cqre.net/cqrenet/aoc-backend:${AOC_VERSION:-latest}
container_name: aoc-worker
restart: always
env_file:
- .env
environment:
MONGO_URI: mongodb://${MONGO_ROOT_USERNAME}:${MONGO_ROOT_PASSWORD}@mongo:27017/
REDIS_URL: redis://redis:6379/0
command: ["arq", "jobs.WorkerSettings"]
depends_on:
redis:
condition: service_healthy
mongo:
condition: service_healthy
networks:
- aoc-internal
nginx: nginx:
image: nginx:alpine image: nginx:alpine
container_name: aoc-nginx container_name: aoc-nginx
@@ -58,6 +94,7 @@ services:
volumes: volumes:
mongo_data: mongo_data:
redis_data:
networks: networks:
aoc-internal: aoc-internal:

View File

@@ -1,4 +1,13 @@
services: services:
redis:
image: valkey/valkey:8-alpine
container_name: aoc-redis
restart: always
ports:
- "6379:6379"
volumes:
- redis_data:/data
mongo: mongo:
image: mongo:7 image: mongo:7
container_name: aoc-mongo container_name: aoc-mongo
@@ -21,10 +30,27 @@ services:
- .env - .env
environment: environment:
MONGO_URI: mongodb://${MONGO_ROOT_USERNAME}:${MONGO_ROOT_PASSWORD}@mongo:${MONGO_PORT}/ MONGO_URI: mongodb://${MONGO_ROOT_USERNAME}:${MONGO_ROOT_PASSWORD}@mongo:${MONGO_PORT}/
REDIS_URL: redis://redis:6379/0
depends_on: depends_on:
- mongo - mongo
- redis
ports: ports:
- "8000:8000" - "8000:8000"
worker:
build: ./backend
container_name: aoc-worker
restart: always
env_file:
- .env
environment:
MONGO_URI: mongodb://${MONGO_ROOT_USERNAME}:${MONGO_ROOT_PASSWORD}@mongo:${MONGO_PORT}/
REDIS_URL: redis://redis:6379/0
command: ["arq", "jobs.WorkerSettings"]
depends_on:
- redis
- mongo
volumes: volumes:
mongo_data: mongo_data:
redis_data: