9 Commits

Author SHA1 Message Date
b4e504a87b feat: intent-aware querying + smart sampling for large audit datasets
All checks were successful
Release / build-and-push (push) Successful in 1m31s
CI / lint-and-test (push) Successful in 34s
- Add keyword-based intent extraction: 'device' → Intune, 'user' → Directory, etc.
- Broad questions without intent auto-exclude noisy services (Exchange, SharePoint)
- Smart stratified sampling: failures always included, high-value services prioritised
- Fetch up to 1000 events from MongoDB, then curate best 200 for the LLM
- Excluded services noted in LLM prompt and query_info so the admin knows the scope
2026-04-20 17:41:21 +02:00
b728abb5ee ci: also tag and push 'latest' on every release
All checks were successful
CI / lint-and-test (push) Successful in 22s
2026-04-20 17:31:27 +02:00
d100388c7d chore(release): bump version to 1.2.6
All checks were successful
CI / lint-and-test (push) Successful in 31s
Release / build-and-push (push) Successful in 1m17s
2026-04-20 17:29:10 +02:00
11fd87411d fix: bake version into Docker image at build time
All checks were successful
Release / build-and-push (push) Successful in 1m18s
CI / lint-and-test (push) Successful in 20s
- Add VERSION build arg to Dockerfile
- Pass --build-arg VERSION in release workflow
- Remove VERSION env override from docker-compose files
- Version is now immutable inside the image, no runtime env var needed
2026-04-20 17:24:20 +02:00
6a80bf4eb9 fix: read version from env var so it works inside Docker
All checks were successful
Release / build-and-push (push) Successful in 28s
CI / lint-and-test (push) Successful in 21s
2026-04-20 17:15:55 +02:00
5e02f5a402 docs: add v1.2.5 release notes
All checks were successful
CI / lint-and-test (push) Successful in 25s
2026-04-20 17:12:43 +02:00
0c3e5ec57b feat: add version display to frontend and /api/version endpoint (v1.2.5)
All checks were successful
Release / build-and-push (push) Successful in 40s
CI / lint-and-test (push) Successful in 22s
- Add GET /api/version endpoint that reads VERSION file
- Frontend fetches version on init and displays it as a badge in the header
- Add version-badge CSS styling
- Update docker-compose.yml comment to v1.2.5
2026-04-20 17:09:02 +02:00
a255be93fe feat: aggregate large event sets before sending to LLM
All checks were successful
CI / lint-and-test (push) Successful in 18s
Release / build-and-push (push) Successful in 29s
When a query matches >50 events, the LLM now receives:
- Aggregated counts by service, operation, result, and actor
- A list of failures (up to 10)
- The 50 most recent raw events as samples

This scales to thousands of events without blowing the token budget
or losing signal. The LLM gets a bird's-eye view plus concrete examples.

Also updates the system prompt to handle both individual event lists
and aggregated overviews correctly.
2026-04-20 16:23:55 +02:00
cfe9397cc5 feat: raise LLM event limit to 200 and show total count awareness
All checks were successful
CI / lint-and-test (push) Successful in 23s
Release / build-and-push (push) Successful in 27s
- Bump LLM_MAX_EVENTS default from 50 to 200
- Add total_matched count to /api/ask response
- Include 'Showing X of Y total' header in LLM prompt so the model
  knows when its view is a subset and avoids false certainty
- Update system prompt to instruct acknowledging scale when truncated
- Update test mocks to accept new total parameter
2026-04-20 16:13:52 +02:00
12 changed files with 367 additions and 23 deletions

View File

@@ -42,6 +42,6 @@ ALERTS_ENABLED=false
LLM_API_KEY= LLM_API_KEY=
LLM_BASE_URL=https://api.openai.com/v1 LLM_BASE_URL=https://api.openai.com/v1
LLM_MODEL=gpt-4o-mini LLM_MODEL=gpt-4o-mini
LLM_MAX_EVENTS=50 LLM_MAX_EVENTS=200
LLM_TIMEOUT_SECONDS=30 LLM_TIMEOUT_SECONDS=30
LLM_API_VERSION= LLM_API_VERSION=

View File

@@ -16,7 +16,13 @@ jobs:
run: echo "${{ secrets.REGISTRY_TOKEN }}" | docker login git.cqre.net -u ${{ github.actor }} --password-stdin 2>&1 | grep -v "WARNING! Your credentials are stored unencrypted" run: echo "${{ secrets.REGISTRY_TOKEN }}" | docker login git.cqre.net -u ${{ github.actor }} --password-stdin 2>&1 | grep -v "WARNING! Your credentials are stored unencrypted"
- name: Build Docker image - name: Build Docker image
run: docker build ./backend --tag git.cqre.net/cqrenet/aoc-backend:${{ gitea.ref_name }} run: docker build ./backend --build-arg VERSION=${{ gitea.ref_name }} --tag git.cqre.net/cqrenet/aoc-backend:${{ gitea.ref_name }}
- name: Push Docker image - name: Tag as latest
run: docker tag git.cqre.net/cqrenet/aoc-backend:${{ gitea.ref_name }} git.cqre.net/cqrenet/aoc-backend:latest
- name: Push version tag
run: docker push git.cqre.net/cqrenet/aoc-backend:${{ gitea.ref_name }} run: docker push git.cqre.net/cqrenet/aoc-backend:${{ gitea.ref_name }}
- name: Push latest tag
run: docker push git.cqre.net/cqrenet/aoc-backend:latest

78
RELEASE_NOTES_v1.2.5.md Normal file
View File

@@ -0,0 +1,78 @@
# AOC v1.2.5 Release Notes
**Release date:** 2026-04-20
---
## What's new
### Natural language query (`/api/ask`)
Ask questions in plain English and get AI-generated answers backed by your audit logs.
- **Regex-based parsing** extracts time ranges (`last 3 days`, `yesterday`, `today`) and entities (`device ABC123`, `user bob@example.com`) without calling an LLM.
- **AI narrative summarisation** via any OpenAI-compatible API (OpenAI, Azure OpenAI, MS Foundry, Ollama).
- **Graceful fallback** when no LLM is configured — returns a structured bullet list with a clear error banner.
- **Cited evidence** — every answer includes the raw events that back it up.
### Filter-aware queries
The ask endpoint now respects the filter panel. When you set **Service = Exchange**, **Result = failure** and ask *"What happened to device X?"*, the LLM only sees failed Exchange events for that device.
### Scales to thousands of events
For large result sets (>50 events), the LLM receives an **aggregated overview** instead of a raw dump:
- Counts by service, action, result, and actor
- Failure highlights
- The 50 most recent raw events as samples
This keeps token usage low while preserving accuracy.
### Azure OpenAI / MS Foundry support
- Automatic `api-key` header detection for Azure endpoints.
- `LLM_API_VERSION` config for Azure `api-version` query parameters.
- `max_completion_tokens` support for newer model deployments.
### Version display
- `GET /api/version` endpoint reads the `VERSION` file.
- Frontend shows a version badge in the header (e.g., **1.2.5**).
### Production hardening (from v1.1.0)
- Dockerfile runs as non-root user with Gunicorn + Uvicorn workers.
- `docker-compose.prod.yml` with internal-only MongoDB, health checks, and nginx reverse proxy.
- Security headers (`X-Frame-Options`, `X-Content-Type-Options`, etc.).
---
## Configuration
Add to your `.env`:
```bash
# Required for AI narrative summarisation
LLM_API_KEY=your-key
LLM_BASE_URL=https://api.openai.com/v1
LLM_MODEL=gpt-4o-mini
LLM_MAX_EVENTS=200
LLM_TIMEOUT_SECONDS=30
LLM_API_VERSION= # set for Azure OpenAI, e.g. 2024-12-01-preview
```
For Azure OpenAI / MS Foundry:
```bash
LLM_BASE_URL=https://your-resource.openai.azure.com/openai/deployments/your-deployment
LLM_API_KEY=your-azure-key
LLM_API_VERSION=2024-12-01-preview
LLM_MODEL=your-deployment-name
```
---
## Upgrade notes
No breaking changes. Existing `/api/events`, filters, pagination, tags, and comments work unchanged.
---
## Docker image
```
git.cqre.net/cqrenet/aoc-backend:v1.2.5
```

View File

@@ -1 +1 @@
1.2.0 1.2.7

View File

@@ -1,5 +1,9 @@
FROM python:3.11-slim FROM python:3.11-slim
# Bake the version into the image at build time
ARG VERSION=unknown
ENV VERSION=${VERSION}
# Security: run as non-root # Security: run as non-root
RUN groupadd -r aoc && useradd -r -g aoc aoc RUN groupadd -r aoc && useradd -r -g aoc aoc

View File

@@ -46,7 +46,7 @@ class Settings(BaseSettings):
LLM_API_KEY: str = "" LLM_API_KEY: str = ""
LLM_BASE_URL: str = "https://api.openai.com/v1" LLM_BASE_URL: str = "https://api.openai.com/v1"
LLM_MODEL: str = "gpt-4o-mini" LLM_MODEL: str = "gpt-4o-mini"
LLM_MAX_EVENTS: int = 50 LLM_MAX_EVENTS: int = 200
LLM_TIMEOUT_SECONDS: int = 30 LLM_TIMEOUT_SECONDS: int = 30
LLM_API_VERSION: str = "" # e.g. 2025-01-01-preview for Azure OpenAI LLM_API_VERSION: str = "" # e.g. 2025-01-01-preview for Azure OpenAI

View File

@@ -12,7 +12,7 @@
<div class="page" x-data="aocApp()" x-init="initApp()"> <div class="page" x-data="aocApp()" x-init="initApp()">
<header class="hero"> <header class="hero">
<div> <div>
<p class="eyebrow">Admin Operations Center</p> <p class="eyebrow">Admin Operations Center <span class="version-badge" x-text="appVersion"></span></p>
<h1>Directory Audit Explorer</h1> <h1>Directory Audit Explorer</h1>
<p class="lede">Filter Microsoft Entra audit events by user, app, time, action, and action type.</p> <p class="lede">Filter Microsoft Entra audit events by user, app, time, action, and action type.</p>
</div> </div>
@@ -243,6 +243,7 @@
actor: '', selectedServices: [], search: '', operation: '', result: '', start: '', end: '', limit: 100, includeTags: '', excludeTags: '', actor: '', selectedServices: [], search: '', operation: '', result: '', start: '', end: '', limit: 100, includeTags: '', excludeTags: '',
}, },
options: { actors: [], services: [], operations: [], results: [] }, options: { actors: [], services: [], operations: [], results: [] },
appVersion: '',
askQuestionText: '', askQuestionText: '',
askLoading: false, askLoading: false,
askAnswer: '', askAnswer: '',
@@ -252,6 +253,7 @@
askLlmError: '', askLlmError: '',
async initApp() { async initApp() {
await this.loadVersion();
await this.initAuth(); await this.initAuth();
if (!this.authConfig?.auth_enabled || this.accessToken) { if (!this.authConfig?.auth_enabled || this.accessToken) {
await this.loadFilterOptions(); await this.loadFilterOptions();
@@ -260,6 +262,16 @@
} }
}, },
async loadVersion() {
try {
const res = await fetch('/api/version');
if (res.ok) {
const body = await res.json();
this.appVersion = body.version || '';
}
} catch {}
},
authHeader() { authHeader() {
return this.accessToken ? { Authorization: `Bearer ${this.accessToken}` } : {}; return this.accessToken ? { Authorization: `Bearer ${this.accessToken}` } : {};
}, },

View File

@@ -433,6 +433,20 @@ input {
color: var(--muted); color: var(--muted);
} }
.version-badge {
display: inline-block;
margin-left: 8px;
padding: 2px 8px;
border-radius: 999px;
background: rgba(125, 211, 252, 0.15);
border: 1px solid rgba(125, 211, 252, 0.3);
color: var(--accent-strong);
font-size: 11px;
font-weight: 600;
letter-spacing: 0.05em;
vertical-align: middle;
}
.ask-events { .ask-events {
margin-bottom: 14px; margin-bottom: 14px;
} }

View File

@@ -134,6 +134,13 @@ async def metrics():
return Response(content=prometheus_metrics(), media_type="text/plain") return Response(content=prometheus_metrics(), media_type="text/plain")
@app.get("/api/version")
async def version():
import os
return {"version": os.environ.get("VERSION", "unknown")}
frontend_dir = Path(__file__).parent / "frontend" frontend_dir = Path(__file__).parent / "frontend"
app.mount("/", StaticFiles(directory=frontend_dir, html=True), name="frontend") app.mount("/", StaticFiles(directory=frontend_dir, html=True), name="frontend")

View File

@@ -13,6 +13,129 @@ from models.api import AskRequest, AskResponse
router = APIRouter(dependencies=[Depends(require_auth)]) router = APIRouter(dependencies=[Depends(require_auth)])
logger = structlog.get_logger("aoc.ask") logger = structlog.get_logger("aoc.ask")
# ---------------------------------------------------------------------------
# Intent extraction — map question keywords to relevant audit services
# ---------------------------------------------------------------------------
_SERVICE_INTENTS = {
"intune": ["Intune"],
"device": ["Intune", "Device"],
"laptop": ["Intune", "Device"],
"mobile": ["Intune", "Device"],
"phone": ["Intune", "Device"],
"ipad": ["Intune", "Device"],
"app": ["Intune", "ApplicationManagement"],
"application": ["Intune", "ApplicationManagement"],
"policy": ["Intune", "Policy"],
"compliance": ["Intune", "Policy"],
"user": ["Directory", "UserManagement"],
"group": ["Directory", "GroupManagement"],
"role": ["Directory", "RoleManagement"],
"permission": ["Directory", "RoleManagement"],
"license": ["Directory", "License"],
"email": ["Exchange"],
"mailbox": ["Exchange"],
"mail": ["Exchange"],
"message": ["Exchange", "Teams"],
"file": ["SharePoint"],
"sharepoint": ["SharePoint"],
"site": ["SharePoint"],
"document": ["SharePoint"],
"team": ["Teams"],
"channel": ["Teams"],
"meeting": ["Teams"],
"call": ["Teams"],
}
# Services that are extremely noisy for typical admin questions.
# We exclude them by default on broad questions unless the user explicitly mentions them.
_NOISY_SERVICES = {"Exchange", "SharePoint"}
# Services that are generally admin-relevant and kept by default.
_DEFAULT_ADMIN_SERVICES = {
"Directory",
"UserManagement",
"GroupManagement",
"RoleManagement",
"ApplicationManagement",
"Intune",
"Device",
"Policy",
"Teams",
"License",
}
def _extract_intent_services(question: str) -> tuple[list[str] | None, bool]:
"""
Extract relevant services from the question.
Returns:
(services, is_explicit):
- services: list of service names to query, or None for default admin set
- is_explicit: True if the user explicitly mentioned a noisy service
"""
q_lower = question.lower()
tokens = set(re.findall(r"\b[a-z]+\b", q_lower))
matched_services = set()
for token, services in _SERVICE_INTENTS.items():
if token in tokens:
matched_services.update(services)
if matched_services:
# User asked something specific — return exactly what they asked for
is_explicit = not matched_services.isdisjoint(_NOISY_SERVICES)
return sorted(matched_services), is_explicit
# Broad question with no clear intent — default to admin-relevant services only
return None, False
# ---------------------------------------------------------------------------
# Smart sampling — stratified by importance so the LLM sees signal, not noise
# ---------------------------------------------------------------------------
def _smart_sample(events: list[dict], max_events: int = 200) -> list[dict]:
"""
Return a curated subset that preserves diversity and prioritises signal.
Tiers:
1. Failures (always valuable)
2. High-admin-value services (Intune, Device, Directory, etc.)
3. Everything else
"""
if len(events) <= max_events:
return events
high_value = {
"Directory",
"UserManagement",
"GroupManagement",
"RoleManagement",
"Intune",
"Device",
"Policy",
"ApplicationManagement",
}
failures = [e for e in events if str(e.get("result") or "").lower() in ("failure", "failed")]
high_val = [e for e in events if e.get("service") in high_value and e not in failures]
rest = [e for e in events if e not in failures and e not in high_val]
# Allocate slots: half to failures+high-value, half to rest (but never let rest dominate)
slots = max_events
failure_cap = min(len(failures), max(10, slots // 4))
high_cap = min(len(high_val), max(20, slots // 4))
rest_cap = slots - failure_cap - high_cap
sampled = failures[:failure_cap] + high_val[:high_cap] + rest[:rest_cap]
# Sort back to chronological order
sampled.sort(key=lambda e: e.get("timestamp") or "", reverse=True)
return sampled
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
# Time-range extraction # Time-range extraction
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
@@ -168,22 +291,80 @@ def _build_event_query(
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
_SYSTEM_PROMPT = """You are an IT operations assistant. An administrator has asked a question about audit logs. _SYSTEM_PROMPT = """You are an IT operations assistant. An administrator has asked a question about audit logs.
Your job is to read the list of audit events below and write a concise, plain-language answer. Your job is to read the data below and write a concise, plain-language answer.
The input may be either:
- A small list of individual audit events (numbered Event #1, #2, etc.), or
- An aggregated overview with counts by service, action, result, and actor, plus sample events.
Rules: Rules:
- Assume the reader is a non-expert admin. - Assume the reader is a non-expert admin.
- Group related events together and tell a coherent story. - For aggregated overviews: summarise the scale, top patterns, and highlight anomalies or failures.
- For small event lists: group related events together and tell a coherent story.
- Highlight anything unusual, failed actions, or privilege escalations. - Highlight anything unusual, failed actions, or privilege escalations.
- Reference specific event numbers (e.g., "Event #3") when making claims so the user can verify. - Reference specific event numbers (e.g., "Event #3") when making claims so the user can verify.
- If the data is an aggregated subset of a larger result set, acknowledge the scale (e.g., "847 events occurred — the top pattern was...").
- If there are no events, say so clearly. - If there are no events, say so clearly.
- Keep the answer under 300 words. - Keep the answer under 300 words.
- Do not invent events that are not in the list. - Do not invent events or patterns that are not supported by the data.
""" """
def _format_events_for_llm(events: list[dict]) -> str: def _aggregate_counts(events: list[dict]) -> dict:
"""Build lightweight aggregation tables for large result sets."""
from collections import Counter
svc_counts = Counter(e.get("service") or "Unknown" for e in events)
op_counts = Counter(e.get("operation") or "Unknown" for e in events)
result_counts = Counter(e.get("result") or "Unknown" for e in events)
actor_counts = Counter(e.get("actor_display") or "Unknown" for e in events)
return {
"services": svc_counts.most_common(10),
"operations": op_counts.most_common(10),
"results": result_counts.most_common(5),
"actors": actor_counts.most_common(10),
}
def _format_events_for_llm(
events: list[dict], total: int | None = None, excluded_services: list[str] | None = None
) -> str:
lines = [] lines = []
for i, e in enumerate(events, 1):
# If we have a large result set, send aggregation + samples instead of raw dump
if total is not None and total > len(events) and len(events) >= 50:
lines.append(f"Result set overview: {total} total events (showing a curated sample of {len(events)}).\n")
if excluded_services:
lines.append(f"Note: high-volume services excluded by default: {', '.join(excluded_services)}.\n")
agg = _aggregate_counts(events)
lines.append("Breakdown by service:")
for svc, cnt in agg["services"]:
lines.append(f" {svc}: {cnt}")
lines.append("\nBreakdown by action:")
for op, cnt in agg["operations"]:
lines.append(f" {op}: {cnt}")
lines.append("\nBreakdown by result:")
for res, cnt in agg["results"]:
lines.append(f" {res}: {cnt}")
lines.append("\nTop actors:")
for actor, cnt in agg["actors"]:
lines.append(f" {actor}: {cnt}")
# Include failures and a few recent samples
failures = [e for e in events if str(e.get("result") or "").lower() in ("failure", "failed")]
if failures:
lines.append(f"\nFailures ({len(failures)}):")
for e in failures[:10]:
ts = e.get("timestamp", "?")[:16].replace("T", " ")
op = e.get("operation", "unknown")
actor = e.get("actor_display", "unknown")
lines.append(f" {ts}{op} by {actor}")
lines.append("\nMost recent sample events:")
else:
if total is not None and total > len(events):
lines.append(f"Showing {len(events)} of {total} total matching events (most recent first):\n")
# Always include the first N raw events as detail (up to 50)
for i, e in enumerate(events[:50], 1):
ts = e.get("timestamp") or "unknown time" ts = e.get("timestamp") or "unknown time"
op = e.get("operation") or "unknown action" op = e.get("operation") or "unknown action"
actor = e.get("actor_display") or "unknown actor" actor = e.get("actor_display") or "unknown actor"
@@ -213,11 +394,16 @@ def _build_chat_url(base_url: str, api_version: str) -> str:
return url return url
async def _call_llm(question: str, events: list[dict]) -> str: async def _call_llm(
question: str,
events: list[dict],
total: int | None = None,
excluded_services: list[str] | None = None,
) -> str:
if not LLM_API_KEY: if not LLM_API_KEY:
raise RuntimeError("LLM_API_KEY not configured") raise RuntimeError("LLM_API_KEY not configured")
context = _format_events_for_llm(events) context = _format_events_for_llm(events, total=total, excluded_services=excluded_services)
messages = [ messages = [
{"role": "system", "content": _SYSTEM_PROMPT}, {"role": "system", "content": _SYSTEM_PROMPT},
{ {
@@ -278,6 +464,7 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
start, end = _extract_time_range(question) start, end = _extract_time_range(question)
entity = _extract_entity(question) entity = _extract_entity(question)
intent_services, explicit_noisy = _extract_intent_services(question)
# Default to last 7 days if no time range detected # Default to last 7 days if no time range detected
if not start: if not start:
@@ -285,11 +472,29 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
start = (now - timedelta(days=7)).isoformat().replace("+00:00", "Z") start = (now - timedelta(days=7)).isoformat().replace("+00:00", "Z")
end = now.isoformat().replace("+00:00", "Z") end = now.isoformat().replace("+00:00", "Z")
# -----------------------------------------------------------------------
# Decide which services to query
# -----------------------------------------------------------------------
excluded_services: list[str] = []
if body.services:
# User explicitly filtered via UI — respect that exactly
query_services = body.services
elif intent_services is not None:
# NL question implies specific services
query_services = intent_services
else:
# Broad question with no intent — exclude noisy services by default
query_services = sorted(_DEFAULT_ADMIN_SERVICES)
excluded_services = sorted(_NOISY_SERVICES)
# -----------------------------------------------------------------------
# Build and run query
# -----------------------------------------------------------------------
query = _build_event_query( query = _build_event_query(
entity, entity,
start, start,
end, end,
services=body.services, services=query_services,
actor=body.actor, actor=body.actor,
operation=body.operation, operation=body.operation,
result=body.result, result=body.result,
@@ -298,21 +503,34 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
) )
try: try:
cursor = events_collection.find(query).sort([("timestamp", -1)]).limit(LLM_MAX_EVENTS) total = events_collection.count_documents(query)
events = list(cursor) # Fetch a generous window so we can apply smart sampling in Python
cursor = events_collection.find(query).sort([("timestamp", -1)]).limit(1000)
raw_events = list(cursor)
except Exception as exc: except Exception as exc:
logger.error("Failed to query events for ask", error=str(exc)) logger.error("Failed to query events for ask", error=str(exc))
raise HTTPException(status_code=500, detail=f"Database query failed: {exc}") from exc raise HTTPException(status_code=500, detail=f"Database query failed: {exc}") from exc
for e in events: for e in raw_events:
e["_id"] = str(e.get("_id", "")) e["_id"] = str(e.get("_id", ""))
# Apply smart sampling (preserves failures, prioritises admin-relevant services)
events = _smart_sample(raw_events, max_events=LLM_MAX_EVENTS)
# If no events, return early # If no events, return early
if not events: if not events:
return AskResponse( return AskResponse(
answer="I couldn't find any audit events matching your question. Try broadening the time range or checking the spelling of the device/user name.", answer="I couldn't find any audit events matching your question. Try broadening the time range or checking the spelling of the device/user name.",
events=[], events=[],
query_info={"entity": entity, "start": start, "end": end, "event_count": 0}, query_info={
"entity": entity,
"start": start,
"end": end,
"event_count": 0,
"total_matched": total,
"services_queried": query_services,
"excluded_services": excluded_services,
},
llm_used=False, llm_used=False,
llm_error="LLM not used — no events found." if not LLM_API_KEY else None, llm_error="LLM not used — no events found." if not LLM_API_KEY else None,
) )
@@ -325,7 +543,7 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
llm_error = "LLM_API_KEY is not configured. Set it in your .env to enable AI narrative summarisation." llm_error = "LLM_API_KEY is not configured. Set it in your .env to enable AI narrative summarisation."
else: else:
try: try:
answer = await _call_llm(question, events) answer = await _call_llm(question, events, total=total, excluded_services=excluded_services)
llm_used = True llm_used = True
except Exception as exc: except Exception as exc:
llm_error = f"LLM call failed: {exc}" llm_error = f"LLM call failed: {exc}"
@@ -333,9 +551,11 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
# Fallback: structured summary if LLM unavailable or failed # Fallback: structured summary if LLM unavailable or failed
if not answer: if not answer:
parts = [f"Found {len(events)} event(s)"] parts = [f"Found {total} event(s)"]
if entity: if entity:
parts.append(f"related to **{entity}**") parts.append(f"related to **{entity}**")
if excluded_services:
parts.append(f"(excluding {', '.join(excluded_services)})")
parts.append(f"between {start[:10]} and {end[:10]}.\n") parts.append(f"between {start[:10]} and {end[:10]}.\n")
for i, e in enumerate(events[:10], 1): for i, e in enumerate(events[:10], 1):
@@ -359,6 +579,9 @@ async def ask_question(body: AskRequest, user: dict = Depends(require_auth)):
"start": start, "start": start,
"end": end, "end": end,
"event_count": len(events), "event_count": len(events),
"total_matched": total,
"services_queried": query_services,
"excluded_services": excluded_services,
"mongo_query": json.dumps(query, default=str), "mongo_query": json.dumps(query, default=str),
}, },
llm_used=llm_used, llm_used=llm_used,

View File

@@ -236,7 +236,7 @@ class TestAskEndpoint:
} }
) )
async def fake_llm(question, events): async def fake_llm(question, events, total=None, excluded_services=None):
return "The device had a failed wipe attempt." return "The device had a failed wipe attempt."
monkeypatch.setattr("routes.ask.LLM_API_KEY", "fake-key") monkeypatch.setattr("routes.ask.LLM_API_KEY", "fake-key")
@@ -265,7 +265,7 @@ class TestAskEndpoint:
} }
) )
async def failing_llm(question, events): async def failing_llm(question, events, total=None):
raise RuntimeError("LLM service down") raise RuntimeError("LLM service down")
monkeypatch.setattr("routes.ask.LLM_API_KEY", "fake-key") monkeypatch.setattr("routes.ask.LLM_API_KEY", "fake-key")

View File

@@ -14,7 +14,7 @@ services:
backend: backend:
build: ./backend build: ./backend
# For production, use the pre-built image instead: # For production, use the pre-built image instead:
# image: git.cqre.net/cqrenet/aoc-backend:v1.1.0 # image: git.cqre.net/cqrenet/aoc-backend:v1.2.5
container_name: aoc-backend container_name: aoc-backend
restart: always restart: always
env_file: env_file: