feat: intent-aware querying + smart sampling for large audit datasets

- Add keyword-based intent extraction: 'device' → Intune, 'user' → Directory, etc. - Broad questions without intent auto-exclude noisy services (Exchange, SharePoint) - Smart stratified sampling: failures always included, high-value services prioritised - Fetch up to 1000 events from MongoDB, then curate best 200 for the LLM - Excluded services noted in LLM prompt and query_info so the admin knows the scope
2026-04-20 17:41:21 +02:00
parent b728abb5ee
commit b4e504a87b
3 changed files with 180 additions and 13 deletions
--- a/backend/tests/test_ask.py
+++ b/backend/tests/test_ask.py
@@ -236,7 +236,7 @@ class TestAskEndpoint:
            }
        )

-        async def fake_llm(question, events, total=None):
+        async def fake_llm(question, events, total=None, excluded_services=None):
            return "The device had a failed wipe attempt."

        monkeypatch.setattr("routes.ask.LLM_API_KEY", "fake-key")