feat: Redis caching + async queue for LLM scaling (v1.6.0)
Some checks failed
Release / build-and-push (push) Successful in 1m24s
CI / lint-and-test (push) Failing after 29s

- Add async Redis client singleton (redis_client.py) for caching and arq pool
- Add arq job functions (jobs.py) for background LLM processing
- Cache ask/explain LLM responses with TTL (1h ask, 24h explain)
- Add async mode to /api/ask: enqueue job, return job_id, poll /api/jobs/{id}
- Add GET /api/jobs/{job_id} endpoint for job status polling
- Add arq worker service to docker-compose (dev + prod)
- Switch from Redis to Valkey (BSD fork) in Docker Compose
- Add REDIS_URL config setting
- Add tests for cache hit, async mode, and job status
This commit is contained in:
2026-04-22 09:55:05 +02:00
parent 47e0dfc2ca
commit f75f165911
16 changed files with 498 additions and 14 deletions

View File

@@ -82,6 +82,7 @@ class AskRequest(BaseModel):
end: str | None = None
include_tags: list[str] | None = None
exclude_tags: list[str] | None = None
async_mode: bool = False # enqueue async job instead of waiting
class AskEventRef(BaseModel):
@@ -101,3 +102,4 @@ class AskResponse(BaseModel):
query_info: dict
llm_used: bool
llm_error: str | None = None
job_id: str | None = None