feat: Redis caching + async queue for LLM scaling (v1.6.0)
- Add async Redis client singleton (redis_client.py) for caching and arq pool
- Add arq job functions (jobs.py) for background LLM processing
- Cache ask/explain LLM responses with TTL (1h ask, 24h explain)
- Add async mode to /api/ask: enqueue job, return job_id, poll /api/jobs/{id}
- Add GET /api/jobs/{job_id} endpoint for job status polling
- Add arq worker service to docker-compose (dev + prod)
- Switch from Redis to Valkey (BSD fork) in Docker Compose
- Add REDIS_URL config setting
- Add tests for cache hit, async mode, and job status
This commit is contained in:
@@ -82,6 +82,7 @@ class AskRequest(BaseModel):
|
||||
end: str | None = None
|
||||
include_tags: list[str] | None = None
|
||||
exclude_tags: list[str] | None = None
|
||||
async_mode: bool = False # enqueue async job instead of waiting
|
||||
|
||||
|
||||
class AskEventRef(BaseModel):
|
||||
@@ -101,3 +102,4 @@ class AskResponse(BaseModel):
|
||||
query_info: dict
|
||||
llm_used: bool
|
||||
llm_error: str | None = None
|
||||
job_id: str | None = None
|
||||
|
||||
Reference in New Issue
Block a user