When a query matches >50 events, the LLM now receives:
- Aggregated counts by service, operation, result, and actor
- A list of failures (up to 10)
- The 50 most recent raw events as samples
This scales to thousands of events without blowing the token budget
or losing signal. The LLM gets a bird's-eye view plus concrete examples.
Also updates the system prompt to handle both individual event lists
and aggregated overviews correctly.