We made a broad pass over how agents run, so day-to-day work feels quicker and more dependable.
Live messages now appear faster, long answers from slower models no longer get cut off, and when a model is busy the system smoothly moves to the next best option.
What you can do now
- See live replies appear in about half a second.
- Get complete answers from models that pause to think, without them being dropped.
- Keep working when a model is busy, as agents switch to your backup options automatically.
- Get a clear message when a usage limit is reached instead of a silent stall.
- See cost and usage totals that match what was actually used.
Why it matters
Reliability is what makes AI feel like a teammate you can count on. Small stalls, dropped answers, and unclear errors erode that trust quickly.
These changes target the rough edges that show up in real, everyday use, so agents stay responsive and predictable under load.
Example workflows
- Live chat: A team watches replies stream in noticeably faster.
- Heavy tasks: A long analysis finishes cleanly instead of cutting off mid-thought.
- Peak load: Work continues on a backup model when the first choice is busy.
- Budgets: Usage and cost numbers stay accurate across multi-step work.
What’s next
We'll keep hardening the engine so agents stay fast and reliable as workloads grow.