#[0.3.0] - 2025-11-25
#Added
-
Universal Streaming Support for All LLM Providers
- Real-time text streaming across OpenAI, Anthropic, Grok, and Gemini providers
- Tool call streaming with complete argument streaming
- Unified
LLMChunkformat across all providers (types:"text","tool_call_complete") - Automatic token usage tracking during streaming
- Model metadata included in each chunk
- Multi-turn conversation support with streaming for all providers
-
Agent Streaming Events (
on_eventcallback)- Real-time execution monitoring via
agent.run(prompt, on_event=callback) - Six event types for comprehensive visibility:
ITERATION: Track multi-step reasoning iterationsTHINKING: Stream LLM text generation in real-timeTOOL_CALL: Monitor tool invocations with argumentsTOOL_RESULT: Receive tool execution resultsCOMPLETE: Get final answer with cost and token metadataERROR: Handle execution errors gracefully
- Support for both sync and async event handlers
- Works with both
run()andrun_detailed()methods - Zero-configuration - just add
on_eventparameter
- Real-time execution monitoring via
-
Streaming Examples and Patterns
- Progress tracking with custom UI handlers
- Async event handling for database logging and notifications
- Buffered text updates for optimal UI performance
- Tool execution monitoring for debugging
- Cost and performance tracking during execution
#Fixed
- Gemini Streaming Tool Calling
- Fixed critical issue where Gemini's streaming API would return empty tool calls in multi-turn conversations
- Added defensive validation to filter empty tool calls before they poison conversation history
- Ensures reliable multi-turn tool conversations with Gemini
- Applied validation to both async and sync streaming paths
- Gemini now fully supports text and tool streaming with multi-turn conversations