Back to Changelog
v0.3.0
November 25, 2025

Universal Streaming Support

AddedFixed

#[0.3.0] - 2025-11-25

#Added

  • Universal Streaming Support for All LLM Providers

    • Real-time text streaming across OpenAI, Anthropic, Grok, and Gemini providers
    • Tool call streaming with complete argument streaming
    • Unified LLMChunk format across all providers (types: "text", "tool_call_complete")
    • Automatic token usage tracking during streaming
    • Model metadata included in each chunk
    • Multi-turn conversation support with streaming for all providers
  • Agent Streaming Events (on_event callback)

    • Real-time execution monitoring via agent.run(prompt, on_event=callback)
    • Six event types for comprehensive visibility:
      • ITERATION: Track multi-step reasoning iterations
      • THINKING: Stream LLM text generation in real-time
      • TOOL_CALL: Monitor tool invocations with arguments
      • TOOL_RESULT: Receive tool execution results
      • COMPLETE: Get final answer with cost and token metadata
      • ERROR: Handle execution errors gracefully
    • Support for both sync and async event handlers
    • Works with both run() and run_detailed() methods
    • Zero-configuration - just add on_event parameter
  • Streaming Examples and Patterns

    • Progress tracking with custom UI handlers
    • Async event handling for database logging and notifications
    • Buffered text updates for optimal UI performance
    • Tool execution monitoring for debugging
    • Cost and performance tracking during execution

#Fixed

  • Gemini Streaming Tool Calling
    • Fixed critical issue where Gemini's streaming API would return empty tool calls in multi-turn conversations
    • Added defensive validation to filter empty tool calls before they poison conversation history
    • Ensures reliable multi-turn tool conversations with Gemini
    • Applied validation to both async and sync streaming paths
    • Gemini now fully supports text and tool streaming with multi-turn conversations