LLM Providers

The Daita framework provides a unified interface for working with multiple Large Language Model (LLM) providers. The system uses a factory pattern that allows easy switching between providers while maintaining a consistent interface for agents.

Overview

Daita supports five major LLM providers out of the box with full streaming support:

OpenAI - GPT-4, GPT-3.5-turbo, and other OpenAI models
Anthropic - Claude family models (Sonnet, Haiku, Opus)
xAI Grok - Grok and vision models
Google Gemini - Gemini and Gemini Pro models
Mock Provider - Testing and development without API calls

All providers support real-time streaming for both text generation and tool calling, enabling transparent agent execution with live progress updates.

Quick Start

Direct Instantiation (Recommended)

The simplest way to use LLM providers is to import and instantiate them directly:

from daita.llm import OpenAIProvider, AnthropicProvider, GrokProvider

# OpenAI
llm = OpenAIProvider(model="gpt-4", api_key="sk-...")
response = await llm.generate("Hello, world!")

# Anthropic Claude
llm = AnthropicProvider(model="claude-3-sonnet-20240229", api_key="sk-ant-...")
response = await llm.generate("Analyze this data...")

# xAI Grok
llm = GrokProvider(model="grok-beta", api_key="xai-...")
response = await llm.generate("What's the latest news?")

# Streaming support
async for chunk in llm.generate("Explain quantum computing", stream=True):
    if chunk.type == "text":
        print(chunk.content, end="", flush=True)
    elif chunk.type == "tool_call_complete":
        print(f"\nTool: {chunk.tool_name}({chunk.tool_args})")

Factory Pattern (For Dynamic Selection)

Use the factory when you need to dynamically choose providers at runtime:

from daita.llm import create_llm_provider

# Useful when provider is determined at runtime
provider_name = config.get("llm_provider")  # e.g., from config file
llm = create_llm_provider(provider_name, "gpt-4", api_key="sk-...")
response = await llm.generate("Hello, world!")

Streaming Support

All Daita LLM providers support real-time streaming for both text generation and tool calling. When streaming is enabled, you receive LLMChunk objects in real-time as the model generates content:

from daita.llm import OpenAIProvider

llm = OpenAIProvider(model="gpt-4")

# Stream text generation
async for chunk in llm.generate("Write a story", stream=True):
    if chunk.type == "text":
        # Text content streaming
        print(chunk.content, end="", flush=True)
    elif chunk.type == "tool_call_complete":
        # Tool call detected
        print(f"Tool: {chunk.tool_name}")
        print(f"Args: {chunk.tool_args}")

Streaming Features:

✅ Real-time text token streaming
✅ Tool call streaming with complete arguments
✅ Unified chunk format across all providers
✅ Automatic token usage tracking
✅ Model metadata in each chunk

LLMChunk Types:

"text": Text content chunks (field: content)
"tool_call_complete": Complete tool call (fields: tool_name, tool_args, tool_call_id)

Factory Function

The create_llm_provider() factory function is useful when you need to dynamically select providers at runtime (e.g., from configuration files). For most cases, direct instantiation is simpler and more Pythonic:

from daita.llm import create_llm_provider

llm = create_llm_provider(
    provider="openai",           # Provider name
    model="gpt-4",              # Model identifier
    api_key="sk-...",           # API key (optional if set in environment)
    agent_id="my_agent",        # For token tracking (optional)
    temperature=0.7,            # Model parameters (optional)
    max_tokens=1000             # Additional provider-specific options
)

Parameters

Parameter	Type	Required	Description
`provider`	str	Yes	Provider name: 'openai', 'anthropic', 'grok', 'gemini', or 'mock'
`model`	str	Yes	Model identifier specific to the provider
`api_key`	str	No	API key (uses environment variables if not provided)
`agent_id`	str	No	Agent ID for token usage tracking
`**kwargs`	dict	No	Additional provider-specific parameters

Registry

List available providers:

from daita.llm import list_available_providers

providers = list_available_providers()
print(providers)  # ['openai', 'anthropic', 'grok', 'gemini', 'mock']

OpenAI Provider

The OpenAI provider supports all OpenAI models including GPT-4, GPT-3.5-turbo, and GPT-4-turbo variants.

Configuration

from daita.llm import OpenAIProvider

# Basic OpenAI configuration
llm = OpenAIProvider(
    model="gpt-4",
    api_key="sk-your-openai-key"
)

# Advanced configuration with custom parameters
llm = OpenAIProvider(
    model="gpt-4o-mini",
    api_key="sk-your-openai-key",
    temperature=0.7,
    max_tokens=1000,
    frequency_penalty=0.1,
    presence_penalty=0.1,
    timeout=60
)

Environment Variables

# Required
export OPENAI_API_KEY="sk-your-openai-key"

# Optional
export OPENAI_ORG_ID="org-your-org-id"

Advanced Features

# Custom parameters with conversation
messages = [
    {"role": "system", "content": "You are an expert code reviewer."},
    {"role": "user", "content": "Analyze this code for bugs: def foo(): return x"}
]
response = await llm.generate(messages, temperature=0.3, max_tokens=2000)

# Tool calling (function calling)
from daita.core.tools import tool

@tool
async def get_weather(location: str) -> dict:
    """Get weather for a location."""
    return {"temp": 72, "condition": "sunny"}

# Use tools with generate()
response = await llm.generate("What's the weather like?", tools=[get_weather])

Streaming Support

OpenAI provides full streaming support for both text and tool calling:

# Stream text responses
async for chunk in llm.generate("Explain AI", stream=True):
    if chunk.type == "text":
        print(chunk.content, end="", flush=True)

# Stream with tool calling
async for chunk in llm.generate("Search for X", tools=[search_tool], stream=True):
    if chunk.type == "tool_call_complete":
        print(f"Calling: {chunk.tool_name}({chunk.tool_args})")

Anthropic Provider

The Anthropic provider supports Claude 3 family models with their unique capabilities and safety features.

Configuration

from daita.llm import AnthropicProvider

# Basic Anthropic configuration
llm = AnthropicProvider(
    model="claude-3-sonnet-20240229",
    api_key="sk-ant-your-anthropic-key"
)

# Advanced configuration
llm = AnthropicProvider(
    model="claude-3-opus-20240229",
    api_key="sk-ant-your-anthropic-key",
    temperature=0.5,
    max_tokens=2000,
    timeout=90
)

Environment Variables

# Required
export ANTHROPIC_API_KEY="sk-ant-your-anthropic-key"

Claude-Specific Features

# Long-form content generation
response = await llm.generate(
    prompt="Write a comprehensive analysis of...",
    max_tokens=4000,
    temperature=0.7
)

# Document analysis with large context
response = await llm.generate(
    prompt=f"Analyze this document: {large_document_text}",
    max_tokens=1000
)

Streaming Support

Anthropic provides full streaming support with efficient tool calling:

# Stream Claude's responses
async for chunk in llm.generate("Write an essay", stream=True):
    if chunk.type == "text":
        print(chunk.content, end="", flush=True)

# Stream with tool calling (Claude's tool use)
async for chunk in llm.generate("Analyze data", tools=[analysis_tool], stream=True):
    if chunk.type == "tool_call_complete":
        print(f"Tool: {chunk.tool_name}({chunk.tool_args})")

xAI Grok Provider

The Grok provider connects to xAI's Grok models, which are optimized for real-time information and conversational AI.

Configuration

from daita.llm import GrokProvider

# Basic Grok configuration
llm = GrokProvider(
    model="grok-beta",
    api_key="xai-your-api-key"
)

# Configuration with custom base URL
llm = GrokProvider(
    model="grok-vision-beta",
    api_key="xai-your-api-key",
    base_url="https://api.x.ai/v1",
    timeout=60
)

Environment Variables

# Either of these work
export XAI_API_KEY="xai-your-api-key"
export GROK_API_KEY="xai-your-api-key"

Grok-Specific Features

# Real-time information queries
response = await llm.generate(
    prompt="What's happening in tech news today?",
    temperature=0.8
)

# Vision capabilities (grok-vision-beta)
llm_vision = create_llm_provider("grok", "grok-vision-beta")
response = await llm_vision.generate_with_image(
    prompt="Describe this image",
    image_path="./screenshot.png"
)

Streaming Support

Grok provides OpenAI-compatible streaming for real-time responses:

# Stream Grok responses
async for chunk in llm.generate("Latest tech trends", stream=True):
    if chunk.type == "text":
        print(chunk.content, end="", flush=True)

# Stream with tool calling
async for chunk in llm.generate("Search news", tools=[search_tool], stream=True):
    if chunk.type == "tool_call_complete":
        print(f"Tool: {chunk.tool_name}({chunk.tool_args})")

Google Gemini Provider

The Gemini provider supports Google's latest generative AI models with multimodal capabilities.

Configuration

from daita.llm import GeminiProvider

# Basic Gemini configuration
llm = GeminiProvider(
    model="gemini-1.5-flash",
    api_key="AIza-your-google-api-key"
)

# Advanced configuration with safety settings
llm = GeminiProvider(
    model="gemini-1.5-pro",
    api_key="AIza-your-google-api-key",
    temperature=0.9,
    safety_settings={
        "HARM_CATEGORY_HARASSMENT": "BLOCK_MEDIUM_AND_ABOVE",
        "HARM_CATEGORY_HATE_SPEECH": "BLOCK_MEDIUM_AND_ABOVE"
    },
    generation_config={
        "candidate_count": 1,
        "max_output_tokens": 2048
    }
)

Environment Variables

# Either of these work
export GOOGLE_API_KEY="AIza-your-google-api-key"
export GEMINI_API_KEY="AIza-your-google-api-key"

Gemini-Specific Features

# Large context processing
response = await llm.generate(
    prompt=f"Summarize this entire codebase: {massive_code_text}",
    max_tokens=1000
)

# Multimodal capabilities
response = await llm.generate_with_media(
    prompt="Explain what's happening in this video",
    media_path="./demo_video.mp4"
)

# Safety-filtered generation
response = await llm.generate(
    prompt="Generate content about...",
    safety_settings={
        "HARM_CATEGORY_DANGEROUS_CONTENT": "BLOCK_LOW_AND_ABOVE"
    }
)

Streaming Support

Gemini provides full streaming support for both text and tool calling with large context windows:

# Stream Gemini responses
async for chunk in llm.generate("Explain quantum physics", stream=True):
    if chunk.type == "text":
        print(chunk.content, end="", flush=True)

# Stream with tool calling (multi-turn conversations supported)
async for chunk in llm.generate("Analyze data", tools=[analysis_tool], stream=True):
    if chunk.type == "tool_call_complete":
        print(f"Tool: {chunk.tool_name}({chunk.tool_args})")

Note: Gemini streaming includes automatic validation to filter empty tool calls, ensuring reliable multi-turn tool conversations.

Mock Provider

The Mock provider is designed for testing and development without making actual API calls or incurring costs.

Configuration

# Basic mock configuration
llm = create_llm_provider(
    provider="mock",
    model="test-model",
    agent_id="test_agent"
)

# Mock with custom responses
llm = create_llm_provider(
    provider="mock",
    model="gpt-4-mock",
    responses=["Hello! This is a mock response.", "Another mock response."],
    delay=0.5  # Simulate API latency
)

Features

No API calls - Returns predefined responses
Latency simulation - Configurable delays to simulate real API behavior
Token tracking - Simulates token usage for testing
Error simulation - Can simulate API failures for error handling tests

Mock-Specific Configuration

# Detailed mock setup
llm = create_llm_provider(
    provider="mock",
    model="claude-mock",
    responses=[
        "This is the first mock response.",
        "This is the second mock response.",
        "This is the third mock response."
    ],
    delay=1.0,           # 1 second delay
    cycle_responses=True, # Cycle through responses
    simulate_tokens=True, # Track mock token usage
    error_rate=0.1       # 10% chance of simulated errors
)

# Use in tests
response = await llm.generate("Any prompt")
print(response)  # Returns one of the mock responses

Multi-Provider Usage

You can use multiple providers in the same application for different use cases:

from daita.llm import create_llm_provider

# Different providers for different tasks
openai_llm = create_llm_provider("openai", "gpt-4", agent_id="analyzer")
anthropic_llm = create_llm_provider("anthropic", "claude-3-sonnet-20240229", agent_id="writer")
gemini_llm = create_llm_provider("gemini", "gemini-1.5-flash", agent_id="summarizer")

# Use appropriate provider for each task
analysis = await openai_llm.generate("Analyze this data: ...")
content = await anthropic_llm.generate("Write an article about: ...")
summary = await gemini_llm.generate("Summarize this document: ...")

Integration with Agents

LLM providers integrate seamlessly with Daita agents:

from daita import SubstrateAgent

# Create agent with specific provider
agent = SubstrateAgent(
    name="Analysis Agent",
    llm_provider="anthropic",
    model="claude-3-sonnet-20240229"
)

# Agent automatically uses the specified provider
result = await agent.process("analyze", data={"text": "Complex data to analyze"})

Multiple Agents with Different Providers

from daita import SubstrateAgent

# Each agent uses different provider optimized for its task
data_agent = SubstrateAgent(
    name="Data Processor",
    llm_provider="openai",
    model="gpt-4"
)

content_agent = SubstrateAgent(
    name="Content Generator",
    llm_provider="anthropic",
    model="claude-3-sonnet-20240229"
)

speed_agent = SubstrateAgent(
    name="Quick Responder",
    llm_provider="gemini",
    model="gemini-1.5-flash"
)

# Use each agent with its optimized provider
data_result = await data_agent.process("analyze", data)
content_result = await content_agent.process("generate", data)
speed_result = await speed_agent.process("summarize", data)

Error Handling

All providers implement consistent error handling:

from daita.core.exceptions import LLMError

try:
    llm = create_llm_provider("openai", "gpt-4")
    response = await llm.generate("Your prompt here")
except LLMError as e:
    print(f"LLM error: {e}")
    # Handle provider-specific errors
except Exception as e:
    print(f"Unexpected error: {e}")

Common Error Types

Authentication errors - Invalid API keys
Rate limiting - Too many requests
Model errors - Invalid model names
Network errors - Connection issues
Token limit errors - Prompt too long

Token Tracking

Token usage is automatically tracked when using agents:

from daita import SubstrateAgent

# Create agent with LLM provider
agent = SubstrateAgent(
    name="My Agent",
    llm_provider="openai",
    model="gpt-4"
)

# Use the agent (token usage tracked automatically)
await agent.process("analyze", {"text": "Hello, world!"})

# Check token usage
usage = agent.get_token_usage()
print(f"Total tokens: {usage['total_tokens']}")
print(f"Prompt tokens: {usage['prompt_tokens']}")
print(f"Completion tokens: {usage['completion_tokens']}")
print(f"Estimated cost: ${usage['estimated_cost']:.4f}")

Custom Providers

You can register custom LLM providers for specialized use cases:

from daita.llm import register_llm_provider, BaseLLMProvider

class CustomProvider(BaseLLMProvider):
    """Custom LLM provider implementation."""

    async def generate(self, prompt: str, **kwargs) -> str:
        # Your custom implementation
        return "Custom response"

# Register the provider
register_llm_provider("custom", CustomProvider)

# Use the custom provider
llm = create_llm_provider("custom", "custom-model")
response = await llm.generate("Test prompt")

Best Practices

API Key Management

import os

# Use environment variables for API keys
os.environ["OPENAI_API_KEY"] = "sk-your-key"
os.environ["ANTHROPIC_API_KEY"] = "sk-ant-your-key"

# Let the provider auto-detect
llm = create_llm_provider("openai", "gpt-4")  # No API key needed

Model Selection

# Choose models based on task complexity
simple_llm = create_llm_provider("gemini", "gemini-1.5-flash")      # Quick tasks
balanced_llm = create_llm_provider("anthropic", "claude-3-sonnet")  # Balanced
powerful_llm = create_llm_provider("openai", "gpt-4")               # Complex reasoning

Error Resilience

async def resilient_generate(prompt: str, providers: list):
    """Try multiple providers for resilience."""
    for provider_config in providers:
        try:
            llm = create_llm_provider(**provider_config)
            return await llm.generate(prompt)
        except LLMError:
            continue
    raise LLMError("All providers failed")

# Usage
providers = [
    {"provider": "openai", "model": "gpt-4"},
    {"provider": "anthropic", "model": "claude-3-sonnet-20240229"},
    {"provider": "gemini", "model": "gemini-1.5-pro"}
]

response = await resilient_generate("Your prompt", providers)

Performance Optimization

# Use appropriate models for the task
quick_llm = create_llm_provider("gemini", "gemini-1.5-flash")     # Fast responses
quality_llm = create_llm_provider("anthropic", "claude-3-opus")  # High quality

# Optimize parameters
speed_optimized = create_llm_provider(
    "openai", "gpt-3.5-turbo",
    temperature=0.3,    # Lower for consistency
    max_tokens=500      # Limit for speed
)

Next Steps

Getting Started - Quick start tutorial
Agents - Using LLMs in agents
Configuration - Advanced LLM configuration
Error Handling - Robust error management
Tracing - Monitor LLM usage and costs

Support

Provider Issues: Check provider-specific documentation and status pages
Integration Help: See agent documentation and examples
Custom Providers: Review the BaseLLMProvider class for implementation guidance
Token Optimization: Monitor usage with automatic tracing

Overview​

Quick Start​

Direct Instantiation (Recommended)​

Factory Pattern (For Dynamic Selection)​

Streaming Support​

Factory Function​

Parameters​

Registry​

OpenAI Provider​

Configuration​

Environment Variables​

Advanced Features​

Streaming Support​

Anthropic Provider​

Configuration​

Environment Variables​

Claude-Specific Features​

Streaming Support​

xAI Grok Provider​

Configuration​

Environment Variables​

Grok-Specific Features​

Streaming Support​

Google Gemini Provider​

Configuration​

Environment Variables​

Gemini-Specific Features​

Streaming Support​

Mock Provider​

Configuration​

Features​

Mock-Specific Configuration​

Multi-Provider Usage​

Integration with Agents​

Multiple Agents with Different Providers​

Error Handling​

Common Error Types​

Token Tracking​

Custom Providers​

Best Practices​

API Key Management​

Model Selection​

Error Resilience​

Performance Optimization​

Next Steps​

Support​

Overview

Quick Start

Direct Instantiation (Recommended)

Factory Pattern (For Dynamic Selection)

Streaming Support

Factory Function

Parameters

Registry

OpenAI Provider

Configuration

Environment Variables

Advanced Features

Streaming Support

Anthropic Provider

Configuration

Environment Variables

Claude-Specific Features

Streaming Support

xAI Grok Provider

Configuration

Environment Variables

Grok-Specific Features

Streaming Support

Google Gemini Provider

Configuration

Environment Variables

Gemini-Specific Features

Streaming Support

Mock Provider

Configuration

Features

Mock-Specific Configuration

Multi-Provider Usage

Integration with Agents

Multiple Agents with Different Providers

Error Handling

Common Error Types

Token Tracking

Custom Providers

Best Practices

API Key Management

Model Selection

Error Resilience

Performance Optimization

Next Steps

Support