LLM Providers
The Daita framework provides a unified interface for working with multiple Large Language Model (LLM) providers. The system uses a factory pattern that allows easy switching between providers while maintaining a consistent interface for agents.
Overview
Daita supports five major LLM providers out of the box with full streaming support:
- OpenAI - GPT-4, GPT-3.5-turbo, and other OpenAI models
- Anthropic - Claude family models (Sonnet, Haiku, Opus)
- xAI Grok - Grok and vision models
- Google Gemini - Gemini and Gemini Pro models
- Mock Provider - Testing and development without API calls
All providers support real-time streaming for both text generation and tool calling, enabling transparent agent execution with live progress updates.
Quick Start
Direct Instantiation (Recommended)
The simplest way to use LLM providers is to import and instantiate them directly:
from daita.llm import OpenAIProvider, AnthropicProvider, GrokProvider
# OpenAI
llm = OpenAIProvider(model="gpt-4", api_key="sk-...")
response = await llm.generate("Hello, world!")
# Anthropic Claude
llm = AnthropicProvider(model="claude-3-sonnet-20240229", api_key="sk-ant-...")
response = await llm.generate("Analyze this data...")
# xAI Grok
llm = GrokProvider(model="grok-beta", api_key="xai-...")
response = await llm.generate("What's the latest news?")
# Streaming support
async for chunk in llm.generate("Explain quantum computing", stream=True):
if chunk.type == "text":
print(chunk.content, end="", flush=True)
elif chunk.type == "tool_call_complete":
print(f"\nTool: {chunk.tool_name}({chunk.tool_args})")
Factory Pattern (For Dynamic Selection)
Use the factory when you need to dynamically choose providers at runtime:
from daita.llm import create_llm_provider
# Useful when provider is determined at runtime
provider_name = config.get("llm_provider") # e.g., from config file
llm = create_llm_provider(provider_name, "gpt-4", api_key="sk-...")
response = await llm.generate("Hello, world!")
Streaming Support
All Daita LLM providers support real-time streaming for both text generation and tool calling. When streaming is enabled, you receive LLMChunk objects in real-time as the model generates content:
from daita.llm import OpenAIProvider
llm = OpenAIProvider(model="gpt-4")
# Stream text generation
async for chunk in llm.generate("Write a story", stream=True):
if chunk.type == "text":
# Text content streaming
print(chunk.content, end="", flush=True)
elif chunk.type == "tool_call_complete":
# Tool call detected
print(f"Tool: {chunk.tool_name}")
print(f"Args: {chunk.tool_args}")
Streaming Features:
- ✅ Real-time text token streaming
- ✅ Tool call streaming with complete arguments
- ✅ Unified chunk format across all providers
- ✅ Automatic token usage tracking
- ✅ Model metadata in each chunk
LLMChunk Types:
"text": Text content chunks (field:content)"tool_call_complete": Complete tool call (fields:tool_name,tool_args,tool_call_id)
Factory Function
The create_llm_provider() factory function is useful when you need to dynamically select providers at runtime (e.g., from configuration files). For most cases, direct instantiation is simpler and more Pythonic:
from daita.llm import create_llm_provider
llm = create_llm_provider(
provider="openai", # Provider name
model="gpt-4", # Model identifier
api_key="sk-...", # API key (optional if set in environment)
agent_id="my_agent", # For token tracking (optional)
temperature=0.7, # Model parameters (optional)
max_tokens=1000 # Additional provider-specific options
)
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
provider | str | Yes | Provider name: 'openai', 'anthropic', 'grok', 'gemini', or 'mock' |
model | str | Yes | Model identifier specific to the provider |
api_key | str | No | API key (uses environment variables if not provided) |
agent_id | str | No | Agent ID for token usage tracking |
**kwargs | dict | No | Additional provider-specific parameters |
Registry
List available providers:
from daita.llm import list_available_providers
providers = list_available_providers()
print(providers) # ['openai', 'anthropic', 'grok', 'gemini', 'mock']
OpenAI Provider
The OpenAI provider supports all OpenAI models including GPT-4, GPT-3.5-turbo, and GPT-4-turbo variants.
Configuration
from daita.llm import OpenAIProvider
# Basic OpenAI configuration
llm = OpenAIProvider(
model="gpt-4",
api_key="sk-your-openai-key"
)
# Advanced configuration with custom parameters
llm = OpenAIProvider(
model="gpt-4o-mini",
api_key="sk-your-openai-key",
temperature=0.7,
max_tokens=1000,
frequency_penalty=0.1,
presence_penalty=0.1,
timeout=60
)
Environment Variables
# Required
export OPENAI_API_KEY="sk-your-openai-key"
# Optional
export OPENAI_ORG_ID="org-your-org-id"
Advanced Features
# Custom parameters with conversation
messages = [
{"role": "system", "content": "You are an expert code reviewer."},
{"role": "user", "content": "Analyze this code for bugs: def foo(): return x"}
]
response = await llm.generate(messages, temperature=0.3, max_tokens=2000)
# Tool calling (function calling)
from daita.core.tools import tool
@tool
async def get_weather(location: str) -> dict:
"""Get weather for a location."""
return {"temp": 72, "condition": "sunny"}
# Use tools with generate()
response = await llm.generate("What's the weather like?", tools=[get_weather])
Streaming Support
OpenAI provides full streaming support for both text and tool calling:
# Stream text responses
async for chunk in llm.generate("Explain AI", stream=True):
if chunk.type == "text":
print(chunk.content, end="", flush=True)
# Stream with tool calling
async for chunk in llm.generate("Search for X", tools=[search_tool], stream=True):
if chunk.type == "tool_call_complete":
print(f"Calling: {chunk.tool_name}({chunk.tool_args})")
Anthropic Provider
The Anthropic provider supports Claude 3 family models with their unique capabilities and safety features.
Configuration
from daita.llm import AnthropicProvider
# Basic Anthropic configuration
llm = AnthropicProvider(
model="claude-3-sonnet-20240229",
api_key="sk-ant-your-anthropic-key"
)
# Advanced configuration
llm = AnthropicProvider(
model="claude-3-opus-20240229",
api_key="sk-ant-your-anthropic-key",
temperature=0.5,
max_tokens=2000,
timeout=90
)
Environment Variables
# Required
export ANTHROPIC_API_KEY="sk-ant-your-anthropic-key"
Claude-Specific Features
# Long-form content generation
response = await llm.generate(
prompt="Write a comprehensive analysis of...",
max_tokens=4000,
temperature=0.7
)
# Document analysis with large context
response = await llm.generate(
prompt=f"Analyze this document: {large_document_text}",
max_tokens=1000
)
Streaming Support
Anthropic provides full streaming support with efficient tool calling:
# Stream Claude's responses
async for chunk in llm.generate("Write an essay", stream=True):
if chunk.type == "text":
print(chunk.content, end="", flush=True)
# Stream with tool calling (Claude's tool use)
async for chunk in llm.generate("Analyze data", tools=[analysis_tool], stream=True):
if chunk.type == "tool_call_complete":
print(f"Tool: {chunk.tool_name}({chunk.tool_args})")
xAI Grok Provider
The Grok provider connects to xAI's Grok models, which are optimized for real-time information and conversational AI.
Configuration
from daita.llm import GrokProvider
# Basic Grok configuration
llm = GrokProvider(
model="grok-beta",
api_key="xai-your-api-key"
)
# Configuration with custom base URL
llm = GrokProvider(
model="grok-vision-beta",
api_key="xai-your-api-key",
base_url="https://api.x.ai/v1",
timeout=60
)
Environment Variables
# Either of these work
export XAI_API_KEY="xai-your-api-key"
export GROK_API_KEY="xai-your-api-key"
Grok-Specific Features
# Real-time information queries
response = await llm.generate(
prompt="What's happening in tech news today?",
temperature=0.8
)
# Vision capabilities (grok-vision-beta)
llm_vision = create_llm_provider("grok", "grok-vision-beta")
response = await llm_vision.generate_with_image(
prompt="Describe this image",
image_path="./screenshot.png"
)
Streaming Support
Grok provides OpenAI-compatible streaming for real-time responses:
# Stream Grok responses
async for chunk in llm.generate("Latest tech trends", stream=True):
if chunk.type == "text":
print(chunk.content, end="", flush=True)
# Stream with tool calling
async for chunk in llm.generate("Search news", tools=[search_tool], stream=True):
if chunk.type == "tool_call_complete":
print(f"Tool: {chunk.tool_name}({chunk.tool_args})")
Google Gemini Provider
The Gemini provider supports Google's latest generative AI models with multimodal capabilities.
Configuration
from daita.llm import GeminiProvider
# Basic Gemini configuration
llm = GeminiProvider(
model="gemini-1.5-flash",
api_key="AIza-your-google-api-key"
)
# Advanced configuration with safety settings
llm = GeminiProvider(
model="gemini-1.5-pro",
api_key="AIza-your-google-api-key",
temperature=0.9,
safety_settings={
"HARM_CATEGORY_HARASSMENT": "BLOCK_MEDIUM_AND_ABOVE",
"HARM_CATEGORY_HATE_SPEECH": "BLOCK_MEDIUM_AND_ABOVE"
},
generation_config={
"candidate_count": 1,
"max_output_tokens": 2048
}
)
Environment Variables
# Either of these work
export GOOGLE_API_KEY="AIza-your-google-api-key"
export GEMINI_API_KEY="AIza-your-google-api-key"
Gemini-Specific Features
# Large context processing
response = await llm.generate(
prompt=f"Summarize this entire codebase: {massive_code_text}",
max_tokens=1000
)
# Multimodal capabilities
response = await llm.generate_with_media(
prompt="Explain what's happening in this video",
media_path="./demo_video.mp4"
)
# Safety-filtered generation
response = await llm.generate(
prompt="Generate content about...",
safety_settings={
"HARM_CATEGORY_DANGEROUS_CONTENT": "BLOCK_LOW_AND_ABOVE"
}
)
Streaming Support
Gemini provides full streaming support for both text and tool calling with large context windows:
# Stream Gemini responses
async for chunk in llm.generate("Explain quantum physics", stream=True):
if chunk.type == "text":
print(chunk.content, end="", flush=True)
# Stream with tool calling (multi-turn conversations supported)
async for chunk in llm.generate("Analyze data", tools=[analysis_tool], stream=True):
if chunk.type == "tool_call_complete":
print(f"Tool: {chunk.tool_name}({chunk.tool_args})")
Note: Gemini streaming includes automatic validation to filter empty tool calls, ensuring reliable multi-turn tool conversations.
Mock Provider
The Mock provider is designed for testing and development without making actual API calls or incurring costs.
Configuration
# Basic mock configuration
llm = create_llm_provider(
provider="mock",
model="test-model",
agent_id="test_agent"
)
# Mock with custom responses
llm = create_llm_provider(
provider="mock",
model="gpt-4-mock",
responses=["Hello! This is a mock response.", "Another mock response."],
delay=0.5 # Simulate API latency
)
Features
- No API calls - Returns predefined responses
- Latency simulation - Configurable delays to simulate real API behavior
- Token tracking - Simulates token usage for testing
- Error simulation - Can simulate API failures for error handling tests
Mock-Specific Configuration
# Detailed mock setup
llm = create_llm_provider(
provider="mock",
model="claude-mock",
responses=[
"This is the first mock response.",
"This is the second mock response.",
"This is the third mock response."
],
delay=1.0, # 1 second delay
cycle_responses=True, # Cycle through responses
simulate_tokens=True, # Track mock token usage
error_rate=0.1 # 10% chance of simulated errors
)
# Use in tests
response = await llm.generate("Any prompt")
print(response) # Returns one of the mock responses
Multi-Provider Usage
You can use multiple providers in the same application for different use cases:
from daita.llm import create_llm_provider
# Different providers for different tasks
openai_llm = create_llm_provider("openai", "gpt-4", agent_id="analyzer")
anthropic_llm = create_llm_provider("anthropic", "claude-3-sonnet-20240229", agent_id="writer")
gemini_llm = create_llm_provider("gemini", "gemini-1.5-flash", agent_id="summarizer")
# Use appropriate provider for each task
analysis = await openai_llm.generate("Analyze this data: ...")
content = await anthropic_llm.generate("Write an article about: ...")
summary = await gemini_llm.generate("Summarize this document: ...")
Integration with Agents
LLM providers integrate seamlessly with Daita agents:
from daita import SubstrateAgent
# Create agent with specific provider
agent = SubstrateAgent(
name="Analysis Agent",
llm_provider="anthropic",
model="claude-3-sonnet-20240229"
)
# Agent automatically uses the specified provider
result = await agent.process("analyze", data={"text": "Complex data to analyze"})
Multiple Agents with Different Providers
from daita import SubstrateAgent
# Each agent uses different provider optimized for its task
data_agent = SubstrateAgent(
name="Data Processor",
llm_provider="openai",
model="gpt-4"
)
content_agent = SubstrateAgent(
name="Content Generator",
llm_provider="anthropic",
model="claude-3-sonnet-20240229"
)
speed_agent = SubstrateAgent(
name="Quick Responder",
llm_provider="gemini",
model="gemini-1.5-flash"
)
# Use each agent with its optimized provider
data_result = await data_agent.process("analyze", data)
content_result = await content_agent.process("generate", data)
speed_result = await speed_agent.process("summarize", data)
Error Handling
All providers implement consistent error handling:
from daita.core.exceptions import LLMError
try:
llm = create_llm_provider("openai", "gpt-4")
response = await llm.generate("Your prompt here")
except LLMError as e:
print(f"LLM error: {e}")
# Handle provider-specific errors
except Exception as e:
print(f"Unexpected error: {e}")
Common Error Types
- Authentication errors - Invalid API keys
- Rate limiting - Too many requests
- Model errors - Invalid model names
- Network errors - Connection issues
- Token limit errors - Prompt too long
Token Tracking
Token usage is automatically tracked when using agents:
from daita import SubstrateAgent
# Create agent with LLM provider
agent = SubstrateAgent(
name="My Agent",
llm_provider="openai",
model="gpt-4"
)
# Use the agent (token usage tracked automatically)
await agent.process("analyze", {"text": "Hello, world!"})
# Check token usage
usage = agent.get_token_usage()
print(f"Total tokens: {usage['total_tokens']}")
print(f"Prompt tokens: {usage['prompt_tokens']}")
print(f"Completion tokens: {usage['completion_tokens']}")
print(f"Estimated cost: ${usage['estimated_cost']:.4f}")
Custom Providers
You can register custom LLM providers for specialized use cases:
from daita.llm import register_llm_provider, BaseLLMProvider
class CustomProvider(BaseLLMProvider):
"""Custom LLM provider implementation."""
async def generate(self, prompt: str, **kwargs) -> str:
# Your custom implementation
return "Custom response"
# Register the provider
register_llm_provider("custom", CustomProvider)
# Use the custom provider
llm = create_llm_provider("custom", "custom-model")
response = await llm.generate("Test prompt")
Best Practices
API Key Management
import os
# Use environment variables for API keys
os.environ["OPENAI_API_KEY"] = "sk-your-key"
os.environ["ANTHROPIC_API_KEY"] = "sk-ant-your-key"
# Let the provider auto-detect
llm = create_llm_provider("openai", "gpt-4") # No API key needed
Model Selection
# Choose models based on task complexity
simple_llm = create_llm_provider("gemini", "gemini-1.5-flash") # Quick tasks
balanced_llm = create_llm_provider("anthropic", "claude-3-sonnet") # Balanced
powerful_llm = create_llm_provider("openai", "gpt-4") # Complex reasoning
Error Resilience
async def resilient_generate(prompt: str, providers: list):
"""Try multiple providers for resilience."""
for provider_config in providers:
try:
llm = create_llm_provider(**provider_config)
return await llm.generate(prompt)
except LLMError:
continue
raise LLMError("All providers failed")
# Usage
providers = [
{"provider": "openai", "model": "gpt-4"},
{"provider": "anthropic", "model": "claude-3-sonnet-20240229"},
{"provider": "gemini", "model": "gemini-1.5-pro"}
]
response = await resilient_generate("Your prompt", providers)
Performance Optimization
# Use appropriate models for the task
quick_llm = create_llm_provider("gemini", "gemini-1.5-flash") # Fast responses
quality_llm = create_llm_provider("anthropic", "claude-3-opus") # High quality
# Optimize parameters
speed_optimized = create_llm_provider(
"openai", "gpt-3.5-turbo",
temperature=0.3, # Lower for consistency
max_tokens=500 # Limit for speed
)
Next Steps
- Getting Started - Quick start tutorial
- Agents - Using LLMs in agents
- Configuration - Advanced LLM configuration
- Error Handling - Robust error management
- Tracing - Monitor LLM usage and costs
Support
- Provider Issues: Check provider-specific documentation and status pages
- Integration Help: See agent documentation and examples
- Custom Providers: Review the
BaseLLMProviderclass for implementation guidance - Token Optimization: Monitor usage with automatic tracing