Skip to main content

LLM Providers

The Daita framework provides a unified interface for working with multiple Large Language Model (LLM) providers. The system uses a factory pattern that allows easy switching between providers while maintaining a consistent interface for agents.

Overview

Daita supports five major LLM providers out of the box with full streaming support:

  • OpenAI - GPT-4, GPT-3.5-turbo, and other OpenAI models
  • Anthropic - Claude family models (Sonnet, Haiku, Opus)
  • xAI Grok - Grok and vision models
  • Google Gemini - Gemini and Gemini Pro models
  • Mock Provider - Testing and development without API calls

All providers support real-time streaming for both text generation and tool calling, enabling transparent agent execution with live progress updates.

Quick Start

The simplest way to use LLM providers is to import and instantiate them directly:

from daita.llm import OpenAIProvider, AnthropicProvider, GrokProvider

# OpenAI
llm = OpenAIProvider(model="gpt-4", api_key="sk-...")
response = await llm.generate("Hello, world!")

# Anthropic Claude
llm = AnthropicProvider(model="claude-3-sonnet-20240229", api_key="sk-ant-...")
response = await llm.generate("Analyze this data...")

# xAI Grok
llm = GrokProvider(model="grok-beta", api_key="xai-...")
response = await llm.generate("What's the latest news?")

# Streaming support
async for chunk in llm.generate("Explain quantum computing", stream=True):
if chunk.type == "text":
print(chunk.content, end="", flush=True)
elif chunk.type == "tool_call_complete":
print(f"\nTool: {chunk.tool_name}({chunk.tool_args})")

Factory Pattern (For Dynamic Selection)

Use the factory when you need to dynamically choose providers at runtime:

from daita.llm import create_llm_provider

# Useful when provider is determined at runtime
provider_name = config.get("llm_provider") # e.g., from config file
llm = create_llm_provider(provider_name, "gpt-4", api_key="sk-...")
response = await llm.generate("Hello, world!")

Streaming Support

All Daita LLM providers support real-time streaming for both text generation and tool calling. When streaming is enabled, you receive LLMChunk objects in real-time as the model generates content:

from daita.llm import OpenAIProvider

llm = OpenAIProvider(model="gpt-4")

# Stream text generation
async for chunk in llm.generate("Write a story", stream=True):
if chunk.type == "text":
# Text content streaming
print(chunk.content, end="", flush=True)
elif chunk.type == "tool_call_complete":
# Tool call detected
print(f"Tool: {chunk.tool_name}")
print(f"Args: {chunk.tool_args}")

Streaming Features:

  • ✅ Real-time text token streaming
  • ✅ Tool call streaming with complete arguments
  • ✅ Unified chunk format across all providers
  • ✅ Automatic token usage tracking
  • ✅ Model metadata in each chunk

LLMChunk Types:

  • "text": Text content chunks (field: content)
  • "tool_call_complete": Complete tool call (fields: tool_name, tool_args, tool_call_id)

Factory Function

The create_llm_provider() factory function is useful when you need to dynamically select providers at runtime (e.g., from configuration files). For most cases, direct instantiation is simpler and more Pythonic:

from daita.llm import create_llm_provider

llm = create_llm_provider(
provider="openai", # Provider name
model="gpt-4", # Model identifier
api_key="sk-...", # API key (optional if set in environment)
agent_id="my_agent", # For token tracking (optional)
temperature=0.7, # Model parameters (optional)
max_tokens=1000 # Additional provider-specific options
)

Parameters

ParameterTypeRequiredDescription
providerstrYesProvider name: 'openai', 'anthropic', 'grok', 'gemini', or 'mock'
modelstrYesModel identifier specific to the provider
api_keystrNoAPI key (uses environment variables if not provided)
agent_idstrNoAgent ID for token usage tracking
**kwargsdictNoAdditional provider-specific parameters

Registry

List available providers:

from daita.llm import list_available_providers

providers = list_available_providers()
print(providers) # ['openai', 'anthropic', 'grok', 'gemini', 'mock']

OpenAI Provider

The OpenAI provider supports all OpenAI models including GPT-4, GPT-3.5-turbo, and GPT-4-turbo variants.

Configuration

from daita.llm import OpenAIProvider

# Basic OpenAI configuration
llm = OpenAIProvider(
model="gpt-4",
api_key="sk-your-openai-key"
)

# Advanced configuration with custom parameters
llm = OpenAIProvider(
model="gpt-4o-mini",
api_key="sk-your-openai-key",
temperature=0.7,
max_tokens=1000,
frequency_penalty=0.1,
presence_penalty=0.1,
timeout=60
)

Environment Variables

# Required
export OPENAI_API_KEY="sk-your-openai-key"

# Optional
export OPENAI_ORG_ID="org-your-org-id"

Advanced Features

# Custom parameters with conversation
messages = [
{"role": "system", "content": "You are an expert code reviewer."},
{"role": "user", "content": "Analyze this code for bugs: def foo(): return x"}
]
response = await llm.generate(messages, temperature=0.3, max_tokens=2000)

# Tool calling (function calling)
from daita.core.tools import tool

@tool
async def get_weather(location: str) -> dict:
"""Get weather for a location."""
return {"temp": 72, "condition": "sunny"}

# Use tools with generate()
response = await llm.generate("What's the weather like?", tools=[get_weather])

Streaming Support

OpenAI provides full streaming support for both text and tool calling:

# Stream text responses
async for chunk in llm.generate("Explain AI", stream=True):
if chunk.type == "text":
print(chunk.content, end="", flush=True)

# Stream with tool calling
async for chunk in llm.generate("Search for X", tools=[search_tool], stream=True):
if chunk.type == "tool_call_complete":
print(f"Calling: {chunk.tool_name}({chunk.tool_args})")

Anthropic Provider

The Anthropic provider supports Claude 3 family models with their unique capabilities and safety features.

Configuration

from daita.llm import AnthropicProvider

# Basic Anthropic configuration
llm = AnthropicProvider(
model="claude-3-sonnet-20240229",
api_key="sk-ant-your-anthropic-key"
)

# Advanced configuration
llm = AnthropicProvider(
model="claude-3-opus-20240229",
api_key="sk-ant-your-anthropic-key",
temperature=0.5,
max_tokens=2000,
timeout=90
)

Environment Variables

# Required
export ANTHROPIC_API_KEY="sk-ant-your-anthropic-key"

Claude-Specific Features

# Long-form content generation
response = await llm.generate(
prompt="Write a comprehensive analysis of...",
max_tokens=4000,
temperature=0.7
)

# Document analysis with large context
response = await llm.generate(
prompt=f"Analyze this document: {large_document_text}",
max_tokens=1000
)

Streaming Support

Anthropic provides full streaming support with efficient tool calling:

# Stream Claude's responses
async for chunk in llm.generate("Write an essay", stream=True):
if chunk.type == "text":
print(chunk.content, end="", flush=True)

# Stream with tool calling (Claude's tool use)
async for chunk in llm.generate("Analyze data", tools=[analysis_tool], stream=True):
if chunk.type == "tool_call_complete":
print(f"Tool: {chunk.tool_name}({chunk.tool_args})")

xAI Grok Provider

The Grok provider connects to xAI's Grok models, which are optimized for real-time information and conversational AI.

Configuration

from daita.llm import GrokProvider

# Basic Grok configuration
llm = GrokProvider(
model="grok-beta",
api_key="xai-your-api-key"
)

# Configuration with custom base URL
llm = GrokProvider(
model="grok-vision-beta",
api_key="xai-your-api-key",
base_url="https://api.x.ai/v1",
timeout=60
)

Environment Variables

# Either of these work
export XAI_API_KEY="xai-your-api-key"
export GROK_API_KEY="xai-your-api-key"

Grok-Specific Features

# Real-time information queries
response = await llm.generate(
prompt="What's happening in tech news today?",
temperature=0.8
)

# Vision capabilities (grok-vision-beta)
llm_vision = create_llm_provider("grok", "grok-vision-beta")
response = await llm_vision.generate_with_image(
prompt="Describe this image",
image_path="./screenshot.png"
)

Streaming Support

Grok provides OpenAI-compatible streaming for real-time responses:

# Stream Grok responses
async for chunk in llm.generate("Latest tech trends", stream=True):
if chunk.type == "text":
print(chunk.content, end="", flush=True)

# Stream with tool calling
async for chunk in llm.generate("Search news", tools=[search_tool], stream=True):
if chunk.type == "tool_call_complete":
print(f"Tool: {chunk.tool_name}({chunk.tool_args})")

Google Gemini Provider

The Gemini provider supports Google's latest generative AI models with multimodal capabilities.

Configuration

from daita.llm import GeminiProvider

# Basic Gemini configuration
llm = GeminiProvider(
model="gemini-1.5-flash",
api_key="AIza-your-google-api-key"
)

# Advanced configuration with safety settings
llm = GeminiProvider(
model="gemini-1.5-pro",
api_key="AIza-your-google-api-key",
temperature=0.9,
safety_settings={
"HARM_CATEGORY_HARASSMENT": "BLOCK_MEDIUM_AND_ABOVE",
"HARM_CATEGORY_HATE_SPEECH": "BLOCK_MEDIUM_AND_ABOVE"
},
generation_config={
"candidate_count": 1,
"max_output_tokens": 2048
}
)

Environment Variables

# Either of these work
export GOOGLE_API_KEY="AIza-your-google-api-key"
export GEMINI_API_KEY="AIza-your-google-api-key"

Gemini-Specific Features

# Large context processing
response = await llm.generate(
prompt=f"Summarize this entire codebase: {massive_code_text}",
max_tokens=1000
)

# Multimodal capabilities
response = await llm.generate_with_media(
prompt="Explain what's happening in this video",
media_path="./demo_video.mp4"
)

# Safety-filtered generation
response = await llm.generate(
prompt="Generate content about...",
safety_settings={
"HARM_CATEGORY_DANGEROUS_CONTENT": "BLOCK_LOW_AND_ABOVE"
}
)

Streaming Support

Gemini provides full streaming support for both text and tool calling with large context windows:

# Stream Gemini responses
async for chunk in llm.generate("Explain quantum physics", stream=True):
if chunk.type == "text":
print(chunk.content, end="", flush=True)

# Stream with tool calling (multi-turn conversations supported)
async for chunk in llm.generate("Analyze data", tools=[analysis_tool], stream=True):
if chunk.type == "tool_call_complete":
print(f"Tool: {chunk.tool_name}({chunk.tool_args})")

Note: Gemini streaming includes automatic validation to filter empty tool calls, ensuring reliable multi-turn tool conversations.

Mock Provider

The Mock provider is designed for testing and development without making actual API calls or incurring costs.

Configuration

# Basic mock configuration
llm = create_llm_provider(
provider="mock",
model="test-model",
agent_id="test_agent"
)

# Mock with custom responses
llm = create_llm_provider(
provider="mock",
model="gpt-4-mock",
responses=["Hello! This is a mock response.", "Another mock response."],
delay=0.5 # Simulate API latency
)

Features

  • No API calls - Returns predefined responses
  • Latency simulation - Configurable delays to simulate real API behavior
  • Token tracking - Simulates token usage for testing
  • Error simulation - Can simulate API failures for error handling tests

Mock-Specific Configuration

# Detailed mock setup
llm = create_llm_provider(
provider="mock",
model="claude-mock",
responses=[
"This is the first mock response.",
"This is the second mock response.",
"This is the third mock response."
],
delay=1.0, # 1 second delay
cycle_responses=True, # Cycle through responses
simulate_tokens=True, # Track mock token usage
error_rate=0.1 # 10% chance of simulated errors
)

# Use in tests
response = await llm.generate("Any prompt")
print(response) # Returns one of the mock responses

Multi-Provider Usage

You can use multiple providers in the same application for different use cases:

from daita.llm import create_llm_provider

# Different providers for different tasks
openai_llm = create_llm_provider("openai", "gpt-4", agent_id="analyzer")
anthropic_llm = create_llm_provider("anthropic", "claude-3-sonnet-20240229", agent_id="writer")
gemini_llm = create_llm_provider("gemini", "gemini-1.5-flash", agent_id="summarizer")

# Use appropriate provider for each task
analysis = await openai_llm.generate("Analyze this data: ...")
content = await anthropic_llm.generate("Write an article about: ...")
summary = await gemini_llm.generate("Summarize this document: ...")

Integration with Agents

LLM providers integrate seamlessly with Daita agents:

from daita import SubstrateAgent

# Create agent with specific provider
agent = SubstrateAgent(
name="Analysis Agent",
llm_provider="anthropic",
model="claude-3-sonnet-20240229"
)

# Agent automatically uses the specified provider
result = await agent.process("analyze", data={"text": "Complex data to analyze"})

Multiple Agents with Different Providers

from daita import SubstrateAgent

# Each agent uses different provider optimized for its task
data_agent = SubstrateAgent(
name="Data Processor",
llm_provider="openai",
model="gpt-4"
)

content_agent = SubstrateAgent(
name="Content Generator",
llm_provider="anthropic",
model="claude-3-sonnet-20240229"
)

speed_agent = SubstrateAgent(
name="Quick Responder",
llm_provider="gemini",
model="gemini-1.5-flash"
)

# Use each agent with its optimized provider
data_result = await data_agent.process("analyze", data)
content_result = await content_agent.process("generate", data)
speed_result = await speed_agent.process("summarize", data)

Error Handling

All providers implement consistent error handling:

from daita.core.exceptions import LLMError

try:
llm = create_llm_provider("openai", "gpt-4")
response = await llm.generate("Your prompt here")
except LLMError as e:
print(f"LLM error: {e}")
# Handle provider-specific errors
except Exception as e:
print(f"Unexpected error: {e}")

Common Error Types

  • Authentication errors - Invalid API keys
  • Rate limiting - Too many requests
  • Model errors - Invalid model names
  • Network errors - Connection issues
  • Token limit errors - Prompt too long

Token Tracking

Token usage is automatically tracked when using agents:

from daita import SubstrateAgent

# Create agent with LLM provider
agent = SubstrateAgent(
name="My Agent",
llm_provider="openai",
model="gpt-4"
)

# Use the agent (token usage tracked automatically)
await agent.process("analyze", {"text": "Hello, world!"})

# Check token usage
usage = agent.get_token_usage()
print(f"Total tokens: {usage['total_tokens']}")
print(f"Prompt tokens: {usage['prompt_tokens']}")
print(f"Completion tokens: {usage['completion_tokens']}")
print(f"Estimated cost: ${usage['estimated_cost']:.4f}")

Custom Providers

You can register custom LLM providers for specialized use cases:

from daita.llm import register_llm_provider, BaseLLMProvider

class CustomProvider(BaseLLMProvider):
"""Custom LLM provider implementation."""

async def generate(self, prompt: str, **kwargs) -> str:
# Your custom implementation
return "Custom response"

# Register the provider
register_llm_provider("custom", CustomProvider)

# Use the custom provider
llm = create_llm_provider("custom", "custom-model")
response = await llm.generate("Test prompt")

Best Practices

API Key Management

import os

# Use environment variables for API keys
os.environ["OPENAI_API_KEY"] = "sk-your-key"
os.environ["ANTHROPIC_API_KEY"] = "sk-ant-your-key"

# Let the provider auto-detect
llm = create_llm_provider("openai", "gpt-4") # No API key needed

Model Selection

# Choose models based on task complexity
simple_llm = create_llm_provider("gemini", "gemini-1.5-flash") # Quick tasks
balanced_llm = create_llm_provider("anthropic", "claude-3-sonnet") # Balanced
powerful_llm = create_llm_provider("openai", "gpt-4") # Complex reasoning

Error Resilience

async def resilient_generate(prompt: str, providers: list):
"""Try multiple providers for resilience."""
for provider_config in providers:
try:
llm = create_llm_provider(**provider_config)
return await llm.generate(prompt)
except LLMError:
continue
raise LLMError("All providers failed")

# Usage
providers = [
{"provider": "openai", "model": "gpt-4"},
{"provider": "anthropic", "model": "claude-3-sonnet-20240229"},
{"provider": "gemini", "model": "gemini-1.5-pro"}
]

response = await resilient_generate("Your prompt", providers)

Performance Optimization

# Use appropriate models for the task
quick_llm = create_llm_provider("gemini", "gemini-1.5-flash") # Fast responses
quality_llm = create_llm_provider("anthropic", "claude-3-opus") # High quality

# Optimize parameters
speed_optimized = create_llm_provider(
"openai", "gpt-3.5-turbo",
temperature=0.3, # Lower for consistency
max_tokens=500 # Limit for speed
)

Next Steps

Support

  • Provider Issues: Check provider-specific documentation and status pages
  • Integration Help: See agent documentation and examples
  • Custom Providers: Review the BaseLLMProvider class for implementation guidance
  • Token Optimization: Monitor usage with automatic tracing