Daita Logo

LLM Providers

The Daita framework provides a unified interface for working with multiple Large Language Model (LLM) providers. The system uses a factory pattern that allows easy switching between providers while maintaining a consistent interface for agents.

#Overview

Daita supports six LLM providers out of the box with full streaming support:

  • OpenAI - GPT-5.4 mini, GPT-5.4, GPT-5.5, and other OpenAI models
  • Anthropic - Claude family models (Haiku, Sonnet, Opus)
  • xAI Grok - Grok 4 and vision models
  • Google Gemini - Gemini 2.5 Flash/Lite and Gemini Pro models
  • Ollama - Local models (Llama, Mistral, Gemma, CodeStral, Phi, etc.)
  • Mock Provider - Testing and development without API calls

All providers support real-time streaming for both text generation and tool calling, enabling transparent agent execution with live progress updates.

#Environment Variables

Set API keys for the providers you'll use:

ProviderEnvironment VariableKey Format
OpenAIOPENAI_API_KEYsk-...
AnthropicANTHROPIC_API_KEYsk-ant-...
GoogleGOOGLE_API_KEY or GEMINI_API_KEYAIza...
xAIXAI_API_KEY or GROK_API_KEYxai-...
Ollama(none required)
bash
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."

#Quick Start

Import and instantiate providers directly:

python
from daita.llm import OpenAIProvider, AnthropicProvider, GrokProvider
 
# OpenAI
llm = OpenAIProvider(model="gpt-5.4-mini")
response = await llm.generate("Hello, world!")
 
# Anthropic Claude
llm = AnthropicProvider(model="claude-haiku-4-5")
response = await llm.generate("Analyze this data...")
 
# xAI Grok
llm = GrokProvider(model="grok-4.20")
response = await llm.generate("What's the latest news?")

#Factory Pattern

Use the factory when provider is determined at runtime:

python
from daita.llm import create_llm_provider
 
provider_name = config.get("llm_provider")  # From config
llm = create_llm_provider(provider_name, "gpt-5.4-mini")
response = await llm.generate("Hello, world!")

#Streaming

All providers support real-time streaming for text and tool calling:

python
from daita.llm import OpenAIProvider
 
llm = OpenAIProvider(model="gpt-5.4-mini")
 
# Stream text generation
async for chunk in llm.generate("Write a story", stream=True):
    if chunk.type == "text":
        print(chunk.content, end="", flush=True)
    elif chunk.type == "tool_call_complete":
        print(f"\nTool: {chunk.tool_name}({chunk.tool_args})")

Chunk Types:

  • "text" - Text content (field: content)
  • "tool_call_complete" - Tool call (fields: tool_name, tool_args, tool_call_id)

All providers return the same chunk format for consistent handling across different LLMs.

#Factory Function

The create_llm_provider() factory function is useful when you need to dynamically select providers at runtime (e.g., from configuration files). For most cases, direct instantiation is simpler and more Pythonic:

python
from daita.llm import create_llm_provider
 
llm = create_llm_provider(
    provider="openai",           # Provider name
    model="gpt-5.4-mini",       # Model identifier
    api_key="sk-...",           # API key (optional if set in environment)
    agent_id="my_agent",        # For token tracking (optional)
    temperature=0.7,            # Model parameters (optional)
    max_tokens=1000             # Additional provider-specific options
)

#Parameters

ParameterTypeRequiredDescription
providerstrYesProvider name: 'openai', 'anthropic', 'grok', 'gemini', 'ollama', or 'mock'
modelstrYesModel identifier specific to the provider
api_keystrNoAPI key (uses environment variables if not provided)
agent_idstrNoAgent ID for token usage tracking
**kwargsdictNoAdditional provider-specific parameters

#Registry

List available providers:

python
from daita.llm import list_available_providers
 
providers = list_available_providers()
print(providers)  # ['openai', 'anthropic', 'grok', 'gemini', 'ollama', 'mock']

#OpenAI Provider

The OpenAI provider supports OpenAI chat models including GPT-5.4 mini, GPT-5.4, GPT-5.5, and legacy GPT-4 variants.

#Configuration

python
from daita.llm import OpenAIProvider
 
# Basic OpenAI configuration
llm = OpenAIProvider(
    model="gpt-5.4-mini",
    api_key="sk-your-openai-key"
)
 
# Advanced configuration with custom parameters
llm = OpenAIProvider(
    model="gpt-5.4-mini",
    api_key="sk-your-openai-key",
    temperature=0.7,
    max_completion_tokens=1000,
    reasoning_effort="medium",
    service_tier="auto",
    parallel_tool_calls=True,
    frequency_penalty=0.1,
    presence_penalty=0.1,
    timeout=60
)

max_tokens is still accepted as a convenience alias. For current OpenAI models, Daita sends it as max_completion_tokens by default. Set use_legacy_max_tokens=True when targeting older OpenAI-compatible endpoints that still expect max_tokens.

#Advanced Features

python
# Custom parameters with conversation
messages = [
    {"role": "system", "content": "You are an expert code reviewer."},
    {"role": "user", "content": "Analyze this code for bugs: def foo(): return x"}
]
response = await llm.generate(messages, temperature=0.3, max_tokens=2000)
 
# Tool calling (function calling)
from daita.core.tools import tool
 
@tool
async def get_weather(location: str) -> dict:
    """Get weather for a location."""
    return {"temp": 72, "condition": "sunny"}
 
# Use tools with generate()
response = await llm.generate("What's the weather like?", tools=[get_weather])

#Anthropic Provider

The Anthropic provider supports Claude family models with their unique capabilities and safety features.

#Configuration

python
from daita.llm import AnthropicProvider
 
# Basic Anthropic configuration
llm = AnthropicProvider(
    model="claude-haiku-4-5",
    api_key="sk-ant-your-anthropic-key"
)
 
# Advanced configuration
llm = AnthropicProvider(
    model="claude-sonnet-4-5",
    api_key="sk-ant-your-anthropic-key",
    temperature=0.5,
    max_tokens=2000,
    timeout=90
)

#Claude-Specific Features

python
# Long-form content generation
response = await llm.generate(
    prompt="Write a comprehensive analysis of...",
    max_tokens=4000,
    temperature=0.7
)
 
# Document analysis with large context
response = await llm.generate(
    prompt=f"Analyze this document: {large_document_text}",
    max_tokens=1000
)

#xAI Grok Provider

The Grok provider connects to xAI's Grok models, which are optimized for real-time information and conversational AI.

#Configuration

python
from daita.llm import GrokProvider
 
# Basic Grok configuration
llm = GrokProvider(
    model="grok-4.20",
    api_key="xai-your-api-key"
)
 
# Configuration with custom base URL
llm = GrokProvider(
    model="grok-vision-beta",
    api_key="xai-your-api-key",
    base_url="https://api.x.ai/v1",
    timeout=60
)

#Grok-Specific Features

python
# Real-time information queries
response = await llm.generate(
    prompt="What's happening in tech news today?",
    temperature=0.8
)
 
# Vision capabilities (grok-vision-beta)
llm_vision = create_llm_provider("grok", "grok-vision-beta")
response = await llm_vision.generate_with_image(
    prompt="Describe this image",
    image_path="./screenshot.png"
)

#Google Gemini Provider

The Gemini provider supports Google's latest generative AI models with multimodal capabilities.

#Configuration

python
from daita.llm import GeminiProvider
 
# Basic Gemini configuration
llm = GeminiProvider(
    model="gemini-2.5-flash-lite",
    api_key="AIza-your-google-api-key"
)
 
# Advanced configuration with safety settings
llm = GeminiProvider(
    model="gemini-2.5-flash",
    api_key="AIza-your-google-api-key",
    temperature=0.9,
    top_k=40,
    response_mime_type="application/json",
    safety_settings={
        "HARM_CATEGORY_HARASSMENT": "BLOCK_MEDIUM_AND_ABOVE",
        "HARM_CATEGORY_HATE_SPEECH": "BLOCK_MEDIUM_AND_ABOVE"
    },
    generation_config={
        "candidate_count": 1,
        "max_output_tokens": 2048
    }
)

Gemini provider calls also accept stop_sequences, response_schema, and thinking_config. These are forwarded into GenerateContentConfig alongside standard options such as temperature, top_p, and max_tokens.

#Gemini-Specific Features

python
# Large context processing
response = await llm.generate(
    prompt=f"Summarize this entire codebase: {massive_code_text}",
    max_tokens=1000
)
 
# Multimodal capabilities
response = await llm.generate_with_media(
    prompt="Explain what's happening in this video",
    media_path="./demo_video.mp4"
)
 
# Safety-filtered generation
response = await llm.generate(
    prompt="Generate content about...",
    safety_settings={
        "HARM_CATEGORY_DANGEROUS_CONTENT": "BLOCK_LOW_AND_ABOVE"
    }
)

#Ollama Provider

The Ollama provider connects to a locally running Ollama server via its OpenAI-compatible API, letting you run agents against any model available through ollama pull — Llama 3.1, Mistral, Gemma 2, CodeStral, Phi 3, and more.

No API key is required. Ollama must be running locally (or at a reachable URL).

#Configuration

python
from daita.llm import OllamaProvider
 
# Basic — uses localhost:11434 by default
llm = OllamaProvider(model="llama3.1")
 
# Custom server URL
llm = OllamaProvider(
    model="mistral",
    base_url="http://gpu-server:11434/v1",
    timeout=120
)

The server URL can also be set via the OLLAMA_BASE_URL environment variable.

#Using with Agents

python
from daita import Agent
 
agent = Agent(
    name="Local Agent",
    llm_provider="ollama",
    model="llama3.1",
)
 
await agent.start()
result = await agent.run("Analyze this data...")
await agent.stop()

#Supported Models

Any model available via ollama pull works. Common choices:

ModelUse CaseCommand
llama3.1General purposeollama pull llama3.1
mistralFast, balancedollama pull mistral
codestralCode generationollama pull codestral
gemma2Lightweight, efficientollama pull gemma2
phi3Small but capableollama pull phi3

#Error Handling

The Ollama provider produces clear diagnostics for common issues:

  • Connection refused — "Cannot connect to Ollama at ... Is Ollama running? Start it with: ollama serve"
  • Cloud environment — "Ollama is a local-only LLM provider and cannot run in Daita Cloud. Use a cloud provider instead."

#Limitations

  • Local only — Ollama cannot run in Daita Cloud deployments. Use a cloud provider (OpenAI, Anthropic, Gemini, Grok) for hosted agents.
  • Tool calling — Supported, but quality varies by model. Llama 3.1 and Mistral have the best tool-calling support.
  • Streaming — Fully supported via the OpenAI-compatible streaming API.

#Mock Provider

The Mock provider is designed for testing and development without making actual API calls or incurring costs.

#Configuration

python
# Basic mock configuration
llm = create_llm_provider(
    provider="mock",
    model="test-model",
    agent_id="test_agent"
)
 
# Mock with custom responses
llm = create_llm_provider(
    provider="mock",
    model="gpt-4-mock",
    responses=["Hello! This is a mock response.", "Another mock response."],
    delay=0.5  # Simulate API latency
)

#Features

  • No API calls - Returns predefined responses
  • Latency simulation - Configurable delays to simulate real API behavior
  • Token tracking - Simulates token usage for testing
  • Error simulation - Can simulate API failures for error handling tests

#Mock-Specific Configuration

python
# Detailed mock setup
llm = create_llm_provider(
    provider="mock",
    model="claude-mock",
    responses=[
        "This is the first mock response.",
        "This is the second mock response.",
        "This is the third mock response."
    ],
    delay=1.0,           # 1 second delay
    cycle_responses=True, # Cycle through responses
    simulate_tokens=True, # Track mock token usage
    error_rate=0.1       # 10% chance of simulated errors
)
 
# Use in tests
response = await llm.generate("Any prompt")
print(response)  # Returns one of the mock responses

#Multi-Provider Usage

You can use multiple providers in the same application for different use cases:

python
from daita.llm import create_llm_provider
 
# Different providers for different tasks
openai_llm = create_llm_provider("openai", "gpt-5.4-mini", agent_id="analyzer")
anthropic_llm = create_llm_provider("anthropic", "claude-haiku-4-5", agent_id="writer")
gemini_llm = create_llm_provider("gemini", "gemini-2.5-flash-lite", agent_id="summarizer")
 
# Use appropriate provider for each task
analysis = await openai_llm.generate("Analyze this data: ...")
content = await anthropic_llm.generate("Write an article about: ...")
summary = await gemini_llm.generate("Summarize this document: ...")

#Using with Agents

Agents use providers automatically when you specify llm_provider:

python
from daita import Agent
 
# Agent uses specified provider
agent = Agent(
    name="Analyst",
    llm_provider="anthropic",
    model="claude-haiku-4-5"
)
 
await agent.start()
result = await agent.run("Analyze this data")

See Agent documentation for complete agent usage.

#Error Handling

All providers implement consistent error handling:

python
from daita.core.exceptions import LLMError
 
try:
    llm = create_llm_provider("openai", "gpt-5.4-mini")
    response = await llm.generate("Your prompt here")
except LLMError as e:
    print(f"LLM error: {e}")
    # Handle provider-specific errors
except Exception as e:
    print(f"Unexpected error: {e}")

#Common Error Types

  • Authentication errors - Invalid API keys
  • Rate limiting - Too many requests
  • Model errors - Invalid model names
  • Network errors - Connection issues
  • Token limit errors - Prompt too long

#Token Tracking

Token usage is automatically tracked when using agents:

python
from daita import Agent
 
# Create agent with LLM provider
agent = Agent(
    name="My Agent",
    llm_provider="openai",
    model="gpt-5.4-mini"
)
 
# Use the agent (token usage tracked automatically)
await agent.run("analyze this text", data={"text": "Hello, world!"})
 
# Check token usage
usage = agent.get_token_usage()
print(f"Total tokens: {usage['total_tokens']}")
print(f"Prompt tokens: {usage['prompt_tokens']}")
print(f"Completion tokens: {usage['completion_tokens']}")
print(f"Estimated cost: ${usage['estimated_cost']:.4f}")

#Custom Providers

You can register custom LLM providers for specialized use cases:

python
from daita.llm import register_llm_provider, BaseLLMProvider
 
class CustomProvider(BaseLLMProvider):
    """Custom LLM provider implementation."""
 
    async def generate(self, prompt: str, **kwargs) -> str:
        # Your custom implementation
        return "Custom response"
 
# Register the provider
register_llm_provider("custom", CustomProvider)
 
# Use the custom provider
llm = create_llm_provider("custom", "custom-model")
response = await llm.generate("Test prompt")

#Best Practices

API Keys:

  • Store keys in environment variables, never hardcode
  • Use different keys for development and production
  • Rotate keys regularly

Model Selection:

  • Fast tasks: Gemini Flash, GPT-3.5-turbo
  • Balanced: Claude Sonnet, GPT-4o-mini
  • Complex reasoning: GPT-4, Claude Opus

Error Handling:

  • Always wrap LLM calls in try-except blocks
  • Handle rate limiting with exponential backoff
  • Consider fallback providers for resilience

Performance:

  • Use streaming for better user experience
  • Set appropriate max_tokens to control costs
  • Monitor token usage with agent tracing

#Next Steps