Error Handling
The Daita error handling system provides intelligent, context-aware error management with automatic retry logic and comprehensive exception classification. It's designed to maximize system reliability while providing clear debugging information and graceful failure modes.
#Error Handling Philosophy
Daita uses intelligent error classification where every exception carries "retry hints" that guide automatic recovery behavior. This enables context-aware error recovery without manual intervention.
from daita.core.exceptions import TransientError, PermanentError, RetryableError
# Transient errors - retry immediately
raise TransientError("Rate limit exceeded")
# Retryable errors - retry with exponential backoff
raise RetryableError("Database temporarily unavailable")
# Permanent errors - don't retry, fix the issue
raise PermanentError("Invalid API key format")#Exception Hierarchy
All Daita exceptions inherit from DaitaError and include retry hints and contextual information.
#Base Exception - DaitaError
from daita.core.exceptions import DaitaError
try:
result = await some_operation()
except DaitaError as e:
if e.is_transient():
print("Will retry immediately")
elif e.is_retryable():
print("Will retry with backoff")
elif e.is_permanent():
print("Manual intervention required")
print(f"Context: {e.context}")#Component-Specific Exceptions
| Exception | When to Use | Retry Hint |
|---|---|---|
AgentError | Agent operation failures | Varies by cause |
LLMError | LLM provider errors | Usually retryable |
PluginError | Plugin/database errors | Usually retryable |
ConfigError | Configuration issues | Always permanent |
WorkflowError | Workflow failures | Varies by cause |
#Retry-Specific Exception Classes
#TransientError - Immediate Retry
Temporary issues that resolve quickly. Retried with minimal delay.
| Exception | Use Case | Additional Fields |
|---|---|---|
RateLimitError | API rate limiting | retry_after |
TimeoutError | Network timeouts | timeout_duration |
ConnectionError | Connection failures | host, port |
ServiceUnavailableError | Service downtime | service_name |
from daita.core.exceptions import RateLimitError, TimeoutError
# Rate limiting
try:
response = await api_client.get("/data")
except RateLimitError as e:
print(f"Rate limited, retry after {e.retry_after}s")
# Timeouts
try:
result = await slow_operation()
except TimeoutError as e:
print(f"Timed out after {e.timeout_duration}s")#RetryableError - Exponential Backoff
Issues that may resolve with time. Retried with exponential backoff.
| Exception | Use Case | Additional Fields |
|---|---|---|
ResourceBusyError | Resource contention | resource_name |
DataInconsistencyError | Temporary inconsistency | data_source |
ProcessingQueueFullError | Queue overload | queue_name |
#PermanentError - No Retry
Fundamental issues requiring manual intervention. Not retried.
| Exception | Use Case | Additional Fields |
|---|---|---|
AuthenticationError | Invalid credentials | provider |
PermissionError | Access denied | resource, action |
ValidationError | Invalid data | field, value |
NotFoundError | Missing resource | resource_type, resource_id |
#Automatic Retry Logic
Agents automatically retry failed operations based on error classification. See Overview for detailed retry configuration.
#Basic Configuration
from daita import Agent
from daita.config import RetryPolicy
# Simple: Enable retry with defaults
agent = Agent(
name="Resilient Agent",
enable_retry=True # 3 retries, exponential backoff
)
# Advanced: Custom retry policy
agent = Agent(
name="Custom Agent",
enable_retry=True,
retry_policy=RetryPolicy(
max_retries=5,
initial_delay=2.0
)
)
# Automatic retry handling
result = await agent.run("Process this data")How it works:
- Transient errors → Retry immediately with minimal delay
- Retryable errors → Retry with exponential backoff (1s, 2s, 4s, 8s...)
- Permanent errors → No retry, fail immediately
- Random jitter prevents thundering herd
#Error Handling Patterns
#Basic Error Handling
from daita import Agent
from daita.core.exceptions import AgentError, LLMError, ValidationError
agent = Agent(name="MyAgent", enable_retry=True)
try:
result = await agent.run("Process this data")
print(f"Success: {result}")
except ValidationError as e:
# Permanent errors - fix input and retry
print(f"Invalid input: {e}")
# Fix data and try again
except LLMError as e:
# LLM provider errors - usually transient
if e.is_permanent():
print(f"API key issue: {e}")
else:
print(f"Temporary LLM error: {e}")
# Automatic retry handles this
except AgentError as e:
# Agent errors with context
print(f"Agent failed: {e}")
print(f"Context: {e.context}")
except Exception as e:
# Unexpected errors
print(f"Unexpected error: {e}")
raise#Graceful Degradation
async def get_recommendations(user_id):
"""Get recommendations with fallback strategies."""
try:
# Primary: AI recommendations
return await ai_agent.run(f"Recommend for user {user_id}")
except LLMError:
# Fallback: Rule-based recommendations
return get_rule_based_recommendations(user_id)
except Exception:
# Final fallback: Popular items
return get_popular_items()#Error Monitoring & Debugging
All errors are automatically traced through Daita's built-in tracing system. See Automatic Tracing for details.
#Error Information
from daita.core.exceptions import DaitaError
try:
result = await agent.run("Process data")
except DaitaError as e:
# All Daita exceptions include:
print(f"Error: {e}")
print(f"Retry hint: {e.retry_hint}")
print(f"Context: {e.context}")
# Component-specific fields
if hasattr(e, 'agent_id'):
print(f"Agent: {e.agent_id}")
if hasattr(e, 'provider'):
print(f"Provider: {e.provider}")#Logging Best Practices
- Use structured logging with error context
- Log permanent errors as errors, transient as warnings
- Include retry attempt information
- Track error rates and trends
- Set up alerts for high error rates
#Best Practices
Exception Types:
- Use specific exception types (
ValidationError,RateLimitError, etc.) not genericException - Raise
TransientErrorfor temporary issues (network, rate limits) - Raise
RetryableErrorfor resource contention or queue issues - Raise
PermanentErrorfor auth, validation, or configuration errors - Include descriptive error messages
Error Context:
- Always include relevant context in exceptions
- Add operation name, user ID, resource IDs to context
- Use
create_contextual_error()to wrap standard exceptions - Include timestamps for debugging
Graceful Degradation:
- Implement fallback strategies for critical features
- Try AI → Rule-based → Static fallbacks
- Return cached data when services are unavailable
- Don't fail completely when non-critical features break
Monitoring & Alerting:
- Track error rates and patterns
- Alert on high authentication failure rates
- Monitor rate limiting frequency
- Track timeout rates for performance issues
- Use structured logging for analysis
Testing:
- Test that transient errors are retried
- Verify permanent errors don't retry
- Mock failures to test error paths
- Test fallback strategies work correctly
- Verify error context is preserved
#Next Steps
- Agent - Agent creation and configuration
- Automatic Tracing - Error monitoring and debugging
- Plugins - Database and API error handling
- Overview - Retry configuration details