Skip to main content

Scheduling & Cron Jobs

Daita provides powerful scheduling capabilities to run your agents and workflows automatically at specified intervals. Schedules are configured using cron expressions and deployed to AWS EventBridge for reliable, managed execution.

Overview

Schedule configuration allows you to:

  • Run agents and workflows on a recurring schedule (hourly, daily, weekly, etc.)
  • Set custom input data for scheduled executions
  • Control timezone for accurate scheduling
  • Enable/disable schedules without redeployment
  • Monitor scheduled task execution through the Daita dashboard

Schedules are defined in your daita-project.yaml file and automatically deployed to AWS EventBridge when you run daita push.

Configuration

Define schedules in the schedules section of your daita-project.yaml:

schedules:
agents:
data_processor:
cron: "0 */6 * * ? *" # Every 6 hours
enabled: true
timezone: "UTC"
description: "Process accumulated data every 6 hours"
data:
batch_size: 1000

workflows:
backup_workflow:
cron: "0 0 * * ? *" # Daily at midnight
enabled: true
timezone: "UTC"
description: "Daily backup workflow"

Configuration Fields

FieldTypeRequiredDescription
cronstringYesCron expression defining when to run
enabledbooleanNoWhether the schedule is active (default: true)
timezonestringNoTimezone for schedule (default: "UTC")
descriptionstringNoHuman-readable description of the schedule
dataobjectNoInput data to pass to the agent/workflow

Cron Expression Format

Daita uses AWS EventBridge cron expressions, which require 6 fields (not the standard 5-field Unix format):

┌────────────── minute (0 - 59)
│ ┌────────────── hour (0 - 23)
│ │ ┌────────────── day of month (1 - 31)
│ │ │ ┌────────────── month (1 - 12 or JAN-DEC)
│ │ │ │ ┌────────────── day of week (1 - 7 or SUN-SAT)
│ │ │ │ │ ┌────────────── year (1970 - 2199)
│ │ │ │ │ │
* * * * ? *

Important differences from Unix cron:

  • 6 fields required - EventBridge adds a year field at the end
  • ? wildcard - Use ? for "any" in day-of-month or day-of-week fields. You cannot use * in both day fields
  • If you specify a day-of-week, use ? for day-of-month, and vice versa

Common Patterns

# Every 15 minutes
cron: "*/15 * * * ? *"

# Every hour at minute 30
cron: "30 * * * ? *"

# Every 2 hours
cron: "0 */2 * * ? *"

# Every 6 hours (at 12am, 6am, 12pm, 6pm)
cron: "0 */6 * * ? *"

# Daily at 2:30 AM
cron: "30 2 * * ? *"

# Weekdays at 9 AM (use ? for day-of-month when day-of-week is specified)
cron: "0 9 ? * MON-FRI *"

# Every Monday at 9 AM
cron: "0 9 ? * MON *"

# First day of every month at midnight (use ? for day-of-week when day-of-month is specified)
cron: "0 0 1 * ? *"

# Every Sunday at midnight
cron: "0 0 ? * SUN *"

# Last day of the month is not directly supported
# Use daily job with date logic instead

Named Days and Months

You can use names instead of numbers:

# Days: SUN, MON, TUE, WED, THU, FRI, SAT
cron: "0 9 ? * MON-FRI *" # Weekdays

# Months: JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC
cron: "0 0 1 JAN,JUL ? *" # First day of January and July

Examples

Scheduled Data Processing Agent

Process data every 6 hours with custom batch size:

agents:
- name: data_processor
description: "Processes accumulated data"
file: agents/data_processor.py
enabled: true

schedules:
agents:
data_processor:
cron: "0 */6 * * ? *"
enabled: true
timezone: "UTC"
description: "Process accumulated data every 6 hours"
data:
batch_size: 1000
mode: "incremental"

In your agent code, access the scheduled data:

from daita import agent

@agent()
def data_processor(inputs):
batch_size = inputs.get('batch_size', 100)
mode = inputs.get('mode', 'full')

# Process data with configured parameters
results = process_data(batch_size=batch_size, mode=mode)

return {
"status": "success",
"records_processed": len(results)
}

Weekly Report Generation

Generate reports every Monday morning:

agents:
- name: report_generator
description: "Generates weekly reports"
file: agents/report_generator.py
enabled: true

schedules:
agents:
report_generator:
cron: "0 9 ? * MON *"
enabled: true
timezone: "America/New_York"
description: "Generate weekly reports every Monday at 9 AM ET"
data:
report_type: "weekly"
recipients: ["team@company.com"]

Scheduled Workflow

Run a multi-step workflow daily:

workflows:
- name: daily_pipeline
description: "Daily data pipeline"
file: workflows/daily_pipeline.py
enabled: true

schedules:
workflows:
daily_pipeline:
cron: "0 2 * * ? *"
enabled: true
timezone: "UTC"
description: "Run daily data pipeline at 2 AM UTC"
data:
include_external: true

Multiple Schedules for Same Agent

You can schedule the same agent multiple times with different configurations:

agents:
- name: data_sync
description: "Syncs data from external sources"
file: agents/data_sync.py
enabled: true

schedules:
agents:
# Frequent sync during business hours
data_sync_frequent:
cron: "0 9-17 ? * MON-FRI *"
enabled: true
timezone: "America/New_York"
description: "Hourly sync during business hours"
data:
mode: "incremental"

# Full sync overnight
data_sync_full:
cron: "0 0 * * ? *"
enabled: true
timezone: "America/New_York"
description: "Full sync daily at midnight"
data:
mode: "full"

Note: To schedule the same agent multiple times, use different schedule keys (e.g., data_sync_frequent, data_sync_full).

Timezone Support

Schedules respect timezone configuration to ensure accurate execution timing:

schedules:
agents:
morning_report:
cron: "0 9 ? * MON-FRI *"
timezone: "America/New_York" # 9 AM Eastern Time

evening_cleanup:
cron: "0 18 * * ? *"
timezone: "America/Los_Angeles" # 6 PM Pacific Time

midnight_job:
cron: "0 0 * * ? *"
timezone: "Europe/London" # Midnight UK time

Common Timezones

  • UTC - Coordinated Universal Time
  • America/New_York - US Eastern Time
  • America/Chicago - US Central Time
  • America/Denver - US Mountain Time
  • America/Los_Angeles - US Pacific Time
  • Europe/London - UK Time
  • Europe/Paris - Central European Time
  • Asia/Tokyo - Japan Time
  • Australia/Sydney - Australian Eastern Time

See the full list of timezones for all supported values.

Deployment Process

When you deploy your project with daita push, the CLI:

  1. Parses Schedule Configuration - Validates cron expressions and configuration
  2. Creates EventBridge Rules - Sets up AWS EventBridge rules with your cron expressions
  3. Links to Lambda Functions - Connects schedules to your deployed agents/workflows
  4. Stores Schedule Metadata - Tracks schedules in the Daita database

Deployment Output

During deployment, you'll see schedule information:

$ daita push

✓ Creating deployment package...
✓ Uploading package to secure API endpoint...
✓ Deploying to secure Lambda functions...
Schedules: 2 agents, 1 workflows
Agent data_processor: 0 */6 * * *
Agent report_generator: 0 9 * * MON
Workflow backup_workflow: 0 0 * * *
✓ Deployed to Daita-managed production
Deployment ID: 3a5b7c9d-1e2f-3a4b-5c6d-7e8f9a0b1c2d

Validation

The CLI validates schedules before deployment:

# Invalid cron expression
Error: Invalid cron expression: "0 25 * * *" (hour must be 0-23)

# Referencing non-existent agent
Error: Schedule references unknown agent 'nonexistent_agent'

# Missing required agent
Error: Cannot schedule 'data_processor' - agent not found in configuration

Managing Schedules

View Active Schedules

Check which schedules are configured:

# Show deployment plan (includes schedules)
daita push --dry-run

Enable/Disable Schedules

Control schedule activation without redeployment:

schedules:
agents:
data_processor:
cron: "0 */6 * * ? *"
enabled: true # Active - will execute

backup_agent:
cron: "0 0 * * ? *"
enabled: false # Disabled - won't execute

After updating enabled status, deploy to apply changes:

daita push

Update Schedule Timing

To change when a schedule runs, update the cron expression:

schedules:
agents:
report_generator:
# Old: Every Monday at 9 AM
# cron: "0 9 ? * MON *"

# New: Every weekday at 9 AM
cron: "0 9 ? * MON-FRI *"

Deploy to apply the updated schedule:

daita push

Monitor Schedule Execution

View execution logs for scheduled runs:

# View logs for a specific agent
daita logs data_processor

# View recent logs with timestamps
daita logs data_processor --follow

# Filter for scheduled executions
daita logs data_processor | grep "scheduled"

Environment-Specific Schedules

Use environment overrides to run schedules at different intervals per environment:

# Base configuration
schedules:
agents:
data_processor:
cron: "0 */6 * * ? *" # Default: every 6 hours
enabled: true

# Environment-specific overrides
environments:
staging:
schedules:
agents:
data_processor:
cron: "0 */12 * * ? *" # Staging: every 12 hours

production:
schedules:
agents:
data_processor:
cron: "0 */3 * * ? *" # Production: every 3 hours
enabled: true

Deploy to specific environments:

# Deploy to staging (uses 12-hour schedule)
daita push --env staging

# Deploy to production (uses 3-hour schedule)
daita push --env production

Architecture

EventBridge Integration

Daita schedules use AWS EventBridge for reliable, managed execution:

  • EventBridge Rules - Each schedule creates an EventBridge rule with your cron expression
  • Lambda Targets - Rules trigger your deployed Lambda functions
  • Timezone Support - EventBridge handles timezone conversions automatically
  • State Management - Enable/disable schedules by updating rule state

Schedule Lifecycle

  1. Configuration - Define schedules in daita-project.yaml
  2. Validation - CLI validates cron expressions and references
  3. Deployment - EventBridge rules created with cron expressions
  4. Execution - EventBridge triggers Lambda functions on schedule
  5. Logging - Execution logs captured and available via daita logs

Best Practices

1. Use Descriptive Schedules

Always include clear descriptions:

schedules:
agents:
data_processor:
cron: "0 2 * * ? *"
description: "Process daily data at 2 AM UTC" # Clear description

2. Choose Appropriate Timezones

Use timezones that match your business operations:

schedules:
agents:
morning_report:
cron: "0 9 ? * MON-FRI *"
timezone: "America/New_York" # Matches business hours

3. Avoid Over-Scheduling

Don't schedule too frequently:

# ❌ Bad: Every minute (may cause throttling)
cron: "* * * * ? *"

# ✅ Good: Every 15 minutes (reasonable frequency)
cron: "*/15 * * * ? *"

4. Stagger Schedules

Spread out multiple schedules to avoid resource contention:

schedules:
agents:
processor_a:
cron: "0 2 * * ? *" # 2:00 AM

processor_b:
cron: "15 2 * * ? *" # 2:15 AM (staggered by 15 minutes)

processor_c:
cron: "30 2 * * ? *" # 2:30 AM (staggered by 30 minutes)

5. Test Schedule Configuration

Use dry-run to verify schedules before deployment:

# Verify schedule configuration
daita push --dry-run

# Deploy when ready
daita push

6. Handle Execution Data

Make scheduled executions resilient:

from daita import agent

@agent()
def scheduled_processor(inputs):
# Provide defaults for all scheduled inputs
batch_size = inputs.get('batch_size', 100)
mode = inputs.get('mode', 'incremental')

try:
results = process_data(batch_size, mode)
return {"status": "success", "count": len(results)}
except Exception as e:
# Log errors for debugging
print(f"Schedule execution failed: {e}")
return {"status": "error", "message": str(e)}

7. Monitor Scheduled Tasks

Regularly check execution logs:

# Check recent executions
daita logs scheduled_agent --tail 50

# Monitor for failures
daita logs scheduled_agent | grep -i error

Troubleshooting

Schedule Not Executing

Symptoms: Schedule configured but agent/workflow doesn't run

Solutions:

  1. Check schedule is enabled: enabled: true
  2. Verify cron expression is valid
  3. Check timezone matches expected execution time
  4. Review deployment logs for errors: daita logs <agent-name>

Wrong Execution Time

Symptoms: Schedule runs at unexpected times

Solutions:

  1. Verify timezone setting matches your expectation
  2. Remember cron uses 24-hour format (not AM/PM)
  3. Check if daylight saving time affects your timezone
  4. Test cron expression with crontab.guru

Execution Failures

Symptoms: Schedule triggers but agent fails

Solutions:

  1. Check agent code handles scheduled inputs correctly
  2. Verify scheduled data fields match agent expectations
  3. Review error logs: daita logs <agent-name>
  4. Test agent locally: daita test <agent-name>

Cannot Update Schedule

Symptoms: Changes to schedule not taking effect

Solutions:

  1. Ensure you run daita push after configuration changes
  2. Verify YAML syntax is correct (no indentation errors)
  3. Check for validation errors in deployment output

Limitations

Frequency Constraints

  • Minimum interval: 1 minute (cron expressions with * * * * *)
  • Maximum frequency: Avoid sub-minute intervals
  • Rate limits: AWS EventBridge limits apply (check current quotas)

Cron Expression Support

EventBridge 6-field cron format required:

  • ✅ Supported: 0 */6 * * ? * (every 6 hours)
  • ✅ Supported: 0 9 ? * MON-FRI * (weekdays at 9 AM)
  • ✅ Supported: Named days/months (MON-FRI, JAN-DEC)
  • ✅ Supported: Ranges (9-17 for hours 9 AM to 5 PM)
  • ❌ Not supported: L for last day of month (use daily job with date logic)
  • ❌ Not supported: @daily, @hourly macros (use standard cron format)
  • ❌ Not supported: * in both day-of-month and day-of-week (use ? for one)

EventBridge Limits

AWS EventBridge quotas apply:

  • Rules per account: 300 by default (can be increased)
  • Invocations per second: Varies by region
  • See AWS EventBridge Quotas