Agent.from_db()
Build a fully configured analyst agent from a database connection string or plugin. Discovers schema automatically, generates a system prompt, and registers query and analysis tools — ready to use in one call.
#Overview
Agent.from_db() creates a production-ready data analyst agent from a database connection. It connects, discovers the schema, generates a rich system prompt, and returns a configured Agent with query tools pre-registered.
from daita import Agent
agent = await Agent.from_db("postgresql://user:pass@localhost/analytics")
await agent.start()
result = await agent.run("What were the top 5 products by revenue last quarter?")No manual schema setup, no tool registration, no system prompt writing — from_db() handles all of it.
#Supported Databases
Pass a connection string or a plugin instance:
# PostgreSQL
agent = await Agent.from_db("postgresql://user:pass@host:5432/mydb")
# MySQL
agent = await Agent.from_db("mysql://user:pass@host:3306/mydb")
# SQLite (file-based)
agent = await Agent.from_db("sqlite:///./data.db")
# SQLite (in-memory)
agent = await Agent.from_db("sqlite:///:memory:")
# Existing plugin instance
from daita.plugins import postgresql
db = postgresql(host="localhost", database="analytics")
agent = await Agent.from_db(db)#Parameters
await Agent.from_db(
source, # connection string or plugin instance (required)
*,
name=None, # agent name (default: "{domain} database agent")
model=None, # LLM model override
api_key=None, # LLM API key override
llm_provider=None, # LLM provider override
prompt=None, # user context prepended to auto-generated prompt
db_schema=None, # DB schema name (e.g. "public" for PostgreSQL)
include_sample_values=True, # include sample numeric values in prompt
redact_pii_columns=True, # skip sampling PII-named columns
lineage=None, # True or LineagePlugin instance
memory=None, # True or MemoryPlugin instance
history=None, # True or ConversationHistory instance
cache_ttl=None, # schema cache TTL in seconds
read_only=True, # omit write tools when True
toolkit="analyst", # analyst toolkit to register
**agent_kwargs, # forwarded to Agent.__init__
)#Core Parameters
source(str | BaseDatabasePlugin): Connection string or plugin instance. Required.name(str): Agent name. Defaults to"{domain} database agent"where domain is inferred from the schema.prompt(str): Additional context prepended to the auto-generated schema prompt. Use this to describe business rules, preferred query patterns, or domain terminology.read_only(bool): WhenTrue(default), write tools (sqlite_execute,execute_sql, etc.) are omitted from the agent. Recommended for production.toolkit(str | None): Analyst toolkit to register."analyst"(default) adds 6 in-process analysis tools.Nonedisables the toolkit.
#LLM Configuration
model,api_key,llm_provider: Override the LLM settings. When omitted, auto-detected from environment variables.**agent_kwargs: Forwarded toAgent.__init__— use this to passdisplay_reasoning=True,relay=..., or any other Agent parameter.
#Schema Discovery
db_schema(str): Target DB schema (e.g."public"for PostgreSQL,"main"for SQLite). Discovered automatically whenNone.include_sample_values(bool): WhenTrue, numeric columns are sampled and sample values appear in the system prompt. Helps the LLM reason about scale and units.redact_pii_columns(bool): WhenTrue, columns with names matching common PII patterns (email,password,ssn, etc.) are excluded from sampling.cache_ttl(int | None): Cache schema discovery results for this many seconds.Nonedisables caching. Useful for large schemas or slow connections.
#Schema Discovery
from_db() connects to the database and uses dialect-specific discovery:
- PostgreSQL/MySQL: Queries
information_schema.columnsandinformation_schema.table_constraints - SQLite: Uses
PRAGMA table_info()andsqlite_master - Other dialects: Falls back to the plugin's
tables()anddescribe()methods
The discovered schema is embedded in the agent's system prompt, so the LLM understands the full table structure without needing to call discovery tools. Schema discovery tools (sqlite_list_tables, postgres_inspect, etc.) are removed from the tool registry to save ~250–300 tokens per LLM call.
#Caching and Drift Detection
# Cache schema for 1 hour
agent = await Agent.from_db(
"postgresql://user:pass@host/db",
cache_ttl=3600
)When cache_ttl is set:
- On a cache hit, discovery is skipped entirely
- On expiry, the schema is re-discovered and the cache is updated
- If re-discovery fails, the expired cache is used with a warning
- If the new schema differs from the cached one, drift is logged and the system prompt is updated
#Analyst Toolkit
With toolkit="analyst" (the default), six in-process analysis tools are registered:
| Tool | Description |
|---|---|
pivot_table | Pivot rows into a cross-tabulation |
correlate | Compute column correlation matrix |
detect_anomalies | Statistical outlier detection |
compare_entities | Compare two entities row-by-row |
find_similar | Find rows most similar to a reference row |
forecast_trend | Simple trend forecasting on time-series data |
These tools run in-process (no extra LLM call) and operate on data already fetched by the query tools.
#Optional Integrations
#Lineage
Automatically creates a LineagePlugin and registers foreign-key relationships discovered during schema discovery:
agent = await Agent.from_db(
"postgresql://user:pass@host/db",
lineage=True
)
# agent._db_lineage is the LineagePlugin instance
# Or pass your own:
from daita.plugins import lineage
lin = lineage()
agent = await Agent.from_db("postgresql://...", lineage=lin)#Memory
Auto-creates a MemoryPlugin and runs numeric column calibration on first use:
agent = await Agent.from_db(
"postgresql://user:pass@host/db",
memory=True
)Calibration infers column units (e.g. "amount is in USD cents") and stores them in memory. Subsequent agent runs use this context automatically.
#Conversation History
Enables conversational drilldown across multiple run() calls:
agent = await Agent.from_db(
"postgresql://user:pass@host/db",
history=True
)
await agent.start()
await agent.run("How many orders last week?")
await agent.run("Break that down by region") # references prior context#Error Handling
from daita import Agent
from daita.core.exceptions import AgentError
try:
agent = await Agent.from_db("postgresql://user:wrong@host/db")
except AgentError as e:
if "Failed to connect" in str(e):
print("Check database credentials and host")
elif "Schema discovery failed" in str(e):
print("Connected but schema discovery failed — check schema parameter")
except ValueError as e:
print(f"Invalid connection string: {e}")#Best Practices
- Use
read_only=True(the default) in production to prevent accidental writes. - Provide a
promptwith business context (e.g. column naming conventions, currency units, time zone) to improve answer accuracy. - Enable
cache_ttlfor large schemas or when starting multiple agents from the same database — schema discovery can be slow on wide schemas. - Use
history=Truefor interactive Q&A workflows where users follow up on prior questions. - Set
nameexplicitly to a meaningful identifier — it appears in logs, traces, and the system prompt.
#Next Steps
- Watch — Monitor database state and react to changes
- PostgreSQL Plugin — Direct database access
- SQLite Plugin — Lightweight file-based databases
- Agent — Full agent configuration reference