from_db Schema Discovery
How Agent.from_db() discovers database schema, samples numeric values, caches schema, detects drift, and summarizes database structure.
#Discovery
Agent.from_db() connects to the database and discovers tables, columns, primary keys, foreign keys, row counts when available, and database metadata.
agent = await Agent.from_db(
"postgresql://user:pass@host/db",
db_schema="public",
)The normalized schema is available through the public DB context:
schema = agent.db.schema
print(schema["database_type"])
print(schema["table_count"])#Target Schema
Use db_schema when the database has multiple schemas or namespaces:
agent = await Agent.from_db(
"postgresql://user:pass@host/db",
db_schema="analytics",
)#Sample Values
include_sample_values controls whether numeric columns are sampled during setup:
agent = await Agent.from_db(
"postgresql://user:pass@host/db",
include_sample_values=True,
)Sample values help the model infer scale and units, but they can add startup cost. The selected mode determines the default.
#PII Redaction During Sampling
redact_pii_columns=True skips sampling columns whose names look sensitive:
agent = await Agent.from_db(
"postgresql://user:pass@host/db",
include_sample_values=True,
redact_pii_columns=True,
)This protects against accidental sampling from columns such as email, password, token, SSN, and similar fields.
#Caching
Use cache_ttl to write and reuse a TTL-based schema cache:
agent = await Agent.from_db(
"postgresql://user:pass@host/db",
cache_ttl=3600,
)When a valid cache entry exists, from_db() can reuse it instead of running discovery again. When the cache expires, the schema is rediscovered and refreshed.
When cache_ttl=None, from_db() does not write a new TTL cache entry, but it can still reuse an existing schema snapshot or catalog snapshot if one is available. This lets warm local or hosted runtimes avoid discovery without opting into TTL cache writes.
#Drift
When rediscovery finds differences from the cached schema, drift metadata is attached:
if agent.db.drift:
print(agent.db.drift)The compact agent description also reports drift status:
metadata = agent.describe()
print(metadata["db"]["drift_status"])#Summaries and Suggested Questions
from_db() builds a compact database summary for runtime context and developer inspection:
print(agent.db.summary)
print(agent.db.suggested_questions)The summary can include signals such as likely fact tables, entity tables, timestamp columns, candidate metrics, and starter questions.
#Large Schemas
For large schemas, from_db() does not need to inline every table and column into the prompt. Use a retrieval-oriented budget and let the agent rely on schema navigation tools:
agent = await Agent.from_db(
"postgresql://user:pass@host/db",
budget="retrieval",
)See Schema Navigation for the public tools used to inspect large schemas at run time.