from_db Schema Discovery

#Discovery

Agent.from_db() connects to the database and discovers tables, columns, primary keys, foreign keys, row counts when available, and database metadata.

python

agent = await Agent.from_db(
    "postgresql://user:pass@host/db",
    db_schema="public",
)

The normalized schema is available through the public DB context:

python

schema = agent.db.schema
print(schema["database_type"])
print(schema["table_count"])

#Target Schema

Use db_schema when the database has multiple schemas or namespaces:

python

agent = await Agent.from_db(
    "postgresql://user:pass@host/db",
    db_schema="analytics",
)

#Sample Values

include_sample_values controls whether numeric columns are sampled during setup:

python

agent = await Agent.from_db(
    "postgresql://user:pass@host/db",
    include_sample_values=True,
)

Sample values help the model infer scale and units, but they can add startup cost. The selected mode determines the default.

#PII Redaction During Sampling

redact_pii_columns=True skips sampling columns whose names look sensitive:

python

agent = await Agent.from_db(
    "postgresql://user:pass@host/db",
    include_sample_values=True,
    redact_pii_columns=True,
)

This protects against accidental sampling from columns such as email, password, token, SSN, and similar fields.

#Caching

Use cache_ttl to write and reuse a TTL-based schema cache:

python

agent = await Agent.from_db(
    "postgresql://user:pass@host/db",
    cache_ttl=3600,
)

When a valid cache entry exists, from_db() can reuse it instead of running discovery again. When the cache expires, the schema is rediscovered and refreshed.

When cache_ttl=None, from_db() does not write a new TTL cache entry, but it can still reuse an existing schema snapshot or catalog snapshot if one is available. This lets warm local or hosted runtimes avoid discovery without opting into TTL cache writes.

#Drift

When rediscovery finds differences from the cached schema, drift metadata is attached:

python

if agent.db.drift:
    print(agent.db.drift)

The compact agent description also reports drift status:

python

metadata = agent.describe()
print(metadata["db"]["drift_status"])

#Summaries and Suggested Questions

from_db() builds a compact database summary for runtime context and developer inspection:

python

print(agent.db.summary)
print(agent.db.suggested_questions)

The summary can include signals such as likely fact tables, entity tables, timestamp columns, candidate metrics, and starter questions.

#Large Schemas

For large schemas, from_db() does not need to inline every table and column into the prompt. Use a retrieval-oriented budget and let the agent rely on schema navigation tools:

python

agent = await Agent.from_db(
    "postgresql://user:pass@host/db",
    budget="retrieval",
)

See Schema Navigation for the public tools used to inspect large schemas at run time.