BigQuery Plugin
Google BigQuery data warehouse connection and querying for agents, built on `google-cloud-bigquery` with asyncio executors.
#Installation
pip install 'daita-agents[bigquery]'#Quick Start
from daita import Agent
from daita.plugins import bigquery
# Create plugin
bq = bigquery(
project="my-gcp-project",
dataset="analytics"
)
# Create agent with BigQuery access
agent = Agent(
name="Warehouse Analyst",
model="gpt-4o-mini",
prompt="You are a data analyst. Query BigQuery to answer questions.",
tools=[bq]
)
await agent.start()
result = await agent.run("What were our top 10 products by revenue last quarter?")#Direct Usage
The plugin can be used directly without agents for programmatic access:
from daita.plugins import bigquery
async with bigquery(project="my-project", dataset="analytics") as bq:
results = await bq.query("SELECT * FROM users LIMIT 10")
tables = await bq.tables()
schema = await bq.describe("users")#Connection Parameters
bigquery(
project: Optional[str] = None,
dataset: Optional[str] = None,
credentials_path: Optional[str] = None,
location: Optional[str] = None,
timeout: int = 300
)#Parameters
project(str): GCP project ID. Required. Falls back toBIGQUERY_PROJECTorGOOGLE_CLOUD_PROJECTenv varsdataset(str): Default dataset for unqualified table references. Falls back toBIGQUERY_DATASETcredentials_path(str): Path to service account JSON key file. Falls back toGOOGLE_APPLICATION_CREDENTIALS. If omitted, Application Default Credentials are usedlocation(str): BigQuery location (default:"US"). Falls back toBIGQUERY_LOCATIONtimeout(int): Query timeout in seconds (default: 300)
#Environment Variables
| Variable | Maps to |
|---|---|
BIGQUERY_PROJECT / GOOGLE_CLOUD_PROJECT | project |
BIGQUERY_DATASET | dataset |
GOOGLE_APPLICATION_CREDENTIALS | credentials_path |
BIGQUERY_LOCATION | location |
#Authentication
#Application Default Credentials (recommended)
gcloud auth application-default loginbq = bigquery(project="my-project", dataset="analytics")#Service Account Key
bq = bigquery(
project="my-project",
dataset="analytics",
credentials_path="/path/to/service-account.json"
)#Core Methods
#query()
Execute a SELECT query and return rows as a list of dicts:
results = await bq.query("SELECT name, email FROM users WHERE active = true LIMIT 100")
# [{"name": "Alice", "email": "alice@example.com"}, ...]Parameterized queries use %s placeholders (automatically converted to BigQuery's @p0, @p1 syntax):
results = await bq.query(
"SELECT * FROM events WHERE event_type = %s AND created_at > %s",
params=["purchase", "2026-01-01"]
)#execute()
Execute DML/DDL statements and return the number of affected rows:
affected = await bq.execute(
"DELETE FROM logs WHERE created_at < %s",
params=["2025-01-01"]
)
print(f"Deleted {affected} rows")#tables()
List all tables in a dataset:
table_list = await bq.tables() # uses default dataset
table_list = await bq.tables(dataset="other_ds") # specific dataset#datasets()
List all datasets in the project:
ds_list = await bq.datasets()#describe()
Get column schema for a table:
columns = await bq.describe("users")
# [{"column_name": "id", "data_type": "INT64", "is_nullable": "NO"}, ...]#count_rows() / sample_rows()
count = await bq.count_rows("events")
count = await bq.count_rows("events", filter="event_type = 'purchase'")
sample = await bq.sample_rows("events", n=10)#Available Tools
When used with an agent, BigQuery exposes these LLM-callable tools:
| Tool | Description | Key Parameters |
|---|---|---|
| bigquery_query | Run a SELECT query | sql, params, focus |
| bigquery_inspect | List tables and column schemas | dataset, tables |
| bigquery_count | Count rows (optionally filtered) | table, filter |
| bigquery_sample | Random sample of rows | table, n |
| bigquery_list_datasets | List all datasets | — |
| bigquery_execute | Run DML/DDL (write mode only) | sql, params |
bigquery_execute is only available when the plugin is not in read-only mode.
#Tool Usage Example
from daita import Agent
from daita.plugins import bigquery
bq = bigquery(project="analytics-prod", dataset="warehouse")
agent = Agent(
name="Data Analyst",
prompt="You are a BigQuery analyst. Help users explore and query the data warehouse.",
tools=[bq]
)
await agent.start()
result = await agent.run("""
List all datasets, then inspect the tables in the 'sales' dataset.
How many orders were placed last month?
""")
await agent.stop()#Read-Only Mode
Restrict the agent to SELECT queries only:
bq = bigquery(project="prod-project", dataset="analytics", read_only=True)In read-only mode, bigquery_execute is not exposed as a tool.
#Table Name Resolution
Tables are automatically qualified with project and dataset:
# These are equivalent when project="my-project" and dataset="analytics":
await bq.query("SELECT * FROM users LIMIT 5")
await bq.query("SELECT * FROM analytics.users LIMIT 5")
await bq.query("SELECT * FROM my-project.analytics.users LIMIT 5")#Context Manager Usage
from daita.plugins import bigquery
async with bigquery(project="my-project", dataset="analytics") as bq:
results = await bq.query("SELECT COUNT(*) as cnt FROM events")
print(f"Total events: {results[0]['cnt']}")
# Automatically disconnected#Error Handling
from daita.plugins import bigquery
try:
async with bigquery(project="my-project", dataset="analytics") as bq:
results = await bq.query("SELECT * FROM users LIMIT 10")
except ImportError:
print("Install BigQuery support: pip install 'daita-agents[bigquery]'")
except ValueError as e:
print(f"Configuration error: {e}")#Best Practices
Authentication:
- Use Application Default Credentials in development
- Use service accounts with minimal permissions in production
- Never commit credential files to source control
Performance:
- Always include
LIMITin exploratory queries — the tool auto-appendsLIMIT 50if omitted - Use partitioned and clustered tables for large datasets
- Set appropriate
timeoutfor long-running analytical queries
Cost Management:
- Use
read_only=Truein production to prevent accidental DML - Prefer
count_rows()overSELECT COUNT(*)for simple counts - Use
bigquery_inspectto understand schema before writing complex queries
#Troubleshooting
| Issue | Solution |
|---|---|
google-cloud-bigquery not installed | pip install 'daita-agents[bigquery]' |
BigQuery project is required | Set project parameter or BIGQUERY_PROJECT env var |
Dataset is required | Set dataset parameter or BIGQUERY_DATASET env var |
| Permission denied | Check IAM roles — agent needs BigQuery Data Viewer at minimum |
| Query timeout | Increase timeout parameter or optimize the query |
#Next Steps
- Snowflake Plugin — Another data warehouse plugin
- PostgreSQL Plugin — Relational database operations
- Plugin Overview — All available plugins