Daita Documentation

#[0.16.0] - 2026-04-17

Major release introducing the Skills system — a new primitive that bundles domain instructions with tools — alongside full Google Cloud Platform catalog coverage, a modular per-source normalizer architecture, agent-facing graph traversal tools, and a comprehensive exception hierarchy refactor.

#Added

Skills System (daita/skills/)

A new first-class primitive sitting between raw tools and full plugins. Skills bundle domain-specific instructions with related tools, carrying behavioral intelligence that plugins (infrastructure connectors) do not.
python
```
from daita import Agent, Skill, tool
 
@tool
def format_report(data: list, title: str) -> str:
    """Render a markdown report."""
    ...
 
report_skill = Skill(
    name="report_gen",
    description="Produces polished analytical reports",
    instructions="Always render results as markdown with a title and bulleted rows.",
    tools=[format_report],
)
 
agent = Agent(name="Analyst", llm_provider="openai", model="gpt-4o")
agent.add_skill(report_skill)
```
Subclass BaseSkill for dynamic instruction generation, plugin dependencies (via requires()), or instructions loaded from a file. Skills are LifecyclePlugin subclasses, so on_before_run hooks automatically inject instructions into the system prompt. A new SkillError exception covers skill-specific failures (e.g., missing required plugins).

BaseSkill, Skill, and SkillError are exported from the top-level daita package.
GCP Catalog Support (daita/plugins/catalog/gcp.py)

Complete Google Cloud Platform infrastructure discovery, matching the AWS coverage introduced in 0.14.0. The new GCPDiscoverer enumerates resources across a project with per-service discoverers for:
- BigQuery — datasets, tables, and column schemas
- Firestore — collections and document schemas
- Bigtable — instances, tables, and column families
- Cloud Storage (GCS) — buckets and inferred file schemas
- Pub/Sub — topics and subscriptions
- Memorystore — Redis instances
- API Gateway — APIs, configs, and gateways
python
```
from daita.plugins import catalog
 
cat = catalog()
schema = await cat.discover_gcp(
    project_id="my-project",
    credentials_path="service-account.json",
)
```
Install with pip install 'daita-agents[gcp]'. The [cloud] and [all] bundles now pull in the full GCP stack.
Modular Normalizer Architecture (daita/plugins/catalog/normalizer/)

The monolithic normalizer.py (746 lines) has been split into per-source modules, mirroring the discoverer/profiler pattern. Each normalizer is now a focused file under catalog/normalizer/:

_postgresql, _mysql, _mongodb, _dynamodb, _documentdb, _s3, _gcs, _sns, _sqs, _kinesis, _opensearch, _bigquery, _bigtable, _firestore, _memorystore, _pubsub, _apigateway, _gcp_apigateway

A shared _common.py holds cross-source helpers. The package __init__.py dispatches to the right normalizer based on source type, producing the same unified NormalizedSchema output.
Graph Traversal Tools for Agents (daita/core/graph/tools.py)

Generic, edge-type-agnostic graph primitives that agents can call directly. Registration is opt-in to keep unrelated agents' tool lists focused:
python
```
from daita.core.graph import register_graph_tools
 
agent = Agent(name="Impact Analyst", llm_provider="openai", model="gpt-4o")
agent.add_plugin(lineage())
register_graph_tools(agent)
```
Exposes two tools:
- graph_subgraph(root, depth, edge_types?, direction?) — return nodes and edges reachable within depth hops; covers neighbors and bounded expansions in a single primitive.
- graph_shortest_path(from_id, to_id, edge_types?) — return the shortest path between two nodes, or null when unreachable.
Graph Algorithms Expansion (daita/core/graph/algorithms.py)

New stateless traversal algorithms operating on nx.MultiDiGraph:

ancestors, descendants, find_paths, shortest_path, connected_component, impact_analysis, default_subgraph, and traverse.

Every traversal accepts an optional edge_types filter, implemented once as a NetworkX edge-subgraph view (O(|E|) with no copy). A new LINEAGE_EDGE_TYPES constant captures the semantic set of data-flow edges (READS, WRITES, TRANSFORMS, SYNCS_TO, DERIVED_FROM, TRIGGERS, CALLS, PRODUCES) so lineage tooling doesn't accidentally walk into structural HAS_COLUMN or INDEXED_BY edges.
Graph Resolution Layer (daita/core/graph/resolution.py)

Single source of truth for mapping bare table references (e.g., "orders") to fully-qualified ResolvedTable values keyed by table:<store>.<name>. Used by LineagePlugin.track(...), capture_sql_lineage(...), and DataQualityPlugin.report(...).

A new AmbiguousReferencePolicy controls behavior when a bare name matches multiple stores:
- STRICT (default) — raises AmbiguousReferenceError so callers don't silently clobber data across stores.
- LENIENT — returns the most-recently-updated candidate and logs a warning.
- UNRESOLVED_SENTINEL — returns a placeholder under a synthetic __unresolved__ store; the catalog persister promotes it into a canonical node once discovery emits a matching table.
Structural Graph Types

NodeType gains INDEX; EdgeType gains INDEXED_BY, COVERS, and REFERENCES. These support richer catalog modeling (indexes, foreign-key references, covering indexes) while remaining excluded from default lineage traversals.
Catalog Base Profiler (daita/plugins/catalog/base_profiler.py)

New BaseProfiler abstract class formalizing the profiler contract across AWS and GCP services, matching the existing BaseDiscoverer.
Code-Review Agent Example (examples/deployments/code-review-agent/)

New end-to-end deployment example showing the Skills system in practice: a code-review agent composed of a SecurityReviewSkill (loaded from a markdown prompt file) and a CodeQualitySkill, plus a full test suite and deploy manifest.
Integration Test Harness (tests/integration/_harness.py)

New harness and live integration suites exercising real backends for catalog (AWS, GCP, MongoDB, MySQL, PostgreSQL, GitHub), lineage, and deep graph-accuracy scenarios.

#Changed

Exception Hierarchy Refactor (daita/core/exceptions.py)

DaitaError and its subclasses have been restructured around a shared _enrich() helper that merges named fields (e.g., agent_id, task, provider, model) into the context dict while skipping None values. Subclass constructors are now 5–10 lines each instead of 20+, and domain-specific context is consistently propagated without boilerplate. The public API is unchanged — imports, attributes, and is_transient() / is_retryable() / is_permanent() all behave as before.
Lineage Plugin Rewrite

LineagePlugin now delegates to the graph resolution layer for all bare-name lookups, uses the new edge-type-aware traversal algorithms, and exposes cleaner track, analyze_impact, and find_path methods. Configurable via AmbiguousReferencePolicy.
Catalog Tools Overhaul (daita/plugins/catalog/tools.py)

The catalog plugin's agent-facing tools were reworked for consistency with the new discoverer/profiler/normalizer split. Tool schemas, naming, and return shapes were aligned across AWS, GCP, and GitHub sources.
Catalog Persistence Expansion (daita/plugins/catalog/persistence.py)

Persistence layer extended for the new GCP sources and updated to handle INDEX / REFERENCES nodes and edges, including unresolved-sentinel promotion during re-discovery.
table_id on Graph Models

Graph nodes and edges now carry explicit table_id metadata where applicable, enabling fast lookups across mixed-store catalogs without parsing node keys.

#Removed

daita/core/plugin_tracing.py (557 lines)

The legacy plugin-tracing module has been removed. Tool-execution and LLM-call spans introduced in 0.15.1 (via Agent._execute_and_emit and BaseLLMProvider.generate) fully supersede it, with less code and cleaner span semantics.
daita/core/decision_tracing.py

Decision-tracing scaffolding removed; the remaining observability needs are covered by the standard tracing spans.
Legacy daita/plugins/catalog/normalizer.py

Replaced by the modular normalizer/ package. External imports from daita.plugins.catalog are unaffected — the dispatcher continues to expose the same normalize() entry point.

Skills System, GCP Catalog Support & Graph Traversal Tools

#[0.16.0] - 2026-04-17

#Added

#Changed

#Removed