Back to Changelog
v0.16.0
April 17, 2026

Skills System, GCP Catalog Support & Graph Traversal Tools

AddedChangedRemoved

#[0.16.0] - 2026-04-17

Major release introducing the Skills system — a new primitive that bundles domain instructions with tools — alongside full Google Cloud Platform catalog coverage, a modular per-source normalizer architecture, agent-facing graph traversal tools, and a comprehensive exception hierarchy refactor.

#Added

  • Skills System (daita/skills/)

    A new first-class primitive sitting between raw tools and full plugins. Skills bundle domain-specific instructions with related tools, carrying behavioral intelligence that plugins (infrastructure connectors) do not.

    python
    from daita import Agent, Skill, tool
     
    @tool
    def format_report(data: list, title: str) -> str:
        """Render a markdown report."""
        ...
     
    report_skill = Skill(
        name="report_gen",
        description="Produces polished analytical reports",
        instructions="Always render results as markdown with a title and bulleted rows.",
        tools=[format_report],
    )
     
    agent = Agent(name="Analyst", llm_provider="openai", model="gpt-4o")
    agent.add_skill(report_skill)

    Subclass BaseSkill for dynamic instruction generation, plugin dependencies (via requires()), or instructions loaded from a file. Skills are LifecyclePlugin subclasses, so on_before_run hooks automatically inject instructions into the system prompt. A new SkillError exception covers skill-specific failures (e.g., missing required plugins).

    BaseSkill, Skill, and SkillError are exported from the top-level daita package.

  • GCP Catalog Support (daita/plugins/catalog/gcp.py)

    Complete Google Cloud Platform infrastructure discovery, matching the AWS coverage introduced in 0.14.0. The new GCPDiscoverer enumerates resources across a project with per-service discoverers for:

    • BigQuery — datasets, tables, and column schemas
    • Firestore — collections and document schemas
    • Bigtable — instances, tables, and column families
    • Cloud Storage (GCS) — buckets and inferred file schemas
    • Pub/Sub — topics and subscriptions
    • Memorystore — Redis instances
    • API Gateway — APIs, configs, and gateways
    python
    from daita.plugins import catalog
     
    cat = catalog()
    schema = await cat.discover_gcp(
        project_id="my-project",
        credentials_path="service-account.json",
    )

    Install with pip install 'daita-agents[gcp]'. The [cloud] and [all] bundles now pull in the full GCP stack.

  • Modular Normalizer Architecture (daita/plugins/catalog/normalizer/)

    The monolithic normalizer.py (746 lines) has been split into per-source modules, mirroring the discoverer/profiler pattern. Each normalizer is now a focused file under catalog/normalizer/:

    _postgresql, _mysql, _mongodb, _dynamodb, _documentdb, _s3, _gcs, _sns, _sqs, _kinesis, _opensearch, _bigquery, _bigtable, _firestore, _memorystore, _pubsub, _apigateway, _gcp_apigateway

    A shared _common.py holds cross-source helpers. The package __init__.py dispatches to the right normalizer based on source type, producing the same unified NormalizedSchema output.

  • Graph Traversal Tools for Agents (daita/core/graph/tools.py)

    Generic, edge-type-agnostic graph primitives that agents can call directly. Registration is opt-in to keep unrelated agents' tool lists focused:

    python
    from daita.core.graph import register_graph_tools
     
    agent = Agent(name="Impact Analyst", llm_provider="openai", model="gpt-4o")
    agent.add_plugin(lineage())
    register_graph_tools(agent)

    Exposes two tools:

    • graph_subgraph(root, depth, edge_types?, direction?) — return nodes and edges reachable within depth hops; covers neighbors and bounded expansions in a single primitive.
    • graph_shortest_path(from_id, to_id, edge_types?) — return the shortest path between two nodes, or null when unreachable.
  • Graph Algorithms Expansion (daita/core/graph/algorithms.py)

    New stateless traversal algorithms operating on nx.MultiDiGraph:

    ancestors, descendants, find_paths, shortest_path, connected_component, impact_analysis, default_subgraph, and traverse.

    Every traversal accepts an optional edge_types filter, implemented once as a NetworkX edge-subgraph view (O(|E|) with no copy). A new LINEAGE_EDGE_TYPES constant captures the semantic set of data-flow edges (READS, WRITES, TRANSFORMS, SYNCS_TO, DERIVED_FROM, TRIGGERS, CALLS, PRODUCES) so lineage tooling doesn't accidentally walk into structural HAS_COLUMN or INDEXED_BY edges.

  • Graph Resolution Layer (daita/core/graph/resolution.py)

    Single source of truth for mapping bare table references (e.g., "orders") to fully-qualified ResolvedTable values keyed by table:<store>.<name>. Used by LineagePlugin.track(...), capture_sql_lineage(...), and DataQualityPlugin.report(...).

    A new AmbiguousReferencePolicy controls behavior when a bare name matches multiple stores:

    • STRICT (default) — raises AmbiguousReferenceError so callers don't silently clobber data across stores.
    • LENIENT — returns the most-recently-updated candidate and logs a warning.
    • UNRESOLVED_SENTINEL — returns a placeholder under a synthetic __unresolved__ store; the catalog persister promotes it into a canonical node once discovery emits a matching table.
  • Structural Graph Types

    NodeType gains INDEX; EdgeType gains INDEXED_BY, COVERS, and REFERENCES. These support richer catalog modeling (indexes, foreign-key references, covering indexes) while remaining excluded from default lineage traversals.

  • Catalog Base Profiler (daita/plugins/catalog/base_profiler.py)

    New BaseProfiler abstract class formalizing the profiler contract across AWS and GCP services, matching the existing BaseDiscoverer.

  • Code-Review Agent Example (examples/deployments/code-review-agent/)

    New end-to-end deployment example showing the Skills system in practice: a code-review agent composed of a SecurityReviewSkill (loaded from a markdown prompt file) and a CodeQualitySkill, plus a full test suite and deploy manifest.

  • Integration Test Harness (tests/integration/_harness.py)

    New harness and live integration suites exercising real backends for catalog (AWS, GCP, MongoDB, MySQL, PostgreSQL, GitHub), lineage, and deep graph-accuracy scenarios.

#Changed

  • Exception Hierarchy Refactor (daita/core/exceptions.py)

    DaitaError and its subclasses have been restructured around a shared _enrich() helper that merges named fields (e.g., agent_id, task, provider, model) into the context dict while skipping None values. Subclass constructors are now 5–10 lines each instead of 20+, and domain-specific context is consistently propagated without boilerplate. The public API is unchanged — imports, attributes, and is_transient() / is_retryable() / is_permanent() all behave as before.

  • Lineage Plugin Rewrite

    LineagePlugin now delegates to the graph resolution layer for all bare-name lookups, uses the new edge-type-aware traversal algorithms, and exposes cleaner track, analyze_impact, and find_path methods. Configurable via AmbiguousReferencePolicy.

  • Catalog Tools Overhaul (daita/plugins/catalog/tools.py)

    The catalog plugin's agent-facing tools were reworked for consistency with the new discoverer/profiler/normalizer split. Tool schemas, naming, and return shapes were aligned across AWS, GCP, and GitHub sources.

  • Catalog Persistence Expansion (daita/plugins/catalog/persistence.py)

    Persistence layer extended for the new GCP sources and updated to handle INDEX / REFERENCES nodes and edges, including unresolved-sentinel promotion during re-discovery.

  • table_id on Graph Models

    Graph nodes and edges now carry explicit table_id metadata where applicable, enabling fast lookups across mixed-store catalogs without parsing node keys.

#Removed

  • daita/core/plugin_tracing.py (557 lines)

    The legacy plugin-tracing module has been removed. Tool-execution and LLM-call spans introduced in 0.15.1 (via Agent._execute_and_emit and BaseLLMProvider.generate) fully supersede it, with less code and cleaner span semantics.

  • daita/core/decision_tracing.py

    Decision-tracing scaffolding removed; the remaining observability needs are covered by the standard tracing spans.

  • Legacy daita/plugins/catalog/normalizer.py

    Replaced by the modular normalizer/ package. External imports from daita.plugins.catalog are unaffected — the dispatcher continues to expose the same normalize() entry point.