Daita Logo

Memory Plugin

Production-ready semantic memory for Daita agents with automatic local/cloud detection and intelligent curation.

#Quick Start

python
from daita import Agent
from daita.plugins import MemoryPlugin
 
# Create plugin (automatically project-scoped and isolated per agent)
memory = MemoryPlugin()
 
# Add to agent - memory is now persistent across runs
agent = Agent(
    name="Research Assistant",
    prompt="You are a research assistant. Use memory to track important findings.",
    tools=[memory]
)
 
await agent.start()
 
# Agent can now remember and recall information autonomously
result = await agent.run("Remember that the user prefers Python over JavaScript")
# Later...
result = await agent.run("What programming language does the user prefer?")
 
await agent.stop()

#Direct Usage

The plugin can be used directly without agents for programmatic memory management. However, the main value is agent integration - enabling LLMs to autonomously store and retrieve context across conversations using semantic search.

#Quick Construction

Use the memory() factory for concise plugin creation:

python
from daita.plugins import memory
 
mem = memory(workspace="my_project", auto_curate="on_stop")

This is equivalent to MemoryPlugin(...) and accepts the same parameters.

#Configuration Parameters

python
MemoryPlugin(
    workspace: Optional[str] = None,
    scope: str = "project",
    auto_curate: str = "on_stop",
    curation_provider: Optional[str] = None,
    curation_model: Optional[str] = None,
    curation_api_key: Optional[str] = None,
    embedding_provider: str = "openai",
    embedding_model: str = "text-embedding-3-small",
    enable_reranking: bool = False,
    enable_fact_extraction: bool = False,
    max_chunks: int = 2000,
    default_ttl_days: Optional[int] = None,
)

#Parameters

  • workspace (str): Workspace name for memory isolation. Default: auto-generated from agent name for stable persistence across runs
  • scope (str): Memory scope - "project" (default, stored in .daita/memory/) or "global" (stored in ~/.daita/memory/)
  • auto_curate (str): Curation trigger mode - "on_stop" (default) or "manual"
  • curation_provider (str): LLM provider for curation ("openai", "anthropic", etc.). Default: "openai"
  • curation_model (str): LLM model for curation. Default: "gpt-4o-mini"
  • curation_api_key (str): API key for curation LLM. Default: uses global settings
  • embedding_provider (str): Provider for semantic embeddings. Default: "openai"
  • embedding_model (str): Model for embeddings. Default: "text-embedding-3-small"
  • enable_reranking (bool): Re-score recall results with an LLM for higher precision. Requires a curator agent. Default: False
  • enable_fact_extraction (bool): Extract structured facts from memories at ingestion time for richer recall. Requires a curator agent. Default: False
  • max_chunks (int): Maximum stored chunks before LRU-style eviction (default: 2000). Keeps memory bounded for long-running agents
  • default_ttl_days (int): Default time-to-live in days for new memories. None means no expiry (default). Can be overridden per-memory via the ttl_days parameter on remember()

#Memory Scopes & Workspaces

Scope controls where memory is stored:

python
# Project-scoped (default) - memory stays with this project
memory = MemoryPlugin(scope="project")
# Location: .daita/memory/workspaces/{workspace}/
 
# Global - memory accessible across all projects
memory = MemoryPlugin(scope="global")
# Location: ~/.daita/memory/workspaces/{workspace}/

Workspace controls memory isolation:

python
# Isolated (default) - each agent has its own memory
agent = Agent("Researcher", tools=[MemoryPlugin()])
# Workspace: "researcher" (auto-generated from agent name)
 
# Shared - multiple agents share the same memory
shared_memory = MemoryPlugin(workspace="research_team")
agent1 = Agent("Researcher", tools=[shared_memory])
agent2 = Agent("Analyst", tools=[shared_memory])
# Both agents access workspace: "research_team"

Cloud Deployment:

  • Automatically detects cloud environment
  • Uses AWS storage for persistence across serverless invocations

#Using with Agents

Memory plugin exposes semantic memory operations as tools that agents can use autonomously:

python
from daita import Agent
from daita.plugins import MemoryPlugin
 
# Create memory plugin with custom configuration
memory = MemoryPlugin(
    auto_curate="on_stop"   # Curate when agent stops
)
 
# Pass plugin to agent - agent can now use memory tools autonomously
agent = Agent(
    name="Personal Assistant",
    prompt="""You are a personal assistant. Use your memory to:
    - Remember user preferences and important facts
    - Recall relevant context from past conversations
    - Build a knowledge base over time""",
    llm_provider="openai",
    model="gpt-4",
    tools=[memory]
)
 
await agent.start()
 
# Agent autonomously uses memory tools
result = await agent.run("Remember: I'm allergic to peanuts and prefer dark mode")
# Later conversation...
result = await agent.run("What are my dietary restrictions?")
# Agent uses recall() to find relevant memories
 
await agent.stop()

#Available Tools

The Memory plugin exposes these tools to LLM agents:

ToolDescriptionParameters
rememberStore information in long-term memorycontent (str or list of dicts), importance (float: 0.5), category (optional), ttl_days (optional)
recallSearch memories semanticallyquery, limit (5), score_threshold (0.6), min_importance, max_importance, category, since, before
list_by_categoryEnumerate all memories in a categorycategory (required), min_importance (float: 0.0), limit (int: 100)
update_memoryReplace an existing memoryquery (required), new_content (required), importance (float: 0.5)
read_memoryRead complete memory filefile (default: "MEMORY.md", or "today")
list_memoriesList all memory filesNone

Tool Categories: memory Tool Source: plugin

#Tool Usage Example

python
from daita import Agent
from daita.plugins import MemoryPlugin
 
# Setup memory with custom curation
memory = MemoryPlugin(
    workspace="project_alpha",      # Shared workspace
    auto_curate="on_stop"           # Curate when agent stops
)
 
agent = Agent(
    name="Project Manager",
    prompt="You are a project manager. Track decisions, tasks, and key information.",
    llm_provider="openai",
    model="gpt-4",
    tools=[memory]
)
 
await agent.start()
 
# Agent uses memory tools autonomously
result = await agent.run("""
Store project information:
- Client prefers weekly status updates on Mondays
- Budget approved: $50,000
- Deadline: March 15, 2024
- Tech stack: Python, FastAPI, PostgreSQL
""")
 
# Later, retrieve context
result = await agent.run("What's our project deadline and budget?")
# Agent uses recall() to find relevant information
 
# Check specific memory file
result = await agent.run("Show me the long-term memory file")
# Agent uses read_memory() to display full content
 
await agent.stop()

#Batch Storage

New in 0.14.0. remember() accepts a list of dicts for batch ingestion. This uses a single embedding API call, making it significantly more efficient than storing items one at a time:

python
await agent.run("""
Remember the following facts:
- Q1 revenue was $4.2M (importance: 0.8, category: financial)
- New VP of Eng starts March 15 (category: people)
- Sprint velocity averaged 42 points (importance: 0.6)
""")

The agent translates this to a batch call internally:

python
await remember([
    {"content": "Q1 revenue was $4.2M", "importance": 0.8, "category": "financial"},
    {"content": "New VP of Eng starts March 15", "category": "people"},
    {"content": "Sprint velocity averaged 42 points", "importance": 0.6},
])

Each item in the batch can specify its own importance and category. Items without these fields use auto-classification.

#Memory TTL

New in 0.14.0. Memories can be set to expire automatically:

python
# Plugin-wide default: all memories expire after 90 days
memory = MemoryPlugin(default_ttl_days=90)
 
# Per-memory override
await remember("Temporary API key: sk-abc123", ttl_days=7, importance=0.9)
  • Expired memories are pruned at session start/stop
  • Memories with no TTL (None) never expire
  • Per-memory ttl_days overrides default_ttl_days

#Time-Aware Recall

New in 0.14.0. recall() supports since and before parameters for time-bounded searches. These accept ISO datetimes or relative shorthand:

python
# Memories from the last 24 hours
result = await recall("deployment issues", since="24h")
 
# Memories from the last 7 days
result = await recall("financial data", since="7d")
 
# Memories before a specific date
result = await recall("old decisions", before="2026-01-01T00:00:00")

Supported shorthand: "24h", "7d", "30d" (hours or days).

#Auto-Classification

New in 0.14.0. When category or importance are not explicitly set, the Memory plugin automatically classifies memories using lightweight heuristics (no LLM call required). This ensures consistent categorization without additional API cost.


#Direct Memory Operations (Scripts)

For scripts that need memory operations, use a lightweight agent:

python
import asyncio
from daita import Agent
from daita.plugins import MemoryPlugin
 
async def main():
    memory = MemoryPlugin(workspace="analytics", auto_curate="manual")
 
    agent = Agent(
        name="Memory Manager",
        model="gpt-4o-mini",
        prompt="You are a memory manager. Store and retrieve information as instructed.",
        tools=[memory]
    )
 
    await agent.start()
 
    # Store information
    await agent.run(
        "Remember with importance 0.8 and category 'financial': "
        "Q4 revenue exceeded projections by 15%"
    )
 
    # Search memories
    result = await agent.run("What do you know about revenue projections?")
    print(result)
 
    await agent.stop()
 
asyncio.run(main())

#Advanced Memory Management

#Programmatic Curation

Daita Cloud only. curate() requires a managed cloud deployment. On local installs it raises RuntimeError. Automatic summarisation via regenerate_memory_md() is available locally when auto_curate="on_stop".

python
from daita.plugins import MemoryPlugin
 
memory = MemoryPlugin(auto_curate="manual")
agent = Agent("Analyst", tools=[memory])
await agent.start()
 
# Run agent interactions...
result = await agent.run("Analyze today's data...")
 
# Manually trigger curation (Daita Cloud only)
curation_result = await memory.curate()
print(f"Added {curation_result.facts_added} facts")
print(f"Cost: ${curation_result.cost_usd:.4f}")
 
await agent.stop()

#Importance Scoring

python
# Mark specific memories as important
result = await memory.mark_important(
    query="project deadline",
    importance=0.9,
    source="user_explicit"
)
 
# Pin critical memories (never pruned)
result = await memory.pin(query="client password")
print(f"Pinned {result['updated']} memories")
 
# Remove outdated memories
result = await memory.forget(query="old API credentials")
print(f"Deleted {result['deleted']} memories")

#Runtime Configuration

python
# Update configuration dynamically
memory.configure(auto_curate="manual")   # Switch to manual curation
memory.configure(auto_curate="on_stop")  # Switch back to automatic

#Intelligence Features

#LLM Reranking

When enable_reranking=True, the plugin runs a lightweight LLM scoring pass over the top recall candidates before returning them, reordering results by semantic relevance. This trades a small amount of latency for significantly higher recall precision.

python
from daita import Agent
from daita.plugins import memory
 
# Curator agent provides the LLM for reranking
curator = Agent(name="Curator", model="gpt-4o-mini", prompt="You curate memories.")
 
mem = memory(
    workspace="research",
    enable_reranking=True,
    curator=curator
)
 
agent = Agent(
    name="Research Assistant",
    prompt="Use memory to track findings.",
    tools=[mem]
)

Reranking is most useful when your memory store is large and baseline vector similarity returns many near-equal candidates.

#Fact Extraction

When enable_fact_extraction=True, the plugin parses each stored memory through an LLM at ingestion time and stores structured ExtractedFact records alongside the raw content. Facts capture entities, relationships, dates, and numeric values, enabling more precise temporal and relational recall.

python
mem = memory(
    workspace="project_alpha",
    enable_fact_extraction=True,
    curator=curator
)

Extracted facts are stored in the memory's metadata under extracted_facts and are used automatically during recall(). Enabling this option adds one LLM call per remember() invocation.

#Query Routing

The Memory plugin automatically classifies recall queries into semantic categories (vector search, keyword search, or hybrid) using an internal QueryRouter. No configuration is required — routing happens transparently based on query characteristics.

#Curation System

The Memory Plugin includes intelligent curation that extracts important facts from daily logs and stores them in long-term memory.

Curation Process:

  1. Analyzes daily conversation logs
  2. Extracts key facts, preferences, and decisions using LLM
  3. Assigns importance scores (0.0-1.0) to each fact
  4. Merges similar facts to prevent redundancy
  5. Stores in long-term memory with semantic embeddings

Curation Modes:

python
# Automatic on agent stop (default)
MemoryPlugin(auto_curate="on_stop")
 
# Manual trigger only
MemoryPlugin(auto_curate="manual")

Curation Result:

python
curation_result = await memory.curate()
 
# Access results
print(f"Success: {curation_result.success}")
print(f"Facts extracted: {curation_result.facts_extracted}")
print(f"Facts added: {curation_result.facts_added}")
print(f"Memories updated: {curation_result.memories_updated}")
print(f"Memories pruned: {curation_result.memories_pruned}")
print(f"Tokens used: {curation_result.tokens_used}")
print(f"Cost: ${curation_result.cost_usd:.4f}")

#Best Practices

Memory Organization:

  • Use project scope for project-specific context (default)
  • Use global scope for cross-project knowledge (user preferences, general facts)
  • Create shared workspaces for team collaboration across agents
  • Keep isolated workspaces (default) for independent agent tasks

Performance:

  • Let auto-curation run on agent stop (default) - balances freshness and cost
  • Use auto_curate="manual" for long-running agents where you control timing
  • Set score_threshold in recall() to filter low-relevance results (default: 0.6)
  • Use importance filters to focus on high-value memories

Cost Management:

  • Use gpt-4o-mini for curation (default) - balances quality and cost
  • Manual curation mode gives full control over when LLM calls occur
  • Monitor curation costs via CurationResult.cost_usd

Security:

  • Never store credentials or API keys in memory
  • Use memory for context, decisions, and preferences only
  • Pin critical business rules to prevent accidental pruning

#Common Patterns

Long-Running Agent with Shared Memory:

python
# Multiple agents share the same memory workspace
shared_memory = MemoryPlugin(workspace="support_team")
 
agent1 = Agent("Support Agent A", tools=[shared_memory])
agent2 = Agent("Support Agent B", tools=[shared_memory])
 
# Agent A stores customer context
await agent1.start()
await agent1.run("Customer prefers email communication over phone")
await agent1.stop()
 
# Agent B can recall that context later
await agent2.start()
result = await agent2.run("How does this customer prefer to be contacted?")
# Agent B finds the information stored by Agent A
await agent2.stop()

Research Assistant with Global Knowledge:

python
# Global scope for persistent knowledge across all projects
memory = MemoryPlugin(
    scope="global",
    workspace="research_knowledge",
    auto_curate="on_stop"
)
 
agent = Agent(
    name="Research Assistant",
    prompt="You are a research assistant. Build a knowledge base over time.",
    tools=[memory]
)
 
await agent.start()
 
# Store research findings
await agent.run("Remember: The Pythagorean theorem applies to right triangles")
await agent.run("Remember: Python uses 0-based indexing for lists")
 
# Knowledge persists across projects and sessions
await agent.stop()

Workflow Integration:

python
from daita.core import Workflow
from daita.plugins import MemoryPlugin
 
# Shared memory across workflow agents
memory = MemoryPlugin(workspace="data_pipeline")
 
# Each agent in workflow uses shared memory
data_agent = Agent("Data Collector", tools=[memory])
analyst_agent = Agent("Data Analyst", tools=[memory])
reporter_agent = Agent("Report Generator", tools=[memory])
 
workflow = Workflow("Analytics Pipeline")
workflow.add_agent(data_agent)
workflow.add_agent(analyst_agent)
workflow.add_agent(reporter_agent)
 
# Agents share context through memory as workflow executes
await workflow.run()

#Error Handling

python
from daita.plugins import MemoryPlugin
 
try:
    memory = MemoryPlugin(
        workspace="my_workspace",
        curation_provider="openai"
    )
    agent = Agent("Assistant", tools=[memory])
    await agent.start()
 
    result = await agent.run("Remember important information")
 
except RuntimeError as e:
    if "Missing required environment variables" in str(e):
        print("Set DAITA_ORG_ID and DAITA_PROJECT_NAME for cloud memory")
    elif "not installed" in str(e):
        print("Install embedding provider: pip install openai")
    else:
        print(f"Memory error: {e}")
finally:
    await agent.stop()

#Troubleshooting

IssueSolution
openai not installedpip install openai (or anthropic, for embeddings)
Cloud memory initialization failsSet DAITA_ORG_ID and DAITA_PROJECT_NAME env vars
Empty recall resultsLower score_threshold or check if memories exist
High curation costsUse auto_curate="manual" to control when curation runs
Memories not persistingCheck workspace and scope configuration
Shared memory not workingEnsure same workspace parameter across agents
Curation not runningCheck auto_curate setting, verify LLM provider configured

#Next Steps