Memory Plugin
Production-ready semantic memory for Daita agents with automatic local/cloud detection and intelligent curation.
#Quick Start
from daita import Agent
from daita.plugins import MemoryPlugin
# Create plugin (automatically project-scoped and isolated per agent)
memory = MemoryPlugin()
# Add to agent - memory is now persistent across runs
agent = Agent(
name="Research Assistant",
prompt="You are a research assistant. Use memory to track important findings.",
tools=[memory]
)
await agent.start()
# Agent can now remember and recall information autonomously
result = await agent.run("Remember that the user prefers Python over JavaScript")
# Later...
result = await agent.run("What programming language does the user prefer?")
await agent.stop()#Direct Usage
The plugin can be used directly without agents for programmatic memory management. However, the main value is agent integration - enabling LLMs to autonomously store and retrieve context across conversations using semantic search.
#Quick Construction
Use the memory() factory for concise plugin creation:
from daita.plugins import memory
mem = memory(workspace="my_project", auto_curate="on_stop")This is equivalent to MemoryPlugin(...) and accepts the same parameters.
#Configuration Parameters
MemoryPlugin(
workspace: Optional[str] = None,
scope: str = "project",
auto_curate: str = "on_stop",
curation_provider: Optional[str] = None,
curation_model: Optional[str] = None,
curation_api_key: Optional[str] = None,
embedding_provider: str = "openai",
embedding_model: str = "text-embedding-3-small",
embedder: Optional[BaseEmbeddingProvider] = None,
enable_reranking: bool = False,
enable_fact_extraction: bool = False,
enable_working_memory: bool = False,
enable_reinforcement: bool = False,
enable_memory_graph: bool = False,
max_chunks: int = 2000,
default_ttl_days: Optional[int] = None,
tier: str = "basic",
memory_tools: Optional[List[str]] = None,
dedup_threshold: float = 0.95,
)#Parameters
workspace(str): Workspace name for memory isolation. Default: auto-generated from agent name for stable persistence across runsscope(str): Memory scope -"project"(default, stored in.daita/memory/) or"global"(stored in~/.daita/memory/)auto_curate(str): Curation trigger mode -"on_stop"(default) or"manual"curation_provider(str): LLM provider for curation ("openai","anthropic", etc.). Default:"openai"curation_model(str): LLM model for curation. Default:"gpt-4o-mini"curation_api_key(str): API key for curation LLM. Default: uses global settingsembedding_provider(str): Provider name for semantic embeddings. Default:"openai". Ignored ifembedderis providedembedding_model(str): Model for embeddings. Default:"text-embedding-3-small". Ignored ifembedderis providedembedder(BaseEmbeddingProvider): Pre-constructed embedding provider instance. Takes precedence overembedding_provider/embedding_model. See Embedding Providers for available providersenable_reranking(bool): Re-score recall results with an LLM for higher precision. Default:Falseenable_fact_extraction(bool): Extract structured facts from memories at ingestion time for richer recall. Default:Falseenable_working_memory(bool): Enable session-scoped scratchpad. Addsscratchandthinktools. Default:Falseenable_reinforcement(bool): Enable outcome-based learning. Addsreinforcetool for recording whether recalled memories led to good or bad outcomes. Default:Falseenable_memory_graph(bool): Enable knowledge graph over memories. Addstraverse_memoryandquery_factstools. Default:Falsemax_chunks(int): Maximum stored chunks before LRU-style eviction (default: 2000). Keeps memory bounded for long-running agentsdefault_ttl_days(int): Default time-to-live in days for new memories.Nonemeans no expiry (default). Can be overridden per-memory via thettl_daysparameter onremember()tier(str): Tool tier controlling which tools are exposed."basic"(default),"analysis", or"full". See Tool Tiersmemory_tools(list): Explicit list of tool names to expose. Overridestierwhen setdedup_threshold(float): Cosine similarity threshold for deduplication (default: 0.95). Lower values are more aggressive at deduplication
#Memory Scopes & Workspaces
Scope controls where memory is stored:
# Project-scoped (default) - memory stays with this project
memory = MemoryPlugin(scope="project")
# Location: .daita/memory/workspaces/{workspace}/
# Global - memory accessible across all projects
memory = MemoryPlugin(scope="global")
# Location: ~/.daita/memory/workspaces/{workspace}/Workspace controls memory isolation:
# Isolated (default) - each agent has its own memory
agent = Agent("Researcher", tools=[MemoryPlugin()])
# Workspace: "researcher" (auto-generated from agent name)
# Shared - multiple agents share the same memory
shared_memory = MemoryPlugin(workspace="research_team")
agent1 = Agent("Researcher", tools=[shared_memory])
agent2 = Agent("Analyst", tools=[shared_memory])
# Both agents access workspace: "research_team"Cloud Deployment:
- Automatically detects cloud environment
- Uses AWS storage for persistence across serverless invocations
#Using with Agents
#Tool-Based Integration (Recommended)
Memory plugin exposes semantic memory operations as tools that agents can use autonomously:
from daita import Agent
from daita.plugins import MemoryPlugin
# Create memory plugin with custom configuration
memory = MemoryPlugin(
auto_curate="on_stop" # Curate when agent stops
)
# Pass plugin to agent - agent can now use memory tools autonomously
agent = Agent(
name="Personal Assistant",
prompt="""You are a personal assistant. Use your memory to:
- Remember user preferences and important facts
- Recall relevant context from past conversations
- Build a knowledge base over time""",
llm_provider="openai",
model="gpt-4",
tools=[memory]
)
await agent.start()
# Agent autonomously uses memory tools
result = await agent.run("Remember: I'm allergic to peanuts and prefer dark mode")
# Later conversation...
result = await agent.run("What are my dietary restrictions?")
# Agent uses recall() to find relevant memories
await agent.stop()#Available Tools
The Memory plugin exposes these tools to LLM agents. Which tools are available depends on the tier parameter and feature flags.
#Core Tools (all tiers)
| Tool | Description | Parameters |
|---|---|---|
| remember | Store information in long-term memory | content (str or list of dicts), importance (float: 0.5), category (optional), ttl_days (optional), promote_key (optional) |
| recall | Search memories semantically | query, limit (5), score_threshold (0.6), min_importance, max_importance, category, since, before |
| read_memory | Read complete memory file | file (default: "MEMORY.md", or "today") |
| list_memories | List all memory files | None |
#Analysis & Full Tier Tools
| Tool | Description | Parameters | Requires |
|---|---|---|---|
| query_facts | Query structured entity-relation-value triples | entity (optional) | enable_fact_extraction=True |
| traverse_memory | Walk entity connections across memories | entity (required), depth (int: 2) | enable_memory_graph=True |
| reinforce | Record positive/negative outcome for a memory | chunk_ids (list), outcome ("positive"/"negative") | enable_reinforcement=True |
| list_by_category | Enumerate all memories in a category | category (required), min_importance, limit | — |
#Working Memory Tools
| Tool | Description | Parameters | Requires |
|---|---|---|---|
| scratch | Write to session-scoped scratchpad | content (str), key (optional) | enable_working_memory=True |
| think | Search working memory by keyword | query (str), limit (int: 5) | enable_working_memory=True |
Tool Categories: memory
Tool Source: plugin
#Tool Usage Example
from daita import Agent
from daita.plugins import MemoryPlugin
# Setup memory with custom curation
memory = MemoryPlugin(
workspace="project_alpha", # Shared workspace
auto_curate="on_stop" # Curate when agent stops
)
agent = Agent(
name="Project Manager",
prompt="You are a project manager. Track decisions, tasks, and key information.",
llm_provider="openai",
model="gpt-4",
tools=[memory]
)
await agent.start()
# Agent uses memory tools autonomously
result = await agent.run("""
Store project information:
- Client prefers weekly status updates on Mondays
- Budget approved: $50,000
- Deadline: March 15, 2024
- Tech stack: Python, FastAPI, PostgreSQL
""")
# Later, retrieve context
result = await agent.run("What's our project deadline and budget?")
# Agent uses recall() to find relevant information
# Check specific memory file
result = await agent.run("Show me the long-term memory file")
# Agent uses read_memory() to display full content
await agent.stop()#Batch Storage
New in 0.14.0. remember() accepts a list of dicts for batch ingestion. This uses a single embedding API call, making it significantly more efficient than storing items one at a time:
await agent.run("""
Remember the following facts:
- Q1 revenue was $4.2M (importance: 0.8, category: financial)
- New VP of Eng starts March 15 (category: people)
- Sprint velocity averaged 42 points (importance: 0.6)
""")The agent translates this to a batch call internally:
await remember([
{"content": "Q1 revenue was $4.2M", "importance": 0.8, "category": "financial"},
{"content": "New VP of Eng starts March 15", "category": "people"},
{"content": "Sprint velocity averaged 42 points", "importance": 0.6},
])Each item in the batch can specify its own importance and category. Items without these fields use auto-classification.
#Memory TTL
New in 0.14.0. Memories can be set to expire automatically:
# Plugin-wide default: all memories expire after 90 days
memory = MemoryPlugin(default_ttl_days=90)
# Per-memory override
await remember("Temporary API key: sk-abc123", ttl_days=7, importance=0.9)- Expired memories are pruned at session start/stop
- Memories with no TTL (
None) never expire - Per-memory
ttl_daysoverridesdefault_ttl_days
#Time-Aware Recall
New in 0.14.0. recall() supports since and before parameters for time-bounded searches. These accept ISO datetimes or relative shorthand:
# Memories from the last 24 hours
result = await recall("deployment issues", since="24h")
# Memories from the last 7 days
result = await recall("financial data", since="7d")
# Memories before a specific date
result = await recall("old decisions", before="2026-01-01T00:00:00")Supported shorthand: "24h", "7d", "30d" (hours or days).
#Auto-Classification
New in 0.14.0. When category or importance are not explicitly set, the Memory plugin automatically classifies memories using lightweight heuristics (no LLM call required). This ensures consistent categorization without additional API cost.
#Tool Tiers
New in 0.15.0. The tier parameter controls which tools are exposed to the agent, letting you match complexity to the use case:
| Tier | Tools Included |
|---|---|
"basic" | remember, recall, read_memory, list_memories |
"analysis" | Basic + query_facts, traverse_memory, reinforce, list_by_category |
"full" | All tools |
# Simple agent — just needs remember/recall
memory = MemoryPlugin(tier="basic")
# Research agent — needs graph traversal and fact queries
memory = MemoryPlugin(
tier="analysis",
enable_fact_extraction=True,
enable_memory_graph=True,
enable_reinforcement=True,
)
# Full control — pick exactly which tools to expose
memory = MemoryPlugin(memory_tools=["remember", "recall", "scratch", "think"])The memory_tools parameter overrides tier when set, giving fine-grained control.
#Working Memory
New in 0.15.0. Working memory is a session-scoped scratchpad — in-memory only, no disk, no embeddings, no API calls. It is designed for agents that need to take notes during a task and optionally promote important findings to long-term memory.
memory = MemoryPlugin(enable_working_memory=True, tier="full")
agent = Agent(
name="Researcher",
prompt="Take scratch notes as you work, then promote key findings to memory.",
tools=[memory]
)The agent can use two working memory tools:
scratch(content, key=None)— Write to the scratchpad. Returns the assigned key. Ifkeyis omitted, an auto-incrementing key (scratch_1,scratch_2, ...) is used.think(query, limit=5)— Search the scratchpad by keyword. Returns matching items sorted by relevance.
Working memory auto-evicts when the agent stops. To promote a scratch item to long-term memory, use remember(promote_key="scratch_1").
#Memory Graph
New in 0.15.0; updated in 0.17.0. The memory graph builds a lightweight knowledge graph over stored memories, connecting entities via relationships. This enables traversal queries that pure cosine similarity would miss — for example, "what are all the infrastructure constraints for Project Orion?"
memory = MemoryPlugin(
enable_memory_graph=True,
enable_fact_extraction=True, # recommended for richer graphs
tier="analysis",
)In 0.17.0, memory graph domain types moved out of the core graph vocabulary and into the memory plugin. Most applications do not need to change anything, but code that imports memory graph node or edge enums directly should use the memory-owned models:
from daita.plugins.memory.graph_models import (
MemoryEdgeType,
MemoryGraphEdge,
MemoryGraphNode,
MemoryNodeType,
)
from daita.plugins.memory.graph_store import GraphBackendMemoryGraphStoreGraphBackendMemoryGraphStore adapts the shared core graph backend for memory-domain records, so memory graphs still use the same persistence, locking, and traversal mechanics as other graph-backed features.
#How It Works
When a memory is stored, the graph layer:
- Creates a memory node for the stored chunk
- Extracts entity nodes — from LLM-produced facts (if fact extraction is enabled) or zero-LLM keyword heuristics (backtick code references, capitalized phrases, table.column patterns, quoted strings)
- Creates MENTIONS edges from memory nodes to entity nodes
- Creates RELATED_TO edges between entities that co-occur in the same fact
#Traversal
The traverse_memory tool performs a BFS walk from a named entity, returning connected entities and the memories that mention them:
# Agent calls:
# traverse_memory(entity="Project Orion", depth=2)
#
# Returns entities and memories connected within 2 hopsThe query_facts tool returns structured entity-relation-value triples extracted from memories:
# Agent calls:
# query_facts(entity="PostgreSQL")
#
# Returns facts like: PostgreSQL -> runs_on -> RDS, PostgreSQL -> version -> 15.2#Entity Quality
The graph includes quality filters to avoid noise:
- Temporal phrases ("as of 2025", "before Q3") are excluded
- Currency amounts and bare numbers are excluded
- Generic nouns ("challenges", "data", "things") are excluded
- Technical identifiers (snake_case, dot.notation) are preserved as high-value entities
#Reinforcement Learning
New in 0.15.0. Reinforcement learning lets agents record whether recalled memories were helpful, adjusting their effective scores in future recall.
memory = MemoryPlugin(enable_reinforcement=True, tier="analysis")The agent uses the reinforce tool to provide feedback:
# After using recall results successfully:
# reinforce(chunk_ids=["abc123", "def456"], outcome="positive")
# After recall results were unhelpful:
# reinforce(chunk_ids=["ghi789"], outcome="negative")Positive reinforcement boosts a memory's score in future recall; negative reinforcement lowers it. This creates a feedback loop where the most useful memories surface more readily over time.
#Content Preprocessing
New in 0.15.0. At ingestion time, content is automatically split into two representations:
- Storage content — the original text, stored in the database and daily log
- Index content — a cleaned version used for embedding, deduplication, and fact extraction
The index representation strips structural noise — code blocks, inline code, markdown formatting, bullet prefixes — so the embedding captures the factual signal rather than formatting. This prevents structurally identical but factually different memories (e.g. two table schemas) from appearing as near-duplicates.
No configuration required — preprocessing runs automatically on all remember() calls.
#Custom Embeddings
New in 0.15.0. Inject a custom embedding provider instead of using the default OpenAI embeddings:
from daita.plugins import MemoryPlugin
from daita.embeddings import create_embedding_provider
# Use local embeddings (no API key needed)
embedder = create_embedding_provider(
"sentence-transformers",
model="all-MiniLM-L6-v2"
)
memory = MemoryPlugin(embedder=embedder)The embedder parameter accepts any BaseEmbeddingProvider instance and takes precedence over the embedding_provider/embedding_model string parameters. See Embedding Providers for all available providers.
#Direct Memory Operations (Scripts)
For scripts that need memory operations, use a lightweight agent:
import asyncio
from daita import Agent
from daita.plugins import MemoryPlugin
async def main():
memory = MemoryPlugin(workspace="analytics", auto_curate="manual")
agent = Agent(
name="Memory Manager",
model="gpt-4o-mini",
prompt="You are a memory manager. Store and retrieve information as instructed.",
tools=[memory]
)
await agent.start()
# Store information
await agent.run(
"Remember with importance 0.8 and category 'financial': "
"Q4 revenue exceeded projections by 15%"
)
# Search memories
result = await agent.run("What do you know about revenue projections?")
print(result)
await agent.stop()
asyncio.run(main())#Advanced Memory Management
#Programmatic Curation
Daita Cloud only.
curate()requires a managed cloud deployment. On local installs it raisesRuntimeError. Automatic summarisation viaregenerate_memory_md()is available locally whenauto_curate="on_stop".
from daita.plugins import MemoryPlugin
memory = MemoryPlugin(auto_curate="manual")
agent = Agent("Analyst", tools=[memory])
await agent.start()
# Run agent interactions...
result = await agent.run("Analyze today's data...")
# Manually trigger curation (Daita Cloud only)
curation_result = await memory.curate()
print(f"Added {curation_result.facts_added} facts")
print(f"Cost: ${curation_result.cost_usd:.4f}")
await agent.stop()#Importance Scoring
# Mark specific memories as important
result = await memory.mark_important(
query="project deadline",
importance=0.9,
source="user_explicit"
)
# Pin critical memories (never pruned)
result = await memory.pin(query="client password")
print(f"Pinned {result['updated']} memories")
# Remove outdated memories
result = await memory.forget(query="old API credentials")
print(f"Deleted {result['deleted']} memories")#Runtime Configuration
# Update configuration dynamically
memory.configure(auto_curate="manual") # Switch to manual curation
memory.configure(auto_curate="on_stop") # Switch back to automatic#Intelligence Features
#LLM Reranking
When enable_reranking=True, the plugin runs a lightweight LLM scoring pass over the top recall candidates before returning them, reordering results by semantic relevance. This trades a small amount of latency for significantly higher recall precision.
from daita import Agent
from daita.plugins import memory
# Curator agent provides the LLM for reranking
curator = Agent(name="Curator", model="gpt-4o-mini", prompt="You curate memories.")
mem = memory(
workspace="research",
enable_reranking=True,
curator=curator
)
agent = Agent(
name="Research Assistant",
prompt="Use memory to track findings.",
tools=[mem]
)Reranking is most useful when your memory store is large and baseline vector similarity returns many near-equal candidates.
#Fact Extraction
When enable_fact_extraction=True, the plugin parses each stored memory through an LLM at ingestion time and stores structured ExtractedFact records alongside the raw content. Facts capture entities, relationships, dates, and numeric values, enabling more precise temporal and relational recall.
mem = memory(
workspace="project_alpha",
enable_fact_extraction=True,
curator=curator
)Extracted facts are stored in the memory's metadata under extracted_facts and are used automatically during recall(). Enabling this option adds one LLM call per remember() invocation.
#Query Routing
The Memory plugin automatically classifies recall queries into semantic categories (vector search, keyword search, or hybrid) using an internal QueryRouter. No configuration is required — routing happens transparently based on query characteristics.
#Curation System
The Memory Plugin includes intelligent curation that extracts important facts from daily logs and stores them in long-term memory.
Curation Process:
- Analyzes daily conversation logs
- Extracts key facts, preferences, and decisions using LLM
- Assigns importance scores (0.0-1.0) to each fact
- Merges similar facts to prevent redundancy
- Stores in long-term memory with semantic embeddings
Curation Modes:
# Automatic on agent stop (default)
MemoryPlugin(auto_curate="on_stop")
# Manual trigger only
MemoryPlugin(auto_curate="manual")Curation Result:
curation_result = await memory.curate()
# Access results
print(f"Success: {curation_result.success}")
print(f"Facts extracted: {curation_result.facts_extracted}")
print(f"Facts added: {curation_result.facts_added}")
print(f"Memories updated: {curation_result.memories_updated}")
print(f"Memories pruned: {curation_result.memories_pruned}")
print(f"Tokens used: {curation_result.tokens_used}")
print(f"Cost: ${curation_result.cost_usd:.4f}")#Best Practices
Memory Organization:
- Use project scope for project-specific context (default)
- Use global scope for cross-project knowledge (user preferences, general facts)
- Create shared workspaces for team collaboration across agents
- Keep isolated workspaces (default) for independent agent tasks
Performance:
- Let auto-curation run on agent stop (default) - balances freshness and cost
- Use
auto_curate="manual"for long-running agents where you control timing - Set
score_thresholdin recall() to filter low-relevance results (default: 0.6) - Use importance filters to focus on high-value memories
Cost Management:
- Use
gpt-4o-minifor curation (default) - balances quality and cost - Manual curation mode gives full control over when LLM calls occur
- Monitor curation costs via
CurationResult.cost_usd
Security:
- Never store credentials or API keys in memory
- Use memory for context, decisions, and preferences only
- Pin critical business rules to prevent accidental pruning
#Common Patterns
Long-Running Agent with Shared Memory:
# Multiple agents share the same memory workspace
shared_memory = MemoryPlugin(workspace="support_team")
agent1 = Agent("Support Agent A", tools=[shared_memory])
agent2 = Agent("Support Agent B", tools=[shared_memory])
# Agent A stores customer context
await agent1.start()
await agent1.run("Customer prefers email communication over phone")
await agent1.stop()
# Agent B can recall that context later
await agent2.start()
result = await agent2.run("How does this customer prefer to be contacted?")
# Agent B finds the information stored by Agent A
await agent2.stop()Research Assistant with Global Knowledge:
# Global scope for persistent knowledge across all projects
memory = MemoryPlugin(
scope="global",
workspace="research_knowledge",
auto_curate="on_stop"
)
agent = Agent(
name="Research Assistant",
prompt="You are a research assistant. Build a knowledge base over time.",
tools=[memory]
)
await agent.start()
# Store research findings
await agent.run("Remember: The Pythagorean theorem applies to right triangles")
await agent.run("Remember: Python uses 0-based indexing for lists")
# Knowledge persists across projects and sessions
await agent.stop()Workflow Integration:
from daita.core import Workflow
from daita.plugins import MemoryPlugin
# Shared memory across workflow agents
memory = MemoryPlugin(workspace="data_pipeline")
# Each agent in workflow uses shared memory
data_agent = Agent("Data Collector", tools=[memory])
analyst_agent = Agent("Data Analyst", tools=[memory])
reporter_agent = Agent("Report Generator", tools=[memory])
workflow = Workflow("Analytics Pipeline")
workflow.add_agent(data_agent)
workflow.add_agent(analyst_agent)
workflow.add_agent(reporter_agent)
# Agents share context through memory as workflow executes
await workflow.run()#Error Handling
from daita.plugins import MemoryPlugin
try:
memory = MemoryPlugin(
workspace="my_workspace",
curation_provider="openai"
)
agent = Agent("Assistant", tools=[memory])
await agent.start()
result = await agent.run("Remember important information")
except RuntimeError as e:
if "Missing required environment variables" in str(e):
print("Set DAITA_ORG_ID and DAITA_PROJECT_NAME for cloud memory")
elif "not installed" in str(e):
print("Install embedding provider: pip install openai")
else:
print(f"Memory error: {e}")
finally:
await agent.stop()#Troubleshooting
| Issue | Solution |
|---|---|
openai not installed | pip install openai (or anthropic, for embeddings) |
| Cloud memory initialization fails | Set DAITA_ORG_ID and DAITA_PROJECT_NAME env vars |
| Empty recall results | Lower score_threshold or check if memories exist |
| High curation costs | Use auto_curate="manual" to control when curation runs |
| Memories not persisting | Check workspace and scope configuration |
| Shared memory not working | Ensure same workspace parameter across agents |
| Curation not running | Check auto_curate setting, verify LLM provider configured |
#Next Steps
- Agent Basics - Learn how to create agents with memory
- Workflows - Use shared memory in multi-agent workflows
- PostgreSQL Plugin - Combine memory with database access
- Plugin Overview - Explore other plugins