Memory
Alquimia has two memory layers: short-term (what the LLM sees in the current prompt) and long-term (what gets summarized or erased when context grows too large).
Short-term memory
Section titled “Short-term memory”Short-term memory controls which messages from the conversation history are included in the LLM prompt. Without a strategy, all messages are included — which can exceed the model’s context window.
max_tokens strategy
Section titled “max_tokens strategy”Walks the conversation backwards, grouping messages by human turn (Interaction), and keeps messages until the token budget is exhausted.
{ "short_term_memory_strategy": [ { "memory_strategy_id": "max_tokens", "memory_max_tokens": 8000 } ]}| Parameter | Type | Default | Description |
|---|---|---|---|
memory_strategy_id | "max_tokens" | required | Strategy identifier |
memory_max_tokens | int | 10000 | Maximum tokens to include. -1 = include nothing |
conditions | string[] | [] | Empathy-style conditions to filter which interactions to include |
Long-term memory
Section titled “Long-term memory”Long-term memory strategies trigger when the conversation grows beyond a threshold. They either summarize the history or erase it, keeping only the most recent interactions.
Trigger conditions
Section titled “Trigger conditions”All long-term strategies share these trigger fields:
| Field | Description |
|---|---|
input_tokens_threshold | Trigger when the LLM’s input token count exceeds this value |
interaction_threshold_qty | Trigger when the number of human turns exceeds this value |
interaction_threshold_tokens | Trigger when total interaction tokens exceed this value |
on_tools_success_threshold | Trigger when any of these tool names succeed |
on_tools_error_threshold | Trigger when any of these tool names error |
interaction_keep | Number of most-recent interactions to keep after the strategy runs |
summarizer strategy
Section titled “summarizer strategy”Uses Chain of Density (CoD) summarization: iteratively compresses the conversation into a progressively denser summary over cod_max_loops iterations.
{ "long_term_memory_strategy": [ { "long_term_memory_id": "summarizer", "interaction_threshold_qty": 20, "interaction_threshold_tokens": 20000, "cod_max_loops": 5, "interaction_keep": 3, "instructions": "Focus on action items and decisions made.", "knowledge_base": { "collection_id": "session-summaries", "search_mode": "always" } } ]}| Parameter | Type | Default | Description |
|---|---|---|---|
long_term_memory_id | "summarizer" | required | Strategy identifier |
interaction_threshold_qty | int | 20 | Trigger after this many interactions |
interaction_threshold_tokens | int | 20000 | Trigger after this many tokens |
cod_max_loops | int | 5 | Number of CoD densification passes |
interaction_keep | int | 0 | Interactions to keep after summarization |
instructions | string | null | Custom summarization instructions |
knowledge_base | KnowledgeBase | null | Store summaries in a vector store for RAG |
neuralyzer strategy
Section titled “neuralyzer strategy”Erases memory beyond a threshold. Simpler and faster than summarization — use when you don’t need to retain historical context.
{ "long_term_memory_strategy": [ { "long_term_memory_id": "neuralyzer", "interaction_threshold_qty": 30, "interaction_keep": 5 } ]}| Parameter | Type | Default | Description |
|---|---|---|---|
long_term_memory_id | "neuralyzer" | required | Strategy identifier |
interaction_threshold_qty | int | -1 | Trigger after this many interactions (-1 = never) |
interaction_threshold_tokens | int | -1 | Trigger after this many tokens (-1 = never) |
interaction_keep | int | 0 | Interactions to keep after erasure |
Combining strategies
Section titled “Combining strategies”You can combine short-term and long-term strategies:
{ "profile": { "short_term_memory_strategy": [ { "memory_strategy_id": "max_tokens", "memory_max_tokens": 8000 } ], "long_term_memory_strategy": [ { "long_term_memory_id": "summarizer", "interaction_threshold_qty": 20, "cod_max_loops": 3, "interaction_keep": 2 } ] }}Flow: When the long-term strategy triggers, it summarizes the conversation and keeps the last 2 interactions. The short-term strategy then limits what the LLM sees to 8,000 tokens from those kept interactions.
Persistence strategies
Section titled “Persistence strategies”After each inference, the conversation is persisted according to persistence_strategy:
| Value | Behavior |
|---|---|
INCREMENTAL | Append new messages to the existing session (default) |
FLUSH | Replace the session with the current conversation |
EPHEMERAL | Do not persist — session is lost after inference |
Related pages
Section titled “Related pages”- Memory Strategies reference — full API reference
- Agent Configuration — where to configure memory