Memory

Alquimia has two memory layers: short-term (what the LLM sees in the current prompt) and long-term (what gets summarized or erased when context grows too large).

Short-term memory

Short-term memory controls which messages from the conversation history are included in the LLM prompt. Without a strategy, all messages are included — which can exceed the model’s context window.

`max_tokens` strategy

Walks the conversation backwards, grouping messages by human turn (Interaction), and keeps messages until the token budget is exhausted.

{
  "short_term_memory_strategy": [
    {
      "memory_strategy_id": "max_tokens",
      "memory_max_tokens": 8000
    }
  ]
}

Parameter	Type	Default	Description
`memory_strategy_id`	`"max_tokens"`	required	Strategy identifier
`memory_max_tokens`	`int`	`10000`	Maximum tokens to include. `-1` = include nothing
`conditions`	`string[]`	`[]`	Empathy-style conditions to filter which interactions to include

Long-term memory

Long-term memory strategies trigger when the conversation grows beyond a threshold. They either summarize the history or erase it, keeping only the most recent interactions.

Trigger conditions

All long-term strategies share these trigger fields:

Field	Description
`input_tokens_threshold`	Trigger when the LLM’s input token count exceeds this value
`interaction_threshold_qty`	Trigger when the number of human turns exceeds this value
`interaction_threshold_tokens`	Trigger when total interaction tokens exceed this value
`on_tools_success_threshold`	Trigger when any of these tool names succeed
`on_tools_error_threshold`	Trigger when any of these tool names error
`interaction_keep`	Number of most-recent interactions to keep after the strategy runs

`summarizer` strategy

Uses Chain of Density (CoD) summarization: iteratively compresses the conversation into a progressively denser summary over cod_max_loops iterations.

{
  "long_term_memory_strategy": [
    {
      "long_term_memory_id": "summarizer",
      "interaction_threshold_qty": 20,
      "interaction_threshold_tokens": 20000,
      "cod_max_loops": 5,
      "interaction_keep": 3,
      "instructions": "Focus on action items and decisions made.",
      "knowledge_base": {
        "collection_id": "session-summaries",
        "search_mode": "always"
      }
    }
  ]
}

Parameter	Type	Default	Description
`long_term_memory_id`	`"summarizer"`	required	Strategy identifier
`interaction_threshold_qty`	`int`	`20`	Trigger after this many interactions
`interaction_threshold_tokens`	`int`	`20000`	Trigger after this many tokens
`cod_max_loops`	`int`	`5`	Number of CoD densification passes
`interaction_keep`	`int`	`0`	Interactions to keep after summarization
`instructions`	`string`	`null`	Custom summarization instructions
`knowledge_base`	`KnowledgeBase`	`null`	Store summaries in a vector store for RAG

`neuralyzer` strategy

Erases memory beyond a threshold. Simpler and faster than summarization — use when you don’t need to retain historical context.

{
  "long_term_memory_strategy": [
    {
      "long_term_memory_id": "neuralyzer",
      "interaction_threshold_qty": 30,
      "interaction_keep": 5
    }
  ]
}

Parameter	Type	Default	Description
`long_term_memory_id`	`"neuralyzer"`	required	Strategy identifier
`interaction_threshold_qty`	`int`	`-1`	Trigger after this many interactions (`-1` = never)
`interaction_threshold_tokens`	`int`	`-1`	Trigger after this many tokens (`-1` = never)
`interaction_keep`	`int`	`0`	Interactions to keep after erasure

Combining strategies

You can combine short-term and long-term strategies:

{
  "profile": {
    "short_term_memory_strategy": [
      { "memory_strategy_id": "max_tokens", "memory_max_tokens": 8000 }
    ],
    "long_term_memory_strategy": [
      {
        "long_term_memory_id": "summarizer",
        "interaction_threshold_qty": 20,
        "cod_max_loops": 3,
        "interaction_keep": 2
      }
    ]
  }
}

Flow: When the long-term strategy triggers, it summarizes the conversation and keeps the last 2 interactions. The short-term strategy then limits what the LLM sees to 8,000 tokens from those kept interactions.

Persistence strategies

After each inference, the conversation is persisted according to persistence_strategy:

Value	Behavior
`INCREMENTAL`	Append new messages to the existing session (default)
`FLUSH`	Replace the session with the current conversation
`EPHEMERAL`	Do not persist — session is lost after inference

Memory Strategies reference — full API reference
Agent Configuration — where to configure memory

Memory

Short-term memory

max_tokens strategy