Evaluation Strategies

An evaluation strategy determines how the LLM’s response is interpreted after each inference call. It controls tool binding, response parsing, and the iteration loop.

Strategy overview

Strategy ID	Class	Use case
`one-shoot`	`OneShootEvaluationStrategy`	Single LLM call, no tools
`native`	`NativeToolsEvaluationStrategy`	Native function calling (OpenAI, Anthropic)
`raw`	`RawToolsEvaluationStrategy`	JSON extraction from text (models without native tool calling)

`one-shoot`

The simplest strategy. Makes a single LLM call and returns the response. No tools are bound.

{
  "evaluation_strategy": {
    "evaluation_strategy_id": "one-shoot"
  }
}

Use this for:

Simple Q&A agents
Summarization
Classification (when using shields)
Any agent that doesn’t need tools

`native`

Uses the LLM’s native function-calling API (e.g., OpenAI tool_calls). Supports multi-step tool execution loops.

{
  "evaluation_strategy": {
    "evaluation_strategy_id": "native",
    "max_steps": 10,
    "max_concurrent_tools": 5,
    "tool_choice": "auto"
  }
}

Fields

Field	Type	Default	Description
`evaluation_strategy_id`	`"native"`	required	Strategy identifier
`max_steps`	`int`	`10`	Maximum tool execution iterations
`max_concurrent_tools`	`int`	`5`	Maximum parallel tool calls per step
`tool_choice`	`string \| object`	`"auto"`	Tool selection mode
`decorators`	`Decorator[]`	`null`	Pluggable behavior extensions

`tool_choice` values

Value	Behavior
`"auto"`	LLM decides whether to call a tool
`"required"`	LLM must call at least one tool
`"none"`	LLM must not call any tools
`ToolChoiceAllowedTools`	Restrict to a specific set of tools
`ToolChoiceForcedTool[]`	Force specific tool calls

System prompt injection

The native strategy automatically injects into the system prompt:

# System limitations
- Max steps allowed: 10
- Max concurrent tool executions: 5
- Current step: 0

Decorators

Decorators extend the native strategy with additional capabilities:

`plan_mode` decorator

Adds create_plan and patch_plan tools. The agent creates and incrementally updates a structured plan as it works.

{
  "evaluation_strategy": {
    "evaluation_strategy_id": "native",
    "decorators": [
      {
        "decorator_id": "plan_mode",
        "plan_format": "markdown"
      }
    ]
  }
}

`raw`

Extracts tool calls from the LLM’s text output using JSON pattern matching. Use this with models that don’t support native function calling.

{
  "evaluation_strategy": {
    "evaluation_strategy_id": "raw",
    "max_steps": 10,
    "tool_schemas": [
      {
        "name": "search_web",
        "description": "Search the web for information.",
        "parameters": {
          "type": "object",
          "properties": {
            "query": { "type": "string", "description": "Search query" }
          },
          "required": ["query"]
        }
      }
    ]
  }
}

Fields

Field	Type	Default	Description
`evaluation_strategy_id`	`"raw"`	required	Strategy identifier
`max_steps`	`int`	`10`	Maximum tool execution iterations
`max_concurrent_tools`	`int`	`5`	Maximum parallel tool calls per step
`tool_schemas`	`ToolSchema[]`	required	Tool definitions injected into the system prompt
`parse_regex_pattern`	`string`	JSON pattern	Regex for extracting JSON from text

System prompt injection

The raw strategy injects tool instructions and the available tool list into the system prompt automatically. The LLM is instructed to respond with:

{
  "name": "search_web",
  "parameters": {
    "query": "capital of France"
  }
}

Parsing fallback chain

The strategy tries multiple parsing approaches in order:

Direct json.loads()
Single-quote replacement
Brace block extraction
Regex pattern matching

Structured output

All strategies support structured output via with_structured_output():

{
  "evaluation_strategy": {
    "evaluation_strategy_id": "one-shoot",
    "structured_output": {
      "method": "json_schema",
      "include_raw": false,
      "json_schema": {
        "type": "object",
        "properties": {
          "answer": { "type": "string" },
          "confidence": { "type": "number" }
        },
        "required": ["answer", "confidence"]
      }
    }
  }
}

Field	Type	Default	Description
`method`	`string`	`"json_schema"`	Structured output method
`include_raw`	`bool`	`false`	Include the raw LLM response alongside the parsed output
`json_schema`	`dict`	`{}`	JSON Schema for the expected output

Tools & Integrations — how tools are connected
Agent Configuration — where to configure strategies

Evaluation Strategies

Strategy overview

one-shoot

native

Fields

tool_choice values

System prompt injection

Decorators

plan_mode decorator

raw

Fields

System prompt injection

Parsing fallback chain

Structured output

Related pages

`one-shoot`

`native`

`tool_choice` values

`plan_mode` decorator

`raw`