iterations results from the CPAN

Langertha
view release on metacpan or search on metacpan
- **OpenAPI** â€” OpenAPI spec validation
- **ThinkTag** â€” Chain-of-thought `<think>` tag filtering

### Core Classes

- **Langertha::Response** â€” LLM response with metadata, stringifies to
  content. `tool_calls` is an `ArrayRef[Langertha::ToolCall]` (single
  source of truth for emitted tool calls â€” native and synthetic).
- **Langertha::Stream** / **Stream::Chunk** â€” Streaming iteration.
  `Stream::Chunk` carries an optional `tool_calls` field; helper
  `aggregate_tool_calls(\@chunks)` on `Role::Chat` collects them.
- **Langertha::ToolCall** â€” canonical tool invocation produced by an
  LLM (with `synthetic` flag for forced-tool fallbacks).
- **Langertha::ToolChoice** â€” canonical tool-selection policy with
  per-provider serializers (`to_openai`, `to_anthropic`, `to_gemini`,
  `to_perplexity`).
- **Langertha::Tool** â€” canonical tool definition with cross-provider
  serializers (`to_openai`, `to_anthropic`, `to_gemini`, `to_mcp`,
  `to_json_schema`) and accepting constructors (`from_openai`,
  `from_anthropic`, `from_mcp`, `from_gemini`, `from_hash`).
- **Langertha::Content::Image** â€” provider-agnostic vision input.
- **Langertha::Request::HTTP** â€” Internal HTTP request wrapper
- **Langertha::Raider** â€” Autonomous agent (see below)
- **Langertha::Raider::Result** â€” Raid result with type handling

### Tool & Structured-Output Flow

Three inputs combine: caller arguments (`tools`/`tool_choice`/
`response_format`/`mcp_servers`), method (`chat_f` single-turn vs
`chat_with_tools_f` multi-turn loop), and engine caps. `chat_f`
auto-rewrites between forms when the wire reality demands it; every
case lands as a `Langertha::ToolCall` on `Response.tool_calls`.

| Caller passes | Engine has | What `chat_f` does |
|---|---|---|
| `tools` only (no choice) | `tools_native` | forwarded to wire (per-provider via `Tool->to_X`) |
| `tools` only | only `tools_hermes` | only via `chat_with_tools_f` (XML in prompt) |
| `tools` + `tool_choice={type=>'tool',name=>X}` | `tool_choice_named` | native forced-name |
| `tools` + `tool_choice={type=>'tool',name=>X}` | only `response_format_json_schema` (Perplexity) | **auto-rewrite**: clears tools/choice, sets `response_format=json_schema` from tool's schema; loose-parses content; attaches synthetic `ToolCall` |
| `response_format=json_*` | `response_format_json_*` | native (Geminiâ†’`responseSchema`, Ollamaâ†’`format`) |
| `response_format=json_*` | only `tool_choice_named` (Anthropic) | engine-internal: synth tool + forced choice; `tool_use` input lifted into `Response.content` as JSON |
| `mcp_servers` set | `tools_native` or `tools_hermes` | use `chat_with_tools_f` for multi-turn loop |

Per-provider wire payload: OpenAI `tools=[{type=>'function',...}]` /
`tool_calls` in `choices[0].message`; Anthropic `tools=[{name,input_schema}]`
/ `tool_use` blocks in `content[]`; Gemini `functionDeclarations` +
`toolConfig.functionCallingConfig` / `functionCall` parts; Ollama
OpenAI-shape natively. Hermes engines (NousResearch, AKI, AKIOpenAI)
inject tools as XML into the system prompt and parse `<tool_call>`
tags from the model's text output.

## Raider (Autonomous Agent)

`Langertha::Raider` wraps an engine with conversation history, MCP tools, and a multi-turn tool-calling loop.

### Key Features

- **Conversation history** persisted across raids (only user + final assistant messages)
- **Session history** â€” full archive including tool calls (never compressed)
- **Auto-compression** â€” summarizes history when token threshold exceeded
- **Metrics** â€” tracks raids, iterations, tool calls, timing
- **Langfuse integration** â€” traces, spans, generations per raid
- **Hermes tool calling** â€” for models without native tool support
- **Mid-raid injection** â€” `inject()` and `on_iteration` callback
- **Self-tools** (virtual) â€” `raider_mcp => 1` enables agent-controlled tools:
  - `raider_ask_user` â€” ask user questions (sync callback or async pause)
  - `raider_pause` â€” pause execution for later resumption
  - `raider_abort` â€” abort the raid
  - `raider_wait` â€” wait N seconds
  - `raider_wait_for` â€” wait for external condition
  - `raider_session_history` â€” query/search session history
  - `raider_manage_mcps` â€” list/activate/deactivate catalog MCPs
  - `raider_switch_engine` â€” switch between catalog engines (requires `engine_catalog`)
- **Inline tools** â€” `tools => [...]` for quick tool definitions without MCP server setup
- **MCP catalog** â€” `mcp_catalog => {...}` for dynamic MCP server management
- **Engine catalog** â€” `engine_catalog => {...}` for runtime engine switching via `switch_engine`/`reset_engine`
- **Embedding search** â€” semantic session history search via cosine similarity
- **Result objects** â€” `raid_f` returns `Langertha::Raider::Result` (stringifies for backward compat)
- **Continuation** â€” `respond_f` resumes after question/pause results

### Raider API

```perl
my $result = await $raider->raid_f(@messages);  # Returns Result
my $result = $raider->raid(@messages);           # Sync wrapper

# Interactive self-tools
if ($result->is_question) {
    my $next = await $raider->respond_f($answer);
}

# Engine switching (programmatic API, NOT LLM-controlled)
$raider->switch_engine('smart');     # Switch to catalog engine
$raider->reset_engine;               # Back to default engine
my $engine = $raider->active_engine; # Current engine
my $info = $raider->engine_info;     # { name, class, model }
my $list = $raider->list_engines;    # All engines with status
```

## OOP Framework

Moose exclusively. All classes use `__PACKAGE__->meta->make_immutable`.

## Async

`Future::AsyncAwait` (>= 0.66) for all async methods. IO::Async for event loop.

## MCP (Model Context Protocol)

- `Net::Async::MCP` â€” MCP client
- `MCP::Server` â€” MCP server (tool definitions)
- Tool definitions use `inputSchema` (camelCase) in MCP format
- Each engine's `format_tools()` converts to provider format

## Testing

- `TEST_LANGERTHA_<ENGINE>_API_KEY` env vars for live tests
- Live tests cost real money â€” be selective
- Unit tests in `t/00-75*.t`, live tests in `t/80-86*.t`
- Test framework: `Test2::Bundle::More`
( run in 0.870 second using v1.01-cache-2.11-cpan-71847e10f99 )