Langertha
view release on metacpan or search on metacpan
# Langertha â CLAUDE.md
## Overview
Langertha is a Perl LLM framework supporting 15+ engines via composable Moose roles. It provides chat, tool calling (MCP), streaming, embeddings, transcription, and an autonomous agent (Raider).
## Build System
Uses `[@Author::GETTY]` Dist::Zilla plugin bundle.
```bash
dzil test # Build and test
prove -l t/ # Run tests directly
prove -lv t/60_tool_calling.t # Single test, verbose
```
## Architecture
### Engine Hierarchy (lib/Langertha/Engine/)
```
Engine::Remote url required, JSON + HTTP
â
âââ Engine::AnthropicBase /v1/messages format, x-api-key auth, SSE streaming
â â
â âââ Anthropic Claude models, thinking blocks, tool_use
â âââ MiniMaxAnthropic MiniMax via legacy /anthropic/v1 shim endpoint
â âââ LMStudioAnthropic LM Studio Anthropic-compatible endpoint
â
âââ Engine::OpenAIBase /chat/completions format, Bearer auth, SSE streaming
â â
â â Cloud providers (url has default, api_key from env)
â âââ OpenAI gpt-4o, embeddings, whisper transcription, structured output
â âââ DeepSeek deepseek-chat/reasoner, structured output
â âââ Groq ultra-fast inference, whisper transcription, structured output
â âââ Mistral EU-hosted, embeddings, structured output
â âââ MiniMax Shanghai (default), 1M context window, M2.7
â âââ NousResearch Hermes models, <tool_call> XML tool format
â âââ Cerebras wafer-scale chips, fastest inference
â âââ OpenRouter meta-provider, 300+ models, provider/model format
â âââ Replicate thousands of open-source models, owner/model format
â âââ HuggingFace Inference Providers, org/model format
â âââ Perplexity search-augmented, citations â NO tool calling
â âââ AKIOpenAI EU/Germany, GDPR-compliant
â âââ TSystems T-Systems AIFS / LLM Hub, T-Cloud Germany + EU hyperscaler models
â âââ Scaleway EU-hosted Generative APIs, drop-in OpenAI replacement
â â
â â Self-hosted (url required, no api_key)
â âââ OllamaOpenAI Ollama /v1 endpoint, embeddings
â âââ vLLM high-throughput inference, single-model server
â âââ SGLang SGLang OpenAI-compatible server, fast structured output
â âââ LlamaCpp llama.cpp server, embeddings
â âââ LMStudioOpenAI LM Studio's OpenAI-compatible endpoint
â
âââ Engine::TranscriptionBase Transcription-only OpenAI-shape base (no chat/tools)
â â
â âââ Whisper self-hosted faster-whisper-server etc.
â
â Non-OpenAI formats (own request/response handling)
âââ Gemini ?key= auth, functionDeclarations, thought parts
âââ Ollama native /api/chat, NDJSON streaming, OpenAPI spec
âââ AKI key-in-body auth, EU/Germany, /api/call/{model}
âââ LMStudio LM Studio native API (non-OpenAI/non-Anthropic)
```
**LMStudio family** â LM Studio servers can expose three different
endpoints: `LMStudio` is the native API, `LMStudioOpenAI` is the
OpenAI-compatible endpoint, and `LMStudioAnthropic` is the
Anthropic-compatible endpoint. Pick whichever your LM Studio server is
configured to serve.
**AKI family** â `AKI` is the official AKI.IO native API (changes
often, breaks). `AKIOpenAI` is the more stable OpenAI-compatible
endpoint, but it sometimes lacks features. Both are provided so users
can pick their tradeoff; we don't endorse one over the other.
**Whisper / `->whisper` accessor** â `Whisper` no longer extends
`OpenAI` (since post-0.404 refactor). It extends the new
`TranscriptionBase` so it has only transcription functionality, no
chat / tools / embeddings / image generation. To get a transcription
handle from an existing `OpenAI` instance use the `whisper` attribute
â it returns a `TranscriptionBase` configured with the parent's
`api_key` and `url` so credentials don't have to be restated.
### Roles (lib/Langertha/Role/)
- **Capabilities** â `engine_capabilities` registry + `supports($cap)`
helper. Composed by `Chat` (and indirectly via every other capability
role). Mapping roleâcap-flag lives in one map in `Role::Capabilities`;
engines override via `around engine_capabilities` for wire-reality
corrections (e.g. clearing `tool_choice_named` on string-only providers).
- **Chat** â sync/async chat (`simple_chat`, `simple_chat_f`); also
`chat_f(messages => [...], tools => [...], tool_choice => ...,
response_format => ...)` for single-turn structured calls.
- **Tools** â MCP tool calling loop (`chat_with_tools_f`, `mcp_servers`)
- **HermesTools** â XML-tag tool calling for models without native support
- **Streaming** â SSE / NDJSON streaming responses
- **Embedding** â Vector embeddings (`simple_embedding`)
- **Transcription** â Audio transcription
- **HTTP** â HTTP transport (sync + async via IO::Async)
- **JSON** â JSON encoding/decoding (`$self->json->encode/decode`)
- **SystemPrompt** â System prompt management
- **Temperature**, **ResponseSize**, **ContextSize**, **Seed** â Generation parameters
- **ResponseFormat** â JSON mode / structured output, plus
`$self->decode_loose_json($text)` for tolerant parsing of
prose-wrapped or fenced JSON output (overridable per engine)
- **Models** â Model selection and defaults
- **Langfuse** â Observability (traces, spans, generations)
- **OpenAICompatible** â OpenAI-format request/response handling
- **OpenAPI** â OpenAPI spec validation
- **ThinkTag** â Chain-of-thought `<think>` tag filtering
### Core Classes
- **Langertha::Response** â LLM response with metadata, stringifies to
content. `tool_calls` is an `ArrayRef[Langertha::ToolCall]` (single
source of truth for emitted tool calls â native and synthetic).
- **Langertha::Stream** / **Stream::Chunk** â Streaming iteration.
`Stream::Chunk` carries an optional `tool_calls` field; helper
`aggregate_tool_calls(\@chunks)` on `Role::Chat` collects them.
- **Langertha::ToolCall** â canonical tool invocation produced by an
LLM (with `synthetic` flag for forced-tool fallbacks).
- **Langertha::ToolChoice** â canonical tool-selection policy with
per-provider serializers (`to_openai`, `to_anthropic`, `to_gemini`,
`to_perplexity`).
- **Langertha::Tool** â canonical tool definition with cross-provider
serializers (`to_openai`, `to_anthropic`, `to_gemini`, `to_mcp`,
`to_json_schema`) and accepting constructors (`from_openai`,
`from_anthropic`, `from_mcp`, `from_gemini`, `from_hash`).
- **Langertha::Content::Image** â provider-agnostic vision input.
- **Langertha::Request::HTTP** â Internal HTTP request wrapper
- **Langertha::Raider** â Autonomous agent (see below)
- **Langertha::Raider::Result** â Raid result with type handling
### Tool & Structured-Output Flow
Three inputs combine: caller arguments (`tools`/`tool_choice`/
`response_format`/`mcp_servers`), method (`chat_f` single-turn vs
`chat_with_tools_f` multi-turn loop), and engine caps. `chat_f`
auto-rewrites between forms when the wire reality demands it; every
case lands as a `Langertha::ToolCall` on `Response.tool_calls`.
| Caller passes | Engine has | What `chat_f` does |
|---|---|---|
| `tools` only (no choice) | `tools_native` | forwarded to wire (per-provider via `Tool->to_X`) |
| `tools` only | only `tools_hermes` | only via `chat_with_tools_f` (XML in prompt) |
| `tools` + `tool_choice={type=>'tool',name=>X}` | `tool_choice_named` | native forced-name |
| `tools` + `tool_choice={type=>'tool',name=>X}` | only `response_format_json_schema` (Perplexity) | **auto-rewrite**: clears tools/choice, sets `response_format=json_schema` from tool's schema; loose-parses content; attaches synthetic `ToolCall` |
| `response_format=json_*` | `response_format_json_*` | native (Geminiâ`responseSchema`, Ollamaâ`format`) |
| `response_format=json_*` | only `tool_choice_named` (Anthropic) | engine-internal: synth tool + forced choice; `tool_use` input lifted into `Response.content` as JSON |
| `mcp_servers` set | `tools_native` or `tools_hermes` | use `chat_with_tools_f` for multi-turn loop |
Per-provider wire payload: OpenAI `tools=[{type=>'function',...}]` /
`tool_calls` in `choices[0].message`; Anthropic `tools=[{name,input_schema}]`
/ `tool_use` blocks in `content[]`; Gemini `functionDeclarations` +
`toolConfig.functionCallingConfig` / `functionCall` parts; Ollama
OpenAI-shape natively. Hermes engines (NousResearch, AKI, AKIOpenAI)
( run in 1.636 second using v1.01-cache-2.11-cpan-0bb4e1dffa6 )