view release on metacpan or search on metacpan
.claude/skills/perl-ai-langertha/SKILL.md view on Meta::CPAN
<capabilities>
## Capability Queries
Every engine reports its capabilities via `Langertha::Role::Capabilities`
(composed by `Role::Chat`, so present on every engine):
```perl
$engine->supports('tool_choice_named') or die "engine cannot force named tool";
$engine->supports('response_format_json_schema') # safe to pass json_schema response_format
$engine->supports('streaming') # chat_stream_request wired up
$engine->supports('tools_native') # accepts a tools array on the wire
$engine->supports('tools_hermes') # Hermes XML-tag tool path
my $caps = $engine->engine_capabilities;
# { chat=>1, streaming=>1, tools_native=>1, tool_choice_named=>1,
# response_format_json_schema=>1, embedding=>1, transcription=>1, ... }
```
The flag set is derived from which capability roles the engine composes
(central roleâflag map in `Role::Capabilities`); engines override via
`around engine_capabilities` for wire-reality corrections.
</capabilities>
<chat-f>
## chat_f â Single-Turn with Named Args
.claude/skills/perl-ai-langertha/SKILL.md view on Meta::CPAN
Engines compose feature roles:
| Role | Feature |
|------|---------|
| `Langertha::Role::Capabilities` | `engine_capabilities` registry + `supports($cap)` |
| `Langertha::Role::Chat` | `simple_chat`, `simple_chat_f`, `chat_f` (named args), `aggregate_tool_calls` |
| `Langertha::Role::Tools` | `chat_with_tools_f` (MCP loop) |
| `Langertha::Role::HermesTools` | XML-tag tool calling for models without native support |
| `Langertha::Role::ParallelToolUse` | `parallel_tool_use` boolean (canonical name) |
| `Langertha::Role::Streaming` | SSE/NDJSON streaming |
| `Langertha::Role::Embedding` | Vector embeddings |
| `Langertha::Role::Transcription` | Audio-to-text |
| `Langertha::Role::ImageGeneration` | Image generation |
| `Langertha::Role::SystemPrompt` | System prompt management |
| `Langertha::Role::Temperature` | Sampling temperature |
| `Langertha::Role::Seed` | Deterministic seed (`seed`, `randomize_seed`) |
| `Langertha::Role::ContextSize` | `context_size` parameter |
| `Langertha::Role::ResponseSize` | `response_size` / max_tokens parameter |
| `Langertha::Role::ResponseFormat` | JSON mode / structured output, plus `$self->decode_loose_json($text)` (overridable) |
| `Langertha::Role::Models` | Model listing |
.claude/skills/perl-ai-langertha/SKILL.md view on Meta::CPAN
<value-objects>
## Value Objects
| Class | Purpose |
|-------|---------|
| `Langertha::Tool` | Canonical tool definition. `from_openai/from_anthropic/from_mcp/from_gemini/from_hash` accept any shape; `to_openai/to_anthropic/to_gemini/to_mcp/to_json_schema` emit per-provider wire payloads. |
| `Langertha::ToolChoice` | Canonical tool-selection policy (`auto`/`any`/`none`/`tool`). `to_openai/to_anthropic/to_gemini/to_perplexity` per-provider serializers. |
| `Langertha::ToolCall` | Tool invocation emitted by an LLM. `name`, `arguments`, `id`, `synthetic`. `from_openai/from_anthropic/from_ollama/from_gemini`; `extract($raw)` pulls every call out of any known response shape. |
| `Langertha::Content::Image` | Provider-agnostic vision input. `from_url/from_file/from_data`; `to_openai/to_anthropic/to_gemini`. |
| `Langertha::Response` | LLM response with metadata. Stringifies to `content`. `tool_calls` is `ArrayRef[Langertha::ToolCall]` â single source of truth. |
| `Langertha::Stream::Chunk` | Single streaming chunk. Optional `tool_calls` for engines that emit them mid-stream; `Role::Chat::aggregate_tool_calls(\@chunks)` flattens. |
Use these instead of hand-rolled hashes when normalizing across
providers. `Tool->from_hash` auto-detects MCP camelCase, Anthropic
snake_case, OpenAI envelope, and Gemini-flat shapes.
</value-objects>
# Langertha â CLAUDE.md
## Overview
Langertha is a Perl LLM framework supporting 15+ engines via composable Moose roles. It provides chat, tool calling (MCP), streaming, embeddings, transcription, and an autonomous agent (Raider).
## Build System
Uses `[@Author::GETTY]` Dist::Zilla plugin bundle.
```bash
dzil test # Build and test
prove -l t/ # Run tests directly
prove -lv t/60_tool_calling.t # Single test, verbose
```
## Architecture
### Engine Hierarchy (lib/Langertha/Engine/)
```
Engine::Remote url required, JSON + HTTP
â
âââ Engine::AnthropicBase /v1/messages format, x-api-key auth, SSE streaming
â â
â âââ Anthropic Claude models, thinking blocks, tool_use
â âââ MiniMaxAnthropic MiniMax via legacy /anthropic/v1 shim endpoint
â âââ LMStudioAnthropic LM Studio Anthropic-compatible endpoint
â
âââ Engine::OpenAIBase /chat/completions format, Bearer auth, SSE streaming
â â
â â Cloud providers (url has default, api_key from env)
â âââ OpenAI gpt-4o, embeddings, whisper transcription, structured output
â âââ DeepSeek deepseek-chat/reasoner, structured output
â âââ Groq ultra-fast inference, whisper transcription, structured output
â âââ Mistral EU-hosted, embeddings, structured output
â âââ MiniMax Shanghai (default), 1M context window, M2.7
â âââ NousResearch Hermes models, <tool_call> XML tool format
â âââ Cerebras wafer-scale chips, fastest inference
â âââ OpenRouter meta-provider, 300+ models, provider/model format
â âââ SGLang SGLang OpenAI-compatible server, fast structured output
â âââ LlamaCpp llama.cpp server, embeddings
â âââ LMStudioOpenAI LM Studio's OpenAI-compatible endpoint
â
âââ Engine::TranscriptionBase Transcription-only OpenAI-shape base (no chat/tools)
â â
â âââ Whisper self-hosted faster-whisper-server etc.
â
â Non-OpenAI formats (own request/response handling)
âââ Gemini ?key= auth, functionDeclarations, thought parts
âââ Ollama native /api/chat, NDJSON streaming, OpenAPI spec
âââ AKI key-in-body auth, EU/Germany, /api/call/{model}
âââ LMStudio LM Studio native API (non-OpenAI/non-Anthropic)
```
**LMStudio family** â LM Studio servers can expose three different
endpoints: `LMStudio` is the native API, `LMStudioOpenAI` is the
OpenAI-compatible endpoint, and `LMStudioAnthropic` is the
Anthropic-compatible endpoint. Pick whichever your LM Studio server is
configured to serve.
- **Capabilities** â `engine_capabilities` registry + `supports($cap)`
helper. Composed by `Chat` (and indirectly via every other capability
role). Mapping roleâcap-flag lives in one map in `Role::Capabilities`;
engines override via `around engine_capabilities` for wire-reality
corrections (e.g. clearing `tool_choice_named` on string-only providers).
- **Chat** â sync/async chat (`simple_chat`, `simple_chat_f`); also
`chat_f(messages => [...], tools => [...], tool_choice => ...,
response_format => ...)` for single-turn structured calls.
- **Tools** â MCP tool calling loop (`chat_with_tools_f`, `mcp_servers`)
- **HermesTools** â XML-tag tool calling for models without native support
- **Streaming** â SSE / NDJSON streaming responses
- **Embedding** â Vector embeddings (`simple_embedding`)
- **Transcription** â Audio transcription
- **HTTP** â HTTP transport (sync + async via IO::Async)
- **JSON** â JSON encoding/decoding (`$self->json->encode/decode`)
- **SystemPrompt** â System prompt management
- **Temperature**, **ResponseSize**, **ContextSize**, **Seed** â Generation parameters
- **ResponseFormat** â JSON mode / structured output, plus
`$self->decode_loose_json($text)` for tolerant parsing of
prose-wrapped or fenced JSON output (overridable per engine)
- **Models** â Model selection and defaults
parent's api_key/url and `whisper-1` as transcription_model.
`$openai->whisper->simple_transcription($file)` is the canonical
way to use OpenAI's hosted Whisper from a chat-side engine.
- New Langertha::Role::Capabilities, composed by Langertha::Role::
Chat (and therefore present on every engine via composition). One
central role-to-flag map drives engine_capabilities; engines
override via `around engine_capabilities` for wire-reality
corrections. Capabilities reported by each role:
Chat -> chat
Streaming -> streaming
Tools -> tools_native + tool_choice_{auto,any,none,named}
HermesTools -> tools_hermes
ResponseFormat -> response_format_json_object/json_schema
Embedding -> embedding
Transcription -> transcription
ImageGeneration -> image_generation
Temperature -> temperature
Seed -> seed
ContextSize -> context_size
ResponseSize -> response_size
- Langertha::Response.tool_calls is now populated by every native
tool-calling engine (OpenAICompatible, AnthropicBase, Gemini,
Ollama) as well as the chat_f synthetic-tool fallback path. Single
source of truth â same shape regardless of provider.
Langertha::Response gained tool_call($name) returning the matching
Langertha::ToolCall object (vs. tool_call_args returning args).
- Langertha::Stream::Chunk gained an optional tool_calls attribute
(ArrayRef[Langertha::ToolCall]). Langertha::Role::Chat got
aggregate_tool_calls($chunks) for collecting them after a stream
ends. Per-engine streaming tool-call delta accumulation will land
incrementally; the structures are in place.
- Langertha::Engine::AnthropicBase, Langertha::Engine::Gemini, and
Langertha::Engine::Ollama now compose Langertha::Role::
ResponseFormat. Anthropic emulates response_format via a
synthesized tool plus forced tool_choice (the chat_response parser
lifts the resulting tool_use input back into Response.content as
JSON). Gemini translates response_format into generationConfig
(responseMimeType + responseSchema). Ollama translates into the
`format` parameter (string 'json' for json_object, schema HashRef
- Langertha::Role::Chat exposes engine_capabilities (default derived from
role composition) and a supports($cap) helper so software can query
what the engine can honour before sending parameters.
- Langertha::Role::ResponseFormat gained decode_loose_json($text), a
tolerant decoder for structured-output responses that may be wrapped
in code fences or prose.
- New Langertha::Engine::TSystems for the T-Systems AI Foundation
Services / LLM Hub OpenAI-compatible endpoint
(https://llm-server.llmhub.t-systems.net/v2). Bearer auth via
LANGERTHA_TSYSTEMS_API_KEY, default model gpt-oss-120b (T-Cloud,
Germany; reliable tool calling), supports chat, streaming, tool
calling, embeddings (default text-embedding-bge-m3) and structured
output. GDPR-compliant; T-Cloud models are processed in Germany,
hyperscaler models in the EU.
- New Langertha::Engine::Scaleway for Scaleway Generative APIs
(https://api.scaleway.ai/v1) â EU-hosted, drop-in OpenAI-compatible
replacement. Bearer auth via LANGERTHA_SCALEWAY_API_KEY, default
model llama-3.1-8b-instruct, supports chat, streaming, tool
calling, embeddings and structured output.
0.404 2026-04-21 14:06:44Z
- New Langertha::Content role and Langertha::Content::Image value object
for provider-agnostic vision input. Mirrors the Langertha::ToolChoice
pattern: one canonical block (from_url / from_file / from_data /
from_base64) serializes to OpenAI image_url, Anthropic image source
(URL or base64), and Gemini inline_data via to_openai / to_anthropic
/ to_gemini. Gemini auto-downloads URL-only images on first call
passed through untouched, so existing callers are unaffected.
- Fixes the "messages.0.content.1: Input tag 'image_url' ... does not
match 'image'" 400 from Anthropic when the same [text + image] prompt
was reused across engines: the canonical block is what callers
author, each engine produces its own format.
0.403 2026-04-21 12:04:54Z
- Fixed "Wide character in subroutine entry" crash on non-ASCII JSON
responses. Role::JSON's shared instance is configured with utf8=>1
(bytes in/out), but parse_response and execute_streaming_request
were feeding it Perl-Unicode via $response->decoded_content, which
blew up the first time a response body contained a non-ASCII byte
(Umlaut, em-dash, CJK, emoji). Both entry points now use
$response->content (raw bytes), keeping the pipeline consistent
with the outgoing side. The two spots that re-decode JSON
substrings out of an already-decoded tree (OpenAICompatible's
extract_tool_call for tool_call.function.arguments, and
HermesTools' response_tool_calls for <tool_call> XML bodies) now
go through a new Role::JSON::decode_json_text helper that
centralizes the encode_utf8 bridge.
- Add new shared core modules for cross-format normalization:
Langertha::Input(+::Tools), Langertha::Output(+::Tools),
and Langertha::Metrics.
- Core modules centralize tool schema conversion (OpenAI/Anthropic/Ollama),
Hermes XML extraction/normalization, and usage/cost metric normalization.
- Add core tests t/97_input_output.t and t/98_metrics.t and extend t/00_load.t.
0.305 2026-03-08 21:51:01Z
- New engine base class: Langertha::Engine::AnthropicBase for
Anthropic-compatible APIs (shared /v1/messages chat/streaming/tool/model
handling and Anthropic rate-limit parsing). Anthropic now extends this
base, and MiniMax + LMStudioAnthropic were migrated to extend it too.
- New engine: Langertha::Engine::LMStudio â native LM Studio local REST
API adapter (POST /api/v1/chat, SSE streaming with message.delta/chat.end,
GET /api/v1/models). Supports optional bearer auth via
LANGERTHA_LMSTUDIO_API_KEY, plus basic auth via URL userinfo.
Includes openai() helper returning a Langertha::Engine::LMStudioOpenAI
instance for LM Studio's /v1 endpoint.
- New engine: Langertha::Engine::LMStudioOpenAI for LM Studio's
OpenAI-compatible /v1 endpoint (defaults api_key to C<lmstudio>).
- New engine: Langertha::Engine::LMStudioAnthropic for LM Studio's
Anthropic-compatible /v1/messages endpoint. Includes LMStudio->anthropic
helper for easy conversion from native engine instances; defaults api_key
to C<lmstudio>.
0.301 2026-02-27 01:57:13Z
- Rate limit extraction from HTTP response headers: new
Langertha::RateLimit data class with normalized requests_limit,
requests_remaining, tokens_limit, tokens_remaining, and reset
fields plus raw provider-specific headers. Supported providers:
OpenAI/Groq/Cerebras/OpenRouter/Replicate/HuggingFace
(x-ratelimit-*) and Anthropic (anthropic-ratelimit-*). Engine
stores latest rate_limit, Response carries per-response rate_limit
with requests_remaining/tokens_remaining convenience methods.
- New engine: HuggingFace â HuggingFace Inference Providers
(OpenAI-compatible, org/model format, chat + streaming + tool calling)
0.300 2026-02-26 21:03:33Z
- Plugin system: Langertha::Plugin base class with lifecycle hooks
(plugin_before_raid, plugin_build_conversation, plugin_before_llm_call,
plugin_after_llm_response, plugin_before_tool_call,
plugin_after_tool_call, plugin_after_raid) and self_tools support.
Plugins can be specified by short name (resolved to
Langertha::Plugin::* or LangerthaX::Plugin::*).
- Langertha::Plugin::Langfuse: Langfuse observability as a plugin
(alternative to engine-level Role::Langfuse), with cascading traces,
0.201 2026-02-23 03:50:17Z
- Add Response.thinking attribute for chain-of-thought reasoning:
- Native extraction: DeepSeek/OpenAI-compatible reasoning_content,
Anthropic thinking blocks, Gemini thought parts â automatically
populated on Response.thinking, no configuration needed
- Think tag filter: <think> tag stripping enabled by default on
all engines. Handles both closed (<think>...</think>) and
unclosed (<think>...) tags. Configurable tag name via
think_tag (default: 'think'). Disable with
think_tag_filter => 0. Filtering applied across all text
paths: simple_chat, streaming, tool calling, and Raider.
- Add NousResearch reasoning attribute â enables chain-of-thought
reasoning for Hermes 4 and DeepHermes 3 models by prepending
the standard Nous reasoning system prompt
- Langfuse cascading traces â Raider now creates proper hierarchical
Trace â Span (iteration) â Generation (llm-call) / Span (tool)
structure instead of flat trace â generation. Iteration spans group
the LLM call and its tool calls. Tool spans capture per-tool timing,
input, and output. Trace is updated with final output at raid end.
- Langfuse: add langfuse_span() for creating span events
- Langfuse: add langfuse_update_trace(), langfuse_update_span(),
and live raider test (t/82_live_raider.t)
- Add t/83_live_minimax.t: dedicated MiniMax live test covering
simple_chat, list_models, and Raider with Coding Plan web search
- Add Raider inject() method for mid-raid context injection â
queue messages from async callbacks, timers, or other tasks
that get picked up at the next iteration naturally
- Add Raider on_iteration callback â called before each LLM call
(iterations 2+) with ($raider, $iteration), returns messages
to inject. Injected messages are persisted in history.
- Add Langertha::Engine::MiniMax for MiniMax AI API
(chat, streaming, tool calling via OpenAI-compatible API)
- Rewrite all POD to inline style across all modules â
=attr directly after has, =method directly after sub.
Add POD to all previously undocumented modules.
- Improve =seealso cross-links: remove redundant main module
links, add meaningful related module references
0.200 2026-02-22 21:53:36Z
- Add Langertha::Response: metadata container wrapping LLM text content
with id, model, finish_reason, usage (token counts), timing, and created
fields. Uses overload stringification for backward compatibility â
methods into a reusable role. Engines that use the OpenAI-compatible
API format now compose this role instead of duplicating methods.
Engine::OpenAI and all subclasses continue to work unchanged.
- Add Langertha::Engine::OllamaOpenAI: first-class engine for Ollama's
OpenAI-compatible /v1 endpoint. Ollama's openai() method now returns
this engine instead of a raw Engine::OpenAI instance.
- Add Langertha::Engine::AKI for AKI.IO native API
(chat completions with key-in-body auth, synchronous mode,
dynamic endpoint listing via list_models and endpoint_details)
- Add Langertha::Engine::AKIOpenAI for AKI.IO via OpenAI-compatible API
(chat, streaming, tool calling via Role::OpenAICompatible)
- Add Langertha::Engine::NousResearch for Nous Research Inference API
with Hermes-native tool calling via <tool_call> XML tags
- Add Langertha::Engine::Perplexity for Perplexity Sonar API
(chat and streaming only, no tool calling)
- Add hermes_tools feature flag to Langertha::Role::Tools for
Hermes-native tool calling via <tool_call>/<tool_response> XML tags;
enables MCP tool calling on any model that supports the Hermes
prompt format, even without API-level tool support
- Add hermes_call_tag, hermes_response_tag attributes for custom
XML tag names (default: tool_call, tool_response)
- Add hermes_tool_instructions attribute for customizing the
instruction text without changing the structural XML template
- Add hermes_tool_prompt attribute for full system prompt override
- Add hermes_extract_content() method for engines to override
- Add MCP (Model Context Protocol) tool calling support
- New Langertha::Role::Tools for engine-agnostic tool calling
- Anthropic engine: full tool calling support (format_tools,
response_tool_calls, format_tool_results, response_text_content)
- Async chat_with_tools_f() method for automatic multi-round
tool-calling loop with configurable max iterations
- Requires Net::Async::MCP for MCP server communication
- Add Future::AsyncAwait support for async/await syntax
- All _f methods (simple_chat_f, simple_chat_stream_f, etc.)
- Streaming with real-time async callbacks
- Add streaming support
- Synchronous callback, iterator, and Future-based APIs
- SSE parsing for OpenAI/Anthropic/Groq/Mistral/DeepSeek
- NDJSON parsing for Ollama
- Add Gemini engine (Google AI Studio)
- Add dynamic model listing via provider APIs with caching
- Add Anthropic extended parameters (effort, inference_geo)
- Improve POD documentation across all modules
0.008 2025-03-30 04:55:38Z
- Add Mistral engine integration
ex/mcp_inprocess.pl
ex/mcp_stdio.pl
ex/ollama.pl
ex/ollama_image.pl
ex/raider.pl
ex/raider_plugin_sugar.pl
ex/raider_rag.pl
ex/raider_run.pl
ex/response.pl
ex/sample.ogg
ex/streaming_anthropic.pl
ex/streaming_callback.pl
ex/streaming_future.pl
ex/streaming_gemini.pl
ex/streaming_iterator.pl
ex/streaming_mojo.pl
ex/structured_code.pl
ex/structured_output.pl
ex/structured_sentences.pl
ex/synopsis.pl
ex/transcription.pl
lib/Langertha.pm
lib/Langertha/Chat.pm
lib/Langertha/Content.pm
lib/Langertha/Content/Image.pm
lib/Langertha/Cost.pm
t/10_engine_hierarchy.t
t/11_basic_auth.t
t/12_rate_limit.t
t/20_chat_requests.t
t/21_embedding_requests.t
t/22_transcription_requests.t
t/25_aki_requests.t
t/30_ollama_requests.t
t/40_stream_chunk.t
t/41_stream_iterator.t
t/42_streaming_requests.t
t/43_streaming_parser.t
t/44_streaming_future.t
t/45_async_await.t
t/46_gemini_requests.t
t/50_list_models.t
t/60_responses_requests.t
t/60_tool_calling.t
t/61_tool_calling_openai.t
t/62_tool_calling_gemini.t
t/63_tool_calling_ollama.t
t/64_tool_calling_ollama_mock.t
t/65_tool_calling_vllm.t
ex/async_await.pl view on Meta::CPAN
say "Asking Claude a question...";
my $response = await $engine->simple_chat_f(
'What is the capital of France? Answer in one word.'
);
say "Response: $response\n";
return $response;
}
# Example 2: Streaming with real-time callback
async sub streaming_example {
my ($api_key) = @_;
say "=== Example 2: Streaming Chat with Real-time Callback ===\n";
my $engine = Langertha::Engine::Anthropic->new(
api_key => $api_key,
model => 'claude-sonnet-4-6',
);
say "Streaming response (watch it appear in real-time):\n";
ex/async_await.pl view on Meta::CPAN
say " export ANTHROPIC_API_KEY=your-key-here";
say " perl ex/async_await.pl";
exit 1;
}
say "Langertha Future::AsyncAwait Examples\n";
say "=" x 50 . "\n";
# Run examples (they return Futures, so we need to ->get them)
simple_example($api_key)->get;
streaming_example($api_key)->get;
concurrent_example($api_key)->get;
# Error handling example doesn't need real API key
error_handling_example($api_key)->get;
say "=" x 50;
say "\nâ
All examples completed!\n";
}
main() unless caller;
ex/streaming_anthropic.pl view on Meta::CPAN
warn "Will be using your ANTHROPIC_API_KEY environment variable, which may produce cost.\n";
sleep 5;
}
my $claude = Langertha::Engine::Anthropic->new(
api_key => $ENV{ANTHROPIC_API_KEY} || die("Set ANTHROPIC_API_KEY"),
model => 'claude-sonnet-4-6',
response_size => 1024,
);
# Example 1: Synchronous streaming with callback
printf "Streaming response (synchronous with callback):\n";
printf "%s\n", "-" x 50;
my $chunk_count = 0;
my $full_content = $claude->simple_chat_stream(
sub {
my ($chunk) = @_;
$chunk_count++;
print $chunk->content;
},
'Tell me a very short story about a viking in exactly 3 sentences.'
);
printf "\n%s\n", "-" x 50;
printf "Total chunks: %d\n", $chunk_count;
printf "Total length: %d characters\n", length($full_content);
# Example 2: Real-time streaming with Future
printf "\nReal-time streaming with Future:\n";
printf "%s\n", "-" x 50;
my $future = $claude->simple_chat_stream_realtime_f(
sub {
my ($chunk) = @_;
print $chunk->content;
},
'Write a haiku about Perl programming.'
);
ex/streaming_future.pl view on Meta::CPAN
if ($ENV{OPENAI_API_KEY}) {
warn "Will be using your OPENAI_API_KEY environment variable, which may produce cost.\n";
sleep 5;
}
my $openai = Langertha::Engine::OpenAI->new(
api_key => $ENV{OPENAI_API_KEY} || die("Set OPENAI_API_KEY"),
model => 'gpt-4o-mini',
);
printf "Real-time streaming with Future:\n";
printf "%s\n", "-" x 50;
# Real-time streaming with callback
my $future = $openai->simple_chat_stream_realtime_f(
sub {
my ($chunk) = @_;
printf "[%s]", $chunk->content;
},
'Tell me a very short story about a viking in exactly 3 sentences.'
);
my ($content, $chunks) = $future->get;
ex/streaming_gemini.pl view on Meta::CPAN
warn "Will be using your GEMINI_API_KEY environment variable, which may produce cost.\n";
sleep 5;
}
my $gemini = Langertha::Engine::Gemini->new(
api_key => $ENV{GEMINI_API_KEY} || die("Set GEMINI_API_KEY"),
model => 'gemini-2.5-flash',
response_size => 1024,
);
# Example 1: Synchronous streaming with callback
printf "Streaming response from Gemini (synchronous with callback):\n";
printf "%s\n", "-" x 50;
my $chunk_count = 0;
my $full_content = $gemini->simple_chat_stream(
sub {
my ($chunk) = @_;
$chunk_count++;
print $chunk->content;
},
'Tell me a very short story about a viking in exactly 3 sentences.'
);
printf "\n%s\n", "-" x 50;
printf "Total chunks: %d\n", $chunk_count;
printf "Total length: %d characters\n", length($full_content);
# Example 2: Real-time streaming with Future
printf "\nReal-time streaming with Future:\n";
printf "%s\n", "-" x 50;
my $future = $gemini->simple_chat_stream_realtime_f(
sub {
my ($chunk) = @_;
print $chunk->content;
},
'Write a haiku about Perl programming.'
);
ex/streaming_mojo.pl view on Meta::CPAN
#!/usr/bin/env perl
# Example: Using Langertha's Future-based streaming with Mojolicious
#
# This example shows how to integrate Langertha's Future-based async
# streaming with Mojolicious using Future::Mojo as a bridge.
#
# Required modules:
# cpanm Mojolicious Future::Mojo IO::Async Net::Async::HTTP
use strict;
use warnings;
use FindBin;
use lib "$FindBin::Bin/../lib";
$|=1;
ex/streaming_mojo.pl view on Meta::CPAN
if ($ENV{OPENAI_API_KEY}) {
warn "Will be using your OPENAI_API_KEY environment variable, which may produce cost.\n";
sleep 5;
}
my $openai = Langertha::Engine::OpenAI->new(
api_key => $ENV{OPENAI_API_KEY} || die("Set OPENAI_API_KEY"),
model => 'gpt-4o-mini',
);
printf "Real-time streaming with Future (Mojo-compatible):\n";
printf "%s\n", "-" x 50;
# Get the Future from Langertha
my $future = $openai->simple_chat_stream_realtime_f(
sub {
my ($chunk) = @_;
printf "[%s]", $chunk->content;
},
'Tell me a very short story about a viking in exactly 3 sentences.'
);
lib/Langertha.pm view on Meta::CPAN
Ollama, Groq, Mistral, or other providers.
B<THIS API IS WORK IN PROGRESS.>
=head2 Key Features
=over 4
=item * B<24 engines> -- unified API across cloud and local LLM providers
=item * B<Chat, streaming, embeddings, transcription, image generation>
=item * B<MCP tool calling> -- automatic multi-round tool loops via L<Net::Async::MCP>
=item * B<Raider> -- autonomous agent with history, compression, and plugins
=item * B<Response metadata> -- token usage, model, timing, rate limits
=item * B<Async/await> via L<Future::AsyncAwait>, sync via L<LWP::UserAgent>
=item * B<Langfuse observability> -- traces, generations, and tool spans
lib/Langertha.pm view on Meta::CPAN
Roles provide composable functionality to engines:
=over 4
=item * L<Langertha::Role::Capabilities> - C<engine_capabilities> registry
plus C<supports($cap)> helper, composed by L<Langertha::Role::Chat>
=item * L<Langertha::Role::Chat> - Synchronous and async chat methods,
including C<chat_f(messages =E<gt> [...], tools =E<gt> [...], tool_choice
=E<gt> ..., response_format =E<gt> ...)> for single-turn structured
calls and C<aggregate_tool_calls(\@chunks)> for streaming
=item * L<Langertha::Role::HTTP> - HTTP request/response handling
=item * L<Langertha::Role::Streaming> - Streaming response processing
=item * L<Langertha::Role::JSON> - JSON encode/decode
=item * L<Langertha::Role::OpenAICompatible> - OpenAI-compatible API behaviour
=item * L<Langertha::Role::SystemPrompt> - System prompt attribute
lib/Langertha.pm view on Meta::CPAN
=item * L<Langertha::Tool> - Canonical tool definition with cross-provider
serializers (C<to_openai>, C<to_anthropic>, C<to_gemini>, C<to_mcp>,
C<to_json_schema>) and accepting constructors (C<from_openai>,
C<from_anthropic>, C<from_mcp>, C<from_gemini>, C<from_hash>)
=item * L<Langertha::Content> / L<Langertha::Content::Image> -
Provider-agnostic vision input
=item * L<Langertha::RateLimit> - Normalized rate limit data from HTTP response headers
=item * L<Langertha::Stream> - Iterator over streaming chunks
=item * L<Langertha::Stream::Chunk> - A single chunk from a streaming
response (with optional C<tool_calls> for engines that emit them mid-stream)
=item * L<Langertha::Raider> - Autonomous agent with history and tool calling
=item * L<Langertha::Raider::Result> - Typed raid result (final, question, pause, abort)
=item * L<Langertha::Request::HTTP> - Internal HTTP request object
=back
=head2 Streaming
All engines that implement L<Langertha::Role::Chat> support streaming. There
are several ways to consume a stream:
B<Synchronous with callback:>
$engine->simple_chat_stream(sub {
my ($chunk) = @_;
print $chunk->content;
}, 'Tell me a story');
B<Synchronous with iterator (L<Langertha::Stream>):>
lib/Langertha/Chat.pm view on Meta::CPAN
$data = await $self->_run_plugin_after_llm_response($data, 1);
return $data;
}
sub simple_chat_stream {
my ( $self, $callback, @messages ) = @_;
my $engine = $self->_assert_chat_engine;
croak ref($engine) . " does not support streaming"
unless $engine->can('chat_stream_request');
croak "simple_chat_stream requires a callback as first argument"
unless ref $callback eq 'CODE';
my $conversation = $self->_build_messages(@messages);
$conversation = $self->_run_plugin_before_llm_call($conversation, 1)->get;
my $request = $engine->chat_stream_request($conversation, $self->_extra);
my $chunks = $engine->execute_streaming_request($request, $callback);
return join('', map { $_->content } @$chunks);
}
# --- Chat with tools ---
sub _gather_tools {
my ( $self ) = @_;
my @mcp_servers = @{$self->mcp_servers};
croak "No MCP servers configured" unless @mcp_servers;
lib/Langertha/Chat.pm view on Meta::CPAN
=head2 simple_chat_f
my $response = await $chat->simple_chat_f('Hello!');
Async version of L</simple_chat>.
=head2 simple_chat_stream
my $content = $chat->simple_chat_stream(sub { print shift->content }, 'Hi');
Synchronous streaming chat. Calls C<$callback> with each chunk.
=head2 simple_chat_with_tools
my $text = $chat->simple_chat_with_tools(@messages);
Synchronous tool-calling chat loop. Gathers tools from L</mcp_servers>,
sends chat requests, executes tool calls, and iterates until the LLM
returns a final text response. Fires plugin hooks at each step:
C<plugin_before_llm_call>, C<plugin_after_llm_response>,
C<plugin_before_tool_call>, and C<plugin_after_tool_call>.
lib/Langertha/Engine/AKI.pm view on Meta::CPAN
Provides access to AKI.IO's native API for running LLM inference. AKI.IO is
a European AI model hub based in Germany; all inference runs on EU infrastructure,
fully GDPR-compliant with no data leaving the EU.
The native API sends the API key as a C<key> field in the JSON request body
(not as an HTTP header). Supports synchronous chat, temperature and sampling
controls, dynamic endpoint listing, MCP tool calling via
L<Langertha::Role::HermesTools>, and OpenAI-compatible access via L</openai>.
Streaming is not yet supported in the native API. For streaming, use the
OpenAI-compatible endpoint via C<< $aki->openai >>.
Get your API key at L<https://aki.io/> and set C<LANGERTHA_AKI_API_KEY>.
B<THIS API IS WORK IN PROGRESS>
=head2 api_key
The AKI.IO API key. If not provided, reads from C<LANGERTHA_AKI_API_KEY>
environment variable. Sent as a C<key> field in the JSON request body
lib/Langertha/Engine/AKI.pm view on Meta::CPAN
Parses a native AKI.IO chat response. Dies with an API error message if
C<success> is false. Returns a L<Langertha::Response> with C<content>,
C<model>, C<timing>, and C<raw>.
=head2 openai
my $oai = $aki->openai;
my $oai = $aki->openai(model => 'llama3-chat-8b');
Returns a L<Langertha::Engine::AKIOpenAI> instance configured with the same
API key, system prompt, and temperature. Supports streaming and MCP tool
calling.
B<Note:> The native AKI model name is B<not> carried over automatically
because the C</v1> endpoint uses different model identifiers. If no C<model>
is passed, the AKIOpenAI default model is used and a warning is emitted.
Pass C<< model => '...' >> explicitly with a valid C</v1> model name to
suppress the warning.
=head1 SEE ALSO
lib/Langertha/Engine/AKIOpenAI.pm view on Meta::CPAN
);
my $oai = $aki_native->openai; # warns: model not mapped, uses default
print $oai->simple_chat('Hello via OpenAI format!');
=head1 DESCRIPTION
Provides access to AKI.IO's OpenAI-compatible API at C<https://aki.io/v1>.
Composes L<Langertha::Role::OpenAICompatible> for the standard OpenAI format.
AKI.IO is a European AI model hub (Germany) â fully GDPR-compliant with all
inference on EU infrastructure. Supports chat completions (with SSE streaming)
and dynamic model listing. Composes L<Langertha::Role::HermesTools> for MCP
tool calling via XML tags (AKI's C</v1> endpoint does not support native tool
parameters).
Embeddings and transcription are not supported. For native AKI.IO API features
(C<top_k>, C<top_p>, C<max_gen_tokens>), use L<Langertha::Engine::AKI>.
Get your API key at L<https://aki.io/> and set C<LANGERTHA_AKI_API_KEY>.
B<THIS API IS WORK IN PROGRESS>
lib/Langertha/Engine/AnthropicBase.pm view on Meta::CPAN
sub _build_api_key { $ENV{MY_API_KEY} || die "MY_API_KEY required" }
sub default_model { 'my-model-v1' }
__PACKAGE__->meta->make_immutable;
=head1 DESCRIPTION
Intermediate base class for engines speaking the Anthropic-compatible
C</v1/messages> format. Extends L<Langertha::Engine::Remote> and composes
models/chat/streaming plus Anthropic-style tool calling and response parsing.
Concrete engines extending this class include
L<Langertha::Engine::Anthropic>, L<Langertha::Engine::MiniMax>, and
L<Langertha::Engine::LMStudioAnthropic>.
B<THIS API IS WORK IN PROGRESS>
=head2 api_key
Anthropic-compatible API key sent as C<x-api-key>. Subclasses typically
lib/Langertha/Engine/Cerebras.pm view on Meta::CPAN
=head1 DESCRIPTION
Provides access to Cerebras Inference, the fastest AI inference platform.
Composes L<Langertha::Role::OpenAICompatible> with Cerebras's endpoint
(C<https://api.cerebras.ai/v1>) and API key handling.
Cerebras uses custom wafer-scale chips to deliver extremely fast inference
speeds. Available models include C<llama3.1-8b> (default), C<qwen-3-235b-a22b-instruct-2507>,
and C<gpt-oss-120b>.
Supports chat, streaming, and MCP tool calling. Embeddings and transcription
are not supported.
Get your API key at L<https://cloud.cerebras.ai/> and set
C<LANGERTHA_CEREBRAS_API_KEY> in your environment.
B<THIS API IS WORK IN PROGRESS>
=head1 SEE ALSO
=over
lib/Langertha/Engine/Gemini.pm view on Meta::CPAN
# Same tool_choice translation as chat_request.
if ( exists $extra{tool_choice} && defined $extra{tool_choice} ) {
my $tc = Langertha::ToolChoice->from_hash( delete $extra{tool_choice} );
if ($tc) {
my $cfg = $tc->to_gemini;
$extra{toolConfig} = $cfg if $cfg;
}
}
# Convert messages to Gemini format (same as non-streaming)
my @gemini_contents;
my $system_instruction;
for my $message (@{$messages}) {
if ($message->{role} eq 'system') {
$system_instruction .= "\n\n" if $system_instruction;
$system_instruction .= $message->{content};
} else {
my $role = $message->{role} eq 'assistant' ? 'model' : $message->{role};
push @gemini_contents, {
role => $role,
parts => [{ text => $message->{content} }],
};
}
}
# Build the URL for streaming endpoint
my $model_name = $self->chat_model;
my $url = $self->url . "/v1beta/models/${model_name}:streamGenerateContent?key=" . $self->api_key . "&alt=sse";
my %request_body = (
contents => \@gemini_contents,
);
if ($system_instruction) {
$request_body{systemInstruction} = {
parts => [{ text => $system_instruction }],
lib/Langertha/Engine/Gemini.pm view on Meta::CPAN
%request_body,
%extra,
);
}
sub parse_stream_chunk {
my ( $self, $data, $event ) = @_;
require Langertha::Stream::Chunk;
# Gemini streaming format is similar to non-streaming
my $candidates = $data->{candidates} || [];
return undef unless @$candidates;
my $candidate = $candidates->[0];
my $content = $candidate->{content} || {};
my $parts = $content->{parts} || [];
my $text = '';
$text = $parts->[0]->{text} if @$parts && $parts->[0]->{text};
lib/Langertha/Engine/HuggingFace.pm view on Meta::CPAN
=head1 DESCRIPTION
Provides access to HuggingFace Inference Providers, a unified API gateway
for open-source models hosted on the HuggingFace Hub. The endpoint at
C<https://router.huggingface.co/v1> is 100% OpenAI-compatible.
Model names use C<org/model> format (e.g., C<Qwen/Qwen2.5-7B-Instruct>,
C<meta-llama/Llama-3.3-70B-Instruct>). No default model is set;
C<model> must be specified explicitly.
Supports chat, streaming, and MCP tool calling. Embeddings and transcription
are not supported.
Get your API token at L<https://huggingface.co/settings/tokens> and set
C<LANGERTHA_HUGGINGFACE_API_KEY> in your environment.
B<THIS API IS WORK IN PROGRESS>
=head2 hub_url
Base URL for the HuggingFace Hub API. Default: C<https://huggingface.co>.
lib/Langertha/Engine/LlamaCpp.pm view on Meta::CPAN
=head1 DESCRIPTION
Provides access to llama.cpp's built-in HTTP server, which exposes an
OpenAI-compatible API. Composes L<Langertha::Role::OpenAICompatible>.
Only C<url> is required. The URL must include the C</v1> path prefix
(e.g., C<http://localhost:8080/v1>). Since llama.cpp serves exactly one
model (loaded at server startup), no model name or API key is needed.
Supports chat, streaming, embeddings, and MCP tool calling.
See L<https://github.com/ggml-org/llama.cpp/blob/master/examples/server/README.md>
for server setup.
B<THIS API IS WORK IN PROGRESS>
=head1 SEE ALSO
=over
lib/Langertha/Engine/MiniMax.pm view on Meta::CPAN
latency.
=item * C<MiniMax-M2> â 200K context, 128K max output. Function calling
and agentic capabilities.
=back
See L<https://platform.minimax.io/docs/guides/models-intro> for the full
model catalog including audio, video, and music models.
Supports chat, streaming, tool calling, and structured output. Embeddings,
transcription, images, and documents are not supported via this endpoint.
Get your API key at L<https://platform.minimax.io/> and set
C<LANGERTHA_MINIMAX_API_KEY> in your environment.
=head1 SEE ALSO
=over
=item * L<Langertha::Engine::MiniMaxAnthropic> - MiniMax via legacy Anthropic-compatible endpoint
lib/Langertha/Engine/Ollama.pm view on Meta::CPAN
# Show running models
my $running = $ollama->simple_ps;
=head1 DESCRIPTION
Provides access to Ollama, which runs large language models locally. Ollama
supports many popular open-source models including C<llama3.3> (default),
C<qwen2.5>, C<deepseek-coder-v2>, C<mixtral>, and C<mxbai-embed-large>
(default embedding model).
Supports chat, embeddings, streaming, MCP tool calling (OpenAI-compatible
format), and an OpenAI-compatible API via L</openai>. Not all models support
tool calling; known working models include C<qwen3:8b> and C<llama3.2:3b>.
For Hermes-format tool calling in models without API-level tool support,
compose L<Langertha::Role::HermesTools>. See L<Langertha::Role::HermesTools>
for details.
B<THIS API IS WORK IN PROGRESS>
=head2 openai
my $oai = $ollama->openai;
my $oai = $ollama->openai(model => 'different_model');
Returns a L<Langertha::Engine::OllamaOpenAI> instance configured for Ollama's
C</v1> OpenAI-compatible endpoint, inheriting the current model, embedding
model, system prompt, and temperature settings. Supports streaming, embeddings,
and MCP tool calling.
=head2 new_openai
my $oai = Langertha::Engine::Ollama->new_openai(
url => 'http://localhost:11434',
model => 'llama3.3',
tools => \@mcp_tools,
);
lib/Langertha/Engine/OllamaOpenAI.pm view on Meta::CPAN
=head1 DESCRIPTION
Provides access to Ollama's OpenAI-compatible C</v1> API endpoint. Composes
L<Langertha::Role::OpenAICompatible> for the standard OpenAI format.
C<url> is required and must include the C</v1> path prefix (e.g.,
C<http://localhost:11434/v1>). When using L<Langertha::Engine::Ollama/openai>,
the C</v1> suffix is appended automatically. The API key defaults to
C<'ollama'> since Ollama does not require authentication.
Supports chat completions (SSE streaming), embeddings (default:
C<mxbai-embed-large>), MCP tool calling, and dynamic model listing.
Transcription is not supported.
For the native Ollama API with C<keep_alive>, C<seed>, C<context_size>,
NDJSON streaming, and Hermes tool calling, use L<Langertha::Engine::Ollama>.
B<THIS API IS WORK IN PROGRESS>
=head1 SEE ALSO
=over
=item * L<Langertha::Engine::Ollama> - Native Ollama API (with keep_alive, seed, context_size)
=item * L<Langertha::Role::OpenAICompatible> - OpenAI API format role composed by this engine
lib/Langertha/Engine/OpenAIBase.pm view on Meta::CPAN
string. The base implementation croaks with a descriptive error message.
sub default_model { 'gpt-4o-mini' }
=head1 SEE ALSO
=over
=item * L<Langertha::Engine::Remote> - Parent base class
=item * L<Langertha::Role::OpenAICompatible> - OpenAI API format (chat, embeddings, tools, streaming)
=item * L<Langertha::Role::Chat> - C<simple_chat>, C<simple_chat_f>, streaming methods
=item * L<Langertha::Role::Models> - C<model>, C<models>, C<list_models>
=item * L<Langertha::Role::Temperature> - C<temperature> attribute
=item * L<Langertha::Role::ResponseSize> - C<response_size> / C<max_tokens>
=item * L<Langertha::Role::SystemPrompt> - C<system_prompt> attribute
=item * L<Langertha::Role::Streaming> - SSE stream parsing
lib/Langertha/Engine/OpenAIResponses.pm view on Meta::CPAN
=over 4
=item * B<Top-level> C<output[]> item:
C<< { type =E<gt> 'function_call', call_id =E<gt> 'call_abc', name =E<gt> 'foo',
arguments =E<gt> '{...}' } >>. This is what real reasoning models (e.g.
C<gpt-5.5-pro>) return for forced C<tool_choice>.
=item * B<Nested> inside a message item:
C<< output[type='message'].content[type='function_call'] >>. Historically
seen in streaming / older fixtures.
=back
C<chat_response>, C<response_tool_calls> and L<Langertha::ToolCall/extract>
walk both shapes. Streaming is not supported.
=head1 SEE ALSO
=over
lib/Langertha/Engine/OpenRouter.pm view on Meta::CPAN
Provides access to OpenRouter, a unified API gateway for 300+ models from
many providers (OpenAI, Anthropic, Google, Meta, Mistral, and more).
Composes L<Langertha::Role::OpenAICompatible> with OpenRouter's endpoint
(C<https://openrouter.ai/api/v1>).
Model names use C<provider/model> format (e.g., C<anthropic/claude-sonnet-4-6>,
C<openai/gpt-4o>, C<google/gemini-2.5-flash>). No default model is set;
C<model> must be specified explicitly.
Supports chat, streaming, and MCP tool calling. Embeddings and transcription
are not supported.
Get your API key at L<https://openrouter.ai/settings/keys> and set
C<LANGERTHA_OPENROUTER_API_KEY> in your environment.
B<THIS API IS WORK IN PROGRESS>
=head1 SEE ALSO
=over
lib/Langertha/Engine/Perplexity.pm view on Meta::CPAN
Provides access to Perplexity's Sonar API. Composes
L<Langertha::Role::OpenAICompatible> with Perplexity's endpoint
(C<https://api.perplexity.ai>). Perplexity models are search-augmented
LLMs with real-time web access; responses include citations alongside
generated text.
Available models: C<sonar> (default, fast), C<sonar-pro> (deeper analysis),
C<sonar-reasoning> (chain-of-thought), C<sonar-reasoning-pro> (most capable).
Limitations: tool calling, embeddings, and transcription are not supported.
Only chat and streaming are available.
Get your API key at L<https://www.perplexity.ai/settings/api> and set
C<LANGERTHA_PERPLEXITY_API_KEY>.
B<THIS API IS WORK IN PROGRESS>
=head1 SEE ALSO
=over
lib/Langertha/Engine/Replicate.pm view on Meta::CPAN
=head1 DESCRIPTION
Provides access to Replicate's OpenAI-compatible chat endpoint. Replicate
hosts thousands of open-source models with pay-per-use pricing.
Model names use C<owner/model> format (e.g., C<meta/llama-4-maverick>,
C<meta/llama-4-scout>). No default model is set; C<model> must be specified
explicitly.
Supports chat, streaming, and MCP tool calling via the OpenAI-compatible
endpoint at C<https://api.replicate.com/v1>. Embeddings and transcription
are not supported through this interface.
Get your API token at L<https://replicate.com/account/api-tokens> and set
C<LANGERTHA_REPLICATE_API_KEY> in your environment.
B<THIS API IS WORK IN PROGRESS>
=head1 SEE ALSO
lib/Langertha/Response.pm view on Meta::CPAN
=head2 tokens_remaining
Returns the number of tokens remaining from rate limit headers, or C<undef>.
=head1 SEE ALSO
=over
=item * L<Langertha::RateLimit> - Rate limit data from response headers
=item * L<Langertha::Stream::Chunk> - Single chunk from a streaming response
=item * L<Langertha::Role::Chat> - Chat role that produces response objects
=item * L<Langertha::Role::OpenAICompatible> - Parses responses into this class
=back
=head1 SUPPORT
=head2 Issues
lib/Langertha/Role/Capabilities.pm view on Meta::CPAN
package Langertha::Role::Capabilities;
# ABSTRACT: Engine-capability registry derived from composed roles
our $VERSION = '0.502';
use Moose::Role;
# Role-name => list of capability flag names that role contributes.
# Plus implicit:
# chat -> simple_chat works (Role::Chat is composed)
# streaming -> chat_stream_request is wired up (Role::Streaming)
# tools_native -> Role::Tools (the named flags below come too)
# tools_hermes -> Role::HermesTools
# ... see %ROLE_TO_CAPS below.
my %ROLE_TO_CAPS = (
'Langertha::Role::Chat' => [qw( chat )],
'Langertha::Role::Streaming' => [qw( streaming )],
'Langertha::Role::Tools' => [qw(
tools_native tool_choice_auto tool_choice_any tool_choice_none tool_choice_named
)],
'Langertha::Role::HermesTools' => [qw( tools_hermes )],
'Langertha::Role::ResponseFormat' => [qw(
response_format_json_object response_format_json_schema
)],
'Langertha::Role::Embedding' => [qw( embedding )],
'Langertha::Role::Transcription' => [qw( transcription )],
'Langertha::Role::ImageGeneration' => [qw( image_generation )],
lib/Langertha/Role/Chat.pm view on Meta::CPAN
my $result = $request->response_call->($response);
if ($self->can('has_rate_limit') && $self->has_rate_limit && ref $result && $result->isa('Langertha::Response')) {
$result = $result->clone_with(rate_limit => $self->rate_limit);
}
return $result;
}
sub chat_stream {
my ( $self, @messages ) = @_;
croak "".(ref $self)." does not support streaming"
unless $self->can('chat_stream_request');
return $self->chat_stream_request($self->chat_messages(@messages));
}
sub simple_chat_stream {
my ( $self, $callback, @messages ) = @_;
croak "simple_chat_stream requires a callback as first argument"
unless ref $callback eq 'CODE';
$log->debugf("[%s] simple_chat_stream (%s format)", ref $self, $self->stream_format);
my $request = $self->chat_stream(@messages);
my $chunks = $self->execute_streaming_request($request, $callback);
$log->debugf("[%s] Stream completed: %d chunks", ref $self, scalar @$chunks);
return join('', map { $_->content } @$chunks);
}
sub simple_chat_stream_iterator {
my ( $self, @messages ) = @_;
require Langertha::Stream;
my $request = $self->chat_stream(@messages);
my $chunks = $self->execute_streaming_request($request);
return Langertha::Stream->new(chunks => $chunks);
}
# Future-based async methods
has _async_loop => (
is => 'ro',
lazy_build => 1,
);
lib/Langertha/Role/Chat.pm view on Meta::CPAN
sub simple_chat_stream_f {
my ($self, @messages) = @_;
return $self->simple_chat_stream_realtime_f(undef, @messages);
}
async sub simple_chat_stream_realtime_f {
my ($self, $chunk_callback, @messages) = @_;
croak "".(ref $self)." does not support streaming"
unless $self->can('chat_stream_request');
my $request = $self->chat_stream_request($self->chat_messages(@messages));
my @all_chunks;
my $buffer = '';
my $format = $self->stream_format;
my $response_status;
await $self->_async_http->do_request(
request => $request,
lib/Langertha/Role/Chat.pm view on Meta::CPAN
my $chunks = $self->_process_stream_buffer(\$buffer, $format);
for my $chunk (@$chunks) {
push @all_chunks, $chunk;
$chunk_callback->($chunk) if $chunk_callback;
}
};
},
);
unless ($response_status->is_success) {
die "".(ref $self)." streaming request failed: ".$response_status->status_line;
}
# Process remaining buffer
if ($buffer ne '') {
my $chunks = $self->_process_stream_buffer(\$buffer, $format, 1);
for my $chunk (@$chunks) {
push @all_chunks, $chunk;
$chunk_callback->($chunk) if $chunk_callback;
}
}
lib/Langertha/Role/Chat.pm view on Meta::CPAN
# Async with Future::AsyncAwait (recommended)
use Future::AsyncAwait;
async sub chat_example {
my ($engine) = @_;
my $response = await $engine->simple_chat_f('Hello');
say $response;
}
# Async streaming with real-time callback
async sub stream_example {
my ($engine) = @_;
my ($content, $chunks) = await $engine->simple_chat_stream_realtime_f(
sub { print shift->content },
'Tell me a story'
);
say "\nTotal chunks: ", scalar @$chunks;
}
=head1 DESCRIPTION
This role provides chat functionality for LLM engines. It includes both
synchronous and asynchronous (L<Future>-based) methods for chat and streaming.
The Future-based C<_f> methods are implemented using L<Future::AsyncAwait> and
L<Net::Async::HTTP>. These modules are loaded lazily only when you call a C<_f>
method, so synchronous-only usage does not require them.
=head2 chat_model
The model name used for chat requests. Lazily defaults to C<default_chat_model>
if the engine provides it, otherwise falls back to the general C<model>
attribute from L<Langertha::Role::Models>.
lib/Langertha/Role/Chat.pm view on Meta::CPAN
my $response = $engine->simple_chat(@messages);
my $response = $engine->simple_chat('Hello, how are you?');
Sends a synchronous chat request and returns the response text. Blocks until
the request completes.
=head2 chat_stream
my $request = $engine->chat_stream(@messages);
Builds and returns a streaming chat HTTP request object. Croaks if the engine
does not implement C<chat_stream_request>. Use L</simple_chat_stream> or
L</simple_chat_stream_iterator> to execute the request.
=head2 simple_chat_stream
my $content = $engine->simple_chat_stream($callback, @messages);
$engine->simple_chat_stream(sub {
my ($chunk) = @_;
print $chunk->content;
}, 'Tell me a story');
Sends a synchronous streaming chat request. Calls C<$callback> with each
L<Langertha::Stream::Chunk> as it arrives. Returns the complete concatenated
content string when done. Blocks until the stream completes.
=head2 simple_chat_stream_iterator
my $stream = $engine->simple_chat_stream_iterator(@messages);
while (my $chunk = $stream->next) {
print $chunk->content;
}
lib/Langertha/Role/Chat.pm view on Meta::CPAN
response_format (currently L<Langertha::Engine::Perplexity>), the
request is automatically rewritten to use the JSON Schema path and the
response is loose-parsed; the resulting L<Langertha::Response> exposes
the parsed arguments via L<Langertha::Response/tool_call_args> with
C<synthetic =E<gt> 1> on the synthesized tool_call entry.
=head2 simple_chat_stream_f
my ($content, $chunks) = $engine->simple_chat_stream_f(@messages)->get;
Async streaming without a real-time callback. Convenience wrapper around
L</simple_chat_stream_realtime_f> with C<undef> as the callback. Returns a
L<Future> that resolves to C<($content, \@chunks)>.
=head2 aggregate_tool_calls
my $tool_calls = $engine->aggregate_tool_calls( $chunks );
Walks an ArrayRef of L<Langertha::Stream::Chunk> objects and returns
the flat list of L<Langertha::ToolCall> objects collected from any
chunks that carry C<tool_calls>. Returns an empty ArrayRef if none of
the chunks emitted tool calls.
This is the streaming counterpart to L<Langertha::Response/tool_calls>.
Engines that need to assemble fragmented tool-call deltas (OpenAI's
C<delta.tool_calls> stream, Anthropic's C<input_json_delta>) are
expected to do that assembly inside C<parse_stream_chunk> and attach
the finished L<Langertha::ToolCall> to the relevant chunk; this
helper just collects them.
=head2 simple_chat_stream_realtime_f
# With async/await (recommended)
use Future::AsyncAwait;
lib/Langertha/Role/Chat.pm view on Meta::CPAN
sub { print shift->content },
@messages
);
return $content;
}
# Traditional Future style
my $future = $engine->simple_chat_stream_realtime_f($callback, @messages);
my ($content, $chunks) = $future->get;
Async streaming with real-time callback. C<$callback> is called with each
L<Langertha::Stream::Chunk> as it arrives from the server. Returns a L<Future>
that resolves to C<($content, \@chunks)> where C<$content> is the full
concatenated text.
This is the recommended method for real-time streaming in async applications.
Pass C<undef> as the callback (or use L</simple_chat_stream_f>) if you only
need the final result.
=head2 content_format
my $fmt = $engine->content_format; # 'openai' | 'anthropic' | 'gemini'
Wire format for multimodal content blocks. Controls how
L<Langertha::Content> objects embedded in a message's C<content> arrayref
are serialized during L</chat_messages>. Defaults to C<'openai'>; overridden
lib/Langertha/Role/Chat.pm view on Meta::CPAN
via C<around>) when the wire reality differs from the role inventory
â for example to clear C<tool_choice_named> on providers that only
accept string forms.
Common keys produced by the bundled roles:
=over
=item * C<chat> â C<simple_chat>/C<simple_chat_f> work
=item * C<streaming> â C<chat_stream_request> is wired up
=item * C<tools_native> â engine accepts a C<tools> array on the wire
=item * C<tools_hermes> â tools are injected via Hermes-style XML
prompt rather than (or in addition to) the native API
=item * C<tool_choice_auto> / C<tool_choice_any> / C<tool_choice_none> â
which string-form C<tool_choice> values are accepted
=item * C<tool_choice_named> â C<{type =E<gt> 'tool', name =E<gt> '...'}>
lib/Langertha/Role/HTTP.pm view on Meta::CPAN
);
sub _build_user_agent {
my ( $self ) = @_;
return LWP::UserAgent->new(
agent => $self->user_agent_agent,
$self->has_user_agent_timeout ? ( timeout => $self->user_agent_timeout ) : (),
);
}
sub execute_streaming_request {
my ($self, $request, $chunk_callback) = @_;
croak "execute_streaming_request requires Langertha::Role::Streaming"
unless $self->does('Langertha::Role::Streaming');
my $response = $self->user_agent->request($request);
croak "".(ref $self)." streaming request failed: ".($response->status_line)
unless $response->is_success;
return $self->process_stream_data($response->content, $chunk_callback);
}
1;
__END__
lib/Langertha/Role/HTTP.pm view on Meta::CPAN
=head2 user_agent_agent
The C<User-Agent> string sent with HTTP requests. Defaults to the engine's
class name.
=head2 user_agent
The L<LWP::UserAgent> instance used for synchronous HTTP requests. Built lazily
with C<user_agent_agent> and C<user_agent_timeout>.
=head2 execute_streaming_request
my $chunks = $engine->execute_streaming_request($request, $chunk_callback);
my $chunks = $engine->execute_streaming_request($request);
Executes a streaming HTTP request synchronously using L<LWP::UserAgent> and
delegates stream parsing to L<Langertha::Role::Streaming/process_stream_data>.
Requires the engine to also compose L<Langertha::Role::Streaming>. Returns an
ArrayRef of L<Langertha::Stream::Chunk> objects. If C<$chunk_callback> is
provided it is called with each chunk as it is parsed.
=head1 SEE ALSO
=over
=item * L<Langertha::Role::JSON> - JSON encoding/decoding (required by this role)