Langertha

 view release on metacpan or  search on metacpan

.claude/skills/perl-ai-langertha/SKILL.md  view on Meta::CPAN


<capabilities>
## Capability Queries

Every engine reports its capabilities via `Langertha::Role::Capabilities`
(composed by `Role::Chat`, so present on every engine):

```perl
$engine->supports('tool_choice_named')             or die "engine cannot force named tool";
$engine->supports('response_format_json_schema')   # safe to pass json_schema response_format
$engine->supports('streaming')                     # chat_stream_request wired up
$engine->supports('tools_native')                  # accepts a tools array on the wire
$engine->supports('tools_hermes')                  # Hermes XML-tag tool path

my $caps = $engine->engine_capabilities;
# { chat=>1, streaming=>1, tools_native=>1, tool_choice_named=>1,
#   response_format_json_schema=>1, embedding=>1, transcription=>1, ... }
```

The flag set is derived from which capability roles the engine composes
(central role→flag map in `Role::Capabilities`); engines override via
`around engine_capabilities` for wire-reality corrections.
</capabilities>

<chat-f>
## chat_f — Single-Turn with Named Args

.claude/skills/perl-ai-langertha/SKILL.md  view on Meta::CPAN


Engines compose feature roles:

| Role | Feature |
|------|---------|
| `Langertha::Role::Capabilities` | `engine_capabilities` registry + `supports($cap)` |
| `Langertha::Role::Chat` | `simple_chat`, `simple_chat_f`, `chat_f` (named args), `aggregate_tool_calls` |
| `Langertha::Role::Tools` | `chat_with_tools_f` (MCP loop) |
| `Langertha::Role::HermesTools` | XML-tag tool calling for models without native support |
| `Langertha::Role::ParallelToolUse` | `parallel_tool_use` boolean (canonical name) |
| `Langertha::Role::Streaming` | SSE/NDJSON streaming |
| `Langertha::Role::Embedding` | Vector embeddings |
| `Langertha::Role::Transcription` | Audio-to-text |
| `Langertha::Role::ImageGeneration` | Image generation |
| `Langertha::Role::SystemPrompt` | System prompt management |
| `Langertha::Role::Temperature` | Sampling temperature |
| `Langertha::Role::Seed` | Deterministic seed (`seed`, `randomize_seed`) |
| `Langertha::Role::ContextSize` | `context_size` parameter |
| `Langertha::Role::ResponseSize` | `response_size` / max_tokens parameter |
| `Langertha::Role::ResponseFormat` | JSON mode / structured output, plus `$self->decode_loose_json($text)` (overridable) |
| `Langertha::Role::Models` | Model listing |

.claude/skills/perl-ai-langertha/SKILL.md  view on Meta::CPAN

<value-objects>
## Value Objects

| Class | Purpose |
|-------|---------|
| `Langertha::Tool` | Canonical tool definition. `from_openai/from_anthropic/from_mcp/from_gemini/from_hash` accept any shape; `to_openai/to_anthropic/to_gemini/to_mcp/to_json_schema` emit per-provider wire payloads. |
| `Langertha::ToolChoice` | Canonical tool-selection policy (`auto`/`any`/`none`/`tool`). `to_openai/to_anthropic/to_gemini/to_perplexity` per-provider serializers. |
| `Langertha::ToolCall` | Tool invocation emitted by an LLM. `name`, `arguments`, `id`, `synthetic`. `from_openai/from_anthropic/from_ollama/from_gemini`; `extract($raw)` pulls every call out of any known response shape. |
| `Langertha::Content::Image` | Provider-agnostic vision input. `from_url/from_file/from_data`; `to_openai/to_anthropic/to_gemini`. |
| `Langertha::Response` | LLM response with metadata. Stringifies to `content`. `tool_calls` is `ArrayRef[Langertha::ToolCall]` — single source of truth. |
| `Langertha::Stream::Chunk` | Single streaming chunk. Optional `tool_calls` for engines that emit them mid-stream; `Role::Chat::aggregate_tool_calls(\@chunks)` flattens. |

Use these instead of hand-rolled hashes when normalizing across
providers. `Tool->from_hash` auto-detects MCP camelCase, Anthropic
snake_case, OpenAI envelope, and Gemini-flat shapes.
</value-objects>

CLAUDE.md  view on Meta::CPAN

# Langertha — CLAUDE.md

## Overview

Langertha is a Perl LLM framework supporting 15+ engines via composable Moose roles. It provides chat, tool calling (MCP), streaming, embeddings, transcription, and an autonomous agent (Raider).

## Build System

Uses `[@Author::GETTY]` Dist::Zilla plugin bundle.

```bash
dzil test           # Build and test
prove -l t/         # Run tests directly
prove -lv t/60_tool_calling.t  # Single test, verbose
```

## Architecture

### Engine Hierarchy (lib/Langertha/Engine/)

```
Engine::Remote              url required, JSON + HTTP
  │
  ├── Engine::AnthropicBase /v1/messages format, x-api-key auth, SSE streaming
  │     │
  │     ├── Anthropic       Claude models, thinking blocks, tool_use
  │     ├── MiniMaxAnthropic MiniMax via legacy /anthropic/v1 shim endpoint
  │     └── LMStudioAnthropic LM Studio Anthropic-compatible endpoint
  │
  ├── Engine::OpenAIBase    /chat/completions format, Bearer auth, SSE streaming
  │     │
  │     │  Cloud providers (url has default, api_key from env)
  │     ├── OpenAI          gpt-4o, embeddings, whisper transcription, structured output
  │     ├── DeepSeek        deepseek-chat/reasoner, structured output
  │     ├── Groq            ultra-fast inference, whisper transcription, structured output
  │     ├── Mistral         EU-hosted, embeddings, structured output
  │     ├── MiniMax         Shanghai (default), 1M context window, M2.7
  │     ├── NousResearch    Hermes models, <tool_call> XML tool format
  │     ├── Cerebras        wafer-scale chips, fastest inference
  │     ├── OpenRouter      meta-provider, 300+ models, provider/model format

CLAUDE.md  view on Meta::CPAN

  │     ├── SGLang          SGLang OpenAI-compatible server, fast structured output
  │     ├── LlamaCpp        llama.cpp server, embeddings
  │     └── LMStudioOpenAI  LM Studio's OpenAI-compatible endpoint
  │
  ├── Engine::TranscriptionBase  Transcription-only OpenAI-shape base (no chat/tools)
  │     │
  │     └── Whisper         self-hosted faster-whisper-server etc.
  │
  │  Non-OpenAI formats (own request/response handling)
  ├── Gemini                ?key= auth, functionDeclarations, thought parts
  ├── Ollama                native /api/chat, NDJSON streaming, OpenAPI spec
  ├── AKI                   key-in-body auth, EU/Germany, /api/call/{model}
  └── LMStudio              LM Studio native API (non-OpenAI/non-Anthropic)
```

**LMStudio family** — LM Studio servers can expose three different
endpoints: `LMStudio` is the native API, `LMStudioOpenAI` is the
OpenAI-compatible endpoint, and `LMStudioAnthropic` is the
Anthropic-compatible endpoint. Pick whichever your LM Studio server is
configured to serve.

CLAUDE.md  view on Meta::CPAN

- **Capabilities** — `engine_capabilities` registry + `supports($cap)`
  helper. Composed by `Chat` (and indirectly via every other capability
  role). Mapping role→cap-flag lives in one map in `Role::Capabilities`;
  engines override via `around engine_capabilities` for wire-reality
  corrections (e.g. clearing `tool_choice_named` on string-only providers).
- **Chat** — sync/async chat (`simple_chat`, `simple_chat_f`); also
  `chat_f(messages => [...], tools => [...], tool_choice => ...,
  response_format => ...)` for single-turn structured calls.
- **Tools** — MCP tool calling loop (`chat_with_tools_f`, `mcp_servers`)
- **HermesTools** — XML-tag tool calling for models without native support
- **Streaming** — SSE / NDJSON streaming responses
- **Embedding** — Vector embeddings (`simple_embedding`)
- **Transcription** — Audio transcription
- **HTTP** — HTTP transport (sync + async via IO::Async)
- **JSON** — JSON encoding/decoding (`$self->json->encode/decode`)
- **SystemPrompt** — System prompt management
- **Temperature**, **ResponseSize**, **ContextSize**, **Seed** — Generation parameters
- **ResponseFormat** — JSON mode / structured output, plus
  `$self->decode_loose_json($text)` for tolerant parsing of
  prose-wrapped or fenced JSON output (overridable per engine)
- **Models** — Model selection and defaults

Changes  view on Meta::CPAN

      parent's api_key/url and `whisper-1` as transcription_model.
      `$openai->whisper->simple_transcription($file)` is the canonical
      way to use OpenAI's hosted Whisper from a chat-side engine.

    - New Langertha::Role::Capabilities, composed by Langertha::Role::
      Chat (and therefore present on every engine via composition). One
      central role-to-flag map drives engine_capabilities; engines
      override via `around engine_capabilities` for wire-reality
      corrections. Capabilities reported by each role:
        Chat            -> chat
        Streaming       -> streaming
        Tools           -> tools_native + tool_choice_{auto,any,none,named}
        HermesTools     -> tools_hermes
        ResponseFormat  -> response_format_json_object/json_schema
        Embedding       -> embedding
        Transcription   -> transcription
        ImageGeneration -> image_generation
        Temperature     -> temperature
        Seed            -> seed
        ContextSize     -> context_size
        ResponseSize    -> response_size

Changes  view on Meta::CPAN

    - Langertha::Response.tool_calls is now populated by every native
      tool-calling engine (OpenAICompatible, AnthropicBase, Gemini,
      Ollama) as well as the chat_f synthetic-tool fallback path. Single
      source of truth — same shape regardless of provider.
      Langertha::Response gained tool_call($name) returning the matching
      Langertha::ToolCall object (vs. tool_call_args returning args).

    - Langertha::Stream::Chunk gained an optional tool_calls attribute
      (ArrayRef[Langertha::ToolCall]). Langertha::Role::Chat got
      aggregate_tool_calls($chunks) for collecting them after a stream
      ends. Per-engine streaming tool-call delta accumulation will land
      incrementally; the structures are in place.

    - Langertha::Engine::AnthropicBase, Langertha::Engine::Gemini, and
      Langertha::Engine::Ollama now compose Langertha::Role::
      ResponseFormat. Anthropic emulates response_format via a
      synthesized tool plus forced tool_choice (the chat_response parser
      lifts the resulting tool_use input back into Response.content as
      JSON). Gemini translates response_format into generationConfig
      (responseMimeType + responseSchema). Ollama translates into the
      `format` parameter (string 'json' for json_object, schema HashRef

Changes  view on Meta::CPAN

    - Langertha::Role::Chat exposes engine_capabilities (default derived from
      role composition) and a supports($cap) helper so software can query
      what the engine can honour before sending parameters.
    - Langertha::Role::ResponseFormat gained decode_loose_json($text), a
      tolerant decoder for structured-output responses that may be wrapped
      in code fences or prose.
    - New Langertha::Engine::TSystems for the T-Systems AI Foundation
      Services / LLM Hub OpenAI-compatible endpoint
      (https://llm-server.llmhub.t-systems.net/v2). Bearer auth via
      LANGERTHA_TSYSTEMS_API_KEY, default model gpt-oss-120b (T-Cloud,
      Germany; reliable tool calling), supports chat, streaming, tool
      calling, embeddings (default text-embedding-bge-m3) and structured
      output. GDPR-compliant; T-Cloud models are processed in Germany,
      hyperscaler models in the EU.
    - New Langertha::Engine::Scaleway for Scaleway Generative APIs
      (https://api.scaleway.ai/v1) — EU-hosted, drop-in OpenAI-compatible
      replacement. Bearer auth via LANGERTHA_SCALEWAY_API_KEY, default
      model llama-3.1-8b-instruct, supports chat, streaming, tool
      calling, embeddings and structured output.

0.404     2026-04-21 14:06:44Z

    - New Langertha::Content role and Langertha::Content::Image value object
      for provider-agnostic vision input. Mirrors the Langertha::ToolChoice
      pattern: one canonical block (from_url / from_file / from_data /
      from_base64) serializes to OpenAI image_url, Anthropic image source
      (URL or base64), and Gemini inline_data via to_openai / to_anthropic
      / to_gemini. Gemini auto-downloads URL-only images on first call

Changes  view on Meta::CPAN

      passed through untouched, so existing callers are unaffected.
    - Fixes the "messages.0.content.1: Input tag 'image_url' ... does not
      match 'image'" 400 from Anthropic when the same [text + image] prompt
      was reused across engines: the canonical block is what callers
      author, each engine produces its own format.

0.403     2026-04-21 12:04:54Z

    - Fixed "Wide character in subroutine entry" crash on non-ASCII JSON
      responses. Role::JSON's shared instance is configured with utf8=>1
      (bytes in/out), but parse_response and execute_streaming_request
      were feeding it Perl-Unicode via $response->decoded_content, which
      blew up the first time a response body contained a non-ASCII byte
      (Umlaut, em-dash, CJK, emoji). Both entry points now use
      $response->content (raw bytes), keeping the pipeline consistent
      with the outgoing side. The two spots that re-decode JSON
      substrings out of an already-decoded tree (OpenAICompatible's
      extract_tool_call for tool_call.function.arguments, and
      HermesTools' response_tool_calls for <tool_call> XML bodies) now
      go through a new Role::JSON::decode_json_text helper that
      centralizes the encode_utf8 bridge.

Changes  view on Meta::CPAN


    - Add new shared core modules for cross-format normalization:
      Langertha::Input(+::Tools), Langertha::Output(+::Tools),
      and Langertha::Metrics.
    - Core modules centralize tool schema conversion (OpenAI/Anthropic/Ollama),
      Hermes XML extraction/normalization, and usage/cost metric normalization.
    - Add core tests t/97_input_output.t and t/98_metrics.t and extend t/00_load.t.

0.305     2026-03-08 21:51:01Z
    - New engine base class: Langertha::Engine::AnthropicBase for
      Anthropic-compatible APIs (shared /v1/messages chat/streaming/tool/model
      handling and Anthropic rate-limit parsing). Anthropic now extends this
      base, and MiniMax + LMStudioAnthropic were migrated to extend it too.
    - New engine: Langertha::Engine::LMStudio — native LM Studio local REST
      API adapter (POST /api/v1/chat, SSE streaming with message.delta/chat.end,
      GET /api/v1/models). Supports optional bearer auth via
      LANGERTHA_LMSTUDIO_API_KEY, plus basic auth via URL userinfo.
      Includes openai() helper returning a Langertha::Engine::LMStudioOpenAI
      instance for LM Studio's /v1 endpoint.
    - New engine: Langertha::Engine::LMStudioOpenAI for LM Studio's
      OpenAI-compatible /v1 endpoint (defaults api_key to C<lmstudio>).
    - New engine: Langertha::Engine::LMStudioAnthropic for LM Studio's
      Anthropic-compatible /v1/messages endpoint. Includes LMStudio->anthropic
      helper for easy conversion from native engine instances; defaults api_key
      to C<lmstudio>.

Changes  view on Meta::CPAN

0.301     2026-02-27 01:57:13Z
    - Rate limit extraction from HTTP response headers: new
      Langertha::RateLimit data class with normalized requests_limit,
      requests_remaining, tokens_limit, tokens_remaining, and reset
      fields plus raw provider-specific headers. Supported providers:
      OpenAI/Groq/Cerebras/OpenRouter/Replicate/HuggingFace
      (x-ratelimit-*) and Anthropic (anthropic-ratelimit-*). Engine
      stores latest rate_limit, Response carries per-response rate_limit
      with requests_remaining/tokens_remaining convenience methods.
    - New engine: HuggingFace — HuggingFace Inference Providers
      (OpenAI-compatible, org/model format, chat + streaming + tool calling)

0.300     2026-02-26 21:03:33Z
    - Plugin system: Langertha::Plugin base class with lifecycle hooks
      (plugin_before_raid, plugin_build_conversation, plugin_before_llm_call,
      plugin_after_llm_response, plugin_before_tool_call,
      plugin_after_tool_call, plugin_after_raid) and self_tools support.
      Plugins can be specified by short name (resolved to
      Langertha::Plugin::* or LangerthaX::Plugin::*).
    - Langertha::Plugin::Langfuse: Langfuse observability as a plugin
      (alternative to engine-level Role::Langfuse), with cascading traces,

Changes  view on Meta::CPAN

0.201     2026-02-23 03:50:17Z
    - Add Response.thinking attribute for chain-of-thought reasoning:
      - Native extraction: DeepSeek/OpenAI-compatible reasoning_content,
        Anthropic thinking blocks, Gemini thought parts — automatically
        populated on Response.thinking, no configuration needed
      - Think tag filter: <think> tag stripping enabled by default on
        all engines. Handles both closed (<think>...</think>) and
        unclosed (<think>...) tags. Configurable tag name via
        think_tag (default: 'think'). Disable with
        think_tag_filter => 0. Filtering applied across all text
        paths: simple_chat, streaming, tool calling, and Raider.
    - Add NousResearch reasoning attribute — enables chain-of-thought
      reasoning for Hermes 4 and DeepHermes 3 models by prepending
      the standard Nous reasoning system prompt
    - Langfuse cascading traces — Raider now creates proper hierarchical
      Trace → Span (iteration) → Generation (llm-call) / Span (tool)
      structure instead of flat trace → generation. Iteration spans group
      the LLM call and its tool calls. Tool spans capture per-tool timing,
      input, and output. Trace is updated with final output at raid end.
    - Langfuse: add langfuse_span() for creating span events
    - Langfuse: add langfuse_update_trace(), langfuse_update_span(),

Changes  view on Meta::CPAN

      and live raider test (t/82_live_raider.t)
    - Add t/83_live_minimax.t: dedicated MiniMax live test covering
      simple_chat, list_models, and Raider with Coding Plan web search
    - Add Raider inject() method for mid-raid context injection —
      queue messages from async callbacks, timers, or other tasks
      that get picked up at the next iteration naturally
    - Add Raider on_iteration callback — called before each LLM call
      (iterations 2+) with ($raider, $iteration), returns messages
      to inject. Injected messages are persisted in history.
    - Add Langertha::Engine::MiniMax for MiniMax AI API
      (chat, streaming, tool calling via OpenAI-compatible API)
    - Rewrite all POD to inline style across all modules —
      =attr directly after has, =method directly after sub.
      Add POD to all previously undocumented modules.
    - Improve =seealso cross-links: remove redundant main module
      links, add meaningful related module references

0.200     2026-02-22 21:53:36Z
    - Add Langertha::Response: metadata container wrapping LLM text content
      with id, model, finish_reason, usage (token counts), timing, and created
      fields. Uses overload stringification for backward compatibility —

Changes  view on Meta::CPAN

      methods into a reusable role. Engines that use the OpenAI-compatible
      API format now compose this role instead of duplicating methods.
      Engine::OpenAI and all subclasses continue to work unchanged.
    - Add Langertha::Engine::OllamaOpenAI: first-class engine for Ollama's
      OpenAI-compatible /v1 endpoint. Ollama's openai() method now returns
      this engine instead of a raw Engine::OpenAI instance.
    - Add Langertha::Engine::AKI for AKI.IO native API
      (chat completions with key-in-body auth, synchronous mode,
      dynamic endpoint listing via list_models and endpoint_details)
    - Add Langertha::Engine::AKIOpenAI for AKI.IO via OpenAI-compatible API
      (chat, streaming, tool calling via Role::OpenAICompatible)
    - Add Langertha::Engine::NousResearch for Nous Research Inference API
      with Hermes-native tool calling via <tool_call> XML tags
    - Add Langertha::Engine::Perplexity for Perplexity Sonar API
      (chat and streaming only, no tool calling)
    - Add hermes_tools feature flag to Langertha::Role::Tools for
      Hermes-native tool calling via <tool_call>/<tool_response> XML tags;
      enables MCP tool calling on any model that supports the Hermes
      prompt format, even without API-level tool support
    - Add hermes_call_tag, hermes_response_tag attributes for custom
      XML tag names (default: tool_call, tool_response)
    - Add hermes_tool_instructions attribute for customizing the
      instruction text without changing the structural XML template
    - Add hermes_tool_prompt attribute for full system prompt override
    - Add hermes_extract_content() method for engines to override

Changes  view on Meta::CPAN

    - Add MCP (Model Context Protocol) tool calling support
      - New Langertha::Role::Tools for engine-agnostic tool calling
      - Anthropic engine: full tool calling support (format_tools,
        response_tool_calls, format_tool_results, response_text_content)
      - Async chat_with_tools_f() method for automatic multi-round
        tool-calling loop with configurable max iterations
      - Requires Net::Async::MCP for MCP server communication
    - Add Future::AsyncAwait support for async/await syntax
      - All _f methods (simple_chat_f, simple_chat_stream_f, etc.)
      - Streaming with real-time async callbacks
    - Add streaming support
      - Synchronous callback, iterator, and Future-based APIs
      - SSE parsing for OpenAI/Anthropic/Groq/Mistral/DeepSeek
      - NDJSON parsing for Ollama
    - Add Gemini engine (Google AI Studio)
    - Add dynamic model listing via provider APIs with caching
    - Add Anthropic extended parameters (effort, inference_geo)
    - Improve POD documentation across all modules

0.008     2025-03-30 04:55:38Z
    - Add Mistral engine integration

MANIFEST  view on Meta::CPAN

ex/mcp_inprocess.pl
ex/mcp_stdio.pl
ex/ollama.pl
ex/ollama_image.pl
ex/raider.pl
ex/raider_plugin_sugar.pl
ex/raider_rag.pl
ex/raider_run.pl
ex/response.pl
ex/sample.ogg
ex/streaming_anthropic.pl
ex/streaming_callback.pl
ex/streaming_future.pl
ex/streaming_gemini.pl
ex/streaming_iterator.pl
ex/streaming_mojo.pl
ex/structured_code.pl
ex/structured_output.pl
ex/structured_sentences.pl
ex/synopsis.pl
ex/transcription.pl
lib/Langertha.pm
lib/Langertha/Chat.pm
lib/Langertha/Content.pm
lib/Langertha/Content/Image.pm
lib/Langertha/Cost.pm

MANIFEST  view on Meta::CPAN

t/10_engine_hierarchy.t
t/11_basic_auth.t
t/12_rate_limit.t
t/20_chat_requests.t
t/21_embedding_requests.t
t/22_transcription_requests.t
t/25_aki_requests.t
t/30_ollama_requests.t
t/40_stream_chunk.t
t/41_stream_iterator.t
t/42_streaming_requests.t
t/43_streaming_parser.t
t/44_streaming_future.t
t/45_async_await.t
t/46_gemini_requests.t
t/50_list_models.t
t/60_responses_requests.t
t/60_tool_calling.t
t/61_tool_calling_openai.t
t/62_tool_calling_gemini.t
t/63_tool_calling_ollama.t
t/64_tool_calling_ollama_mock.t
t/65_tool_calling_vllm.t

ex/async_await.pl  view on Meta::CPAN

  say "Asking Claude a question...";
  my $response = await $engine->simple_chat_f(
    'What is the capital of France? Answer in one word.'
  );

  say "Response: $response\n";
  return $response;
}

# Example 2: Streaming with real-time callback
async sub streaming_example {
  my ($api_key) = @_;

  say "=== Example 2: Streaming Chat with Real-time Callback ===\n";

  my $engine = Langertha::Engine::Anthropic->new(
    api_key => $api_key,
    model => 'claude-sonnet-4-6',
  );

  say "Streaming response (watch it appear in real-time):\n";

ex/async_await.pl  view on Meta::CPAN

    say "  export ANTHROPIC_API_KEY=your-key-here";
    say "  perl ex/async_await.pl";
    exit 1;
  }

  say "Langertha Future::AsyncAwait Examples\n";
  say "=" x 50 . "\n";

  # Run examples (they return Futures, so we need to ->get them)
  simple_example($api_key)->get;
  streaming_example($api_key)->get;
  concurrent_example($api_key)->get;

  # Error handling example doesn't need real API key
  error_handling_example($api_key)->get;

  say "=" x 50;
  say "\n✅ All examples completed!\n";
}

main() unless caller;

ex/streaming_anthropic.pl  view on Meta::CPAN

  warn "Will be using your ANTHROPIC_API_KEY environment variable, which may produce cost.\n";
  sleep 5;
}

my $claude = Langertha::Engine::Anthropic->new(
  api_key => $ENV{ANTHROPIC_API_KEY} || die("Set ANTHROPIC_API_KEY"),
  model => 'claude-sonnet-4-6',
  response_size => 1024,
);

# Example 1: Synchronous streaming with callback
printf "Streaming response (synchronous with callback):\n";
printf "%s\n", "-" x 50;

my $chunk_count = 0;

my $full_content = $claude->simple_chat_stream(
  sub {
    my ($chunk) = @_;
    $chunk_count++;
    print $chunk->content;
  },
  'Tell me a very short story about a viking in exactly 3 sentences.'
);

printf "\n%s\n", "-" x 50;
printf "Total chunks: %d\n", $chunk_count;
printf "Total length: %d characters\n", length($full_content);

# Example 2: Real-time streaming with Future
printf "\nReal-time streaming with Future:\n";
printf "%s\n", "-" x 50;

my $future = $claude->simple_chat_stream_realtime_f(
  sub {
    my ($chunk) = @_;
    print $chunk->content;
  },
  'Write a haiku about Perl programming.'
);

ex/streaming_future.pl  view on Meta::CPAN

if ($ENV{OPENAI_API_KEY}) {
  warn "Will be using your OPENAI_API_KEY environment variable, which may produce cost.\n";
  sleep 5;
}

my $openai = Langertha::Engine::OpenAI->new(
  api_key => $ENV{OPENAI_API_KEY} || die("Set OPENAI_API_KEY"),
  model => 'gpt-4o-mini',
);

printf "Real-time streaming with Future:\n";
printf "%s\n", "-" x 50;

# Real-time streaming with callback
my $future = $openai->simple_chat_stream_realtime_f(
  sub {
    my ($chunk) = @_;
    printf "[%s]", $chunk->content;
  },
  'Tell me a very short story about a viking in exactly 3 sentences.'
);

my ($content, $chunks) = $future->get;

ex/streaming_gemini.pl  view on Meta::CPAN

  warn "Will be using your GEMINI_API_KEY environment variable, which may produce cost.\n";
  sleep 5;
}

my $gemini = Langertha::Engine::Gemini->new(
  api_key => $ENV{GEMINI_API_KEY} || die("Set GEMINI_API_KEY"),
  model => 'gemini-2.5-flash',
  response_size => 1024,
);

# Example 1: Synchronous streaming with callback
printf "Streaming response from Gemini (synchronous with callback):\n";
printf "%s\n", "-" x 50;

my $chunk_count = 0;

my $full_content = $gemini->simple_chat_stream(
  sub {
    my ($chunk) = @_;
    $chunk_count++;
    print $chunk->content;
  },
  'Tell me a very short story about a viking in exactly 3 sentences.'
);

printf "\n%s\n", "-" x 50;
printf "Total chunks: %d\n", $chunk_count;
printf "Total length: %d characters\n", length($full_content);

# Example 2: Real-time streaming with Future
printf "\nReal-time streaming with Future:\n";
printf "%s\n", "-" x 50;

my $future = $gemini->simple_chat_stream_realtime_f(
  sub {
    my ($chunk) = @_;
    print $chunk->content;
  },
  'Write a haiku about Perl programming.'
);

ex/streaming_mojo.pl  view on Meta::CPAN

#!/usr/bin/env perl
# Example: Using Langertha's Future-based streaming with Mojolicious
#
# This example shows how to integrate Langertha's Future-based async
# streaming with Mojolicious using Future::Mojo as a bridge.
#
# Required modules:
#   cpanm Mojolicious Future::Mojo IO::Async Net::Async::HTTP

use strict;
use warnings;
use FindBin;
use lib "$FindBin::Bin/../lib";

$|=1;

ex/streaming_mojo.pl  view on Meta::CPAN

if ($ENV{OPENAI_API_KEY}) {
  warn "Will be using your OPENAI_API_KEY environment variable, which may produce cost.\n";
  sleep 5;
}

my $openai = Langertha::Engine::OpenAI->new(
  api_key => $ENV{OPENAI_API_KEY} || die("Set OPENAI_API_KEY"),
  model => 'gpt-4o-mini',
);

printf "Real-time streaming with Future (Mojo-compatible):\n";
printf "%s\n", "-" x 50;

# Get the Future from Langertha
my $future = $openai->simple_chat_stream_realtime_f(
  sub {
    my ($chunk) = @_;
    printf "[%s]", $chunk->content;
  },
  'Tell me a very short story about a viking in exactly 3 sentences.'
);

lib/Langertha.pm  view on Meta::CPAN

Ollama, Groq, Mistral, or other providers.

B<THIS API IS WORK IN PROGRESS.>

=head2 Key Features

=over 4

=item * B<24 engines> -- unified API across cloud and local LLM providers

=item * B<Chat, streaming, embeddings, transcription, image generation>

=item * B<MCP tool calling> -- automatic multi-round tool loops via L<Net::Async::MCP>

=item * B<Raider> -- autonomous agent with history, compression, and plugins

=item * B<Response metadata> -- token usage, model, timing, rate limits

=item * B<Async/await> via L<Future::AsyncAwait>, sync via L<LWP::UserAgent>

=item * B<Langfuse observability> -- traces, generations, and tool spans

lib/Langertha.pm  view on Meta::CPAN

Roles provide composable functionality to engines:

=over 4

=item * L<Langertha::Role::Capabilities> - C<engine_capabilities> registry
plus C<supports($cap)> helper, composed by L<Langertha::Role::Chat>

=item * L<Langertha::Role::Chat> - Synchronous and async chat methods,
including C<chat_f(messages =E<gt> [...], tools =E<gt> [...], tool_choice
=E<gt> ..., response_format =E<gt> ...)> for single-turn structured
calls and C<aggregate_tool_calls(\@chunks)> for streaming

=item * L<Langertha::Role::HTTP> - HTTP request/response handling

=item * L<Langertha::Role::Streaming> - Streaming response processing

=item * L<Langertha::Role::JSON> - JSON encode/decode

=item * L<Langertha::Role::OpenAICompatible> - OpenAI-compatible API behaviour

=item * L<Langertha::Role::SystemPrompt> - System prompt attribute

lib/Langertha.pm  view on Meta::CPAN

=item * L<Langertha::Tool> - Canonical tool definition with cross-provider
serializers (C<to_openai>, C<to_anthropic>, C<to_gemini>, C<to_mcp>,
C<to_json_schema>) and accepting constructors (C<from_openai>,
C<from_anthropic>, C<from_mcp>, C<from_gemini>, C<from_hash>)

=item * L<Langertha::Content> / L<Langertha::Content::Image> -
Provider-agnostic vision input

=item * L<Langertha::RateLimit> - Normalized rate limit data from HTTP response headers

=item * L<Langertha::Stream> - Iterator over streaming chunks

=item * L<Langertha::Stream::Chunk> - A single chunk from a streaming
response (with optional C<tool_calls> for engines that emit them mid-stream)

=item * L<Langertha::Raider> - Autonomous agent with history and tool calling

=item * L<Langertha::Raider::Result> - Typed raid result (final, question, pause, abort)

=item * L<Langertha::Request::HTTP> - Internal HTTP request object

=back

=head2 Streaming

All engines that implement L<Langertha::Role::Chat> support streaming. There
are several ways to consume a stream:

B<Synchronous with callback:>

    $engine->simple_chat_stream(sub {
        my ($chunk) = @_;
        print $chunk->content;
    }, 'Tell me a story');

B<Synchronous with iterator (L<Langertha::Stream>):>

lib/Langertha/Chat.pm  view on Meta::CPAN


  $data = await $self->_run_plugin_after_llm_response($data, 1);

  return $data;
}


sub simple_chat_stream {
  my ( $self, $callback, @messages ) = @_;
  my $engine = $self->_assert_chat_engine;
  croak ref($engine) . " does not support streaming"
    unless $engine->can('chat_stream_request');
  croak "simple_chat_stream requires a callback as first argument"
    unless ref $callback eq 'CODE';
  my $conversation = $self->_build_messages(@messages);

  $conversation = $self->_run_plugin_before_llm_call($conversation, 1)->get;

  my $request = $engine->chat_stream_request($conversation, $self->_extra);
  my $chunks = $engine->execute_streaming_request($request, $callback);
  return join('', map { $_->content } @$chunks);
}


# --- Chat with tools ---

sub _gather_tools {
  my ( $self ) = @_;
  my @mcp_servers = @{$self->mcp_servers};
  croak "No MCP servers configured" unless @mcp_servers;

lib/Langertha/Chat.pm  view on Meta::CPAN

=head2 simple_chat_f

    my $response = await $chat->simple_chat_f('Hello!');

Async version of L</simple_chat>.

=head2 simple_chat_stream

    my $content = $chat->simple_chat_stream(sub { print shift->content }, 'Hi');

Synchronous streaming chat. Calls C<$callback> with each chunk.

=head2 simple_chat_with_tools

    my $text = $chat->simple_chat_with_tools(@messages);

Synchronous tool-calling chat loop. Gathers tools from L</mcp_servers>,
sends chat requests, executes tool calls, and iterates until the LLM
returns a final text response. Fires plugin hooks at each step:
C<plugin_before_llm_call>, C<plugin_after_llm_response>,
C<plugin_before_tool_call>, and C<plugin_after_tool_call>.

lib/Langertha/Engine/AKI.pm  view on Meta::CPAN


Provides access to AKI.IO's native API for running LLM inference. AKI.IO is
a European AI model hub based in Germany; all inference runs on EU infrastructure,
fully GDPR-compliant with no data leaving the EU.

The native API sends the API key as a C<key> field in the JSON request body
(not as an HTTP header). Supports synchronous chat, temperature and sampling
controls, dynamic endpoint listing, MCP tool calling via
L<Langertha::Role::HermesTools>, and OpenAI-compatible access via L</openai>.

Streaming is not yet supported in the native API. For streaming, use the
OpenAI-compatible endpoint via C<< $aki->openai >>.

Get your API key at L<https://aki.io/> and set C<LANGERTHA_AKI_API_KEY>.

B<THIS API IS WORK IN PROGRESS>

=head2 api_key

The AKI.IO API key. If not provided, reads from C<LANGERTHA_AKI_API_KEY>
environment variable. Sent as a C<key> field in the JSON request body

lib/Langertha/Engine/AKI.pm  view on Meta::CPAN

Parses a native AKI.IO chat response. Dies with an API error message if
C<success> is false. Returns a L<Langertha::Response> with C<content>,
C<model>, C<timing>, and C<raw>.

=head2 openai

    my $oai = $aki->openai;
    my $oai = $aki->openai(model => 'llama3-chat-8b');

Returns a L<Langertha::Engine::AKIOpenAI> instance configured with the same
API key, system prompt, and temperature. Supports streaming and MCP tool
calling.

B<Note:> The native AKI model name is B<not> carried over automatically
because the C</v1> endpoint uses different model identifiers. If no C<model>
is passed, the AKIOpenAI default model is used and a warning is emitted.
Pass C<< model => '...' >> explicitly with a valid C</v1> model name to
suppress the warning.

=head1 SEE ALSO

lib/Langertha/Engine/AKIOpenAI.pm  view on Meta::CPAN

    );
    my $oai = $aki_native->openai;  # warns: model not mapped, uses default
    print $oai->simple_chat('Hello via OpenAI format!');

=head1 DESCRIPTION

Provides access to AKI.IO's OpenAI-compatible API at C<https://aki.io/v1>.
Composes L<Langertha::Role::OpenAICompatible> for the standard OpenAI format.

AKI.IO is a European AI model hub (Germany) — fully GDPR-compliant with all
inference on EU infrastructure. Supports chat completions (with SSE streaming)
and dynamic model listing. Composes L<Langertha::Role::HermesTools> for MCP
tool calling via XML tags (AKI's C</v1> endpoint does not support native tool
parameters).

Embeddings and transcription are not supported. For native AKI.IO API features
(C<top_k>, C<top_p>, C<max_gen_tokens>), use L<Langertha::Engine::AKI>.

Get your API key at L<https://aki.io/> and set C<LANGERTHA_AKI_API_KEY>.

B<THIS API IS WORK IN PROGRESS>

lib/Langertha/Engine/AnthropicBase.pm  view on Meta::CPAN


    sub _build_api_key { $ENV{MY_API_KEY} || die "MY_API_KEY required" }
    sub default_model { 'my-model-v1' }

    __PACKAGE__->meta->make_immutable;

=head1 DESCRIPTION

Intermediate base class for engines speaking the Anthropic-compatible
C</v1/messages> format. Extends L<Langertha::Engine::Remote> and composes
models/chat/streaming plus Anthropic-style tool calling and response parsing.

Concrete engines extending this class include
L<Langertha::Engine::Anthropic>, L<Langertha::Engine::MiniMax>, and
L<Langertha::Engine::LMStudioAnthropic>.

B<THIS API IS WORK IN PROGRESS>

=head2 api_key

Anthropic-compatible API key sent as C<x-api-key>. Subclasses typically

lib/Langertha/Engine/Cerebras.pm  view on Meta::CPAN

=head1 DESCRIPTION

Provides access to Cerebras Inference, the fastest AI inference platform.
Composes L<Langertha::Role::OpenAICompatible> with Cerebras's endpoint
(C<https://api.cerebras.ai/v1>) and API key handling.

Cerebras uses custom wafer-scale chips to deliver extremely fast inference
speeds. Available models include C<llama3.1-8b> (default), C<qwen-3-235b-a22b-instruct-2507>,
and C<gpt-oss-120b>.

Supports chat, streaming, and MCP tool calling. Embeddings and transcription
are not supported.

Get your API key at L<https://cloud.cerebras.ai/> and set
C<LANGERTHA_CEREBRAS_API_KEY> in your environment.

B<THIS API IS WORK IN PROGRESS>

=head1 SEE ALSO

=over

lib/Langertha/Engine/Gemini.pm  view on Meta::CPAN


  # Same tool_choice translation as chat_request.
  if ( exists $extra{tool_choice} && defined $extra{tool_choice} ) {
    my $tc = Langertha::ToolChoice->from_hash( delete $extra{tool_choice} );
    if ($tc) {
      my $cfg = $tc->to_gemini;
      $extra{toolConfig} = $cfg if $cfg;
    }
  }

  # Convert messages to Gemini format (same as non-streaming)
  my @gemini_contents;
  my $system_instruction;

  for my $message (@{$messages}) {
    if ($message->{role} eq 'system') {
      $system_instruction .= "\n\n" if $system_instruction;
      $system_instruction .= $message->{content};
    } else {
      my $role = $message->{role} eq 'assistant' ? 'model' : $message->{role};
      push @gemini_contents, {
        role => $role,
        parts => [{ text => $message->{content} }],
      };
    }
  }

  # Build the URL for streaming endpoint
  my $model_name = $self->chat_model;
  my $url = $self->url . "/v1beta/models/${model_name}:streamGenerateContent?key=" . $self->api_key . "&alt=sse";

  my %request_body = (
    contents => \@gemini_contents,
  );

  if ($system_instruction) {
    $request_body{systemInstruction} = {
      parts => [{ text => $system_instruction }],

lib/Langertha/Engine/Gemini.pm  view on Meta::CPAN

    %request_body,
    %extra,
  );
}

sub parse_stream_chunk {
  my ( $self, $data, $event ) = @_;

  require Langertha::Stream::Chunk;

  # Gemini streaming format is similar to non-streaming
  my $candidates = $data->{candidates} || [];
  return undef unless @$candidates;

  my $candidate = $candidates->[0];
  my $content = $candidate->{content} || {};
  my $parts = $content->{parts} || [];

  my $text = '';
  $text = $parts->[0]->{text} if @$parts && $parts->[0]->{text};

lib/Langertha/Engine/HuggingFace.pm  view on Meta::CPAN

=head1 DESCRIPTION

Provides access to HuggingFace Inference Providers, a unified API gateway
for open-source models hosted on the HuggingFace Hub. The endpoint at
C<https://router.huggingface.co/v1> is 100% OpenAI-compatible.

Model names use C<org/model> format (e.g., C<Qwen/Qwen2.5-7B-Instruct>,
C<meta-llama/Llama-3.3-70B-Instruct>). No default model is set;
C<model> must be specified explicitly.

Supports chat, streaming, and MCP tool calling. Embeddings and transcription
are not supported.

Get your API token at L<https://huggingface.co/settings/tokens> and set
C<LANGERTHA_HUGGINGFACE_API_KEY> in your environment.

B<THIS API IS WORK IN PROGRESS>

=head2 hub_url

Base URL for the HuggingFace Hub API. Default: C<https://huggingface.co>.

lib/Langertha/Engine/LlamaCpp.pm  view on Meta::CPAN


=head1 DESCRIPTION

Provides access to llama.cpp's built-in HTTP server, which exposes an
OpenAI-compatible API. Composes L<Langertha::Role::OpenAICompatible>.

Only C<url> is required. The URL must include the C</v1> path prefix
(e.g., C<http://localhost:8080/v1>). Since llama.cpp serves exactly one
model (loaded at server startup), no model name or API key is needed.

Supports chat, streaming, embeddings, and MCP tool calling.

See L<https://github.com/ggml-org/llama.cpp/blob/master/examples/server/README.md>
for server setup.

B<THIS API IS WORK IN PROGRESS>

=head1 SEE ALSO

=over

lib/Langertha/Engine/MiniMax.pm  view on Meta::CPAN

latency.

=item * C<MiniMax-M2> — 200K context, 128K max output. Function calling
and agentic capabilities.

=back

See L<https://platform.minimax.io/docs/guides/models-intro> for the full
model catalog including audio, video, and music models.

Supports chat, streaming, tool calling, and structured output. Embeddings,
transcription, images, and documents are not supported via this endpoint.

Get your API key at L<https://platform.minimax.io/> and set
C<LANGERTHA_MINIMAX_API_KEY> in your environment.

=head1 SEE ALSO

=over

=item * L<Langertha::Engine::MiniMaxAnthropic> - MiniMax via legacy Anthropic-compatible endpoint

lib/Langertha/Engine/Ollama.pm  view on Meta::CPAN

    # Show running models
    my $running = $ollama->simple_ps;

=head1 DESCRIPTION

Provides access to Ollama, which runs large language models locally. Ollama
supports many popular open-source models including C<llama3.3> (default),
C<qwen2.5>, C<deepseek-coder-v2>, C<mixtral>, and C<mxbai-embed-large>
(default embedding model).

Supports chat, embeddings, streaming, MCP tool calling (OpenAI-compatible
format), and an OpenAI-compatible API via L</openai>. Not all models support
tool calling; known working models include C<qwen3:8b> and C<llama3.2:3b>.

For Hermes-format tool calling in models without API-level tool support,
compose L<Langertha::Role::HermesTools>. See L<Langertha::Role::HermesTools>
for details.

B<THIS API IS WORK IN PROGRESS>

=head2 openai

    my $oai = $ollama->openai;
    my $oai = $ollama->openai(model => 'different_model');

Returns a L<Langertha::Engine::OllamaOpenAI> instance configured for Ollama's
C</v1> OpenAI-compatible endpoint, inheriting the current model, embedding
model, system prompt, and temperature settings. Supports streaming, embeddings,
and MCP tool calling.

=head2 new_openai

    my $oai = Langertha::Engine::Ollama->new_openai(
        url   => 'http://localhost:11434',
        model => 'llama3.3',
        tools => \@mcp_tools,
    );

lib/Langertha/Engine/OllamaOpenAI.pm  view on Meta::CPAN

=head1 DESCRIPTION

Provides access to Ollama's OpenAI-compatible C</v1> API endpoint. Composes
L<Langertha::Role::OpenAICompatible> for the standard OpenAI format.

C<url> is required and must include the C</v1> path prefix (e.g.,
C<http://localhost:11434/v1>). When using L<Langertha::Engine::Ollama/openai>,
the C</v1> suffix is appended automatically. The API key defaults to
C<'ollama'> since Ollama does not require authentication.

Supports chat completions (SSE streaming), embeddings (default:
C<mxbai-embed-large>), MCP tool calling, and dynamic model listing.
Transcription is not supported.

For the native Ollama API with C<keep_alive>, C<seed>, C<context_size>,
NDJSON streaming, and Hermes tool calling, use L<Langertha::Engine::Ollama>.

B<THIS API IS WORK IN PROGRESS>

=head1 SEE ALSO

=over

=item * L<Langertha::Engine::Ollama> - Native Ollama API (with keep_alive, seed, context_size)

=item * L<Langertha::Role::OpenAICompatible> - OpenAI API format role composed by this engine

lib/Langertha/Engine/OpenAIBase.pm  view on Meta::CPAN

string. The base implementation croaks with a descriptive error message.

    sub default_model { 'gpt-4o-mini' }

=head1 SEE ALSO

=over

=item * L<Langertha::Engine::Remote> - Parent base class

=item * L<Langertha::Role::OpenAICompatible> - OpenAI API format (chat, embeddings, tools, streaming)

=item * L<Langertha::Role::Chat> - C<simple_chat>, C<simple_chat_f>, streaming methods

=item * L<Langertha::Role::Models> - C<model>, C<models>, C<list_models>

=item * L<Langertha::Role::Temperature> - C<temperature> attribute

=item * L<Langertha::Role::ResponseSize> - C<response_size> / C<max_tokens>

=item * L<Langertha::Role::SystemPrompt> - C<system_prompt> attribute

=item * L<Langertha::Role::Streaming> - SSE stream parsing

lib/Langertha/Engine/OpenAIResponses.pm  view on Meta::CPAN


=over 4

=item * B<Top-level> C<output[]> item:
C<< { type =E<gt> 'function_call', call_id =E<gt> 'call_abc', name =E<gt> 'foo',
arguments =E<gt> '{...}' } >>. This is what real reasoning models (e.g.
C<gpt-5.5-pro>) return for forced C<tool_choice>.

=item * B<Nested> inside a message item:
C<< output[type='message'].content[type='function_call'] >>. Historically
seen in streaming / older fixtures.

=back

C<chat_response>, C<response_tool_calls> and L<Langertha::ToolCall/extract>
walk both shapes. Streaming is not supported.

=head1 SEE ALSO

=over

lib/Langertha/Engine/OpenRouter.pm  view on Meta::CPAN


Provides access to OpenRouter, a unified API gateway for 300+ models from
many providers (OpenAI, Anthropic, Google, Meta, Mistral, and more).
Composes L<Langertha::Role::OpenAICompatible> with OpenRouter's endpoint
(C<https://openrouter.ai/api/v1>).

Model names use C<provider/model> format (e.g., C<anthropic/claude-sonnet-4-6>,
C<openai/gpt-4o>, C<google/gemini-2.5-flash>). No default model is set;
C<model> must be specified explicitly.

Supports chat, streaming, and MCP tool calling. Embeddings and transcription
are not supported.

Get your API key at L<https://openrouter.ai/settings/keys> and set
C<LANGERTHA_OPENROUTER_API_KEY> in your environment.

B<THIS API IS WORK IN PROGRESS>

=head1 SEE ALSO

=over

lib/Langertha/Engine/Perplexity.pm  view on Meta::CPAN

Provides access to Perplexity's Sonar API. Composes
L<Langertha::Role::OpenAICompatible> with Perplexity's endpoint
(C<https://api.perplexity.ai>). Perplexity models are search-augmented
LLMs with real-time web access; responses include citations alongside
generated text.

Available models: C<sonar> (default, fast), C<sonar-pro> (deeper analysis),
C<sonar-reasoning> (chain-of-thought), C<sonar-reasoning-pro> (most capable).

Limitations: tool calling, embeddings, and transcription are not supported.
Only chat and streaming are available.

Get your API key at L<https://www.perplexity.ai/settings/api> and set
C<LANGERTHA_PERPLEXITY_API_KEY>.

B<THIS API IS WORK IN PROGRESS>

=head1 SEE ALSO

=over

lib/Langertha/Engine/Replicate.pm  view on Meta::CPAN


=head1 DESCRIPTION

Provides access to Replicate's OpenAI-compatible chat endpoint. Replicate
hosts thousands of open-source models with pay-per-use pricing.

Model names use C<owner/model> format (e.g., C<meta/llama-4-maverick>,
C<meta/llama-4-scout>). No default model is set; C<model> must be specified
explicitly.

Supports chat, streaming, and MCP tool calling via the OpenAI-compatible
endpoint at C<https://api.replicate.com/v1>. Embeddings and transcription
are not supported through this interface.

Get your API token at L<https://replicate.com/account/api-tokens> and set
C<LANGERTHA_REPLICATE_API_KEY> in your environment.

B<THIS API IS WORK IN PROGRESS>

=head1 SEE ALSO

lib/Langertha/Response.pm  view on Meta::CPAN

=head2 tokens_remaining

Returns the number of tokens remaining from rate limit headers, or C<undef>.

=head1 SEE ALSO

=over

=item * L<Langertha::RateLimit> - Rate limit data from response headers

=item * L<Langertha::Stream::Chunk> - Single chunk from a streaming response

=item * L<Langertha::Role::Chat> - Chat role that produces response objects

=item * L<Langertha::Role::OpenAICompatible> - Parses responses into this class

=back

=head1 SUPPORT

=head2 Issues

lib/Langertha/Role/Capabilities.pm  view on Meta::CPAN

package Langertha::Role::Capabilities;
# ABSTRACT: Engine-capability registry derived from composed roles
our $VERSION = '0.502';
use Moose::Role;


# Role-name => list of capability flag names that role contributes.
# Plus implicit:
#   chat            -> simple_chat works (Role::Chat is composed)
#   streaming       -> chat_stream_request is wired up (Role::Streaming)
#   tools_native    -> Role::Tools (the named flags below come too)
#   tools_hermes    -> Role::HermesTools
#   ... see %ROLE_TO_CAPS below.
my %ROLE_TO_CAPS = (
  'Langertha::Role::Chat'             => [qw( chat )],
  'Langertha::Role::Streaming'        => [qw( streaming )],
  'Langertha::Role::Tools'            => [qw(
    tools_native tool_choice_auto tool_choice_any tool_choice_none tool_choice_named
  )],
  'Langertha::Role::HermesTools'      => [qw( tools_hermes )],
  'Langertha::Role::ResponseFormat'   => [qw(
    response_format_json_object response_format_json_schema
  )],
  'Langertha::Role::Embedding'        => [qw( embedding )],
  'Langertha::Role::Transcription'    => [qw( transcription )],
  'Langertha::Role::ImageGeneration'  => [qw( image_generation )],

lib/Langertha/Role/Chat.pm  view on Meta::CPAN

  my $result = $request->response_call->($response);
  if ($self->can('has_rate_limit') && $self->has_rate_limit && ref $result && $result->isa('Langertha::Response')) {
    $result = $result->clone_with(rate_limit => $self->rate_limit);
  }
  return $result;
}


sub chat_stream {
  my ( $self, @messages ) = @_;
  croak "".(ref $self)." does not support streaming"
    unless $self->can('chat_stream_request');
  return $self->chat_stream_request($self->chat_messages(@messages));
}


sub simple_chat_stream {
  my ( $self, $callback, @messages ) = @_;
  croak "simple_chat_stream requires a callback as first argument"
    unless ref $callback eq 'CODE';
  $log->debugf("[%s] simple_chat_stream (%s format)", ref $self, $self->stream_format);
  my $request = $self->chat_stream(@messages);
  my $chunks = $self->execute_streaming_request($request, $callback);
  $log->debugf("[%s] Stream completed: %d chunks", ref $self, scalar @$chunks);
  return join('', map { $_->content } @$chunks);
}


sub simple_chat_stream_iterator {
  my ( $self, @messages ) = @_;
  require Langertha::Stream;
  my $request = $self->chat_stream(@messages);
  my $chunks = $self->execute_streaming_request($request);
  return Langertha::Stream->new(chunks => $chunks);
}


# Future-based async methods

has _async_loop => (
  is => 'ro',
  lazy_build => 1,
);

lib/Langertha/Role/Chat.pm  view on Meta::CPAN


sub simple_chat_stream_f {
  my ($self, @messages) = @_;
  return $self->simple_chat_stream_realtime_f(undef, @messages);
}


async sub simple_chat_stream_realtime_f {
  my ($self, $chunk_callback, @messages) = @_;

  croak "".(ref $self)." does not support streaming"
    unless $self->can('chat_stream_request');

  my $request = $self->chat_stream_request($self->chat_messages(@messages));
  my @all_chunks;
  my $buffer = '';
  my $format = $self->stream_format;
  my $response_status;

  await $self->_async_http->do_request(
    request => $request,

lib/Langertha/Role/Chat.pm  view on Meta::CPAN

        my $chunks = $self->_process_stream_buffer(\$buffer, $format);
        for my $chunk (@$chunks) {
          push @all_chunks, $chunk;
          $chunk_callback->($chunk) if $chunk_callback;
        }
      };
    },
  );

  unless ($response_status->is_success) {
    die "".(ref $self)." streaming request failed: ".$response_status->status_line;
  }

  # Process remaining buffer
  if ($buffer ne '') {
    my $chunks = $self->_process_stream_buffer(\$buffer, $format, 1);
    for my $chunk (@$chunks) {
      push @all_chunks, $chunk;
      $chunk_callback->($chunk) if $chunk_callback;
    }
  }

lib/Langertha/Role/Chat.pm  view on Meta::CPAN


    # Async with Future::AsyncAwait (recommended)
    use Future::AsyncAwait;

    async sub chat_example {
        my ($engine) = @_;
        my $response = await $engine->simple_chat_f('Hello');
        say $response;
    }

    # Async streaming with real-time callback
    async sub stream_example {
        my ($engine) = @_;
        my ($content, $chunks) = await $engine->simple_chat_stream_realtime_f(
            sub { print shift->content },
            'Tell me a story'
        );
        say "\nTotal chunks: ", scalar @$chunks;
    }

=head1 DESCRIPTION

This role provides chat functionality for LLM engines. It includes both
synchronous and asynchronous (L<Future>-based) methods for chat and streaming.

The Future-based C<_f> methods are implemented using L<Future::AsyncAwait> and
L<Net::Async::HTTP>. These modules are loaded lazily only when you call a C<_f>
method, so synchronous-only usage does not require them.

=head2 chat_model

The model name used for chat requests. Lazily defaults to C<default_chat_model>
if the engine provides it, otherwise falls back to the general C<model>
attribute from L<Langertha::Role::Models>.

lib/Langertha/Role/Chat.pm  view on Meta::CPAN

    my $response = $engine->simple_chat(@messages);
    my $response = $engine->simple_chat('Hello, how are you?');

Sends a synchronous chat request and returns the response text. Blocks until
the request completes.

=head2 chat_stream

    my $request = $engine->chat_stream(@messages);

Builds and returns a streaming chat HTTP request object. Croaks if the engine
does not implement C<chat_stream_request>. Use L</simple_chat_stream> or
L</simple_chat_stream_iterator> to execute the request.

=head2 simple_chat_stream

    my $content = $engine->simple_chat_stream($callback, @messages);

    $engine->simple_chat_stream(sub {
        my ($chunk) = @_;
        print $chunk->content;
    }, 'Tell me a story');

Sends a synchronous streaming chat request. Calls C<$callback> with each
L<Langertha::Stream::Chunk> as it arrives. Returns the complete concatenated
content string when done. Blocks until the stream completes.

=head2 simple_chat_stream_iterator

    my $stream = $engine->simple_chat_stream_iterator(@messages);
    while (my $chunk = $stream->next) {
        print $chunk->content;
    }

lib/Langertha/Role/Chat.pm  view on Meta::CPAN

response_format (currently L<Langertha::Engine::Perplexity>), the
request is automatically rewritten to use the JSON Schema path and the
response is loose-parsed; the resulting L<Langertha::Response> exposes
the parsed arguments via L<Langertha::Response/tool_call_args> with
C<synthetic =E<gt> 1> on the synthesized tool_call entry.

=head2 simple_chat_stream_f

    my ($content, $chunks) = $engine->simple_chat_stream_f(@messages)->get;

Async streaming without a real-time callback. Convenience wrapper around
L</simple_chat_stream_realtime_f> with C<undef> as the callback. Returns a
L<Future> that resolves to C<($content, \@chunks)>.

=head2 aggregate_tool_calls

    my $tool_calls = $engine->aggregate_tool_calls( $chunks );

Walks an ArrayRef of L<Langertha::Stream::Chunk> objects and returns
the flat list of L<Langertha::ToolCall> objects collected from any
chunks that carry C<tool_calls>. Returns an empty ArrayRef if none of
the chunks emitted tool calls.

This is the streaming counterpart to L<Langertha::Response/tool_calls>.
Engines that need to assemble fragmented tool-call deltas (OpenAI's
C<delta.tool_calls> stream, Anthropic's C<input_json_delta>) are
expected to do that assembly inside C<parse_stream_chunk> and attach
the finished L<Langertha::ToolCall> to the relevant chunk; this
helper just collects them.

=head2 simple_chat_stream_realtime_f

    # With async/await (recommended)
    use Future::AsyncAwait;

lib/Langertha/Role/Chat.pm  view on Meta::CPAN

            sub { print shift->content },
            @messages
        );
        return $content;
    }

    # Traditional Future style
    my $future = $engine->simple_chat_stream_realtime_f($callback, @messages);
    my ($content, $chunks) = $future->get;

Async streaming with real-time callback. C<$callback> is called with each
L<Langertha::Stream::Chunk> as it arrives from the server. Returns a L<Future>
that resolves to C<($content, \@chunks)> where C<$content> is the full
concatenated text.

This is the recommended method for real-time streaming in async applications.
Pass C<undef> as the callback (or use L</simple_chat_stream_f>) if you only
need the final result.

=head2 content_format

    my $fmt = $engine->content_format;  # 'openai' | 'anthropic' | 'gemini'

Wire format for multimodal content blocks. Controls how
L<Langertha::Content> objects embedded in a message's C<content> arrayref
are serialized during L</chat_messages>. Defaults to C<'openai'>; overridden

lib/Langertha/Role/Chat.pm  view on Meta::CPAN

via C<around>) when the wire reality differs from the role inventory
— for example to clear C<tool_choice_named> on providers that only
accept string forms.

Common keys produced by the bundled roles:

=over

=item * C<chat> — C<simple_chat>/C<simple_chat_f> work

=item * C<streaming> — C<chat_stream_request> is wired up

=item * C<tools_native> — engine accepts a C<tools> array on the wire

=item * C<tools_hermes> — tools are injected via Hermes-style XML
prompt rather than (or in addition to) the native API

=item * C<tool_choice_auto> / C<tool_choice_any> / C<tool_choice_none> —
which string-form C<tool_choice> values are accepted

=item * C<tool_choice_named> — C<{type =E<gt> 'tool', name =E<gt> '...'}>

lib/Langertha/Role/HTTP.pm  view on Meta::CPAN

);
sub _build_user_agent {
  my ( $self ) = @_;
  return LWP::UserAgent->new(
    agent => $self->user_agent_agent,
    $self->has_user_agent_timeout ? ( timeout => $self->user_agent_timeout ) : (),
  );
}


sub execute_streaming_request {
  my ($self, $request, $chunk_callback) = @_;

  croak "execute_streaming_request requires Langertha::Role::Streaming"
    unless $self->does('Langertha::Role::Streaming');

  my $response = $self->user_agent->request($request);

  croak "".(ref $self)." streaming request failed: ".($response->status_line)
    unless $response->is_success;

  return $self->process_stream_data($response->content, $chunk_callback);
}



1;

__END__

lib/Langertha/Role/HTTP.pm  view on Meta::CPAN

=head2 user_agent_agent

The C<User-Agent> string sent with HTTP requests. Defaults to the engine's
class name.

=head2 user_agent

The L<LWP::UserAgent> instance used for synchronous HTTP requests. Built lazily
with C<user_agent_agent> and C<user_agent_timeout>.

=head2 execute_streaming_request

    my $chunks = $engine->execute_streaming_request($request, $chunk_callback);
    my $chunks = $engine->execute_streaming_request($request);

Executes a streaming HTTP request synchronously using L<LWP::UserAgent> and
delegates stream parsing to L<Langertha::Role::Streaming/process_stream_data>.
Requires the engine to also compose L<Langertha::Role::Streaming>. Returns an
ArrayRef of L<Langertha::Stream::Chunk> objects. If C<$chunk_callback> is
provided it is called with each chunk as it is parsed.

=head1 SEE ALSO

=over

=item * L<Langertha::Role::JSON> - JSON encoding/decoding (required by this role)



( run in 1.182 second using v1.01-cache-2.11-cpan-0bb4e1dffa6 )