AI-Ollama-Client

 view release on metacpan or  search on metacpan

lib/AI/Ollama/Client/Impl.pm  view on Meta::CPAN

=over 4

=item C<< format >>

The format to return a response in. Currently the only accepted value is json.

Enable JSON mode by setting the format parameter to json. This will structure the response as valid JSON.

Note: it's important to instruct the model to use JSON in the prompt. Otherwise, the model may generate large amounts whitespace.

=item C<< keep_alive >>

How long (in minutes) to keep the model loaded in memory.

=over

=item -

If set to a positive duration (e.g. 20), the model will stay loaded for the provided duration.


lib/AI/Ollama/Client/Impl.pm  view on Meta::CPAN

The format to return a response in. Currently the only accepted value is json.

Enable JSON mode by setting the format parameter to json. This will structure the response as valid JSON.

Note: it's important to instruct the model to use JSON in the prompt. Otherwise, the model may generate large amounts whitespace.

=item C<< images >>

(optional) a list of Base64-encoded images to include in the message (for multimodal models such as llava)

=item C<< keep_alive >>

How long (in minutes) to keep the model loaded in memory.

=over

=item -

If set to a positive duration (e.g. 20), the model will stay loaded for the provided duration.


lib/AI/Ollama/GenerateChatCompletionRequest.pm  view on Meta::CPAN


=cut

has 'format' => (
    is       => 'ro',
    isa      => Enum[
        "json",
    ],
);

=head2 C<< keep_alive >>

How long (in minutes) to keep the model loaded in memory.

- If set to a positive duration (e.g. 20), the model will stay loaded for the provided duration.
- If set to a negative duration (e.g. -1), the model will stay loaded indefinitely.
- If set to 0, the model will be unloaded immediately once finished.
- If not set, the model will stay loaded for 5 minutes by default

=cut

has 'keep_alive' => (
    is       => 'ro',
    isa      => Int,
);

=head2 C<< messages >>

The messages of the chat, this can be used to keep a chat memory

=cut

lib/AI/Ollama/GenerateCompletionRequest.pm  view on Meta::CPAN


(optional) a list of Base64-encoded images to include in the message (for multimodal models such as llava)

=cut

has 'images' => (
    is       => 'ro',
    isa      => ArrayRef[Str],
);

=head2 C<< keep_alive >>

How long (in minutes) to keep the model loaded in memory.

- If set to a positive duration (e.g. 20), the model will stay loaded for the provided duration.
- If set to a negative duration (e.g. -1), the model will stay loaded indefinitely.
- If set to 0, the model will be unloaded immediately once finished.
- If not set, the model will stay loaded for 5 minutes by default

=cut

has 'keep_alive' => (
    is       => 'ro',
    isa      => Int,
);

=head2 C<< model >>

The model name.

Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.

ollama/ollama-curated.json  view on Meta::CPAN

{"openapi":"3.0.3","components":{"schemas":{"PushModelResponse":{"properties":{"total":{"type":"integer","description":"total size of the model","example":"2142590208"},"status":{"$ref":"#/components/schemas/PushModelStatus"},"digest":{"example":"sha...

ollama/ollama-curated.yaml  view on Meta::CPAN

          type: boolean
          description: |
            If `true` no formatting will be applied to the prompt and no context will be returned.

            You may choose to use the `raw` parameter if you are specifying a full templated prompt in your request to the API, and are managing history yourself.
        stream:
          type: boolean
          description: &stream |
            If `false` the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.
          default: false
        keep_alive:
          type: integer
          description: &keep_alive |
            How long (in minutes) to keep the model loaded in memory.

            - If set to a positive duration (e.g. 20), the model will stay loaded for the provided duration.
            - If set to a negative duration (e.g. -1), the model will stay loaded indefinitely.
            - If set to 0, the model will be unloaded immediately once finished.
            - If not set, the model will stay loaded for 5 minutes by default
      required:
        - model
        - prompt
    RequestOptions:

ollama/ollama-curated.yaml  view on Meta::CPAN

          items:
            $ref: '#/components/schemas/Message'
        format:
          $ref: '#/components/schemas/ResponseFormat'
        options:
          $ref: '#/components/schemas/RequestOptions'
        stream:
          type: boolean
          description: *stream
          default: false
        keep_alive:
          type: integer
          description: *keep_alive
      required:
        - model
        - messages
    GenerateChatCompletionResponse:
      type: object
      description: The response class for the chat endpoint.
      properties:
        message:
          $ref: '#/components/schemas/Message'
        model:

t/generate.request  view on Meta::CPAN

Accept: application/x-ndjson
Accept-Encoding: gzip, deflate, br
Accept-Language: de,en-US;q=0.7,en;q=0.3
Cache-Control: no-cache
Connection: keep-alive
Content-Length: 942
Content-Type: application/json
DNT: 1
Host: localhost:11434
Origin: https://dhcode.github.io
Pragma: no-cache
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: cross-site
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:123.0) Gecko/20100101 Firefox/123.0

t/generate.request  view on Meta::CPAN

    "use_mmap": true,
    "use_mlock": true,
    "embedding_only": true,
    "rope_frequency_base": 0,
    "rope_frequency_scale": 0,
    "num_thread": 0
  },
  "format": "json",
  "raw": true,
  "stream": false,
  "keep_alive": 0
}



( run in 2.789 seconds using v1.01-cache-2.11-cpan-df04353d9ac )