AI-Ollama-Client

 view release on metacpan or  search on metacpan

lib/AI/Ollama/Client/Impl.pm  view on Meta::CPAN

The L<Mojo::UserAgent> to use

=head2 B<< server >>

The server to access

=cut

has 'schema_file' => (
    is => 'lazy',
    default => sub { require AI::Ollama::Client::Impl; module_file('AI::Ollama::Client::Impl', 'ollama-curated.yaml') },
);

has 'schema' => (
    is => 'lazy',
    default => sub {
        if( my $fn = $_[0]->schema_file ) {
            YAML::PP->new( boolean => 'JSON::PP' )->load_file($fn);
        }
    },
);

has 'validate_requests' => (
    is => 'rw',
    default => 1,
);

has 'validate_responses' => (
    is => 'rw',
    default => 1,
);

has 'openapi' => (
    is => 'lazy',
    default => sub {
        if( my $schema = $_[0]->schema ) {
            OpenAPI::Modern->new( openapi_schema => $schema, openapi_uri => '' )
        }
    },
);

# The HTTP stuff should go into a ::Role I guess
has 'ua' => (
    is => 'lazy',
    default => sub { Mojo::UserAgent->new },
);

has 'server' => (
    is => 'ro',
);

=head1 METHODS

=head2 C<< build_checkBlob_request >>

lib/AI/Ollama/Client/Impl.pm  view on Meta::CPAN

If set to a negative duration (e.g. -1), the model will stay loaded indefinitely.


=item -

If set to 0, the model will be unloaded immediately once finished.


=item -

If not set, the model will stay loaded for 5 minutes by default


=back

=item C<< messages >>

The messages of the chat, this can be used to keep a chat memory

=item C<< model >>

The model name.

Model names follow a C<model:tag> format. Some examples are C<orca-mini:3b-q4_1> and C<llama2:70b>. The tag is optional and, if not provided, will default to C<latest>. The tag is used to identify a specific version.

=item C<< options >>

Additional model parameters listed in the documentation for the Modelfile such as C<temperature>.

=item C<< stream >>

If C<false> the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.

=back

lib/AI/Ollama/Client/Impl.pm  view on Meta::CPAN

=over 4

=item C<< modelfile >>

The contents of the Modelfile.

=item C<< name >>

The model name.

Model names follow a C<model:tag> format. Some examples are C<orca-mini:3b-q4_1> and C<llama2:70b>. The tag is optional and, if not provided, will default to C<latest>. The tag is used to identify a specific version.

=item C<< stream >>

If C<false> the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.

=back

Returns a L<< AI::Ollama::CreateModelResponse >> on success.

=cut

lib/AI/Ollama/Client/Impl.pm  view on Meta::CPAN



=head3 Options

=over 4

=item C<< name >>

The model name.

Model names follow a C<model:tag> format. Some examples are C<orca-mini:3b-q4_1> and C<llama2:70b>. The tag is optional and, if not provided, will default to C<latest>. The tag is used to identify a specific version.

=back


=cut

sub build_deleteModel_request( $self, %options ) {
    my $method = 'DELETE';
    my $path = '/delete';
    my $url = Mojo::URL->new( $self->server . $path );

lib/AI/Ollama/Client/Impl.pm  view on Meta::CPAN



=head3 Options

=over 4

=item C<< model >>

The model name.

Model names follow a C<model:tag> format. Some examples are C<orca-mini:3b-q4_1> and C<llama2:70b>. The tag is optional and, if not provided, will default to C<latest>. The tag is used to identify a specific version.

=item C<< options >>

Additional model parameters listed in the documentation for the Modelfile such as C<temperature>.

=item C<< prompt >>

Text to generate embeddings for.

=back

lib/AI/Ollama/Client/Impl.pm  view on Meta::CPAN

If set to a negative duration (e.g. -1), the model will stay loaded indefinitely.


=item -

If set to 0, the model will be unloaded immediately once finished.


=item -

If not set, the model will stay loaded for 5 minutes by default


=back

=item C<< model >>

The model name.

Model names follow a C<model:tag> format. Some examples are C<orca-mini:3b-q4_1> and C<llama2:70b>. The tag is optional and, if not provided, will default to C<latest>. The tag is used to identify a specific version.

=item C<< options >>

Additional model parameters listed in the documentation for the Modelfile such as C<temperature>.

=item C<< prompt >>

The prompt to generate a response.

=item C<< raw >>

lib/AI/Ollama/Client/Impl.pm  view on Meta::CPAN

=item C<< insecure >>

Allow insecure connections to the library.

Only use this if you are pulling from your own library during development.

=item C<< name >>

The model name.

Model names follow a C<model:tag> format. Some examples are C<orca-mini:3b-q4_1> and C<llama2:70b>. The tag is optional and, if not provided, will default to C<latest>. The tag is used to identify a specific version.

=item C<< stream >>

If C<false> the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.

=back

Returns a L<< AI::Ollama::PullModelResponse >> on success.

=cut

lib/AI/Ollama/Client/Impl.pm  view on Meta::CPAN



=head3 Options

=over 4

=item C<< name >>

The model name.

Model names follow a C<model:tag> format. Some examples are C<orca-mini:3b-q4_1> and C<llama2:70b>. The tag is optional and, if not provided, will default to C<latest>. The tag is used to identify a specific version.

=back

Returns a L<< AI::Ollama::ModelInfo >> on success.

=cut

sub build_showModelInfo_request( $self, %options ) {
    my $method = 'POST';
    my $path = '/show';

lib/AI/Ollama/CreateModelRequest.pm  view on Meta::CPAN

has 'modelfile' => (
    is       => 'ro',
    isa      => Str,
    required => 1,
);

=head2 C<< name >>

The model name.

Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.

=cut

has 'name' => (
    is       => 'ro',
    isa      => Str,
    required => 1,
);

=head2 C<< stream >>

lib/AI/Ollama/DeleteModelRequest.pm  view on Meta::CPAN

sub as_hash( $self ) {
    return { $self->%* }
}

=head1 PROPERTIES

=head2 C<< name >>

The model name.

Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.

=cut

has 'name' => (
    is       => 'ro',
    isa      => Str,
    required => 1,
);


lib/AI/Ollama/GenerateChatCompletionRequest.pm  view on Meta::CPAN

    ],
);

=head2 C<< keep_alive >>

How long (in minutes) to keep the model loaded in memory.

- If set to a positive duration (e.g. 20), the model will stay loaded for the provided duration.
- If set to a negative duration (e.g. -1), the model will stay loaded indefinitely.
- If set to 0, the model will be unloaded immediately once finished.
- If not set, the model will stay loaded for 5 minutes by default

=cut

has 'keep_alive' => (
    is       => 'ro',
    isa      => Int,
);

=head2 C<< messages >>

lib/AI/Ollama/GenerateChatCompletionRequest.pm  view on Meta::CPAN

has 'messages' => (
    is       => 'ro',
    isa      => ArrayRef[HashRef],
    required => 1,
);

=head2 C<< model >>

The model name.

Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.

=cut

has 'model' => (
    is       => 'ro',
    isa      => Str,
    required => 1,
);

=head2 C<< options >>

lib/AI/Ollama/GenerateChatCompletionResponse.pm  view on Meta::CPAN


has 'message' => (
    is       => 'ro',
    isa      => HashRef,
);

=head2 C<< model >>

The model name.

Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.

=cut

has 'model' => (
    is       => 'ro',
    isa      => Str,
);

=head2 C<< prompt_eval_count >>

lib/AI/Ollama/GenerateCompletionRequest.pm  view on Meta::CPAN

    isa      => ArrayRef[Str],
);

=head2 C<< keep_alive >>

How long (in minutes) to keep the model loaded in memory.

- If set to a positive duration (e.g. 20), the model will stay loaded for the provided duration.
- If set to a negative duration (e.g. -1), the model will stay loaded indefinitely.
- If set to 0, the model will be unloaded immediately once finished.
- If not set, the model will stay loaded for 5 minutes by default

=cut

has 'keep_alive' => (
    is       => 'ro',
    isa      => Int,
);

=head2 C<< model >>

The model name.

Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.

=cut

has 'model' => (
    is       => 'ro',
    isa      => Str,
    required => 1,
);

=head2 C<< options >>

lib/AI/Ollama/GenerateCompletionResponse.pm  view on Meta::CPAN


has 'load_duration' => (
    is       => 'ro',
    isa      => Int,
);

=head2 C<< model >>

The model name.

Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.

=cut

has 'model' => (
    is       => 'ro',
    isa      => Str,
);

=head2 C<< prompt_eval_count >>

lib/AI/Ollama/GenerateEmbeddingRequest.pm  view on Meta::CPAN

sub as_hash( $self ) {
    return { $self->%* }
}

=head1 PROPERTIES

=head2 C<< model >>

The model name.

Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.

=cut

has 'model' => (
    is       => 'ro',
    isa      => Str,
    required => 1,
);

=head2 C<< options >>

lib/AI/Ollama/Model.pm  view on Meta::CPAN


has 'modified_at' => (
    is       => 'ro',
    isa      => Str,
);

=head2 C<< name >>

The model name.

Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.

=cut

has 'name' => (
    is       => 'ro',
    isa      => Str,
);

=head2 C<< size >>

lib/AI/Ollama/ModelInfoRequest.pm  view on Meta::CPAN

sub as_hash( $self ) {
    return { $self->%* }
}

=head1 PROPERTIES

=head2 C<< name >>

The model name.

Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.

=cut

has 'name' => (
    is       => 'ro',
    isa      => Str,
    required => 1,
);


lib/AI/Ollama/PullModelRequest.pm  view on Meta::CPAN

=cut

has 'insecure' => (
    is       => 'ro',
);

=head2 C<< name >>

The model name.

Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.

=cut

has 'name' => (
    is       => 'ro',
    isa      => Str,
    required => 1,
);

=head2 C<< stream >>

lib/AI/Ollama/RequestOptions.pm  view on Meta::CPAN


=cut

has 'main_gpu' => (
    is       => 'ro',
    isa      => Int,
);

=head2 C<< mirostat >>

Enable Mirostat sampling for controlling perplexity. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)

=cut

has 'mirostat' => (
    is       => 'ro',
    isa      => Int,
);

=head2 C<< mirostat_eta >>

lib/AI/Ollama/RequestOptions.pm  view on Meta::CPAN


=cut

has 'num_ctx' => (
    is       => 'ro',
    isa      => Int,
);

=head2 C<< num_gpu >>

The number of layers to send to the GPU(s). On macOS it defaults to 1 to enable metal support, 0 to disable.

=cut

has 'num_gpu' => (
    is       => 'ro',
    isa      => Int,
);

=head2 C<< num_gqa >>

lib/AI/Ollama/RequestOptions.pm  view on Meta::CPAN


=cut

has 'num_predict' => (
    is       => 'ro',
    isa      => Int,
);

=head2 C<< num_thread >>

Sets the number of threads to use during computation. By default, Ollama will detect this for optimal performance. It is recommended to set this value to the number of physical CPU cores your system has (as opposed to the logical number of cores).

=cut

has 'num_thread' => (
    is       => 'ro',
    isa      => Int,
);

=head2 C<< numa >>

lib/AI/Ollama/RequestOptions.pm  view on Meta::CPAN


=cut

has 'temperature' => (
    is       => 'ro',
    isa      => Num,
);

=head2 C<< tfs_z >>

Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1)

=cut

has 'tfs_z' => (
    is       => 'ro',
    isa      => Num,
);

=head2 C<< top_k >>

ollama/ollama-curated.json  view on Meta::CPAN

{"openapi":"3.0.3","components":{"schemas":{"PushModelResponse":{"properties":{"total":{"type":"integer","description":"total size of the model","example":"2142590208"},"status":{"$ref":"#/components/schemas/PushModelStatus"},"digest":{"example":"sha...

ollama/ollama-curated.yaml  view on Meta::CPAN

  schemas:
    GenerateCompletionRequest:
      type: object
      description: Request class for the generate endpoint.
      properties:
        model:
          type: string
          description: &model_name |
            The model name.

            Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
          example: llama2:7b
        prompt:
          type: string
          description: The prompt to generate a response.
          example: Why is the sky blue?
        images:
          type: array
          description: (optional) a list of Base64-encoded images to include in the message (for multimodal models such as llava)
          items:
            type: string

ollama/ollama-curated.yaml  view on Meta::CPAN

        raw:
          type: boolean
          description: |
            If `true` no formatting will be applied to the prompt and no context will be returned.

            You may choose to use the `raw` parameter if you are specifying a full templated prompt in your request to the API, and are managing history yourself.
        stream:
          type: boolean
          description: &stream |
            If `false` the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.
          default: false
        keep_alive:
          type: integer
          description: &keep_alive |
            How long (in minutes) to keep the model loaded in memory.

            - If set to a positive duration (e.g. 20), the model will stay loaded for the provided duration.
            - If set to a negative duration (e.g. -1), the model will stay loaded indefinitely.
            - If set to 0, the model will be unloaded immediately once finished.
            - If not set, the model will stay loaded for 5 minutes by default
      required:
        - model
        - prompt
    RequestOptions:
      type: object
      description: Additional model parameters listed in the documentation for the Modelfile such as `temperature`.
      properties:
        num_keep:
          type: integer
          description: |

ollama/ollama-curated.yaml  view on Meta::CPAN

            Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)
        top_p:
          type: number
          format: float
          description: |
            Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)
        tfs_z:
          type: number
          format: float
          description: |
            Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1)
        typical_p:
          type: number
          format: float
          description: |
            Typical p is used to reduce the impact of less probable tokens from the output.
        repeat_last_n:
          type: integer
          description: |
            Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)
        temperature:

ollama/ollama-curated.yaml  view on Meta::CPAN

          description: |
            Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
        frequency_penalty:
          type: number
          format: float
          description: |
            Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
        mirostat:
          type: integer
          description: |
            Enable Mirostat sampling for controlling perplexity. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
        mirostat_tau:
          type: number
          format: float
          description: |
            Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)
        mirostat_eta:
          type: number
          format: float
          description: |
            Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1)

ollama/ollama-curated.yaml  view on Meta::CPAN

          type: integer
          description: |
            Sets the number of batches to use for generation. (Default: 1)
        num_gqa:
          type: integer
          description: |
            The number of GQA groups in the transformer layer. Required for some models, for example it is 8 for `llama2:70b`.
        num_gpu:
          type: integer
          description: |
            The number of layers to send to the GPU(s). On macOS it defaults to 1 to enable metal support, 0 to disable.
        main_gpu:
          type: integer
          description: |
            The GPU to use for the main model. Default is 0.
        low_vram:
          type: boolean
          description: |
            Enable low VRAM mode. (Default: false)
        f16_kv:
          type: boolean

ollama/ollama-curated.yaml  view on Meta::CPAN

          description: |
            The base of the rope frequency scale. (Default: 1.0)
        rope_frequency_scale:
          type: number
          format: float
          description: |
            The scale of the rope frequency. (Default: 1.0)
        num_thread:
          type: integer
          description: |
            Sets the number of threads to use during computation. By default, Ollama will detect this for optimal performance. It is recommended to set this value to the number of physical CPU cores your system has (as opposed to the logical number o...
    ResponseFormat:
      type: string
      description: |
        The format to return a response in. Currently the only accepted value is json.

        Enable JSON mode by setting the format parameter to json. This will structure the response as valid JSON.

        Note: it's important to instruct the model to use JSON in the prompt. Otherwise, the model may generate large amounts whitespace.
      enum:
        - json

ollama/ollama-curated.yaml  view on Meta::CPAN

          description: The messages of the chat, this can be used to keep a chat memory
          items:
            $ref: '#/components/schemas/Message'
        format:
          $ref: '#/components/schemas/ResponseFormat'
        options:
          $ref: '#/components/schemas/RequestOptions'
        stream:
          type: boolean
          description: *stream
          default: false
        keep_alive:
          type: integer
          description: *keep_alive
      required:
        - model
        - messages
    GenerateChatCompletionResponse:
      type: object
      description: The response class for the chat endpoint.
      properties:

ollama/ollama-curated.yaml  view on Meta::CPAN

          type: string
          description: *model_name
          example: mario
        modelfile:
          type: string
          description: The contents of the Modelfile.
          example: FROM llama2\nSYSTEM You are mario from Super Mario Bros.
        stream:
          type: boolean
          description: *stream
          default: false
      required:
        - name
        - modelfile
    CreateModelResponse:
      description: Response object for creating a model. When finished, `status` is `success`.
      type: object
      properties:
        status:
          $ref: '#/components/schemas/CreateModelStatus'
    CreateModelStatus:

ollama/ollama-curated.yaml  view on Meta::CPAN

        name:
          type: string
          description: *model_name
          example: llama2:7b
        insecure:
          type: boolean
          description: |
            Allow insecure connections to the library.

            Only use this if you are pulling from your own library during development.
          default: false
        stream:
          type: boolean
          description: *stream
          default: false
      required:
        - name
    PullModelResponse:
      description: |
        Response class for pulling a model.

        The first object is the manifest. Then there is a series of downloading responses. Until any of the download is completed, the `completed` key may not be included.

        The number of files to be downloaded depends on the number of layers specified in the manifest.
      type: object

ollama/ollama-curated.yaml  view on Meta::CPAN

        name:
          type: string
          description: The name of the model to push in the form of <namespace>/<model>:<tag>.
          example: 'mattw/pygmalion:latest'
        insecure:
          type: boolean
          description: |
            Allow insecure connections to the library.

            Only use this if you are pushing to your library during development.
          default: false
        stream:
          type: boolean
          description: *stream
          default: false
      required:
        - name
    PushModelResponse:
      type: object
      description: Response class for pushing a model.
      properties:
        status:
          $ref: '#/components/schemas/PushModelStatus'
        digest:
          type: string

openapi/petstore-expanded.yaml  view on Meta::CPAN

            format: int32
      responses:
        '200':
          description: pet response
          content:
            application/json:
              schema:
                type: array
                items:
                  $ref: '#/components/schemas/Pet'
        default:
          description: unexpected error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error'
    post:
      description: Creates a new pet in the store. Duplicates are allowed
      operationId: addPet
      requestBody:
        description: Pet to add to the store

openapi/petstore-expanded.yaml  view on Meta::CPAN

          application/json:
            schema:
              $ref: '#/components/schemas/NewPet'
      responses:
        '200':
          description: pet response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Pet'
        default:
          description: unexpected error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error'
  /pets/{id}:
    get:
      description: Returns a user based on a single ID, if the user does not have access to the pet
      operationId: find pet by id
      parameters:

openapi/petstore-expanded.yaml  view on Meta::CPAN

          schema:
            type: integer
            format: int64
      responses:
        '200':
          description: pet response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Pet'
        default:
          description: unexpected error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error'
    delete:
      description: deletes a single pet based on the ID supplied
      operationId: deletePet
      parameters:
        - name: id
          in: path
          description: ID of pet to delete
          required: true
          schema:
            type: integer
            format: int64
      responses:
        '204':
          description: pet deleted
        default:
          description: unexpected error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error'
components:
  schemas:
    Pet:
      allOf:
        - $ref: '#/components/schemas/NewPet'



( run in 0.442 second using v1.01-cache-2.11-cpan-0a6323c29d9 )