view release on metacpan or search on metacpan
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
The L<Mojo::UserAgent> to use
=head2 B<< server >>
The server to access
=cut
has 'schema_file' => (
is => 'lazy',
default => sub { require AI::Ollama::Client::Impl; module_file('AI::Ollama::Client::Impl', 'ollama-curated.yaml') },
);
has 'schema' => (
is => 'lazy',
default => sub {
if( my $fn = $_[0]->schema_file ) {
YAML::PP->new( boolean => 'JSON::PP' )->load_file($fn);
}
},
);
has 'validate_requests' => (
is => 'rw',
default => 1,
);
has 'validate_responses' => (
is => 'rw',
default => 1,
);
has 'openapi' => (
is => 'lazy',
default => sub {
if( my $schema = $_[0]->schema ) {
OpenAPI::Modern->new( openapi_schema => $schema, openapi_uri => '' )
}
},
);
# The HTTP stuff should go into a ::Role I guess
has 'ua' => (
is => 'lazy',
default => sub { Mojo::UserAgent->new },
);
has 'server' => (
is => 'ro',
);
=head1 METHODS
=head2 C<< build_checkBlob_request >>
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
If set to a negative duration (e.g. -1), the model will stay loaded indefinitely.
=item -
If set to 0, the model will be unloaded immediately once finished.
=item -
If not set, the model will stay loaded for 5 minutes by default
=back
=item C<< messages >>
The messages of the chat, this can be used to keep a chat memory
=item C<< model >>
The model name.
Model names follow a C<model:tag> format. Some examples are C<orca-mini:3b-q4_1> and C<llama2:70b>. The tag is optional and, if not provided, will default to C<latest>. The tag is used to identify a specific version.
=item C<< options >>
Additional model parameters listed in the documentation for the Modelfile such as C<temperature>.
=item C<< stream >>
If C<false> the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.
=back
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
=over 4
=item C<< modelfile >>
The contents of the Modelfile.
=item C<< name >>
The model name.
Model names follow a C<model:tag> format. Some examples are C<orca-mini:3b-q4_1> and C<llama2:70b>. The tag is optional and, if not provided, will default to C<latest>. The tag is used to identify a specific version.
=item C<< stream >>
If C<false> the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.
=back
Returns a L<< AI::Ollama::CreateModelResponse >> on success.
=cut
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
=head3 Options
=over 4
=item C<< name >>
The model name.
Model names follow a C<model:tag> format. Some examples are C<orca-mini:3b-q4_1> and C<llama2:70b>. The tag is optional and, if not provided, will default to C<latest>. The tag is used to identify a specific version.
=back
=cut
sub build_deleteModel_request( $self, %options ) {
my $method = 'DELETE';
my $path = '/delete';
my $url = Mojo::URL->new( $self->server . $path );
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
=head3 Options
=over 4
=item C<< model >>
The model name.
Model names follow a C<model:tag> format. Some examples are C<orca-mini:3b-q4_1> and C<llama2:70b>. The tag is optional and, if not provided, will default to C<latest>. The tag is used to identify a specific version.
=item C<< options >>
Additional model parameters listed in the documentation for the Modelfile such as C<temperature>.
=item C<< prompt >>
Text to generate embeddings for.
=back
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
If set to a negative duration (e.g. -1), the model will stay loaded indefinitely.
=item -
If set to 0, the model will be unloaded immediately once finished.
=item -
If not set, the model will stay loaded for 5 minutes by default
=back
=item C<< model >>
The model name.
Model names follow a C<model:tag> format. Some examples are C<orca-mini:3b-q4_1> and C<llama2:70b>. The tag is optional and, if not provided, will default to C<latest>. The tag is used to identify a specific version.
=item C<< options >>
Additional model parameters listed in the documentation for the Modelfile such as C<temperature>.
=item C<< prompt >>
The prompt to generate a response.
=item C<< raw >>
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
=item C<< insecure >>
Allow insecure connections to the library.
Only use this if you are pulling from your own library during development.
=item C<< name >>
The model name.
Model names follow a C<model:tag> format. Some examples are C<orca-mini:3b-q4_1> and C<llama2:70b>. The tag is optional and, if not provided, will default to C<latest>. The tag is used to identify a specific version.
=item C<< stream >>
If C<false> the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.
=back
Returns a L<< AI::Ollama::PullModelResponse >> on success.
=cut
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
=head3 Options
=over 4
=item C<< name >>
The model name.
Model names follow a C<model:tag> format. Some examples are C<orca-mini:3b-q4_1> and C<llama2:70b>. The tag is optional and, if not provided, will default to C<latest>. The tag is used to identify a specific version.
=back
Returns a L<< AI::Ollama::ModelInfo >> on success.
=cut
sub build_showModelInfo_request( $self, %options ) {
my $method = 'POST';
my $path = '/show';
lib/AI/Ollama/CreateModelRequest.pm view on Meta::CPAN
has 'modelfile' => (
is => 'ro',
isa => Str,
required => 1,
);
=head2 C<< name >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'name' => (
is => 'ro',
isa => Str,
required => 1,
);
=head2 C<< stream >>
lib/AI/Ollama/DeleteModelRequest.pm view on Meta::CPAN
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< name >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'name' => (
is => 'ro',
isa => Str,
required => 1,
);
lib/AI/Ollama/GenerateChatCompletionRequest.pm view on Meta::CPAN
],
);
=head2 C<< keep_alive >>
How long (in minutes) to keep the model loaded in memory.
- If set to a positive duration (e.g. 20), the model will stay loaded for the provided duration.
- If set to a negative duration (e.g. -1), the model will stay loaded indefinitely.
- If set to 0, the model will be unloaded immediately once finished.
- If not set, the model will stay loaded for 5 minutes by default
=cut
has 'keep_alive' => (
is => 'ro',
isa => Int,
);
=head2 C<< messages >>
lib/AI/Ollama/GenerateChatCompletionRequest.pm view on Meta::CPAN
has 'messages' => (
is => 'ro',
isa => ArrayRef[HashRef],
required => 1,
);
=head2 C<< model >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'model' => (
is => 'ro',
isa => Str,
required => 1,
);
=head2 C<< options >>
lib/AI/Ollama/GenerateChatCompletionResponse.pm view on Meta::CPAN
has 'message' => (
is => 'ro',
isa => HashRef,
);
=head2 C<< model >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'model' => (
is => 'ro',
isa => Str,
);
=head2 C<< prompt_eval_count >>
lib/AI/Ollama/GenerateCompletionRequest.pm view on Meta::CPAN
isa => ArrayRef[Str],
);
=head2 C<< keep_alive >>
How long (in minutes) to keep the model loaded in memory.
- If set to a positive duration (e.g. 20), the model will stay loaded for the provided duration.
- If set to a negative duration (e.g. -1), the model will stay loaded indefinitely.
- If set to 0, the model will be unloaded immediately once finished.
- If not set, the model will stay loaded for 5 minutes by default
=cut
has 'keep_alive' => (
is => 'ro',
isa => Int,
);
=head2 C<< model >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'model' => (
is => 'ro',
isa => Str,
required => 1,
);
=head2 C<< options >>
lib/AI/Ollama/GenerateCompletionResponse.pm view on Meta::CPAN
has 'load_duration' => (
is => 'ro',
isa => Int,
);
=head2 C<< model >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'model' => (
is => 'ro',
isa => Str,
);
=head2 C<< prompt_eval_count >>
lib/AI/Ollama/GenerateEmbeddingRequest.pm view on Meta::CPAN
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< model >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'model' => (
is => 'ro',
isa => Str,
required => 1,
);
=head2 C<< options >>
lib/AI/Ollama/Model.pm view on Meta::CPAN
has 'modified_at' => (
is => 'ro',
isa => Str,
);
=head2 C<< name >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'name' => (
is => 'ro',
isa => Str,
);
=head2 C<< size >>
lib/AI/Ollama/ModelInfoRequest.pm view on Meta::CPAN
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< name >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'name' => (
is => 'ro',
isa => Str,
required => 1,
);
lib/AI/Ollama/PullModelRequest.pm view on Meta::CPAN
=cut
has 'insecure' => (
is => 'ro',
);
=head2 C<< name >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'name' => (
is => 'ro',
isa => Str,
required => 1,
);
=head2 C<< stream >>
lib/AI/Ollama/RequestOptions.pm view on Meta::CPAN
=cut
has 'main_gpu' => (
is => 'ro',
isa => Int,
);
=head2 C<< mirostat >>
Enable Mirostat sampling for controlling perplexity. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
=cut
has 'mirostat' => (
is => 'ro',
isa => Int,
);
=head2 C<< mirostat_eta >>
lib/AI/Ollama/RequestOptions.pm view on Meta::CPAN
=cut
has 'num_ctx' => (
is => 'ro',
isa => Int,
);
=head2 C<< num_gpu >>
The number of layers to send to the GPU(s). On macOS it defaults to 1 to enable metal support, 0 to disable.
=cut
has 'num_gpu' => (
is => 'ro',
isa => Int,
);
=head2 C<< num_gqa >>
lib/AI/Ollama/RequestOptions.pm view on Meta::CPAN
=cut
has 'num_predict' => (
is => 'ro',
isa => Int,
);
=head2 C<< num_thread >>
Sets the number of threads to use during computation. By default, Ollama will detect this for optimal performance. It is recommended to set this value to the number of physical CPU cores your system has (as opposed to the logical number of cores).
=cut
has 'num_thread' => (
is => 'ro',
isa => Int,
);
=head2 C<< numa >>
lib/AI/Ollama/RequestOptions.pm view on Meta::CPAN
=cut
has 'temperature' => (
is => 'ro',
isa => Num,
);
=head2 C<< tfs_z >>
Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1)
=cut
has 'tfs_z' => (
is => 'ro',
isa => Num,
);
=head2 C<< top_k >>
ollama/ollama-curated.json view on Meta::CPAN
{"openapi":"3.0.3","components":{"schemas":{"PushModelResponse":{"properties":{"total":{"type":"integer","description":"total size of the model","example":"2142590208"},"status":{"$ref":"#/components/schemas/PushModelStatus"},"digest":{"example":"sha...
ollama/ollama-curated.yaml view on Meta::CPAN
schemas:
GenerateCompletionRequest:
type: object
description: Request class for the generate endpoint.
properties:
model:
type: string
description: &model_name |
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
example: llama2:7b
prompt:
type: string
description: The prompt to generate a response.
example: Why is the sky blue?
images:
type: array
description: (optional) a list of Base64-encoded images to include in the message (for multimodal models such as llava)
items:
type: string
ollama/ollama-curated.yaml view on Meta::CPAN
raw:
type: boolean
description: |
If `true` no formatting will be applied to the prompt and no context will be returned.
You may choose to use the `raw` parameter if you are specifying a full templated prompt in your request to the API, and are managing history yourself.
stream:
type: boolean
description: &stream |
If `false` the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.
default: false
keep_alive:
type: integer
description: &keep_alive |
How long (in minutes) to keep the model loaded in memory.
- If set to a positive duration (e.g. 20), the model will stay loaded for the provided duration.
- If set to a negative duration (e.g. -1), the model will stay loaded indefinitely.
- If set to 0, the model will be unloaded immediately once finished.
- If not set, the model will stay loaded for 5 minutes by default
required:
- model
- prompt
RequestOptions:
type: object
description: Additional model parameters listed in the documentation for the Modelfile such as `temperature`.
properties:
num_keep:
type: integer
description: |
ollama/ollama-curated.yaml view on Meta::CPAN
Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)
top_p:
type: number
format: float
description: |
Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)
tfs_z:
type: number
format: float
description: |
Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1)
typical_p:
type: number
format: float
description: |
Typical p is used to reduce the impact of less probable tokens from the output.
repeat_last_n:
type: integer
description: |
Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)
temperature:
ollama/ollama-curated.yaml view on Meta::CPAN
description: |
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
frequency_penalty:
type: number
format: float
description: |
Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
mirostat:
type: integer
description: |
Enable Mirostat sampling for controlling perplexity. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
mirostat_tau:
type: number
format: float
description: |
Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)
mirostat_eta:
type: number
format: float
description: |
Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1)
ollama/ollama-curated.yaml view on Meta::CPAN
type: integer
description: |
Sets the number of batches to use for generation. (Default: 1)
num_gqa:
type: integer
description: |
The number of GQA groups in the transformer layer. Required for some models, for example it is 8 for `llama2:70b`.
num_gpu:
type: integer
description: |
The number of layers to send to the GPU(s). On macOS it defaults to 1 to enable metal support, 0 to disable.
main_gpu:
type: integer
description: |
The GPU to use for the main model. Default is 0.
low_vram:
type: boolean
description: |
Enable low VRAM mode. (Default: false)
f16_kv:
type: boolean
ollama/ollama-curated.yaml view on Meta::CPAN
description: |
The base of the rope frequency scale. (Default: 1.0)
rope_frequency_scale:
type: number
format: float
description: |
The scale of the rope frequency. (Default: 1.0)
num_thread:
type: integer
description: |
Sets the number of threads to use during computation. By default, Ollama will detect this for optimal performance. It is recommended to set this value to the number of physical CPU cores your system has (as opposed to the logical number o...
ResponseFormat:
type: string
description: |
The format to return a response in. Currently the only accepted value is json.
Enable JSON mode by setting the format parameter to json. This will structure the response as valid JSON.
Note: it's important to instruct the model to use JSON in the prompt. Otherwise, the model may generate large amounts whitespace.
enum:
- json
ollama/ollama-curated.yaml view on Meta::CPAN
description: The messages of the chat, this can be used to keep a chat memory
items:
$ref: '#/components/schemas/Message'
format:
$ref: '#/components/schemas/ResponseFormat'
options:
$ref: '#/components/schemas/RequestOptions'
stream:
type: boolean
description: *stream
default: false
keep_alive:
type: integer
description: *keep_alive
required:
- model
- messages
GenerateChatCompletionResponse:
type: object
description: The response class for the chat endpoint.
properties:
ollama/ollama-curated.yaml view on Meta::CPAN
type: string
description: *model_name
example: mario
modelfile:
type: string
description: The contents of the Modelfile.
example: FROM llama2\nSYSTEM You are mario from Super Mario Bros.
stream:
type: boolean
description: *stream
default: false
required:
- name
- modelfile
CreateModelResponse:
description: Response object for creating a model. When finished, `status` is `success`.
type: object
properties:
status:
$ref: '#/components/schemas/CreateModelStatus'
CreateModelStatus:
ollama/ollama-curated.yaml view on Meta::CPAN
name:
type: string
description: *model_name
example: llama2:7b
insecure:
type: boolean
description: |
Allow insecure connections to the library.
Only use this if you are pulling from your own library during development.
default: false
stream:
type: boolean
description: *stream
default: false
required:
- name
PullModelResponse:
description: |
Response class for pulling a model.
The first object is the manifest. Then there is a series of downloading responses. Until any of the download is completed, the `completed` key may not be included.
The number of files to be downloaded depends on the number of layers specified in the manifest.
type: object
ollama/ollama-curated.yaml view on Meta::CPAN
name:
type: string
description: The name of the model to push in the form of <namespace>/<model>:<tag>.
example: 'mattw/pygmalion:latest'
insecure:
type: boolean
description: |
Allow insecure connections to the library.
Only use this if you are pushing to your library during development.
default: false
stream:
type: boolean
description: *stream
default: false
required:
- name
PushModelResponse:
type: object
description: Response class for pushing a model.
properties:
status:
$ref: '#/components/schemas/PushModelStatus'
digest:
type: string
openapi/petstore-expanded.yaml view on Meta::CPAN
format: int32
responses:
'200':
description: pet response
content:
application/json:
schema:
type: array
items:
$ref: '#/components/schemas/Pet'
default:
description: unexpected error
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
post:
description: Creates a new pet in the store. Duplicates are allowed
operationId: addPet
requestBody:
description: Pet to add to the store
openapi/petstore-expanded.yaml view on Meta::CPAN
application/json:
schema:
$ref: '#/components/schemas/NewPet'
responses:
'200':
description: pet response
content:
application/json:
schema:
$ref: '#/components/schemas/Pet'
default:
description: unexpected error
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
/pets/{id}:
get:
description: Returns a user based on a single ID, if the user does not have access to the pet
operationId: find pet by id
parameters:
openapi/petstore-expanded.yaml view on Meta::CPAN
schema:
type: integer
format: int64
responses:
'200':
description: pet response
content:
application/json:
schema:
$ref: '#/components/schemas/Pet'
default:
description: unexpected error
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
delete:
description: deletes a single pet based on the ID supplied
operationId: deletePet
parameters:
- name: id
in: path
description: ID of pet to delete
required: true
schema:
type: integer
format: int64
responses:
'204':
description: pet deleted
default:
description: unexpected error
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
components:
schemas:
Pet:
allOf:
- $ref: '#/components/schemas/NewPet'