view release on metacpan or search on metacpan
0.05 2025-03-01
* Add File::ShareDir::Install to CONFIGURE_REQUIRES
This fixes GH #3
0.04 2025-02-28
* Regenerate with better OpenAPI converter
* Distribute our OpenAPI spec
* Request builders are now public
* HTTP events are emitted for easier tracing
- request
- response
0.03 2024-04-08
* Fix prerequisites
0.02 2024-04-07
* Regenerated documentation
* Fixed code for uploading a model
0.01 2024-04-05
* Released on an unsuspecting world
* Generated with OpenAPI::PerlGenerator
The Artistic License 2.0
Copyright (c) 2000-2006, The Perl Foundation.
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
This license establishes the terms under which a given free software
Package may be copied, modified, distributed, and/or redistributed.
The intent is that the Copyright Holder maintains some artistic
control over the development of that Package while still keeping the
Package available as open source and free software.
You are always permitted to make arrangements wholly outside of this
license directly with the Copyright Holder of a given Package. If the
terms of this license do not permit the full use that you propose to
make of the Package, you should contact the Copyright Holder and seek
a different licensing arrangement.
Definitions
"Copyright Holder" means the individual(s) or organization(s)
named in the copyright notice for the entire Package.
"Contributor" means any party that has contributed code or other
material to the Package, in accordance with the Copyright Holder's
procedures.
"You" and "your" means any person who would like to copy,
distribute, or modify the Package.
"Package" means the collection of files distributed by the
Copyright Holder, and derivatives of that collection and/or of
those files. A given Package may consist of either the Standard
Version, or a Modified Version.
"Distribute" means providing a copy of the Package or making it
accessible to anyone else, or in the case of a company or
organization, to others outside of your company or organization.
"Distributor Fee" means any fee that you charge for Distributing
this Package or providing support for this Package to another
party. It does not mean licensing fees.
"Standard Version" refers to the Package if it has not been
modified, or has been modified only in ways explicitly requested
by the Copyright Holder.
"Modified Version" means the Package, if it has been changed, and
such changes were not explicitly requested by the Copyright
Holder.
"Original License" means this Artistic License as Distributed with
the Standard Version of the Package, in its current version or as
it may be modified by The Perl Foundation in the future.
"Source" form means the source code, documentation source, and
configuration files for the Package.
"Compiled" form means the compiled bytecode, object code, binary,
or any other form resulting from mechanical transformation or
translation of the Source form.
Permission for Use and Modification Without Distribution
(1) You are permitted to use the Standard Version and create and use
Modified Versions for any purpose without restriction, provided that
you do not Distribute the Modified Version.
Permissions for Redistribution of the Standard Version
Distribution of Modified Versions of the Package as Source
(4) You may Distribute your Modified Version as Source (either gratis
or for a Distributor Fee, and with or without a Compiled form of the
Modified Version) provided that you clearly document how it differs
from the Standard Version, including, but not limited to, documenting
any non-standard features, executables, or modules, and provided that
you do at least ONE of the following:
(a) make the Modified Version available to the Copyright Holder
of the Standard Version, under the Original License, so that the
Copyright Holder may include your modifications in the Standard
Version.
(b) ensure that installation of your Modified Version does not
prevent the user installing or running the Standard Version. In
addition, the Modified Version must bear a name that is different
from the name of the Standard Version.
(c) allow anyone who receives a copy of the Modified Version to
make the Source form of the Modified Version available to others
under
(i) the Original License or
(ii) a license that permits the licensee to freely copy,
modify and redistribute the Modified Version using the same
licensing terms that apply to the copy that the licensee
received, and requires that the Source form of the Modified
Version, and of any works derived from it, be made freely
available in that license fees are prohibited but Distributor
Fees are allowed.
Distribution of Compiled Forms of the Standard Version
or Modified Versions without the Source
(5) You may Distribute Compiled forms of the Standard Version without
the Source, provided that you include complete instructions on how to
get the Source of the Standard Version. Such instructions must be
valid at the time of your distribution. If these instructions, at any
time while you are carrying out such distribution, become invalid, you
{
"abstract" : "Client for AI::Ollama",
"author" : [
"Max Maischein <corion@cpan.org>"
],
"dynamic_config" : 0,
"generated_by" : "ExtUtils::MakeMaker version 7.64, CPAN::Meta::Converter version 2.150010",
"license" : [
"artistic_2"
],
"meta-spec" : {
"url" : "http://search.cpan.org/perldoc?CPAN::Meta::Spec",
"version" : 2
},
"name" : "AI-Ollama-Client",
"no_index" : {
"directory" : [
"t",
"inc"
]
},
"prereqs" : {
"build" : {
"requires" : {
"strict" : "0"
}
},
"configure" : {
"requires" : {
"ExtUtils::MakeMaker" : "0",
"File::ShareDir::Install" : "0"
}
},
"runtime" : {
"requires" : {
"Carp" : "0",
"File::ShareDir" : "0",
"Future" : "0",
"Future::Mojo" : "0",
"Future::Queue" : "0",
"Future::Utils" : "0",
"Mojo::JSON" : "0",
"Mojo::URL" : "0",
"Mojo::UserAgent" : "0",
"Mojolicious" : "9",
"Moo" : "2",
"OpenAPI::Modern" : "0",
"PerlX::Maybe" : "0",
"Role::EventEmitter" : "0",
"URI::Template" : "0",
"YAML::PP" : "0",
"experimental" : "0.031",
"perl" : "5.020",
"stable" : "0.031"
}
},
"test" : {
"requires" : {
"Test2::V0" : "0"
}
}
},
"release_status" : "stable",
"resources" : {
"bugtracker" : {
"web" : "https://github.com/Corion/AI-Ollama-Client/issues"
},
"license" : [
"https://dev.perl.org/licenses/"
],
"repository" : {
"type" : "git",
"url" : "git://github.com/Corion/AI-Ollama-Client.git",
"web" : "https://github.com/Corion/AI-Ollama-Client"
}
},
"version" : "0.05",
"x_serialization_backend" : "JSON::PP version 4.07",
"x_static_install" : 1
}
---
abstract: 'Client for AI::Ollama'
author:
- 'Max Maischein <corion@cpan.org>'
build_requires:
Test2::V0: '0'
strict: '0'
configure_requires:
ExtUtils::MakeMaker: '0'
File::ShareDir::Install: '0'
dynamic_config: 0
generated_by: 'ExtUtils::MakeMaker version 7.64, CPAN::Meta::Converter version 2.150010'
license: artistic_2
meta-spec:
url: http://module-build.sourceforge.net/META-spec-v1.4.html
version: '1.4'
name: AI-Ollama-Client
no_index:
directory:
- t
- inc
requires:
Carp: '0'
File::ShareDir: '0'
Future: '0'
Future::Mojo: '0'
Future::Queue: '0'
Future::Utils: '0'
Mojo::JSON: '0'
Mojo::URL: '0'
Mojo::UserAgent: '0'
Mojolicious: '9'
Moo: '2'
OpenAPI::Modern: '0'
PerlX::Maybe: '0'
Role::EventEmitter: '0'
URI::Template: '0'
YAML::PP: '0'
experimental: '0.031'
perl: '5.020'
stable: '0.031'
resources:
bugtracker: https://github.com/Corion/AI-Ollama-Client/issues
license: https://dev.perl.org/licenses/
repository: git://github.com/Corion/AI-Ollama-Client.git
version: '0.05'
x_serialization_backend: 'CPAN::Meta::YAML version 0.018'
x_static_install: 1
Makefile.PL view on Meta::CPAN
$eumm_version =~ s/_//;
my $module = 'AI::Ollama::Client';
(my $main_file = "lib/$module.pm" ) =~ s!::!/!g;
(my $distbase = $module) =~ s!::!-!g;
my $distlink = $distbase;
my @tests = map { glob $_ } 't/*.t', 't/*/*.t';
my %module = (
NAME => $module,
AUTHOR => q{Max Maischein <corion@cpan.org>},
VERSION_FROM => $main_file,
ABSTRACT_FROM => $main_file,
META_MERGE => {
"meta-spec" => { version => 2 },
resources => {
repository => {
web => "https://github.com/Corion/$distlink",
url => "git://github.com/Corion/$distlink.git",
type => 'git',
},
bugtracker => {
web => "https://github.com/Corion/$distbase/issues",
# mailto => 'meta-bugs@example.com',
},
license => "https://dev.perl.org/licenses/",
},
dynamic_config => 0, # we promise to keep META.* up-to-date
x_static_install => 1, # we are pure Perl and don't do anything fancy
},
MIN_PERL_VERSION => '5.020', # I use signatures
'LICENSE'=> 'artistic_2',
PL_FILES => {},
CONFIGURE_REQUIRES => {
'ExtUtils::MakeMaker' => 0,
'File::ShareDir::Install' => 0,
},
BUILD_REQUIRES => {
'strict' => 0,
},
PREREQ_PM => {
'stable' => '0.031',
'experimental' => '0.031',
'Carp' => 0,
'File::ShareDir' => 0,
'Future' => 0,
'Future::Mojo' => 0,
'Future::Utils' => 0,
'Future::Queue' => 0,
'Moo' => 2,
'Mojolicious' => 9,
'Mojo::JSON' => 0,
'Mojo::URL' => 0,
'Mojo::UserAgent' => 0,
'OpenAPI::Modern' => 0,
'PerlX::Maybe' => 0,
'Role::EventEmitter' => 0,
'URI::Template' => 0,
'YAML::PP' => 0,
},
TEST_REQUIRES => {
'Test2::V0' => 0,
},
dist => { COMPRESS => 'gzip -9f', SUFFIX => 'gz', },
clean => { FILES => "$distbase-*" },
test => { TESTS => join( ' ', @tests ) },
);
# This is so that we can do
# require 'Makefile.PL'
# and then call get_module_info
sub get_module_info { %module }
if( ! caller ) {
require File::ShareDir::Install;
File::ShareDir::Install::install_share( module => "$module\::Impl" => 'ollama');
{
package MY;
require File::ShareDir::Install;
File::ShareDir::Install->import( qw( postamble ));
}
# I should maybe use something like Shipwright...
my $mm = WriteMakefile1(get_module_info);
my $version = $mm->parse_version($main_file);
regen_README($main_file, $version);
regen_EXAMPLES() if -d 'examples';
};
1;
sub WriteMakefile1 { #Written by Alexandr Ciornii, version 0.21. Added by eumm-upgrade.
my %params=@_;
my $eumm_version=$ExtUtils::MakeMaker::VERSION;
$eumm_version=eval $eumm_version;
die "EXTRA_META is deprecated" if exists $params{EXTRA_META};
die "License not specified" if not exists $params{LICENSE};
if ($params{BUILD_REQUIRES} and $eumm_version < 6.5503) {
#EUMM 6.5502 has problems with BUILD_REQUIRES
$params{PREREQ_PM}={ %{$params{PREREQ_PM} || {}} , %{$params{BUILD_REQUIRES}} };
delete $params{BUILD_REQUIRES};
}
if ($params{TEST_REQUIRES} and $eumm_version < 6.64) {
$params{PREREQ_PM}={ %{$params{PREREQ_PM} || {}} , %{$params{TEST_REQUIRES}} };
delete $params{TEST_REQUIRES};
}
delete $params{CONFIGURE_REQUIRES} if $eumm_version < 6.52;
delete $params{MIN_PERL_VERSION} if $eumm_version < 6.48;
delete $params{META_MERGE} if $eumm_version < 6.46;
delete $params{META_ADD} if $eumm_version < 6.46;
delete $params{LICENSE} if $eumm_version < 6.31;
delete $params{AUTHOR} if $] < 5.005;
delete $params{ABSTRACT_FROM} if $] < 5.005;
delete $params{BINARY_LOCATION} if $] < 5.005;
WriteMakefile(%params);
}
sub regen_README {
# README is the short version that just tells people what this is
# and how to install it
my( $file, $version ) = @_;
eval {
# Get description
my $readme = join "\n",
pod_section($file, 'NAME', 'no heading' ),
pod_section($file, 'DESCRIPTION' ),
<<VERSION,
This document describes version $version.
VERSION
<<INSTALL,
INSTALLATION
This is a Perl module distribution. It should be installed with whichever
tool you use to manage your installation of Perl, e.g. any of
cpanm .
cpan .
cpanp -i .
Consult https://www.cpan.org/modules/INSTALL.html for further instruction.
Should you wish to install this module manually, the procedure is
perl Makefile.PL
make
make test
make install
INSTALL
pod_section($file, 'REPOSITORY'),
pod_section($file, 'SUPPORT'),
pod_section($file, 'TALKS'),
pod_section($file, 'KNOWN ISSUES'),
pod_section($file, 'BUG TRACKER'),
pod_section($file, 'CONTRIBUTING'),
pod_section($file, 'SEE ALSO'),
pod_section($file, 'AUTHOR'),
pod_section($file, 'LICENSE' ),
pod_section($file, 'COPYRIGHT' ),
;
update_file( 'README', $readme );
};
# README.mkdn is the documentation that will be shown as the main
# page of the repository on Github. Hence we recreate the POD here
# as Markdown
eval {
require Pod::Markdown;
my $parser = Pod::Markdown->new();
# Read POD from Module.pm and write to README
$parser->parse_from_file($_[0]);
my $readme_mkdn = <<STATUS . $parser->as_markdown;
[](https://github.com/Corion/$distbase/actions?query=workflow%3Awindows)
[](https://github.com/Corion/$distbase/actions?query=workflow%3Amacos)
[](https://github.com/Corion/$distbase/actions?query=workflow%3Alinux)
STATUS
update_file( 'README.mkdn', $readme_mkdn );
};
}
sub pod_section {
my( $filename, $section, $remove_heading ) = @_;
open my $fh, '<', $filename
or die "Couldn't read '$filename': $!";
my @section =
grep { /^=head1\s+$section/.../^=/ } <$fh>;
# Trim the section
if( @section ) {
pop @section if $section[-1] =~ /^=/;
shift @section if $remove_heading;
pop @section
while @section and $section[-1] =~ /^\s*$/;
shift @section
while @section and $section[0] =~ /^\s*$/;
};
@section = map { $_ =~ s!^=\w+\s+!!; $_ } @section;
return join "", @section;
}
sub regen_EXAMPLES {
my $perl = $^X;
if ($perl =~/\s/) {
$perl = qq{"$perl"};
};
(my $example_file = $main_file) =~ s!\.pm$!/Examples.pm!;
my $examples = `$perl -w examples/gen_examples_pod.pl`;
if ($examples) {
warn "(Re)Creating $example_file\n";
$examples =~ s/\r\n/\n/g;
update_file( $example_file, $examples );
};
};
sub update_file {
my( $filename, $new_content ) = @_;
my $content;
if( -f $filename ) {
open my $fh, '<:raw:encoding(UTF-8)', $filename
or die "Couldn't read '$filename': $!";
local $/;
$content = <$fh>;
};
if( $content ne $new_content ) {
if( open my $fh, '>:raw:encoding(UTF-8)', $filename ) {
print $fh $new_content;
} else {
warn "Couldn't (re)write '$filename': $!";
};
};
}
This document describes version 0.05.
INSTALLATION
This is a Perl module distribution. It should be installed with whichever
tool you use to manage your installation of Perl, e.g. any of
cpanm .
cpan .
cpanp -i .
Consult https://www.cpan.org/modules/INSTALL.html for further instruction.
Should you wish to install this module manually, the procedure is
perl Makefile.PL
make
make test
make install
README.mkdn view on Meta::CPAN
[](https://github.com/Corion/AI-Ollama-Client/actions?query=workflow%3Awindows)
[](https://github.com/Corion/AI-Ollama-Client/actions?query=workflow%3Amacos)
[](https://github.com/Corion/AI-Ollama-Client/actions?query=workflow%3Alinux)
# NAME
AI::Ollama::Client - Client for AI::Ollama
# SYNOPSIS
use 5.020;
use AI::Ollama::Client;
my $client = AI::Ollama::Client->new(
server => 'https://example.com/',
);
my $res = $client->someMethod()->get;
say $res;
# METHODS
## `checkBlob`
my $res = $client->checkBlob()->get;
Check to see if a blob exists on the Ollama server which is useful when creating models.
## `createBlob`
my $res = $client->createBlob()->get;
Create a blob from a file. Returns the server file path.
## `generateChatCompletion`
my $res = $client->generateChatCompletion()->get;
Generate the next message in a chat with a provided model.
Returns a [AI::Ollama::GenerateChatCompletionResponse](https://metacpan.org/pod/AI%3A%3AOllama%3A%3AGenerateChatCompletionResponse).
## `copyModel`
my $res = $client->copyModel()->get;
Creates a model with another name from an existing model.
## `createModel`
my $res = $client->createModel()->get;
Create a model from a Modelfile.
Returns a [AI::Ollama::CreateModelResponse](https://metacpan.org/pod/AI%3A%3AOllama%3A%3ACreateModelResponse).
## `deleteModel`
my $res = $client->deleteModel()->get;
Delete a model and its data.
## `generateEmbedding`
my $res = $client->generateEmbedding()->get;
Generate embeddings from a model.
Returns a [AI::Ollama::GenerateEmbeddingResponse](https://metacpan.org/pod/AI%3A%3AOllama%3A%3AGenerateEmbeddingResponse).
## `generateCompletion`
use Future::Utils 'repeat';
my $responses = $client->generateCompletion();
repeat {
my ($res) = $responses->shift;
if( $res ) {
my $str = $res->get;
say $str;
}
Future::Mojo->done( defined $res );
} until => sub($done) { $done->get };
Generate a response for a given prompt with a provided model.
Returns a [AI::Ollama::GenerateCompletionResponse](https://metacpan.org/pod/AI%3A%3AOllama%3A%3AGenerateCompletionResponse).
## `pullModel`
my $res = $client->pullModel(
name => 'llama',
)->get;
Download a model from the ollama library.
Returns a [AI::Ollama::PullModelResponse](https://metacpan.org/pod/AI%3A%3AOllama%3A%3APullModelResponse).
## `pushModel`
my $res = $client->pushModel()->get;
Upload a model to a model library.
Returns a [AI::Ollama::PushModelResponse](https://metacpan.org/pod/AI%3A%3AOllama%3A%3APushModelResponse).
## `showModelInfo`
my $info = $client->showModelInfo()->get;
say $info->modelfile;
Show details about a model including modelfile, template, parameters, license, and system prompt.
Returns a [AI::Ollama::ModelInfo](https://metacpan.org/pod/AI%3A%3AOllama%3A%3AModelInfo).
## `listModels`
my $info = $client->listModels()->get;
for my $model ($info->models->@*) {
say $model->model; # llama2:latest
}
List models that are available locally.
Returns a [AI::Ollama::ModelsResponse](https://metacpan.org/pod/AI%3A%3AOllama%3A%3AModelsResponse).
lib/AI/Ollama/Client.pm view on Meta::CPAN
use MIME::Base64 'encode_base64';
extends 'AI::Ollama::Client::Impl';
=head1 NAME
AI::Ollama::Client - Client for AI::Ollama
=head1 SYNOPSIS
use 5.020;
use AI::Ollama::Client;
my $client = AI::Ollama::Client->new(
server => 'https://example.com/',
);
my $res = $client->someMethod()->get;
say $res;
=head1 METHODS
=head2 C<< checkBlob >>
my $res = $client->checkBlob()->get;
Check to see if a blob exists on the Ollama server which is useful when creating models.
=cut
around 'checkBlob' => sub ( $super, $self, %options ) {
$super->( $self, %options )->then( sub( $res ) {
if( $res->code =~ /^2\d\d$/ ) {
return Future->done( 1 )
} else {
return Future->done( 0 )
}
});
};
=head2 C<< createBlob >>
my $res = $client->createBlob()->get;
Create a blob from a file. Returns the server file path.
=cut
=head2 C<< generateChatCompletion >>
my $res = $client->generateChatCompletion()->get;
Generate the next message in a chat with a provided model.
Returns a L<< AI::Ollama::GenerateChatCompletionResponse >>.
=cut
=head2 C<< copyModel >>
my $res = $client->copyModel()->get;
Creates a model with another name from an existing model.
=cut
=head2 C<< createModel >>
my $res = $client->createModel()->get;
Create a model from a Modelfile.
Returns a L<< AI::Ollama::CreateModelResponse >>.
=cut
=head2 C<< deleteModel >>
my $res = $client->deleteModel()->get;
Delete a model and its data.
=cut
=head2 C<< generateEmbedding >>
my $res = $client->generateEmbedding()->get;
Generate embeddings from a model.
Returns a L<< AI::Ollama::GenerateEmbeddingResponse >>.
=cut
=head2 C<< generateCompletion >>
use Future::Utils 'repeat';
my $responses = $client->generateCompletion();
repeat {
my ($res) = $responses->shift;
if( $res ) {
my $str = $res->get;
say $str;
}
Future::Mojo->done( defined $res );
} until => sub($done) { $done->get };
Generate a response for a given prompt with a provided model.
Returns a L<< AI::Ollama::GenerateCompletionResponse >>.
=cut
around 'generateCompletion' => sub ( $super, $self, %options ) {
# Encode images as base64, if images exist:
# (but create a copy so we don't over write the input array)
if (my $images = $options{images}) {
# Allow { filename => '/etc/passwd' }
$options{images} = [
map {
my $item = $_;
if( ref($item) eq 'HASH' ) {
$item = Mojo::File->new($item->{filename})->slurp();
};
encode_base64($item)
} @$images ];
}
return $super->($self, %options);
};
=head2 C<< pullModel >>
my $res = $client->pullModel(
name => 'llama',
)->get;
Download a model from the ollama library.
Returns a L<< AI::Ollama::PullModelResponse >>.
=cut
=head2 C<< pushModel >>
my $res = $client->pushModel()->get;
Upload a model to a model library.
Returns a L<< AI::Ollama::PushModelResponse >>.
=cut
=head2 C<< showModelInfo >>
my $info = $client->showModelInfo()->get;
say $info->modelfile;
Show details about a model including modelfile, template, parameters, license, and system prompt.
Returns a L<< AI::Ollama::ModelInfo >>.
=cut
=head2 C<< listModels >>
my $info = $client->listModels()->get;
for my $model ($info->models->@*) {
say $model->model; # llama2:latest
}
List models that are available locally.
Returns a L<< AI::Ollama::ModelsResponse >>.
=cut
1;
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
use AI::Ollama::PullModelRequest;
use AI::Ollama::PullModelResponse;
use AI::Ollama::PushModelRequest;
use AI::Ollama::PushModelResponse;
use AI::Ollama::RequestOptions;
=encoding utf8
=head1 SYNOPSIS
my $client = AI::Ollama::Client::Impl->new(
schema_file => '...',
);
=head1 PROPERTIES
=head2 B<< schema_file >>
The OpenAPI schema file we use for validation
=head2 B<< schema >>
The OpenAPI schema data structure we use for validation. If not given,
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
The L<Mojo::UserAgent> to use
=head2 B<< server >>
The server to access
=cut
has 'schema_file' => (
is => 'lazy',
default => sub { require AI::Ollama::Client::Impl; module_file('AI::Ollama::Client::Impl', 'ollama-curated.yaml') },
);
has 'schema' => (
is => 'lazy',
default => sub {
if( my $fn = $_[0]->schema_file ) {
YAML::PP->new( boolean => 'JSON::PP' )->load_file($fn);
}
},
);
has 'validate_requests' => (
is => 'rw',
default => 1,
);
has 'validate_responses' => (
is => 'rw',
default => 1,
);
has 'openapi' => (
is => 'lazy',
default => sub {
if( my $schema = $_[0]->schema ) {
OpenAPI::Modern->new( openapi_schema => $schema, openapi_uri => '' )
}
},
);
# The HTTP stuff should go into a ::Role I guess
has 'ua' => (
is => 'lazy',
default => sub { Mojo::UserAgent->new },
);
has 'server' => (
is => 'ro',
);
=head1 METHODS
=head2 C<< build_checkBlob_request >>
Build an HTTP request as L<Mojo::Request> object. For the parameters see below.
=head2 C<< checkBlob >>
my $res = $client->checkBlob(
'digest' => '...',
)->get;
Check to see if a blob exists on the Ollama server which is useful when creating models.
=head3 Parameters
=over 4
=item B<< digest >>
the SHA256 digest of the blob
=back
=cut
sub build_checkBlob_request( $self, %options ) {
croak "Missing required parameter 'digest'"
unless exists $options{ 'digest' };
my $method = 'HEAD';
my $template = URI::Template->new( '/blobs/{digest}' );
my $path = $template->process(
'digest' => delete $options{'digest'},
);
my $url = Mojo::URL->new( $self->server . $path );
my $tx = $self->ua->build_tx(
$method => $url,
{
}
);
$self->validate_request( $tx );
return $tx
}
sub checkBlob( $self, %options ) {
my $tx = $self->build_checkBlob_request(%options);
my $res = Future::Mojo->new();
my $r1 = Future::Mojo->new();
$r1->then( sub( $tx ) {
my $resp = $tx->res;
$self->emit(response => $resp);
# Should we validate using OpenAPI::Modern here?!
if( $resp->code == 200 ) {
# Blob exists on the server
$res->done($resp);
} elsif( $resp->code == 404 ) {
# Blob was not found
$res->done($resp);
} else {
# An unknown/unhandled response, likely an error
$res->fail( sprintf( "unknown_unhandled code %d: %s", $resp->code, $resp->body ), $resp);
}
})->retain;
# Start our transaction
$self->emit(request => $tx);
$tx = $self->ua->start_p($tx)->then(sub($tx) {
$r1->resolve( $tx );
undef $r1;
})->catch(sub($err) {
$self->emit(response => $tx, $err);
$r1->fail( $err => $tx );
undef $r1;
});
return $res
}
=head2 C<< build_createBlob_request >>
Build an HTTP request as L<Mojo::Request> object. For the parameters see below.
=head2 C<< createBlob >>
my $res = $client->createBlob(
'digest' => '...',
)->get;
Create a blob from a file. Returns the server file path.
=head3 Parameters
=over 4
=item B<< digest >>
the SHA256 digest of the blob
=back
=cut
sub build_createBlob_request( $self, %options ) {
croak "Missing required parameter 'digest'"
unless exists $options{ 'digest' };
my $method = 'POST';
my $template = URI::Template->new( '/blobs/{digest}' );
my $path = $template->process(
'digest' => delete $options{'digest'},
);
my $url = Mojo::URL->new( $self->server . $path );
my $body = delete $options{ body } // '';
my $tx = $self->ua->build_tx(
$method => $url,
{
"Content-Type" => 'application/octet-stream',
}
=> $body,
);
$self->validate_request( $tx );
return $tx
}
sub createBlob( $self, %options ) {
my $tx = $self->build_createBlob_request(%options);
my $res = Future::Mojo->new();
my $r1 = Future::Mojo->new();
$r1->then( sub( $tx ) {
my $resp = $tx->res;
$self->emit(response => $resp);
# Should we validate using OpenAPI::Modern here?!
if( $resp->code == 201 ) {
# Blob was successfully created
$res->done($resp);
} else {
# An unknown/unhandled response, likely an error
$res->fail( sprintf( "unknown_unhandled code %d: %s", $resp->code, $resp->body ), $resp);
}
})->retain;
# Start our transaction
$self->emit(request => $tx);
$tx = $self->ua->start_p($tx)->then(sub($tx) {
$r1->resolve( $tx );
undef $r1;
})->catch(sub($err) {
$self->emit(response => $tx, $err);
$r1->fail( $err => $tx );
undef $r1;
});
return $res
}
=head2 C<< build_generateChatCompletion_request >>
Build an HTTP request as L<Mojo::Request> object. For the parameters see below.
=head2 C<< generateChatCompletion >>
use Future::Utils 'repeat';
my $response = $client->generateChatCompletion();
my $streamed = $response->get();
repeat {
my ($res) = $streamed->shift;
if( $res ) {
my $str = $res->get;
say $str;
}
Future::Mojo->done( defined $res );
} until => sub($done) { $done->get };
Generate the next message in a chat with a provided model.
This is a streaming endpoint, so there will be a series of responses. The final response object will include statistics and additional data from the request.
=head3 Options
=over 4
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
If C<false> the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.
=back
Returns a L<< AI::Ollama::GenerateChatCompletionResponse >> on success.
=cut
sub build_generateChatCompletion_request( $self, %options ) {
my $method = 'POST';
my $path = '/chat';
my $url = Mojo::URL->new( $self->server . $path );
my $request = AI::Ollama::GenerateChatCompletionRequest->new( \%options )->as_hash;
my $tx = $self->ua->build_tx(
$method => $url,
{
'Accept' => 'application/x-ndjson',
"Content-Type" => 'application/json',
}
=> json => $request,
);
$self->validate_request( $tx );
return $tx
}
sub generateChatCompletion( $self, %options ) {
my $tx = $self->build_generateChatCompletion_request(%options);
my $res = Future::Mojo->new();
my $r1 = Future::Mojo->new();
our @store; # we should use ->retain() instead
push @store, $r1->then( sub( $tx ) {
my $resp = $tx->res;
$self->emit(response => $resp);
# Should we validate using OpenAPI::Modern here?!
if( $resp->code == 200 ) {
# Successful operation.
my $queue = Future::Queue->new( prototype => 'Future::Mojo' );
$res->done( $queue );
my $ct = $resp->headers->content_type;
return unless $ct;
$ct =~ s/;\s+.*//;
if( $ct eq 'application/x-ndjson' ) {
# we only handle ndjson currently
my $handled_offset = 0;
$resp->on(progress => sub($msg,@) {
my $fresh = substr( $msg->body, $handled_offset );
my $body = $msg->body;
$body =~ s/[^\r\n]+\z//; # Strip any unfinished line
$handled_offset = length $body;
my @lines = split /\n/, $fresh;
for (@lines) {
my $payload = decode_json( $_ );
$self->validate_response( $payload, $tx );
$queue->push(
AI::Ollama::GenerateChatCompletionResponse->new($payload),
);
};
if( $msg->{state} eq 'finished' ) {
$queue->finish();
}
});
} else {
# Unknown/unhandled content type
$res->fail( sprintf("unknown_unhandled content type '%s'", $resp->content_type), $resp );
}
} else {
# An unknown/unhandled response, likely an error
$res->fail( sprintf( "unknown_unhandled code %d", $resp->code ), $resp);
}
});
my $_tx;
$tx->res->once( progress => sub($msg, @) {
$r1->resolve( $tx );
undef $_tx;
undef $r1;
});
$self->emit(request => $tx);
$_tx = $self->ua->start_p($tx);
return $res
}
=head2 C<< build_copyModel_request >>
Build an HTTP request as L<Mojo::Request> object. For the parameters see below.
=head2 C<< copyModel >>
my $res = $client->copyModel()->get;
Creates a model with another name from an existing model.
=head3 Options
=over 4
=item C<< destination >>
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
=item C<< source >>
Name of the model to copy.
=back
=cut
sub build_copyModel_request( $self, %options ) {
my $method = 'POST';
my $path = '/copy';
my $url = Mojo::URL->new( $self->server . $path );
my $request = AI::Ollama::CopyModelRequest->new( \%options )->as_hash;
my $tx = $self->ua->build_tx(
$method => $url,
{
"Content-Type" => 'application/json',
}
=> json => $request,
);
$self->validate_request( $tx );
return $tx
}
sub copyModel( $self, %options ) {
my $tx = $self->build_copyModel_request(%options);
my $res = Future::Mojo->new();
my $r1 = Future::Mojo->new();
$r1->then( sub( $tx ) {
my $resp = $tx->res;
$self->emit(response => $resp);
# Should we validate using OpenAPI::Modern here?!
if( $resp->code == 200 ) {
# Successful operation.
$res->done($resp);
} else {
# An unknown/unhandled response, likely an error
$res->fail( sprintf( "unknown_unhandled code %d: %s", $resp->code, $resp->body ), $resp);
}
})->retain;
# Start our transaction
$self->emit(request => $tx);
$tx = $self->ua->start_p($tx)->then(sub($tx) {
$r1->resolve( $tx );
undef $r1;
})->catch(sub($err) {
$self->emit(response => $tx, $err);
$r1->fail( $err => $tx );
undef $r1;
});
return $res
}
=head2 C<< build_createModel_request >>
Build an HTTP request as L<Mojo::Request> object. For the parameters see below.
=head2 C<< createModel >>
use Future::Utils 'repeat';
my $response = $client->createModel();
my $streamed = $response->get();
repeat {
my ($res) = $streamed->shift;
if( $res ) {
my $str = $res->get;
say $str;
}
Future::Mojo->done( defined $res );
} until => sub($done) { $done->get };
Create a model from a Modelfile.
It is recommended to set C<modelfile> to the content of the Modelfile rather than just set C<path>. This is a requirement for remote create. Remote model creation should also create any file blobs, fields such as C<FROM> and C<ADAPTER>, explicitly wi...
=head3 Options
=over 4
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
If C<false> the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.
=back
Returns a L<< AI::Ollama::CreateModelResponse >> on success.
=cut
sub build_createModel_request( $self, %options ) {
my $method = 'POST';
my $path = '/create';
my $url = Mojo::URL->new( $self->server . $path );
my $request = AI::Ollama::CreateModelRequest->new( \%options )->as_hash;
my $tx = $self->ua->build_tx(
$method => $url,
{
'Accept' => 'application/x-ndjson',
"Content-Type" => 'application/json',
}
=> json => $request,
);
$self->validate_request( $tx );
return $tx
}
sub createModel( $self, %options ) {
my $tx = $self->build_createModel_request(%options);
my $res = Future::Mojo->new();
my $r1 = Future::Mojo->new();
our @store; # we should use ->retain() instead
push @store, $r1->then( sub( $tx ) {
my $resp = $tx->res;
$self->emit(response => $resp);
# Should we validate using OpenAPI::Modern here?!
if( $resp->code == 200 ) {
# Successful operation.
my $queue = Future::Queue->new( prototype => 'Future::Mojo' );
$res->done( $queue );
my $ct = $resp->headers->content_type;
return unless $ct;
$ct =~ s/;\s+.*//;
if( $ct eq 'application/x-ndjson' ) {
# we only handle ndjson currently
my $handled_offset = 0;
$resp->on(progress => sub($msg,@) {
my $fresh = substr( $msg->body, $handled_offset );
my $body = $msg->body;
$body =~ s/[^\r\n]+\z//; # Strip any unfinished line
$handled_offset = length $body;
my @lines = split /\n/, $fresh;
for (@lines) {
my $payload = decode_json( $_ );
$self->validate_response( $payload, $tx );
$queue->push(
AI::Ollama::CreateModelResponse->new($payload),
);
};
if( $msg->{state} eq 'finished' ) {
$queue->finish();
}
});
} else {
# Unknown/unhandled content type
$res->fail( sprintf("unknown_unhandled content type '%s'", $resp->content_type), $resp );
}
} else {
# An unknown/unhandled response, likely an error
$res->fail( sprintf( "unknown_unhandled code %d", $resp->code ), $resp);
}
});
my $_tx;
$tx->res->once( progress => sub($msg, @) {
$r1->resolve( $tx );
undef $_tx;
undef $r1;
});
$self->emit(request => $tx);
$_tx = $self->ua->start_p($tx);
return $res
}
=head2 C<< build_deleteModel_request >>
Build an HTTP request as L<Mojo::Request> object. For the parameters see below.
=head2 C<< deleteModel >>
my $res = $client->deleteModel()->get;
Delete a model and its data.
=head3 Options
=over 4
=item C<< name >>
The model name.
Model names follow a C<model:tag> format. Some examples are C<orca-mini:3b-q4_1> and C<llama2:70b>. The tag is optional and, if not provided, will default to C<latest>. The tag is used to identify a specific version.
=back
=cut
sub build_deleteModel_request( $self, %options ) {
my $method = 'DELETE';
my $path = '/delete';
my $url = Mojo::URL->new( $self->server . $path );
my $request = AI::Ollama::DeleteModelRequest->new( \%options )->as_hash;
my $tx = $self->ua->build_tx(
$method => $url,
{
"Content-Type" => 'application/json',
}
=> json => $request,
);
$self->validate_request( $tx );
return $tx
}
sub deleteModel( $self, %options ) {
my $tx = $self->build_deleteModel_request(%options);
my $res = Future::Mojo->new();
my $r1 = Future::Mojo->new();
$r1->then( sub( $tx ) {
my $resp = $tx->res;
$self->emit(response => $resp);
# Should we validate using OpenAPI::Modern here?!
if( $resp->code == 200 ) {
# Successful operation.
$res->done($resp);
} else {
# An unknown/unhandled response, likely an error
$res->fail( sprintf( "unknown_unhandled code %d: %s", $resp->code, $resp->body ), $resp);
}
})->retain;
# Start our transaction
$self->emit(request => $tx);
$tx = $self->ua->start_p($tx)->then(sub($tx) {
$r1->resolve( $tx );
undef $r1;
})->catch(sub($err) {
$self->emit(response => $tx, $err);
$r1->fail( $err => $tx );
undef $r1;
});
return $res
}
=head2 C<< build_generateEmbedding_request >>
Build an HTTP request as L<Mojo::Request> object. For the parameters see below.
=head2 C<< generateEmbedding >>
my $res = $client->generateEmbedding()->get;
Generate embeddings from a model.
=head3 Options
=over 4
=item C<< model >>
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
Text to generate embeddings for.
=back
Returns a L<< AI::Ollama::GenerateEmbeddingResponse >> on success.
=cut
sub build_generateEmbedding_request( $self, %options ) {
my $method = 'POST';
my $path = '/embeddings';
my $url = Mojo::URL->new( $self->server . $path );
my $request = AI::Ollama::GenerateEmbeddingRequest->new( \%options )->as_hash;
my $tx = $self->ua->build_tx(
$method => $url,
{
'Accept' => 'application/json',
"Content-Type" => 'application/json',
}
=> json => $request,
);
$self->validate_request( $tx );
return $tx
}
sub generateEmbedding( $self, %options ) {
my $tx = $self->build_generateEmbedding_request(%options);
my $res = Future::Mojo->new();
my $r1 = Future::Mojo->new();
$r1->then( sub( $tx ) {
my $resp = $tx->res;
$self->emit(response => $resp);
# Should we validate using OpenAPI::Modern here?!
if( $resp->code == 200 ) {
# Successful operation.
my $ct = $resp->headers->content_type;
$ct =~ s/;\s+.*//;
if( $ct eq 'application/json' ) {
my $payload = $resp->json();
$self->validate_response( $payload, $tx );
$res->done(
AI::Ollama::GenerateEmbeddingResponse->new($payload),
);
} else {
# Unknown/unhandled content type
$res->fail( sprintf("unknown_unhandled content type '%s'", $resp->content_type), $resp );
}
} else {
# An unknown/unhandled response, likely an error
$res->fail( sprintf( "unknown_unhandled code %d: %s", $resp->code, $resp->body ), $resp);
}
})->retain;
# Start our transaction
$self->emit(request => $tx);
$tx = $self->ua->start_p($tx)->then(sub($tx) {
$r1->resolve( $tx );
undef $r1;
})->catch(sub($err) {
$self->emit(response => $tx, $err);
$r1->fail( $err => $tx );
undef $r1;
});
return $res
}
=head2 C<< build_generateCompletion_request >>
Build an HTTP request as L<Mojo::Request> object. For the parameters see below.
=head2 C<< generateCompletion >>
use Future::Utils 'repeat';
my $response = $client->generateCompletion();
my $streamed = $response->get();
repeat {
my ($res) = $streamed->shift;
if( $res ) {
my $str = $res->get;
say $str;
}
Future::Mojo->done( defined $res );
} until => sub($done) { $done->get };
Generate a response for a given prompt with a provided model.
The final response object will include statistics and additional data from the request.
=head3 Options
=over 4
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
The full prompt or prompt template (overrides what is defined in the Modelfile).
=back
Returns a L<< AI::Ollama::GenerateCompletionResponse >> on success.
=cut
sub build_generateCompletion_request( $self, %options ) {
my $method = 'POST';
my $path = '/generate';
my $url = Mojo::URL->new( $self->server . $path );
my $request = AI::Ollama::GenerateCompletionRequest->new( \%options )->as_hash;
my $tx = $self->ua->build_tx(
$method => $url,
{
'Accept' => 'application/x-ndjson',
"Content-Type" => 'application/json',
}
=> json => $request,
);
$self->validate_request( $tx );
return $tx
}
sub generateCompletion( $self, %options ) {
my $tx = $self->build_generateCompletion_request(%options);
my $res = Future::Mojo->new();
my $r1 = Future::Mojo->new();
our @store; # we should use ->retain() instead
push @store, $r1->then( sub( $tx ) {
my $resp = $tx->res;
$self->emit(response => $resp);
# Should we validate using OpenAPI::Modern here?!
if( $resp->code == 200 ) {
# Successful operation.
my $queue = Future::Queue->new( prototype => 'Future::Mojo' );
$res->done( $queue );
my $ct = $resp->headers->content_type;
return unless $ct;
$ct =~ s/;\s+.*//;
if( $ct eq 'application/x-ndjson' ) {
# we only handle ndjson currently
my $handled_offset = 0;
$resp->on(progress => sub($msg,@) {
my $fresh = substr( $msg->body, $handled_offset );
my $body = $msg->body;
$body =~ s/[^\r\n]+\z//; # Strip any unfinished line
$handled_offset = length $body;
my @lines = split /\n/, $fresh;
for (@lines) {
my $payload = decode_json( $_ );
$self->validate_response( $payload, $tx );
$queue->push(
AI::Ollama::GenerateCompletionResponse->new($payload),
);
};
if( $msg->{state} eq 'finished' ) {
$queue->finish();
}
});
} else {
# Unknown/unhandled content type
$res->fail( sprintf("unknown_unhandled content type '%s'", $resp->content_type), $resp );
}
} else {
# An unknown/unhandled response, likely an error
$res->fail( sprintf( "unknown_unhandled code %d", $resp->code ), $resp);
}
});
my $_tx;
$tx->res->once( progress => sub($msg, @) {
$r1->resolve( $tx );
undef $_tx;
undef $r1;
});
$self->emit(request => $tx);
$_tx = $self->ua->start_p($tx);
return $res
}
=head2 C<< build_pullModel_request >>
Build an HTTP request as L<Mojo::Request> object. For the parameters see below.
=head2 C<< pullModel >>
use Future::Utils 'repeat';
my $response = $client->pullModel();
my $streamed = $response->get();
repeat {
my ($res) = $streamed->shift;
if( $res ) {
my $str = $res->get;
say $str;
}
Future::Mojo->done( defined $res );
} until => sub($done) { $done->get };
Download a model from the ollama library.
Cancelled pulls are resumed from where they left off, and multiple calls will share the same download progress.
=head3 Options
=over 4
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
If C<false> the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.
=back
Returns a L<< AI::Ollama::PullModelResponse >> on success.
=cut
sub build_pullModel_request( $self, %options ) {
my $method = 'POST';
my $path = '/pull';
my $url = Mojo::URL->new( $self->server . $path );
my $request = AI::Ollama::PullModelRequest->new( \%options )->as_hash;
my $tx = $self->ua->build_tx(
$method => $url,
{
'Accept' => 'application/x-ndjson',
"Content-Type" => 'application/json',
}
=> json => $request,
);
$self->validate_request( $tx );
return $tx
}
sub pullModel( $self, %options ) {
my $tx = $self->build_pullModel_request(%options);
my $res = Future::Mojo->new();
my $r1 = Future::Mojo->new();
our @store; # we should use ->retain() instead
push @store, $r1->then( sub( $tx ) {
my $resp = $tx->res;
$self->emit(response => $resp);
# Should we validate using OpenAPI::Modern here?!
if( $resp->code == 200 ) {
# Successful operation.
my $queue = Future::Queue->new( prototype => 'Future::Mojo' );
$res->done( $queue );
my $ct = $resp->headers->content_type;
return unless $ct;
$ct =~ s/;\s+.*//;
if( $ct eq 'application/x-ndjson' ) {
# we only handle ndjson currently
my $handled_offset = 0;
$resp->on(progress => sub($msg,@) {
my $fresh = substr( $msg->body, $handled_offset );
my $body = $msg->body;
$body =~ s/[^\r\n]+\z//; # Strip any unfinished line
$handled_offset = length $body;
my @lines = split /\n/, $fresh;
for (@lines) {
my $payload = decode_json( $_ );
$self->validate_response( $payload, $tx );
$queue->push(
AI::Ollama::PullModelResponse->new($payload),
);
};
if( $msg->{state} eq 'finished' ) {
$queue->finish();
}
});
} else {
# Unknown/unhandled content type
$res->fail( sprintf("unknown_unhandled content type '%s'", $resp->content_type), $resp );
}
} else {
# An unknown/unhandled response, likely an error
$res->fail( sprintf( "unknown_unhandled code %d", $resp->code ), $resp);
}
});
my $_tx;
$tx->res->once( progress => sub($msg, @) {
$r1->resolve( $tx );
undef $_tx;
undef $r1;
});
$self->emit(request => $tx);
$_tx = $self->ua->start_p($tx);
return $res
}
=head2 C<< build_pushModel_request >>
Build an HTTP request as L<Mojo::Request> object. For the parameters see below.
=head2 C<< pushModel >>
my $res = $client->pushModel()->get;
Upload a model to a model library.
Requires registering for ollama.ai and adding a public key first.
=head3 Options
=over 4
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
If C<false> the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.
=back
Returns a L<< AI::Ollama::PushModelResponse >> on success.
=cut
sub build_pushModel_request( $self, %options ) {
my $method = 'POST';
my $path = '/push';
my $url = Mojo::URL->new( $self->server . $path );
my $request = AI::Ollama::PushModelRequest->new( \%options )->as_hash;
my $tx = $self->ua->build_tx(
$method => $url,
{
'Accept' => 'application/json',
"Content-Type" => 'application/json',
}
=> json => $request,
);
$self->validate_request( $tx );
return $tx
}
sub pushModel( $self, %options ) {
my $tx = $self->build_pushModel_request(%options);
my $res = Future::Mojo->new();
my $r1 = Future::Mojo->new();
$r1->then( sub( $tx ) {
my $resp = $tx->res;
$self->emit(response => $resp);
# Should we validate using OpenAPI::Modern here?!
if( $resp->code == 200 ) {
# Successful operation.
my $ct = $resp->headers->content_type;
$ct =~ s/;\s+.*//;
if( $ct eq 'application/json' ) {
my $payload = $resp->json();
$self->validate_response( $payload, $tx );
$res->done(
AI::Ollama::PushModelResponse->new($payload),
);
} else {
# Unknown/unhandled content type
$res->fail( sprintf("unknown_unhandled content type '%s'", $resp->content_type), $resp );
}
} else {
# An unknown/unhandled response, likely an error
$res->fail( sprintf( "unknown_unhandled code %d: %s", $resp->code, $resp->body ), $resp);
}
})->retain;
# Start our transaction
$self->emit(request => $tx);
$tx = $self->ua->start_p($tx)->then(sub($tx) {
$r1->resolve( $tx );
undef $r1;
})->catch(sub($err) {
$self->emit(response => $tx, $err);
$r1->fail( $err => $tx );
undef $r1;
});
return $res
}
=head2 C<< build_showModelInfo_request >>
Build an HTTP request as L<Mojo::Request> object. For the parameters see below.
=head2 C<< showModelInfo >>
my $res = $client->showModelInfo()->get;
Show details about a model including modelfile, template, parameters, license, and system prompt.
=head3 Options
=over 4
=item C<< name >>
lib/AI/Ollama/Client/Impl.pm view on Meta::CPAN
Model names follow a C<model:tag> format. Some examples are C<orca-mini:3b-q4_1> and C<llama2:70b>. The tag is optional and, if not provided, will default to C<latest>. The tag is used to identify a specific version.
=back
Returns a L<< AI::Ollama::ModelInfo >> on success.
=cut
sub build_showModelInfo_request( $self, %options ) {
my $method = 'POST';
my $path = '/show';
my $url = Mojo::URL->new( $self->server . $path );
my $request = AI::Ollama::ModelInfoRequest->new( \%options )->as_hash;
my $tx = $self->ua->build_tx(
$method => $url,
{
'Accept' => 'application/json',
"Content-Type" => 'application/json',
}
=> json => $request,
);
$self->validate_request( $tx );
return $tx
}
sub showModelInfo( $self, %options ) {
my $tx = $self->build_showModelInfo_request(%options);
my $res = Future::Mojo->new();
my $r1 = Future::Mojo->new();
$r1->then( sub( $tx ) {
my $resp = $tx->res;
$self->emit(response => $resp);
# Should we validate using OpenAPI::Modern here?!
if( $resp->code == 200 ) {
# Successful operation.
my $ct = $resp->headers->content_type;
$ct =~ s/;\s+.*//;
if( $ct eq 'application/json' ) {
my $payload = $resp->json();
$self->validate_response( $payload, $tx );
$res->done(
AI::Ollama::ModelInfo->new($payload),
);
} else {
# Unknown/unhandled content type
$res->fail( sprintf("unknown_unhandled content type '%s'", $resp->content_type), $resp );
}
} else {
# An unknown/unhandled response, likely an error
$res->fail( sprintf( "unknown_unhandled code %d: %s", $resp->code, $resp->body ), $resp);
}
})->retain;
# Start our transaction
$self->emit(request => $tx);
$tx = $self->ua->start_p($tx)->then(sub($tx) {
$r1->resolve( $tx );
undef $r1;
})->catch(sub($err) {
$self->emit(response => $tx, $err);
$r1->fail( $err => $tx );
undef $r1;
});
return $res
}
=head2 C<< build_listModels_request >>
Build an HTTP request as L<Mojo::Request> object. For the parameters see below.
=head2 C<< listModels >>
my $res = $client->listModels()->get;
List models that are available locally.
Returns a L<< AI::Ollama::ModelsResponse >> on success.
=cut
sub build_listModels_request( $self, %options ) {
my $method = 'GET';
my $path = '/tags';
my $url = Mojo::URL->new( $self->server . $path );
my $tx = $self->ua->build_tx(
$method => $url,
{
'Accept' => 'application/json',
}
);
$self->validate_request( $tx );
return $tx
}
sub listModels( $self, %options ) {
my $tx = $self->build_listModels_request(%options);
my $res = Future::Mojo->new();
my $r1 = Future::Mojo->new();
$r1->then( sub( $tx ) {
my $resp = $tx->res;
$self->emit(response => $resp);
# Should we validate using OpenAPI::Modern here?!
if( $resp->code == 200 ) {
# Successful operation.
my $ct = $resp->headers->content_type;
$ct =~ s/;\s+.*//;
if( $ct eq 'application/json' ) {
my $payload = $resp->json();
$self->validate_response( $payload, $tx );
$res->done(
AI::Ollama::ModelsResponse->new($payload),
);
} else {
# Unknown/unhandled content type
$res->fail( sprintf("unknown_unhandled content type '%s'", $resp->content_type), $resp );
}
} else {
# An unknown/unhandled response, likely an error
$res->fail( sprintf( "unknown_unhandled code %d: %s", $resp->code, $resp->body ), $resp);
}
})->retain;
# Start our transaction
$self->emit(request => $tx);
$tx = $self->ua->start_p($tx)->then(sub($tx) {
$r1->resolve( $tx );
undef $r1;
})->catch(sub($err) {
$self->emit(response => $tx, $err);
$r1->fail( $err => $tx );
undef $r1;
});
return $res
}
sub validate_response( $self, $payload, $tx ) {
if( $self->validate_responses
and my $openapi = $self->openapi ) {
my $results = $openapi->validate_response($payload, { request => $tx->req });
if( $results->{error}) {
say $results;
say $tx->res->to_string;
};
};
}
sub validate_request( $self, $tx ) {
if( $self->validate_requests
and my $openapi = $self->openapi ) {
my $results = $openapi->validate_request($tx->req);
if( $results->{error}) {
say $results;
say $tx->req->to_string;
};
};
}
1;
lib/AI/Ollama/CopyModelRequest.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::CopyModelRequest -
=head1 SYNOPSIS
my $obj = AI::Ollama::CopyModelRequest->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< destination >>
Name of the new model.
=cut
has 'destination' => (
is => 'ro',
isa => Str,
required => 1,
);
=head2 C<< source >>
Name of the model to copy.
=cut
has 'source' => (
is => 'ro',
isa => Str,
required => 1,
);
1;
lib/AI/Ollama/CreateModelRequest.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::CreateModelRequest -
=head1 SYNOPSIS
my $obj = AI::Ollama::CreateModelRequest->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< modelfile >>
The contents of the Modelfile.
=cut
has 'modelfile' => (
is => 'ro',
isa => Str,
required => 1,
);
=head2 C<< name >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'name' => (
is => 'ro',
isa => Str,
required => 1,
);
=head2 C<< stream >>
If `false` the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.
=cut
has 'stream' => (
is => 'ro',
);
1;
lib/AI/Ollama/CreateModelResponse.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::CreateModelResponse -
=head1 SYNOPSIS
my $obj = AI::Ollama::CreateModelResponse->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< status >>
Status creating the model
=cut
has 'status' => (
is => 'ro',
isa => Enum[
"creating system layer",
"parsing modelfile",
"success",
],
);
1;
lib/AI/Ollama/DeleteModelRequest.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::DeleteModelRequest -
=head1 SYNOPSIS
my $obj = AI::Ollama::DeleteModelRequest->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< name >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'name' => (
is => 'ro',
isa => Str,
required => 1,
);
1;
lib/AI/Ollama/GenerateChatCompletionRequest.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::GenerateChatCompletionRequest -
=head1 SYNOPSIS
my $obj = AI::Ollama::GenerateChatCompletionRequest->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< format >>
The format to return a response in. Currently the only accepted value is json.
Enable JSON mode by setting the format parameter to json. This will structure the response as valid JSON.
Note: it's important to instruct the model to use JSON in the prompt. Otherwise, the model may generate large amounts whitespace.
=cut
has 'format' => (
is => 'ro',
isa => Enum[
"json",
],
);
=head2 C<< keep_alive >>
How long (in minutes) to keep the model loaded in memory.
- If set to a positive duration (e.g. 20), the model will stay loaded for the provided duration.
- If set to a negative duration (e.g. -1), the model will stay loaded indefinitely.
- If set to 0, the model will be unloaded immediately once finished.
- If not set, the model will stay loaded for 5 minutes by default
=cut
has 'keep_alive' => (
is => 'ro',
isa => Int,
);
=head2 C<< messages >>
The messages of the chat, this can be used to keep a chat memory
=cut
has 'messages' => (
is => 'ro',
isa => ArrayRef[HashRef],
required => 1,
);
=head2 C<< model >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'model' => (
is => 'ro',
isa => Str,
required => 1,
);
=head2 C<< options >>
Additional model parameters listed in the documentation for the Modelfile such as `temperature`.
=cut
has 'options' => (
is => 'ro',
isa => HashRef,
);
=head2 C<< stream >>
If `false` the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.
=cut
has 'stream' => (
is => 'ro',
);
1;
lib/AI/Ollama/GenerateChatCompletionResponse.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::GenerateChatCompletionResponse -
=head1 SYNOPSIS
my $obj = AI::Ollama::GenerateChatCompletionResponse->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< created_at >>
Date on which a model was created.
=cut
has 'created_at' => (
is => 'ro',
isa => Str,
);
=head2 C<< done >>
Whether the response has completed.
=cut
has 'done' => (
is => 'ro',
);
=head2 C<< eval_count >>
Number of tokens the response.
=cut
has 'eval_count' => (
is => 'ro',
isa => Int,
);
=head2 C<< eval_duration >>
Time in nanoseconds spent generating the response.
=cut
has 'eval_duration' => (
is => 'ro',
isa => Int,
);
=head2 C<< load_duration >>
Time spent in nanoseconds loading the model.
=cut
has 'load_duration' => (
is => 'ro',
isa => Int,
);
=head2 C<< message >>
A message in the chat endpoint
=cut
has 'message' => (
is => 'ro',
isa => HashRef,
);
=head2 C<< model >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'model' => (
is => 'ro',
isa => Str,
);
=head2 C<< prompt_eval_count >>
Number of tokens in the prompt.
=cut
has 'prompt_eval_count' => (
is => 'ro',
isa => Int,
);
=head2 C<< prompt_eval_duration >>
Time spent in nanoseconds evaluating the prompt.
=cut
has 'prompt_eval_duration' => (
is => 'ro',
isa => Int,
);
=head2 C<< total_duration >>
Time spent generating the response.
=cut
has 'total_duration' => (
is => 'ro',
isa => Int,
);
1;
lib/AI/Ollama/GenerateCompletionRequest.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::GenerateCompletionRequest -
=head1 SYNOPSIS
my $obj = AI::Ollama::GenerateCompletionRequest->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< context >>
The context parameter returned from a previous request to [generateCompletion], this can be used to keep a short conversational memory.
=cut
has 'context' => (
is => 'ro',
isa => ArrayRef[Int],
);
=head2 C<< format >>
The format to return a response in. Currently the only accepted value is json.
Enable JSON mode by setting the format parameter to json. This will structure the response as valid JSON.
Note: it's important to instruct the model to use JSON in the prompt. Otherwise, the model may generate large amounts whitespace.
=cut
has 'format' => (
is => 'ro',
isa => Enum[
"json",
],
);
=head2 C<< images >>
(optional) a list of Base64-encoded images to include in the message (for multimodal models such as llava)
=cut
has 'images' => (
is => 'ro',
isa => ArrayRef[Str],
);
=head2 C<< keep_alive >>
How long (in minutes) to keep the model loaded in memory.
- If set to a positive duration (e.g. 20), the model will stay loaded for the provided duration.
- If set to a negative duration (e.g. -1), the model will stay loaded indefinitely.
- If set to 0, the model will be unloaded immediately once finished.
- If not set, the model will stay loaded for 5 minutes by default
=cut
has 'keep_alive' => (
is => 'ro',
isa => Int,
);
=head2 C<< model >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'model' => (
is => 'ro',
isa => Str,
required => 1,
);
=head2 C<< options >>
Additional model parameters listed in the documentation for the Modelfile such as `temperature`.
=cut
has 'options' => (
is => 'ro',
isa => HashRef,
);
=head2 C<< prompt >>
The prompt to generate a response.
=cut
has 'prompt' => (
is => 'ro',
isa => Str,
required => 1,
);
=head2 C<< raw >>
If `true` no formatting will be applied to the prompt and no context will be returned.
You may choose to use the `raw` parameter if you are specifying a full templated prompt in your request to the API, and are managing history yourself.
=cut
has 'raw' => (
is => 'ro',
);
=head2 C<< stream >>
If `false` the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.
=cut
has 'stream' => (
is => 'ro',
);
=head2 C<< system >>
The system prompt to (overrides what is defined in the Modelfile).
=cut
has 'system' => (
is => 'ro',
isa => Str,
);
=head2 C<< template >>
The full prompt or prompt template (overrides what is defined in the Modelfile).
=cut
has 'template' => (
is => 'ro',
isa => Str,
);
1;
lib/AI/Ollama/GenerateCompletionResponse.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::GenerateCompletionResponse -
=head1 SYNOPSIS
my $obj = AI::Ollama::GenerateCompletionResponse->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< context >>
An encoding of the conversation used in this response, this can be sent in the next request to keep a conversational memory.
=cut
has 'context' => (
is => 'ro',
isa => ArrayRef[Int],
);
=head2 C<< created_at >>
Date on which a model was created.
=cut
has 'created_at' => (
is => 'ro',
isa => Str,
);
=head2 C<< done >>
Whether the response has completed.
=cut
has 'done' => (
is => 'ro',
);
=head2 C<< eval_count >>
Number of tokens the response.
=cut
has 'eval_count' => (
is => 'ro',
isa => Int,
);
=head2 C<< eval_duration >>
Time in nanoseconds spent generating the response.
=cut
has 'eval_duration' => (
is => 'ro',
isa => Int,
);
=head2 C<< load_duration >>
Time spent in nanoseconds loading the model.
=cut
has 'load_duration' => (
is => 'ro',
isa => Int,
);
=head2 C<< model >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'model' => (
is => 'ro',
isa => Str,
);
=head2 C<< prompt_eval_count >>
Number of tokens in the prompt.
=cut
has 'prompt_eval_count' => (
is => 'ro',
isa => Int,
);
=head2 C<< prompt_eval_duration >>
Time spent in nanoseconds evaluating the prompt.
=cut
has 'prompt_eval_duration' => (
is => 'ro',
isa => Int,
);
=head2 C<< response >>
The response for a given prompt with a provided model.
=cut
has 'response' => (
is => 'ro',
isa => Str,
);
=head2 C<< total_duration >>
Time spent generating the response.
=cut
has 'total_duration' => (
is => 'ro',
isa => Int,
);
1;
lib/AI/Ollama/GenerateEmbeddingRequest.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::GenerateEmbeddingRequest -
=head1 SYNOPSIS
my $obj = AI::Ollama::GenerateEmbeddingRequest->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< model >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'model' => (
is => 'ro',
isa => Str,
required => 1,
);
=head2 C<< options >>
Additional model parameters listed in the documentation for the Modelfile such as `temperature`.
=cut
has 'options' => (
is => 'ro',
isa => HashRef,
);
=head2 C<< prompt >>
Text to generate embeddings for.
=cut
has 'prompt' => (
is => 'ro',
isa => Str,
required => 1,
);
1;
lib/AI/Ollama/GenerateEmbeddingResponse.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::GenerateEmbeddingResponse -
=head1 SYNOPSIS
my $obj = AI::Ollama::GenerateEmbeddingResponse->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< embedding >>
The embedding for the prompt.
=cut
has 'embedding' => (
is => 'ro',
isa => ArrayRef[Num],
);
1;
lib/AI/Ollama/Message.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::Message -
=head1 SYNOPSIS
my $obj = AI::Ollama::Message->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< content >>
The content of the message
=cut
has 'content' => (
is => 'ro',
isa => Str,
required => 1,
);
=head2 C<< images >>
(optional) a list of Base64-encoded images to include in the message (for multimodal models such as llava)
=cut
has 'images' => (
is => 'ro',
isa => ArrayRef[Str],
);
=head2 C<< role >>
The role of the message
=cut
has 'role' => (
is => 'ro',
isa => Enum[
"system",
"user",
"assistant",
],
required => 1,
);
1;
lib/AI/Ollama/Model.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::Model -
=head1 SYNOPSIS
my $obj = AI::Ollama::Model->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< modified_at >>
Model modification date.
=cut
has 'modified_at' => (
is => 'ro',
isa => Str,
);
=head2 C<< name >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'name' => (
is => 'ro',
isa => Str,
);
=head2 C<< size >>
Size of the model on disk.
=cut
has 'size' => (
is => 'ro',
isa => Int,
);
1;
lib/AI/Ollama/ModelInfo.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::ModelInfo -
=head1 SYNOPSIS
my $obj = AI::Ollama::ModelInfo->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< license >>
The model's license.
=cut
has 'license' => (
is => 'ro',
isa => Str,
);
=head2 C<< modelfile >>
The modelfile associated with the model.
=cut
has 'modelfile' => (
is => 'ro',
isa => Str,
);
=head2 C<< parameters >>
The model parameters.
=cut
has 'parameters' => (
is => 'ro',
isa => Str,
);
=head2 C<< template >>
The prompt template for the model.
=cut
has 'template' => (
is => 'ro',
isa => Str,
);
1;
lib/AI/Ollama/ModelInfoRequest.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::ModelInfoRequest -
=head1 SYNOPSIS
my $obj = AI::Ollama::ModelInfoRequest->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< name >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'name' => (
is => 'ro',
isa => Str,
required => 1,
);
1;
lib/AI/Ollama/ModelsResponse.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::ModelsResponse -
=head1 SYNOPSIS
my $obj = AI::Ollama::ModelsResponse->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< models >>
List of models available locally.
=cut
has 'models' => (
is => 'ro',
isa => ArrayRef[HashRef],
);
1;
lib/AI/Ollama/PullModelRequest.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::PullModelRequest -
=head1 SYNOPSIS
my $obj = AI::Ollama::PullModelRequest->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< insecure >>
Allow insecure connections to the library.
Only use this if you are pulling from your own library during development.
=cut
has 'insecure' => (
is => 'ro',
);
=head2 C<< name >>
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
=cut
has 'name' => (
is => 'ro',
isa => Str,
required => 1,
);
=head2 C<< stream >>
If `false` the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.
=cut
has 'stream' => (
is => 'ro',
);
1;
lib/AI/Ollama/PullModelResponse.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::PullModelResponse -
=head1 SYNOPSIS
my $obj = AI::Ollama::PullModelResponse->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< completed >>
Total bytes transferred.
=cut
has 'completed' => (
is => 'ro',
isa => Int,
);
=head2 C<< digest >>
The model's digest.
=cut
has 'digest' => (
is => 'ro',
isa => Str,
);
=head2 C<< status >>
Status pulling the model.
=cut
has 'status' => (
is => 'ro',
isa => Enum[
"pulling manifest",
"downloading digestname",
"verifying sha256 digest",
"writing manifest",
"removing any unused layers",
"success",
],
);
=head2 C<< total >>
Total size of the model.
=cut
has 'total' => (
is => 'ro',
isa => Int,
);
1;
lib/AI/Ollama/PushModelRequest.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::PushModelRequest -
=head1 SYNOPSIS
my $obj = AI::Ollama::PushModelRequest->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< insecure >>
Allow insecure connections to the library.
Only use this if you are pushing to your library during development.
=cut
has 'insecure' => (
is => 'ro',
);
=head2 C<< name >>
The name of the model to push in the form of <namespace>/<model>:<tag>.
=cut
has 'name' => (
is => 'ro',
isa => Str,
required => 1,
);
=head2 C<< stream >>
If `false` the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.
=cut
has 'stream' => (
is => 'ro',
);
1;
lib/AI/Ollama/PushModelResponse.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::PushModelResponse -
=head1 SYNOPSIS
my $obj = AI::Ollama::PushModelResponse->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< digest >>
the model's digest
=cut
has 'digest' => (
is => 'ro',
isa => Str,
);
=head2 C<< status >>
Status pushing the model.
=cut
has 'status' => (
is => 'ro',
isa => Enum[
"retrieving manifest",
"starting upload",
"pushing manifest",
"success",
],
);
=head2 C<< total >>
total size of the model
=cut
has 'total' => (
is => 'ro',
isa => Int,
);
1;
lib/AI/Ollama/RequestOptions.pm view on Meta::CPAN
use namespace::clean;
=encoding utf8
=head1 NAME
AI::Ollama::RequestOptions -
=head1 SYNOPSIS
my $obj = AI::Ollama::RequestOptions->new();
...
=cut
sub as_hash( $self ) {
return { $self->%* }
}
=head1 PROPERTIES
=head2 C<< embedding_only >>
Enable embedding only. (Default: false)
=cut
has 'embedding_only' => (
is => 'ro',
);
=head2 C<< f16_kv >>
Enable f16 key/value. (Default: false)
=cut
has 'f16_kv' => (
is => 'ro',
);
=head2 C<< frequency_penalty >>
Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
=cut
has 'frequency_penalty' => (
is => 'ro',
isa => Num,
);
=head2 C<< logits_all >>
Enable logits all. (Default: false)
=cut
has 'logits_all' => (
is => 'ro',
);
=head2 C<< low_vram >>
Enable low VRAM mode. (Default: false)
=cut
has 'low_vram' => (
is => 'ro',
);
=head2 C<< main_gpu >>
The GPU to use for the main model. Default is 0.
=cut
has 'main_gpu' => (
is => 'ro',
isa => Int,
);
=head2 C<< mirostat >>
Enable Mirostat sampling for controlling perplexity. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
=cut
has 'mirostat' => (
is => 'ro',
isa => Int,
);
=head2 C<< mirostat_eta >>
Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1)
=cut
has 'mirostat_eta' => (
is => 'ro',
isa => Num,
);
=head2 C<< mirostat_tau >>
Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)
=cut
has 'mirostat_tau' => (
is => 'ro',
isa => Num,
);
=head2 C<< num_batch >>
Sets the number of batches to use for generation. (Default: 1)
=cut
has 'num_batch' => (
is => 'ro',
isa => Int,
);
=head2 C<< num_ctx >>
Sets the size of the context window used to generate the next token.
=cut
has 'num_ctx' => (
is => 'ro',
isa => Int,
);
=head2 C<< num_gpu >>
The number of layers to send to the GPU(s). On macOS it defaults to 1 to enable metal support, 0 to disable.
=cut
has 'num_gpu' => (
is => 'ro',
isa => Int,
);
=head2 C<< num_gqa >>
The number of GQA groups in the transformer layer. Required for some models, for example it is 8 for `llama2:70b`.
=cut
has 'num_gqa' => (
is => 'ro',
isa => Int,
);
=head2 C<< num_keep >>
Number of tokens to keep from the prompt.
=cut
has 'num_keep' => (
is => 'ro',
isa => Int,
);
=head2 C<< num_predict >>
Maximum number of tokens to predict when generating text. (Default: 128, -1 = infinite generation, -2 = fill context)
=cut
has 'num_predict' => (
is => 'ro',
isa => Int,
);
=head2 C<< num_thread >>
Sets the number of threads to use during computation. By default, Ollama will detect this for optimal performance. It is recommended to set this value to the number of physical CPU cores your system has (as opposed to the logical number of cores).
=cut
has 'num_thread' => (
is => 'ro',
isa => Int,
);
=head2 C<< numa >>
Enable NUMA support. (Default: false)
=cut
has 'numa' => (
is => 'ro',
);
=head2 C<< penalize_newline >>
Penalize newlines in the output. (Default: false)
=cut
has 'penalize_newline' => (
is => 'ro',
);
=head2 C<< presence_penalty >>
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
=cut
has 'presence_penalty' => (
is => 'ro',
isa => Num,
);
=head2 C<< repeat_last_n >>
Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)
=cut
has 'repeat_last_n' => (
is => 'ro',
isa => Int,
);
=head2 C<< repeat_penalty >>
Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)
=cut
has 'repeat_penalty' => (
is => 'ro',
isa => Num,
);
=head2 C<< rope_frequency_base >>
The base of the rope frequency scale. (Default: 1.0)
=cut
has 'rope_frequency_base' => (
is => 'ro',
isa => Num,
);
=head2 C<< rope_frequency_scale >>
The scale of the rope frequency. (Default: 1.0)
=cut
has 'rope_frequency_scale' => (
is => 'ro',
isa => Num,
);
=head2 C<< seed >>
Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0)
=cut
has 'seed' => (
is => 'ro',
isa => Int,
);
=head2 C<< stop >>
Sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
=cut
has 'stop' => (
is => 'ro',
isa => ArrayRef[Str],
);
=head2 C<< temperature >>
The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8)
=cut
has 'temperature' => (
is => 'ro',
isa => Num,
);
=head2 C<< tfs_z >>
Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1)
=cut
has 'tfs_z' => (
is => 'ro',
isa => Num,
);
=head2 C<< top_k >>
Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)
=cut
has 'top_k' => (
is => 'ro',
isa => Int,
);
=head2 C<< top_p >>
Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)
=cut
has 'top_p' => (
is => 'ro',
isa => Num,
);
=head2 C<< typical_p >>
Typical p is used to reduce the impact of less probable tokens from the output.
=cut
has 'typical_p' => (
is => 'ro',
isa => Num,
);
=head2 C<< use_mlock >>
Enable mlock. (Default: false)
=cut
has 'use_mlock' => (
is => 'ro',
);
=head2 C<< use_mmap >>
Enable mmap. (Default: false)
=cut
has 'use_mmap' => (
is => 'ro',
);
=head2 C<< vocab_only >>
Enable vocab only. (Default: false)
=cut
has 'vocab_only' => (
is => 'ro',
);
1;
ollama/ollama-curated.yaml view on Meta::CPAN
openapi: 3.1.0
# https://github.com/davidmigloz/langchain_dart/blob/main/packages/ollama_dart/oas/ollama-curated.yaml
info:
title: Ollama API
description: API Spec for Ollama API. Please see https://github.com/jmorganca/ollama/blob/main/docs/api.md for more details.
version: 0.1.9
#servers:
# - url: http://localhost:11434/api
# description: Ollama server URL
tags:
- name: Completions
description: Given a prompt, the model will generate a completion.
- name: Chat
description: Given a list of messages comprising a conversation, the model will return a response.
- name: Embeddings
description: Get a vector representation of a given input.
- name: Models
description: List and describe the various models available.
paths:
/generate:
post:
operationId: generateCompletion
tags:
- Completions
summary: Generate a response for a given prompt with a provided model.
description: The final response object will include statistics and additional data from the request.
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/GenerateCompletionRequest'
responses:
'200':
description: Successful operation.
content:
application/x-ndjson:
schema:
$ref: '#/components/schemas/GenerateCompletionResponse'
/chat:
post:
operationId: generateChatCompletion
tags:
- Chat
summary: Generate the next message in a chat with a provided model.
description: This is a streaming endpoint, so there will be a series of responses. The final response object will include statistics and additional data from the request.
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/GenerateChatCompletionRequest'
responses:
'200':
description: Successful operation.
content:
application/x-ndjson:
schema:
$ref: '#/components/schemas/GenerateChatCompletionResponse'
/embeddings:
post:
operationId: generateEmbedding
tags:
- Embeddings
summary: Generate embeddings from a model.
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/GenerateEmbeddingRequest'
responses:
'200':
description: Successful operation.
content:
application/json:
schema:
$ref: '#/components/schemas/GenerateEmbeddingResponse'
/create:
post:
operationId: createModel
tags:
- Models
summary: Create a model from a Modelfile.
description: It is recommended to set `modelfile` to the content of the Modelfile rather than just set `path`. This is a requirement for remote create. Remote model creation should also create any file blobs, fields such as `FROM` and `ADAPTER`...
requestBody:
description: Create a new model from a Modelfile.
content:
application/json:
schema:
$ref: '#/components/schemas/CreateModelRequest'
responses:
'200':
description: Successful operation.
content:
application/x-ndjson:
schema:
$ref: '#/components/schemas/CreateModelResponse'
/tags:
get:
operationId: listModels
tags:
- Models
summary: List models that are available locally.
responses:
'200':
description: Successful operation.
content:
application/json:
schema:
$ref: '#/components/schemas/ModelsResponse'
/show:
post:
operationId: showModelInfo
tags:
- Models
summary: Show details about a model including modelfile, template, parameters, license, and system prompt.
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/ModelInfoRequest'
responses:
'200':
description: Successful operation.
content:
application/json:
schema:
$ref: '#/components/schemas/ModelInfo'
/copy:
post:
operationId: copyModel
tags:
- Models
summary: Creates a model with another name from an existing model.
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/CopyModelRequest'
responses:
'200':
description: Successful operation.
/delete:
delete:
operationId: deleteModel
tags:
- Models
summary: Delete a model and its data.
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/DeleteModelRequest'
responses:
'200':
description: Successful operation.
/pull:
post:
operationId: pullModel
tags:
- Models
summary: Download a model from the ollama library.
description: Cancelled pulls are resumed from where they left off, and multiple calls will share the same download progress.
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/PullModelRequest'
responses:
'200':
description: Successful operation.
content:
application/x-ndjson:
schema:
$ref: '#/components/schemas/PullModelResponse'
/push:
post:
operationId: pushModel
tags:
- Models
summary: Upload a model to a model library.
description: Requires registering for ollama.ai and adding a public key first.
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/PushModelRequest'
responses:
'200':
description: Successful operation.
content:
application/json:
schema:
$ref: '#/components/schemas/PushModelResponse'
"/blobs/{digest}":
head:
operationId: checkBlob
tags:
- Models
summary: Check to see if a blob exists on the Ollama server which is useful when creating models.
parameters:
- in: path
name: digest
schema:
type: string
required: true
description: the SHA256 digest of the blob
example: sha256:c8edda1f17edd2f1b60253b773d837bda7b9d249a61245931a4d7c9a8d350250
responses:
'200':
description: Blob exists on the server
'404':
description: Blob was not found
post:
operationId: createBlob
tags:
- Models
summary: Create a blob from a file. Returns the server file path.
parameters:
- in: path
name: digest
schema:
type: string
required: true
description: the SHA256 digest of the blob
example: sha256:c8edda1f17edd2f1b60253b773d837bda7b9d249a61245931a4d7c9a8d350250
requestBody:
content:
application/octet-stream:
schema:
type: string
format: binary
responses:
'201':
description: Blob was successfully created
components:
schemas:
GenerateCompletionRequest:
type: object
description: Request class for the generate endpoint.
properties:
model:
type: string
description: &model_name |
The model name.
Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
example: llama2:7b
prompt:
type: string
description: The prompt to generate a response.
example: Why is the sky blue?
images:
type: array
description: (optional) a list of Base64-encoded images to include in the message (for multimodal models such as llava)
items:
type: string
contentEncoding: base64
description: Base64-encoded image (for multimodal models such as llava)
example: iVBORw0KGgoAAAANSUhEUgAAAAkAAAANCAIAAAD0YtNRAAAABnRSTlMA/AD+APzoM1ogAAAAWklEQVR4AWP48+8PLkR7uUdzcMvtU8EhdykHKAciEXL3pvw5FQIURaBDJkARoDhY3zEXiCgCHbNBmAlUiyaBkENoxZSDWnOtBmoAQu7TnT+3WuDOA7KBIkAGAGwiNeqjusp/AAAAAElFTkSuQmCC
system:
type: string
description: The system prompt to (overrides what is defined in the Modelfile).
template:
type: string
description: The full prompt or prompt template (overrides what is defined in the Modelfile).
context:
type: array
description: The context parameter returned from a previous request to [generateCompletion], this can be used to keep a short conversational memory.
items:
type: integer
options:
$ref: '#/components/schemas/RequestOptions'
format:
$ref: '#/components/schemas/ResponseFormat'
raw:
type: boolean
description: |
If `true` no formatting will be applied to the prompt and no context will be returned.
You may choose to use the `raw` parameter if you are specifying a full templated prompt in your request to the API, and are managing history yourself.
stream:
type: boolean
description: &stream |
If `false` the response will be returned as a single response object, otherwise the response will be streamed as a series of objects.
default: false
keep_alive:
type: integer
description: &keep_alive |
How long (in minutes) to keep the model loaded in memory.
- If set to a positive duration (e.g. 20), the model will stay loaded for the provided duration.
- If set to a negative duration (e.g. -1), the model will stay loaded indefinitely.
- If set to 0, the model will be unloaded immediately once finished.
- If not set, the model will stay loaded for 5 minutes by default
required:
- model
- prompt
RequestOptions:
type: object
description: Additional model parameters listed in the documentation for the Modelfile such as `temperature`.
properties:
num_keep:
type: integer
description: |
Number of tokens to keep from the prompt.
seed:
type: integer
description: |
Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0)
num_predict:
type: integer
description: |
Maximum number of tokens to predict when generating text. (Default: 128, -1 = infinite generation, -2 = fill context)
top_k:
type: integer
description: |
Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)
top_p:
type: number
format: float
description: |
Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)
tfs_z:
type: number
format: float
description: |
Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1)
typical_p:
type: number
format: float
description: |
Typical p is used to reduce the impact of less probable tokens from the output.
repeat_last_n:
type: integer
description: |
Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)
temperature:
type: number
format: float
description: |
The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8)
repeat_penalty:
type: number
format: float
description: |
Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)
presence_penalty:
type: number
format: float
description: |
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
frequency_penalty:
type: number
format: float
description: |
Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
mirostat:
type: integer
description: |
Enable Mirostat sampling for controlling perplexity. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
mirostat_tau:
type: number
format: float
description: |
Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)
mirostat_eta:
type: number
format: float
description: |
Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1)
penalize_newline:
type: boolean
description: |
Penalize newlines in the output. (Default: false)
stop:
type: array
description: Sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
items:
type: string
numa:
type: boolean
description: |
Enable NUMA support. (Default: false)
num_ctx:
type: integer
description: |
Sets the size of the context window used to generate the next token.
num_batch:
type: integer
description: |
Sets the number of batches to use for generation. (Default: 1)
num_gqa:
type: integer
description: |
The number of GQA groups in the transformer layer. Required for some models, for example it is 8 for `llama2:70b`.
num_gpu:
type: integer
description: |
The number of layers to send to the GPU(s). On macOS it defaults to 1 to enable metal support, 0 to disable.
main_gpu:
type: integer
description: |
The GPU to use for the main model. Default is 0.
low_vram:
type: boolean
description: |
Enable low VRAM mode. (Default: false)
f16_kv:
type: boolean
description: |
Enable f16 key/value. (Default: false)
logits_all:
type: boolean
description: |
Enable logits all. (Default: false)
vocab_only:
type: boolean
description: |
Enable vocab only. (Default: false)
use_mmap:
type: boolean
description: |
Enable mmap. (Default: false)
use_mlock:
type: boolean
description: |
Enable mlock. (Default: false)
embedding_only:
type: boolean
description: |
Enable embedding only. (Default: false)
rope_frequency_base:
type: number
format: float
description: |
The base of the rope frequency scale. (Default: 1.0)
rope_frequency_scale:
type: number
format: float
description: |
The scale of the rope frequency. (Default: 1.0)
num_thread:
type: integer
description: |
Sets the number of threads to use during computation. By default, Ollama will detect this for optimal performance. It is recommended to set this value to the number of physical CPU cores your system has (as opposed to the logical number o...
ResponseFormat:
type: string
description: |
The format to return a response in. Currently the only accepted value is json.
Enable JSON mode by setting the format parameter to json. This will structure the response as valid JSON.
Note: it's important to instruct the model to use JSON in the prompt. Otherwise, the model may generate large amounts whitespace.
enum:
- json
GenerateCompletionResponse:
type: object
description: The response class for the generate endpoint.
properties:
model:
type: string
description: *model_name
example: llama2:7b
created_at:
type: string
format: date-time
description: Date on which a model was created.
example: 2023-08-04T19:22:45.499127Z
response:
type: string
description: The response for a given prompt with a provided model.
example: The sky appears blue because of a phenomenon called Rayleigh scattering.
done:
type: boolean
description: Whether the response has completed.
example: true
context:
type: array
description: |
An encoding of the conversation used in this response, this can be sent in the next request to keep a conversational memory.
items:
type: integer
example: [ 1, 2, 3 ]
total_duration:
type: integer
description: Time spent generating the response.
example: 5589157167
load_duration:
type: integer
description: Time spent in nanoseconds loading the model.
example: 3013701500
prompt_eval_count:
type: integer
description: Number of tokens in the prompt.
example: 46
prompt_eval_duration:
type: integer
description: Time spent in nanoseconds evaluating the prompt.
example: 1160282000
eval_count:
type: integer
description: Number of tokens the response.
example: 113
eval_duration:
type: integer
description: Time in nanoseconds spent generating the response.
example: 1325948000
GenerateChatCompletionRequest:
type: object
description: Request class for the chat endpoint.
properties:
model:
type: string
description: *model_name
example: llama2:7b
messages:
type: array
description: The messages of the chat, this can be used to keep a chat memory
items:
$ref: '#/components/schemas/Message'
format:
$ref: '#/components/schemas/ResponseFormat'
options:
$ref: '#/components/schemas/RequestOptions'
stream:
type: boolean
description: *stream
default: false
keep_alive:
type: integer
description: *keep_alive
required:
- model
- messages
GenerateChatCompletionResponse:
type: object
description: The response class for the chat endpoint.
properties:
message:
$ref: '#/components/schemas/Message'
model:
type: string
description: *model_name
example: llama2:7b
created_at:
type: string
format: date-time
description: Date on which a model was created.
example: 2023-08-04T19:22:45.499127Z
done:
type: boolean
description: Whether the response has completed.
example: true
total_duration:
type: integer
description: Time spent generating the response.
example: 5589157167
load_duration:
type: integer
description: Time spent in nanoseconds loading the model.
example: 3013701500
prompt_eval_count:
type: integer
description: Number of tokens in the prompt.
example: 46
prompt_eval_duration:
type: integer
description: Time spent in nanoseconds evaluating the prompt.
example: 1160282000
eval_count:
type: integer
description: Number of tokens the response.
example: 113
eval_duration:
type: integer
description: Time in nanoseconds spent generating the response.
example: 1325948000
Message:
type: object
description: A message in the chat endpoint
properties:
role:
type: string
description: The role of the message
enum: [ "system", "user", "assistant" ]
content:
type: string
description: The content of the message
example: Why is the sky blue?
images:
type: array
description: (optional) a list of Base64-encoded images to include in the message (for multimodal models such as llava)
items:
type: string
description: Base64-encoded image (for multimodal models such as llava)
example: iVBORw0KGgoAAAANSUhEUgAAAAkAAAANCAIAAAD0YtNRAAAABnRSTlMA/AD+APzoM1ogAAAAWklEQVR4AWP48+8PLkR7uUdzcMvtU8EhdykHKAciEXL3pvw5FQIURaBDJkARoDhY3zEXiCgCHbNBmAlUiyaBkENoxZSDWnOtBmoAQu7TnT+3WuDOA7KBIkAGAGwiNeqjusp/AAAAAElFTkSuQmCC
required:
- role
- content
GenerateEmbeddingRequest:
description: Generate embeddings from a model.
type: object
properties:
model:
type: string
description: *model_name
example: llama2:7b
prompt:
type: string
description: Text to generate embeddings for.
example: 'Here is an article about llamas...'
options:
$ref: '#/components/schemas/RequestOptions'
required:
- model
- prompt
GenerateEmbeddingResponse:
type: object
description: Returns the embedding information.
properties:
embedding:
type: array
description: The embedding for the prompt.
items:
type: number
format: double
example: [ 0.5670403838157654, 0.009260174818336964, ... ]
CreateModelRequest:
type: object
description: Create model request object.
properties:
name:
type: string
description: *model_name
example: mario
modelfile:
type: string
description: The contents of the Modelfile.
example: FROM llama2\nSYSTEM You are mario from Super Mario Bros.
stream:
type: boolean
description: *stream
default: false
required:
- name
- modelfile
CreateModelResponse:
description: Response object for creating a model. When finished, `status` is `success`.
type: object
properties:
status:
$ref: '#/components/schemas/CreateModelStatus'
CreateModelStatus:
type: string
description: Status creating the model
enum:
- creating system layer
- parsing modelfile
- success
ModelsResponse:
description: Response class for the list models endpoint.
type: object
properties:
models:
type: array
description: List of models available locally.
items:
$ref: '#/components/schemas/Model'
Model:
type: object
description: A model available locally.
properties:
name:
type: string
description: *model_name
example: llama2:7b
modified_at:
type: string
format: date-time
description: Model modification date.
example: 2023-08-02T17:02:23.713454393-07:00
size:
type: integer
description: Size of the model on disk.
example: 7323310500
ModelInfoRequest:
description: Request class for the show model info endpoint.
type: object
properties:
name:
type: string
description: *model_name
example: llama2:7b
required:
- name
ModelInfo:
description: Details about a model including modelfile, template, parameters, license, and system prompt.
type: object
properties:
license:
type: string
description: The model's license.
example: <contents of license block>
modelfile:
type: string
description: The modelfile associated with the model.
example: 'Modelfile generated by \"ollama show\"\n# To build a new Modelfile based on this one, replace the FROM line with:\n# FROM llama2:latest\n\nFROM /Users/username/.ollama/models/blobs/sha256:8daa9615cce30c259a9555b1cc250d461d1bc69980...
parameters:
type: string
description: The model parameters.
example: 'stop [INST]\nstop [/INST]\nstop <<SYS>>\nstop <</SYS>>'
template:
type: string
description: The prompt template for the model.
example: '[INST] {{ if and .First .System }}<<SYS>>{{ .System }}<</SYS>>\n\n{{ end }}{{ .Prompt }} [/INST]'
CopyModelRequest:
description: Request class for copying a model.
type: object
properties:
source:
type: string
description: Name of the model to copy.
example: llama2:7b
destination:
type: string
description: Name of the new model.
example: llama2-backup
required:
- source
- destination
DeleteModelRequest:
description: Request class for deleting a model.
type: object
properties:
name:
type: string
description: *model_name
example: llama2:13b
required:
- name
PullModelRequest:
description: Request class for pulling a model.
type: object
properties:
name:
type: string
description: *model_name
example: llama2:7b
insecure:
type: boolean
description: |
Allow insecure connections to the library.
Only use this if you are pulling from your own library during development.
default: false
stream:
type: boolean
description: *stream
default: false
required:
- name
PullModelResponse:
description: |
Response class for pulling a model.
The first object is the manifest. Then there is a series of downloading responses. Until any of the download is completed, the `completed` key may not be included.
The number of files to be downloaded depends on the number of layers specified in the manifest.
type: object
properties:
status:
$ref: '#/components/schemas/PullModelStatus'
digest:
type: string
description: The model's digest.
example: 'sha256:bc07c81de745696fdf5afca05e065818a8149fb0c77266fb584d9b2cba3711a'
total:
type: integer
description: Total size of the model.
example: 2142590208
completed:
type: integer
description: Total bytes transferred.
example: 2142590208
PullModelStatus:
type: string
description: Status pulling the model.
enum:
- pulling manifest
- downloading digestname
- verifying sha256 digest
- writing manifest
- removing any unused layers
- success
example: pulling manifest
PushModelRequest:
description: Request class for pushing a model.
type: object
properties:
name:
type: string
description: The name of the model to push in the form of <namespace>/<model>:<tag>.
example: 'mattw/pygmalion:latest'
insecure:
type: boolean
description: |
Allow insecure connections to the library.
Only use this if you are pushing to your library during development.
default: false
stream:
type: boolean
description: *stream
default: false
required:
- name
PushModelResponse:
type: object
description: Response class for pushing a model.
properties:
status:
$ref: '#/components/schemas/PushModelStatus'
digest:
type: string
description: the model's digest
example: 'sha256:bc07c81de745696fdf5afca05e065818a8149fb0c77266fb584d9b2cba3711a'
total:
type: integer
description: total size of the model
example: 2142590208
PushModelStatus:
type: string
description: Status pushing the model.
enum:
- retrieving manifest
- starting upload
- pushing manifest
- success