AI-Embedding
view release on metacpan or search on metacpan
lib/AI/Embedding.pm view on Meta::CPAN
if (scalar keys %$vector1 != scalar keys %$vector2) {
$self->{'error'} = 'Embeds are unequal length';
return;
}
return $self->_compare_vector($vector1, $vector2);
}
# Compare 2 Vectors
sub _compare_vector {
my ($self, $vector1, $vector2) = @_;
my $cs = Data::CosineSimilarity->new;
$cs->add( label1 => $vector1 );
$cs->add( label2 => $vector2 );
return $cs->similarity('label1', 'label2')->cosine;
}
1;
__END__
=encoding utf8
=head1 NAME
AI::Embedding - Perl module for working with text embeddings using various APIs
=head1 VERSION
Version 1.11
=head1 SYNOPSIS
use AI::Embedding;
my $embedding = AI::Embedding->new(
api => 'OpenAI',
key => 'your-api-key'
);
my $csv_embedding = $embedding->embedding('Some sample text');
my $test_embedding = $embedding->test_embedding('Some sample text');
my @raw_embedding = $embedding->raw_embedding('Some sample text');
my $cmp = $embedding->comparator($csv_embedding2);
my $similarity = $cmp->($csv_embedding1);
my $similarity_with_other_embedding = $embedding->compare($csv_embedding1, $csv_embedding2);
=head1 DESCRIPTION
The L<AI::Embedding> module provides an interface for working with text embeddings using various APIs. It currently supports the L<OpenAI|https://www.openai.com> L<Embeddings API|https://platform.openai.com/docs/guides/embeddings/what-are-embeddings>...
Embeddings allow the meaning of passages of text to be compared for similarity. This is more natural and useful to humans than using traditional keyword based comparisons.
An Embedding is a multi-dimensional vector representing the meaning of a piece of text. The Embedding vector is created by an AI Model. The default model (OpenAI's C<text-embedding-ada-002>) produces a 1536 dimensional vector. The resulting vector...
=head2 Comparator
Embeddings are used to compare similarity of meaning between two passages of text. A typical work case is to store a number of pieces of text (e.g. articles or blogs) in a database and compare each one to some user supplied search text. L<AI::Embed...
Alternatively, the C<comparator> method can be called with one Embedding. The C<comparator> returns a reference to a method that takes a single Embedding to be compared to the Embedding from which the Comparator was created.
When comparing multiple Embeddings to the same Embedding (such as search text) it is faster to use a C<comparator>.
=head1 CONSTRUCTOR
=head2 new
my $embedding = AI::Embedding->new(
api => 'OpenAI',
key => 'your-api-key',
model => 'text-embedding-ada-002',
);
Creates a new AI::Embedding object. It requires the 'key' parameter. The 'key' parameter is the API key provided by the service provider and is required.
Parameters:
=over
=item *
C<key> - B<required> The API Key
=item *
C<api> - The API to use. Currently only 'OpenAI' is supported and this is the default.
=item *
C<model> - The language model to use. Defaults to C<text-embedding-ada-002> - see L<OpenAI docs|https://platform.openai.com/docs/guides/embeddings/what-are-embeddings>
=back
=head1 METHODS
=head2 success
Returns true if the last method call was successful
=head2 error
Returns the last error message or an empty string if B<success> returned true
=head2 embedding
my $csv_embedding = $embedding->embedding('Some text passage', [$verbose]);
Generates an embedding for the given text and returns it as a comma-separated string. The C<embedding> method takes a single parameter, the text to generate the embedding for.
Returns a (rather long) string that can be stored in a C<TEXT> database field.
If the method call fails it sets the L</"error"> message and returns C<undef>. If the optional C<verbose> parameter is true, the complete L<HTTP::Tiny> response object is also returned to aid with debugging issues when using this module.
=head2 raw_embedding
my @raw_embedding = $embedding->raw_embedding('Some text passage', [$verbose]);
Generates an embedding for the given text and returns it as an array. The C<raw_embedding> method takes a single parameter, the text to generate the embedding for.
( run in 0.822 second using v1.01-cache-2.11-cpan-cdf2f3d4e48 )