SQLite-VecDB
view release on metacpan or search on metacpan
lib/SQLite/VecDB.pm view on Meta::CPAN
my @results = $coll->search(
vector => [0.1, 0.2, ...],
limit => 5,
);
for my $r (@results) {
say $r->id; # 'doc1'
say $r->distance; # 0.042
say $r->metadata; # { title => 'Hello World' }
say $r->content; # 'Original text content'
}
=head1 DESCRIPTION
SQLite::VecDB turns SQLite into a vector database using the
L<sqlite-vec|https://github.com/asg017/sqlite-vec> extension. It supports
storing vectors with metadata, KNN (k-nearest neighbor) search, and
optional automatic embedding generation via L<Langertha>.
=head2 db_file
Path to the SQLite database file. Use C<:memory:> for an in-memory database.
=head2 dimensions
The number of dimensions for vectors in this database. Must match the
embedding model you are using (e.g. 768 for nomic-embed-text, 1536 for
OpenAI text-embedding-3-small).
=head2 distance_metric
Distance metric for vector search. Default is C<cosine>. Supported by
sqlite-vec: C<cosine>, C<l2>, C<l1>.
=head2 embedding
Optional. A L<Langertha> engine instance that supports the
L<Langertha::Role::Embedding> role. When set, collections gain
C<add_text> and C<search_text> methods that automatically generate
embeddings.
=head2 sqlite_vec_path
Path to the sqlite-vec shared library. Auto-detected from
C<$ENV{SQLITE_VEC_PATH}> or L<Alien::sqlite_vec> if not specified.
=head2 collection
my $coll = $vdb->collection('documents');
my $coll = $vdb->collection; # uses '_default'
Returns a L<SQLite::VecDB::Collection> for the given name. Creates the
underlying tables on first use.
=head2 collections
my @names = $vdb->collections;
Returns the names of all existing collections.
=head1 WITH LANGERTHA â AUTOMATIC EMBEDDINGS
use SQLite::VecDB;
use Langertha::Engine::OpenAI;
my $engine = Langertha::Engine::OpenAI->new(
api_key => $ENV{OPENAI_API_KEY},
);
my $vdb = SQLite::VecDB->new(
db_file => 'vectors.db',
dimensions => 1536,
embedding => $engine,
);
my $coll = $vdb->collection('docs');
# Text is automatically embedded
$coll->add_text(
id => 'doc1',
text => 'Kubernetes is a container orchestration platform.',
);
# Query is automatically embedded
my @results = $coll->search_text(
text => 'container management',
limit => 5,
);
=head1 EMBEDDING SETUP
SQLite::VecDB stores and searches raw vectors. To generate embeddings from
text, pass any L<Langertha> engine that supports L<Langertha::Role::Embedding>
as the C<embedding> attribute.
=head2 Local Embeddings with Ollama (Recommended for Getting Started)
The easiest way to run embeddings locally â no API key, no cloud, free:
# Start Ollama in Docker
docker run -d -p 11434:11434 --name ollama ollama/ollama
# Pull an embedding model (768 dimensions, ~270MB)
docker exec ollama ollama pull nomic-embed-text
Then in Perl:
use SQLite::VecDB;
use Langertha::Engine::Ollama;
my $engine = Langertha::Engine::Ollama->new(
url => 'http://localhost:11434',
embedding_model => 'nomic-embed-text',
);
my $vdb = SQLite::VecDB->new(
db_file => 'my_vectors.db',
dimensions => 768,
embedding => $engine,
);
=head2 Popular Embedding Models
Model Dimensions Provider
âââââââââââââââââââââââââââââââââââââââââââââââââââââ
nomic-embed-text (Ollama) 768 Local
all-minilm (Ollama) 384 Local
mxbai-embed-large (Ollama) 1024 Local
text-embedding-3-small (OpenAI) 1536 Cloud
text-embedding-3-large (OpenAI) 3072 Cloud
=head2 Cloud Embeddings with OpenAI
use Langertha::Engine::OpenAI;
my $engine = Langertha::Engine::OpenAI->new(
api_key => $ENV{OPENAI_API_KEY},
);
my $vdb = SQLite::VecDB->new(
db_file => 'vectors.db',
dimensions => 1536, # text-embedding-3-small default
embedding => $engine,
);
=head1 SQLITE-VEC EXTENSION
The sqlite-vec extension must be available as a shared library. SQLite::VecDB
finds it in this order:
( run in 0.902 second using v1.01-cache-2.11-cpan-71847e10f99 )