ClickHouse-Encoder

 view release on metacpan or  search on metacpan

lib/ClickHouse/Encoder/TCP.pm  view on Meta::CPAN


    # 1. Handshake
    print $s ClickHouse::Encoder::TCP->pack_hello(
        user => 'default', password => '', database => 'default');
    my $hello = ClickHouse::Encoder::TCP->read_packet($s, buffer => \$rbuf);
    die "expected Hello, got type $hello->{type}\n"
        unless $hello->{type} == ClickHouse::Encoder::TCP::SERVER_HELLO;

    # 2. insert query
    print $s ClickHouse::Encoder::TCP->pack_query(
        query => 'insert into events format native');

    # 3. Read TableColumns / empty Data packets the server sends back
    while (1) {
        my $p = ClickHouse::Encoder::TCP->read_packet($s, buffer => \$rbuf);
        last if $p->{type} == ClickHouse::Encoder::TCP::SERVER_DATA;
    }

    # 4. Send our data block(s)
    my $enc = ClickHouse::Encoder->new(columns => [
        ['ev', 'String'], ['ts', 'DateTime']]);
    my $block = $enc->encode([ ['login', time()] ]);
    print $s ClickHouse::Encoder::TCP->pack_data($block);

    # 5. End of insert
    print $s ClickHouse::Encoder::TCP->pack_data_end();

    # 6. Wait for EndOfStream / Exception
    while (1) {
        my $p = ClickHouse::Encoder::TCP->read_packet($s, buffer => \$rbuf);
        last if $p->{type} == ClickHouse::Encoder::TCP::SERVER_END_OF_STREAM;
        die "server exception: $p->{message}\n"
            if $p->{type} == ClickHouse::Encoder::TCP::SERVER_EXCEPTION;
    }
    close $s;

=head1 DESCRIPTION

A pure-Perl helper module that packs the few client packets needed
to drive an insert pipeline over the ClickHouse native TCP protocol
(port 9000), plus a decoder for the most common server packets:
C<Hello>, C<Data>, C<Exception>, C<Progress>, C<Pong>,
C<EndOfStream>, C<ProfileInfo>, C<TableColumns>, C<ProfileEvents>.

Targets protocol revision 54429 - the same revision ClickHouse
clients have used since ~2020. Newer fields the server may include
(timezone, display_name, version_patch) are read opportunistically
when present.

Transport (socket / TLS / framing) is the caller's responsibility.
L</read_packet> is provided as a convenience for blocking
C<IO::Socket>-style use; for non-blocking transports, call
L</unpack_packet> on a sliding byte buffer directly.

Out of scope:

=over 4

=item * Settings with typed values (newer flexible-setting wire form).

=item * select result streaming (covers what's needed for inserts).

=item * Server's prepared-query parameters protocol.

=back

Wire compression is supported as an opt-in: L</pack_data> /
L</pack_data_end> accept C<<< compress => 'lz4' >>> or C<'zstd'>,
and L</read_packet> accepts C<<< compressed => 1 >>>. See CAVEATS
for the negotiation handshake the caller must perform first.

For select, prefer HTTP - it's simpler and well-supported by
L<ClickHouse::Encoder>'s C<decode_block> / C<decode_stream>.

=head1 PACKET ENCODERS

=head2 pack_hello %opts

    my $bytes = ClickHouse::Encoder::TCP->pack_hello(
        client_name => 'my-app',  # default 'ClickHouse::Encoder'
        major       => 1,          # default 1
        minor       => 0,          # default 0
        revision    => 54429,      # default DEFAULT_REVISION
        database    => 'default',
        user        => 'default',
        password    => '',
    );

=head2 pack_query %opts

    my $bytes = ClickHouse::Encoder::TCP->pack_query(
        query    => 'insert into t format native',
        query_id => '',                       # let server generate
        settings => { max_memory_usage => '1000000000' },  # optional
        stage    => STAGE_COMPLETE,           # default
        compression => COMPRESSION_DISABLE,   # default
    );

C<settings> may be a hashref (legacy string-value form) or a raw
byte string already in the right shape.

=head2 pack_data $block_bytes, %opts

    my $bytes = ClickHouse::Encoder::TCP->pack_data($block);
    my $bytes = ClickHouse::Encoder::TCP->pack_data($block,
        compress => 'lz4');   # or 'zstd'

Wraps an encoded Native block (from
L<ClickHouse::Encoder/encode>) in a Data packet with an optional
C<table_name> (usually empty for inserts).

With C<compress =E<gt> 'lz4'> (or C<'zstd'>, or C<'auto'> - any mode
L<ClickHouse::Encoder/compress_native_block> accepts) the block is
first wrapped in ClickHouse's compressed-block framing (16-byte
CityHash128 + 9-byte header + compressed payload) before being placed
inside the Data packet. The server must already be expecting
compressed data via the corresponding C<pack_query(... compression
=E<gt> COMPRESSION_ENABLE)>; sending compressed Data without
negotiating compression in the Query packet will be rejected as a
parse error by C<CompressedReadBuffer>. C<compress> absent, or
C<'none'> / C<'raw'>, emits the bare uncompressed block (the default).

lib/ClickHouse/Encoder/TCP.pm  view on Meta::CPAN

=head1 CONSTANTS

Client packet types: C<CLIENT_HELLO>, C<CLIENT_QUERY>, C<CLIENT_DATA>,
C<CLIENT_CANCEL>, C<CLIENT_PING>.

Server packet types: C<SERVER_HELLO>, C<SERVER_DATA>,
C<SERVER_EXCEPTION>, C<SERVER_PROGRESS>, C<SERVER_PONG>,
C<SERVER_END_OF_STREAM>, C<SERVER_PROFILE_INFO>, C<SERVER_TOTALS>,
C<SERVER_EXTREMES>, C<SERVER_TABLE_COLUMNS>, C<SERVER_PROFILE_EVENTS>.

Other: C<STAGE_COMPLETE>, C<STAGE_WITH_MERGEABLE>,
C<STAGE_FETCH_COLUMNS>, C<COMPRESSION_DISABLE>, C<COMPRESSION_ENABLE>,
C<DEFAULT_REVISION>.

All constants live in the package namespace and are not exported;
reference them as C<ClickHouse::Encoder::TCP::SERVER_DATA> etc.

=head1 CAVEATS

=over 4

=item * B<Modern server cutoff.> The default revision (54429) predates
the chunking-negotiation extension introduced in ClickHouse 24.10
(protocol revision E<gt>= 54475). Newer servers send a chunking offer
right after C<SERVER_HELLO> that this subset does not respond to;
the connection then fails with a fast protocol-mismatch error.
For integration with recent servers, prefer HTTP transport.

=item * B<String encoding.> Inputs to C<pack_string> (any string
field: query, names, settings) are encoded as UTF-8 bytes; passing a
byte-mode string with non-ASCII bytes will be reinterpreted as
Latin-1 by Perl's UTF-8 upgrade rules. If you need raw bytes, encode
to UTF-8 yourself first. C<unpack_string> conversely returns the
raw byte string the server sent - it does not set the UTF-8 flag,
so callers wanting characters should C<decode_utf8> the result.

=item * B<Settings values are strings.> Each setting is emitted with
flags=0 (ordinary) and the value as a string. Numeric/typed-setting
encoding (flexible settings, available at higher revisions) is out
of scope.

=item * B<Wire compression is opt-in.> C<pack_query> still defaults
to C<COMPRESSION_DISABLE>; to negotiate compression pass
C<compression =E<gt> COMPRESSION_ENABLE> in C<pack_query>, then
C<compress =E<gt> 'lz4'> (or C<'zstd'>) to both C<pack_data> and
C<pack_data_end>. Compressed Data packets coming back from the
server are decoded by C<read_packet($fh, compressed =E<gt> 1)>.
The compressed-block framing (16-byte CityHash128 v1.0.2 + 9-byte
header + payload) lives in L<ClickHouse::Encoder/compress_native_block>.

=back

=head1 SEE ALSO

L<ClickHouse::Encoder> - the wire-format encoder these packets carry,
plus L<compress_native_block|ClickHouse::Encoder/compress_native_block>
/ L<decompress_native_block|ClickHouse::Encoder/decompress_native_block>
for the matching block-framing helpers.

L<EV::ClickHouse> - full async ClickHouse client (TCP + HTTP) for
select result streaming, prepared queries with parameter binding,
and chunking-negotiation against modern CH revisions.

=head1 AUTHOR

vividsnow

=head1 LICENSE

Same terms as Perl itself.

=cut



( run in 0.498 second using v1.01-cache-2.11-cpan-13bb782fe5a )