streaming results from the CPAN

streaming
Atomic-Pipe
view release on metacpan or search on metacpan
## Constructor options

All constructors (`new`, `pair`, `from_fh`, `from_fd`, `read_fifo`,
`write_fifo`) accept:

- compression => 'zstd'

    Enable Zstd compression. Currently `'zstd'` is the only supported algorithm;
    any other value croaks at construction.

- compression\_level => $level

    Zstd compression level, defaults to 3. Only meaningful when `compression` is
    enabled.

- compression\_dictionary => $bytes

    Optional shared Zstd dictionary, supplied as raw bytes. Both ends must use the
    same dictionary content. Mutually exclusive with `compression_dictionary_file`.

- compression\_dictionary\_file => $path

    Same as `compression_dictionary` but loaded from a file via
    ["new\_from\_file" in Compress::Zstd::CompressionDictionary](https://metacpan.org/pod/Compress%3A%3AZstd%3A%3ACompressionDictionary#new_from_file). The file is read on
    demand.

- keep\_compressed => $bool

    When set together with `compression`, reads expose the on-wire compressed
    bytes alongside the decompressed payload. See ["read\_message"](#read_message) and
    ["get\_line\_burst\_or\_data"](#get_line_burst_or_data) for the exact return-shape changes. Has no effect
    without `compression`.

## Custom dictionary

Custom Zstd dictionaries can dramatically reduce frame size for small,
repetitive payloads. Either form (bytes or file) may be supplied at
construction or via ["set\_compression\_dictionary"](#set_compression_dictionary) /
["set\_compression\_dictionary\_file"](#set_compression_dictionary_file).

**Caveat:** raw zstd dictionaries do not embed a dict-ID. As a result a
**mismatched** peer dictionary will silently decode to garbage rather than
fail. (Hard frame corruption -- truncated or invalid frames -- still raises
fatally.) Both ends must agree on byte-identical dictionary content.

## Performance

Compression is not just a wire-size optimization for `Atomic::Pipe`: when
messages exceed `PIPE_BUF` (typically 4096 bytes on Linux) the writer must
fragment them into multiple non-atomic chunks, and the reader must reassemble
them. Compressing the payload first frequently collapses a multi-part message
back into a single atomic burst, which avoids that per-message protocol
overhead entirely. As a result, on workloads dominated by larger-than-PIPE\_BUF
messages, compression is often **much faster end-to-end than no compression**,
even after accounting for the CPU cost of compress/decompress.

The kernel pipe buffer size (see ["resize"](#resize)) does **not** affect this --
fragmentation is keyed on the POSIX `PIPE_BUF` atomic-write threshold, not on
the buffer capacity.

### Benchmark: streaming JSON objects

Numbers below are from `bench/zstd_compression.pl` in the distribution. The
workload is a synthetic but representative stream of JSON log/event objects
sent in mixed-data mode via `write_message`. The corpus is generated once and
reused across all runs; sizes are JSON-encoded byte counts.

Two corpora were measured:

- Small JSON (10 MB total, 11785 objects)

    Object sizes 181 .. 1977 bytes, average ~890 B; ~37% of objects under 500 B.
    Most messages fit in a single `PIPE_BUF` burst regardless of compression.

        level     raw MB/s   wire MB    ratio   saved
        plain         9.74    10.00       -        -
        L-3          15.98     6.68    1.50x    33.2%
        L1           24.55     4.92    2.03x    50.8%
        L3 (def)     27.79     4.91    2.04x    50.9%
        L5           46.34     4.87    2.05x    51.3%
        L7           63.72     4.87    2.05x    51.3%
        L12          27.02     4.85    2.06x    51.5%
        L22          14.43     4.84    2.07x    51.6%

    For this size distribution, levels 1..7 are all faster than no compression
    (pipe back-pressure on the uncompressed run still dominates).

- Larger JSON (100 MB total, 20407 objects)

    Object sizes 187 .. 10000 bytes, average ~5.1 KB, evenly distributed across
    the 1..10 KB range. Most objects exceed `PIPE_BUF`, so the uncompressed path
    pays the multi-part fragmentation cost on nearly every message.

        level     raw MB/s   wire MB    ratio   saved
        plain         0.29   100.00       -        -
        L-3         287.85    35.61    2.81x    64.4%
        L-1         273.56    33.92    2.95x    66.1%
        L1          237.04    30.56    3.27x    69.4%
        L3 (def)    207.61    30.25    3.31x    69.7%
        L5          113.02    30.01    3.33x    70.0%
        L9           39.35    29.93    3.34x    70.1%
        L18           7.81    28.14    3.55x    71.9%
        L22           7.85    28.14    3.55x    71.9%

    Here the uncompressed run collapses to ~0.29 MB/s, while even modest
    compression levels achieve 200+ MB/s -- a ~1000x throughput improvement
    driven almost entirely by avoided fragmentation. Levels above ~5 trade
    significant CPU for negligible additional ratio.

- Pipe buffer size has minimal impact

    The same 100 MB corpus, holding mode constant and varying the kernel pipe
    buffer (32 KB, 128 KB, 512 KB, 1 MB), shows almost no movement in either
    direction. The bottleneck is `PIPE_BUF`-aligned framing, not buffer fill, so
    calling ["resize"](#resize) with a larger size will not rescue an uncompressed
    large-message workload.

### Practical guidance

- If your messages are routinely larger than `PIPE_BUF` (~4 KB), enabling
compression is almost always a throughput win, not just a bandwidth win.
( run in 0.795 second using v1.01-cache-2.11-cpan-140bd7fdf52 )