streaming results from the CPAN

Atomic-Pipe
view release on metacpan or search on metacpan
    accept:

    compression => 'zstd'

      Enable Zstd compression. Currently 'zstd' is the only supported
      algorithm; any other value croaks at construction.

    compression_level => $level

      Zstd compression level, defaults to 3. Only meaningful when
      compression is enabled.

    compression_dictionary => $bytes

      Optional shared Zstd dictionary, supplied as raw bytes. Both ends
      must use the same dictionary content. Mutually exclusive with
      compression_dictionary_file.

    compression_dictionary_file => $path

      Same as compression_dictionary but loaded from a file via
      "new_from_file" in Compress::Zstd::CompressionDictionary. The file is
      read on demand.

    keep_compressed => $bool

      When set together with compression, reads expose the on-wire
      compressed bytes alongside the decompressed payload. See
      "read_message" and "get_line_burst_or_data" for the exact
      return-shape changes. Has no effect without compression.

 Custom dictionary

    Custom Zstd dictionaries can dramatically reduce frame size for small,
    repetitive payloads. Either form (bytes or file) may be supplied at
    construction or via "set_compression_dictionary" /
    "set_compression_dictionary_file".

    Caveat: raw zstd dictionaries do not embed a dict-ID. As a result a
    mismatched peer dictionary will silently decode to garbage rather than
    fail. (Hard frame corruption -- truncated or invalid frames -- still
    raises fatally.) Both ends must agree on byte-identical dictionary
    content.

 Performance

    Compression is not just a wire-size optimization for Atomic::Pipe: when
    messages exceed PIPE_BUF (typically 4096 bytes on Linux) the writer
    must fragment them into multiple non-atomic chunks, and the reader must
    reassemble them. Compressing the payload first frequently collapses a
    multi-part message back into a single atomic burst, which avoids that
    per-message protocol overhead entirely. As a result, on workloads
    dominated by larger-than-PIPE_BUF messages, compression is often much
    faster end-to-end than no compression, even after accounting for the
    CPU cost of compress/decompress.

    The kernel pipe buffer size (see "resize") does not affect this --
    fragmentation is keyed on the POSIX PIPE_BUF atomic-write threshold,
    not on the buffer capacity.

  Benchmark: streaming JSON objects

    Numbers below are from bench/zstd_compression.pl in the distribution.
    The workload is a synthetic but representative stream of JSON log/event
    objects sent in mixed-data mode via write_message. The corpus is
    generated once and reused across all runs; sizes are JSON-encoded byte
    counts.

    Two corpora were measured:

    Small JSON (10 MB total, 11785 objects)

      Object sizes 181 .. 1977 bytes, average ~890 B; ~37% of objects under
      500 B. Most messages fit in a single PIPE_BUF burst regardless of
      compression.

        level     raw MB/s   wire MB    ratio   saved
        plain         9.74    10.00       -        -
        L-3          15.98     6.68    1.50x    33.2%
        L1           24.55     4.92    2.03x    50.8%
        L3 (def)     27.79     4.91    2.04x    50.9%
        L5           46.34     4.87    2.05x    51.3%
        L7           63.72     4.87    2.05x    51.3%
        L12          27.02     4.85    2.06x    51.5%
        L22          14.43     4.84    2.07x    51.6%

      For this size distribution, levels 1..7 are all faster than no
      compression (pipe back-pressure on the uncompressed run still
      dominates).

    Larger JSON (100 MB total, 20407 objects)

      Object sizes 187 .. 10000 bytes, average ~5.1 KB, evenly distributed
      across the 1..10 KB range. Most objects exceed PIPE_BUF, so the
      uncompressed path pays the multi-part fragmentation cost on nearly
      every message.

        level     raw MB/s   wire MB    ratio   saved
        plain         0.29   100.00       -        -
        L-3         287.85    35.61    2.81x    64.4%
        L-1         273.56    33.92    2.95x    66.1%
        L1          237.04    30.56    3.27x    69.4%
        L3 (def)    207.61    30.25    3.31x    69.7%
        L5          113.02    30.01    3.33x    70.0%
        L9           39.35    29.93    3.34x    70.1%
        L18           7.81    28.14    3.55x    71.9%
        L22           7.85    28.14    3.55x    71.9%

      Here the uncompressed run collapses to ~0.29 MB/s, while even modest
      compression levels achieve 200+ MB/s -- a ~1000x throughput
      improvement driven almost entirely by avoided fragmentation. Levels
      above ~5 trade significant CPU for negligible additional ratio.

    Pipe buffer size has minimal impact

      The same 100 MB corpus, holding mode constant and varying the kernel
      pipe buffer (32 KB, 128 KB, 512 KB, 1 MB), shows almost no movement
      in either direction. The bottleneck is PIPE_BUF-aligned framing, not
      buffer fill, so calling "resize" with a larger size will not rescue
      an uncompressed large-message workload.
( run in 1.010 second using v1.01-cache-2.11-cpan-140bd7fdf52 )