Compress-Zstd

 view release on metacpan or  search on metacpan

ext/zstd/doc/zstd_compression_format.md  view on Meta::CPAN

| 1 byte                    | 0-1 byte              | 0-4 bytes         | 0-8 bytes              |

#### `Frame_Header_Descriptor`

The first header's byte is called the `Frame_Header_Descriptor`.
It describes which other fields are present.
Decoding this byte is enough to tell the size of `Frame_Header`.

| Bit number | Field name                |
| ---------- | ----------                |
| 7-6        | `Frame_Content_Size_flag` |
| 5          | `Single_Segment_flag`     |
| 4          | `Unused_bit`              |
| 3          | `Reserved_bit`            |
| 2          | `Content_Checksum_flag`   |
| 1-0        | `Dictionary_ID_flag`      |

In this table, bit 7 is the highest bit, while bit 0 is the lowest one.

__`Frame_Content_Size_flag`__

This is a 2-bits flag (`= Frame_Header_Descriptor >> 6`),
specifying if `Frame_Content_Size` (the decompressed data size)
is provided within the header.
`Flag_Value` provides `FCS_Field_Size`,
which is the number of bytes used by `Frame_Content_Size`
according to the following table:

|  `Flag_Value`  |    0   |  1  |  2  |  3  |
| -------------- | ------ | --- | --- | --- |
|`FCS_Field_Size`| 0 or 1 |  2  |  4  |  8  |

When `Flag_Value` is `0`, `FCS_Field_Size` depends on `Single_Segment_flag` :
if `Single_Segment_flag` is set, `FCS_Field_Size` is 1.
Otherwise, `FCS_Field_Size` is 0 : `Frame_Content_Size` is not provided.

__`Single_Segment_flag`__

If this flag is set,
data must be regenerated within a single continuous memory segment.

In this case, `Window_Descriptor` byte is skipped,
but `Frame_Content_Size` is necessarily present.
As a consequence, the decoder must allocate a memory segment
of size equal or larger than `Frame_Content_Size`.

In order to preserve the decoder from unreasonable memory requirements,
a decoder is allowed to reject a compressed frame
which requests a memory size beyond decoder's authorized range.

For broader compatibility, decoders are recommended to support
memory sizes of at least 8 MB.
This is only a recommendation,
each decoder is free to support higher or lower limits,
depending on local limitations.

__`Unused_bit`__

A decoder compliant with this specification version shall not interpret this bit.
It might be used in any future version,
to signal a property which is transparent to properly decode the frame.
An encoder compliant with this specification version must set this bit to zero.

__`Reserved_bit`__

This bit is reserved for some future feature.
Its value _must be zero_.
A decoder compliant with this specification version must ensure it is not set.
This bit may be used in a future revision,
to signal a feature that must be interpreted to decode the frame correctly.

__`Content_Checksum_flag`__

If this flag is set, a 32-bits `Content_Checksum` will be present at frame's end.
See `Content_Checksum` paragraph.

__`Dictionary_ID_flag`__

This is a 2-bits flag (`= FHD & 3`),
telling if a dictionary ID is provided within the header.
It also specifies the size of this field as `DID_Field_Size`.

|`Flag_Value`    |  0  |  1  |  2  |  3  |
| -------------- | --- | --- | --- | --- |
|`DID_Field_Size`|  0  |  1  |  2  |  4  |

#### `Window_Descriptor`

Provides guarantees on minimum memory buffer required to decompress a frame.
This information is important for decoders to allocate enough memory.

The `Window_Descriptor` byte is optional.
When `Single_Segment_flag` is set, `Window_Descriptor` is not present.
In this case, `Window_Size` is `Frame_Content_Size`,
which can be any value from 0 to 2^64-1 bytes (16 ExaBytes).

| Bit numbers |     7-3    |     2-0    |
| ----------- | ---------- | ---------- |
| Field name  | `Exponent` | `Mantissa` |

The minimum memory buffer size is called `Window_Size`.
It is described by the following formulas :
```
windowLog = 10 + Exponent;
windowBase = 1 << windowLog;
windowAdd = (windowBase / 8) * Mantissa;
Window_Size = windowBase + windowAdd;
```
The minimum `Window_Size` is 1 KB.
The maximum `Window_Size` is `(1<<41) + 7*(1<<38)` bytes, which is 3.75 TB.

In general, larger `Window_Size` tend to improve compression ratio,
but at the cost of memory usage.

To properly decode compressed data,
a decoder will need to allocate a buffer of at least `Window_Size` bytes.

In order to preserve decoder from unreasonable memory requirements,
a decoder is allowed to reject a compressed frame
which requests a memory size beyond decoder's authorized range.



( run in 2.750 seconds using v1.01-cache-2.11-cpan-39bf76dae61 )