CBOR-XS
view release on metacpan or search on metacpan
TODO: pack_keys?
TODO: document encode_cbor_sharing?
TODO: large negative integers
TODO: type cast tests.
TODO: round-tripping of types, such as float16 - maybe types::Serialiser support?
TODO: possibly implement https://peteroupc.github.io/CBOR/extended.html, but NaNs are nonportable. rely on libecb?
TODO: https://github.com/svaarala/cbor-specs/blob/master/cbor-nonutf8-string-tags.rst, but maybe that is overkill?
1.87 Fri 08 Sep 2023 22:14:18 CEST
- shared references were not decoded correctly: instead of getting
multiple references to the same object, you got the same
reference to the same object, causing a number of issues. For
example, modifying the reference would modify all places the
reference was used, and encoding the decoded structure would
unshare the previously shared hashes, as trheir reference count
would be 1. Fixing this was rather involved, as perl lacks the
ability to easily swap or copy arrays and hashes.
- \0, \1, \undef do not work, and were not intended to ever work, as
special values, despite being mentioned in the documentation (reported
by nuclightq).
- new feature: allow_weak_cycles.
1.86 Thu 04 Nov 2021 17:48:16 CET
- fixed a wrong printf format specifier (reported by Petr PÃsaÅ).
$coder = CBOR::XS->new;
$binary_cbor_data = $coder->encode ($perl_value);
$perl_value = $coder->decode ($binary_cbor_data);
# prefix decoding
my $many_cbor_strings = ...;
while (length $many_cbor_strings) {
my ($data, $length) = $cbor->decode_prefix ($many_cbor_strings);
# data was decoded
substr $many_cbor_strings, 0, $length, ""; # remove decoded cbor string
}
DESCRIPTION
This module converts Perl data structures to the Concise Binary Object
Representation (CBOR) and vice versa. CBOR is a fast binary
serialisation format that aims to use an (almost) superset of the JSON
data model, i.e. when you can represent something useful in JSON, you
should be able to represent it in CBOR.
In short, CBOR is a faster and quite compact binary alternative to JSON,
$enabled = $cbor->get_allow_sharing
If $enable is true (or missing), then "encode" will not
double-encode values that have been referenced before (e.g. when the
same object, such as an array, is referenced multiple times), but
instead will emit a reference to the earlier value.
This means that such values will only be encoded once, and will not
result in a deep cloning of the value on decode, in decoders
supporting the value sharing extension. This also makes it possible
to encode cyclic data structures (which need "allow_cycles" to be
enabled to be decoded by this module).
It is recommended to leave it off unless you know your communication
partner supports the value sharing extensions to CBOR
(<http://cbor.schmorp.de/value-sharing>), as without decoder
support, the resulting data structure might be unusable.
Detecting shared values incurs a runtime overhead when values are
encoded that have a reference counter larger than one, and might
unnecessarily increase the encoded size, as potentially shared
values are encoded as shareable whether or not they are actually
scalars, arrays or hashes pointed to by a reference). Weirder
constructs, such as an array with multiple "copies" of the *same*
string, which are hard but not impossible to create in Perl, are not
supported (this is the same as with Storable).
If $enable is false (the default), then "encode" will encode shared
data structures repeatedly, unsharing them in the process. Cyclic
data structures cannot be encoded in this mode.
This option does not affect "decode" in any way - shared values and
references will always be decoded properly if present.
$cbor = $cbor->allow_cycles ([$enable])
$enabled = $cbor->get_allow_cycles
If $enable is true (or missing), then "decode" will happily decode
self-referential (cyclic) data structures. By default these will not
be decoded, as they need manual cleanup to avoid memory leaks, so
code that isn't prepared for this will not leak memory.
If $enable is false (the default), then "decode" will throw an error
when it encounters a self-referential/cyclic data structure.
This option does not affect "encode" in any way - shared values and
references will always be encoded properly if present.
$cbor = $cbor->allow_weak_cycles ([$enable])
$enabled = $cbor->get_allow_weak_cycles
It is recommended to leave it off unless you know your
communications partner supports the stringref extension to CBOR
(<http://cbor.schmorp.de/stringref>), as without decoder support,
the resulting data structure might not be usable.
If $enable is false (the default), then "encode" will encode strings
the standard CBOR way.
This option does not affect "decode" in any way - string references
will always be decoded properly if present.
$cbor = $cbor->text_keys ([$enable])
$enabled = $cbor->get_text_keys
If $enabled is true (or missing), then "encode" will encode all perl
hash keys as CBOR text strings/UTF-8 string, upgrading them as
needed.
If $enable is false (the default), then "encode" will encode hash
keys normally - upgraded perl strings (strings internally encoded as
UTF-8) as CBOR text strings, and downgraded perl strings as CBOR
supposedly valid UTF-8 will simply be dumped into the resulting CBOR
string without checking whether that is, in fact, true or not.
$cbor = $cbor->filter ([$cb->($tag, $value)])
$cb_or_undef = $cbor->get_filter
Sets or replaces the tagged value decoding filter (when $cb is
specified) or clears the filter (if no argument or "undef" is
provided).
The filter callback is called only during decoding, when a
non-enforced tagged value has been decoded (see "TAG HANDLING AND
EXTENSIONS" for a list of enforced tags). For specific tags, it's
often better to provide a default converter using the
%CBOR::XS::FILTER hash (see below).
The first argument is the numerical tag, the second is the (decoded)
value that has been tagged.
The filter function should return either exactly one value, which
will replace the tagged value in the decoded data structure, or no
values, which will result in default handling, which currently means
the decoder creates a "CBOR::XS::Tagged" object to hold the tag and
the value.
When the filter is cleared (the default state), the default filter
function, "CBOR::XS::default_filter", is used. This function simply
looks up the tag in the %CBOR::XS::FILTER hash. If an entry exists
it must be a code reference that is called with tag and value, and
is responsible for decoding the value. If no entry exists, it
returns no values. "CBOR::XS" provides a number of default filter
A typical use case would be a network protocol that consists of sending
and receiving CBOR-encoded messages. The solution that works with CBOR
and about anything else is by prepending a length to every CBOR value,
so the receiver knows how many octets to read. More compact (and
slightly slower) would be to just send CBOR values back-to-back, as
"CBOR::XS" knows where a CBOR value ends, and doesn't need an explicit
length.
The following methods help with this:
@decoded = $cbor->incr_parse ($buffer)
This method attempts to decode exactly one CBOR value from the
beginning of the given $buffer. The value is removed from the
$buffer on success. When $buffer doesn't contain a complete value
yet, it returns nothing. Finally, when the $buffer doesn't start
with something that could ever be a valid CBOR value, it raises an
exception, just as "decode" would. In the latter case the decoder
state is undefined and must be reset before being able to parse
further.
This method modifies the $buffer in place. When no CBOR value can be
decoded, the decoder stores the current string offset. On the next
call, continues decoding at the place where it stopped before. For
this to make sense, the $buffer must begin with the same octets as
on previous unsuccessful calls.
You can call this method in scalar context, in which case it either
returns a decoded value or "undef". This makes it impossible to
distinguish between CBOR null values (which decode to "undef") and
an unsuccessful decode, which is often acceptable.
@decoded = $cbor->incr_parse_multiple ($buffer)
Same as "incr_parse", but attempts to decode as many CBOR values as
possible in one go, instead of at most one. Calls to "incr_parse"
and "incr_parse_multiple" can be interleaved.
$cbor->incr_reset
Resets the incremental decoder. This throws away any saved state, so
that subsequent calls to "incr_parse" or "incr_parse_multiple" start
to parse a new CBOR value from the beginning of the $buffer again.
This method can be called at any time, but it *must* be called if
CBOR -> PERL
integers
CBOR integers become (numeric) perl scalars. On perls without 64 bit
support, 64 bit integers will be truncated or otherwise corrupted.
byte strings
Byte strings will become octet strings in Perl (the Byte values
0..255 will simply become characters of the same value in Perl).
UTF-8 strings
UTF-8 strings in CBOR will be decoded, i.e. the UTF-8 octets will be
decoded into proper Unicode code points. At the moment, the validity
of the UTF-8 octets will not be validated - corrupt input will
result in corrupted Perl strings.
arrays, maps
CBOR arrays and CBOR maps will be converted into references to a
Perl array or hash, respectively. The keys of the map will be
stringified during this process.
null
CBOR null becomes "undef" in Perl.
These methods *MUST NOT* change the data structure that is being
serialised. Failure to comply to this can result in memory corruption -
and worse.
If an object supports neither "TO_CBOR" nor "FREEZE", encoding will fail
with an error.
DECODING
Objects encoded via "TO_CBOR" cannot (normally) be automatically
decoded, but objects encoded via "FREEZE" can be decoded using the
following protocol:
When an encoded CBOR perl object is encountered by the decoder, it will
look up the "THAW" method, by using the stored classname, and will fail
if the method cannot be found.
After the lookup it will call the "THAW" method with the stored
classname as first argument, the constant string "CBOR" as second
argument, and all values returned by "FREEZE" as remaining arguments.
TAG HANDLING AND EXTENSIONS
This section describes how this module handles specific tagged values
and extensions. If a tag is not mentioned here and no additional filters
are provided for it, then the default handling applies (creating a
CBOR::XS::Tagged object on decoding, and only encoding the tag when
explicitly requested).
Tags not handled specifically are currently converted into a
CBOR::XS::Tagged object, which is simply a blessed array reference
consisting of the numeric tag value followed by the (decoded) CBOR
value.
Future versions of this module reserve the right to special case
additional tags (such as base64url).
ENFORCED TAGS
These tags are always handled when decoding, and their handling cannot
be overridden by the user.
26 (perl-object, <http://cbor.schmorp.de/perl-object>)
These tags are automatically created (and decoded) for serialisable
objects using the "FREEZE/THAW" methods (the Types::Serialier object
serialisation protocol). See "OBJECT SERIALISATION" for details.
28, 29 (shareable, sharedref, <http://cbor.schmorp.de/value-sharing>)
These tags are automatically decoded when encountered (and they do
not result in a cyclic data structure, see "allow_cycles"),
resulting in shared values in the decoded object. They are only
encoded, however, when "allow_sharing" is enabled.
Not all shared values can be successfully decoded: values that
reference themselves will *currently* decode as "undef" (this is not
the same as a reference pointing to itself, which will be
represented as a value that contains an indirect reference to itself
- these will be decoded properly).
Note that considerably more shared value data structures can be
decoded than will be encoded - currently, only values pointed to by
references will be shared, others will not. While non-reference
shared values can be generated in Perl with some effort, they were
considered too unimportant to be supported in the encoder. The
decoder, however, will decode these values as shared values.
256, 25 (stringref-namespace, stringref,
<http://cbor.schmorp.de/stringref>)
These tags are automatically decoded when encountered. They are only
encoded, however, when "pack_strings" is enabled.
22098 (indirection, <http://cbor.schmorp.de/indirection>)
This tag is automatically generated when a reference are encountered
(with the exception of hash and array references). It is converted
to a reference when decoding.
55799 (self-describe CBOR, RFC 7049)
This value is not generated on encoding (unless explicitly requested
by the user), and is simply ignored when decoding.
When they result in decoding into a specific Perl class, the module
usually provides a corresponding "TO_CBOR" method as well.
When any of these need to load additional modules that are not part of
the perl core distribution (e.g. URI), it is (currently) up to the user
to provide these modules. The decoding usually fails with an exception
if the required module cannot be loaded.
0, 1 (date/time string, seconds since the epoch)
These tags are decoded into Time::Piece objects. The corresponding
"Time::Piece::TO_CBOR" method always encodes into tag 1 values
currently.
The Time::Piece API is generally surprisingly bad, and fractional
seconds are only accidentally kept intact, so watch out. On the plus
side, the module comes with perl since 5.10, which has to count for
something.
2, 3 (positive/negative bignum)
These tags are decoded into Math::BigInt objects. The corresponding
"Math::BigInt::TO_CBOR" method encodes "small" bigints into normal
CBOR integers, and others into positive/negative CBOR bignums.
4, 5, 264, 265 (decimal fraction/bigfloat)
Both decimal fractions and bigfloats are decoded into Math::BigFloat
objects. The corresponding "Math::BigFloat::TO_CBOR" method *always*
encodes into a decimal fraction (either tag 4 or 264).
NaN and infinities are not encoded properly, as they cannot be
represented in CBOR.
See "BIGNUM SECURITY CONSIDERATIONS" for more info.
30 (rational numbers)
These tags are decoded into Math::BigRat objects. The corresponding
"Math::BigRat::TO_CBOR" method encodes rational numbers with
denominator 1 via their numerator only, i.e., they become normal
integers or "bignums".
See "BIGNUM SECURITY CONSIDERATIONS" for more info.
21, 22, 23 (expected later JSON conversion)
CBOR::XS is not a CBOR-to-JSON converter, and will simply ignore
these tags.
CBOR and JSON
CBOR is supposed to implement a superset of the JSON data model, and is,
with some coercion, able to represent all JSON texts (something that
other "binary JSON" formats such as BSON generally do not support).
CBOR implements some extra hints and support for JSON interoperability,
and the spec offers further guidance for conversion between CBOR and
JSON. None of this is currently implemented in CBOR, and the guidelines
in the spec do not result in correct round-tripping of data. If JSON
interoperability is improved in the future, then the goal will be to
ensure that decoded JSON data will round-trip encoding and decoding to
CBOR intact.
SECURITY CONSIDERATIONS
Tl;dr... if you want to decode or encode CBOR from untrusted sources,
you should start with a coder object created via "new_safe" (which
implements the mitigations explained below):
my $coder = CBOR::XS->new_safe;
my $data = $coder->decode ($cbor_text);
even without bigints.
Disabling bigints will also partially or fully disable types that rely
on them, e.g. rational numbers that use bignums.
CBOR IMPLEMENTATION NOTES
This section contains some random implementation notes. They do not
describe guaranteed behaviour, but merely behaviour as-is implemented
right now.
64 bit integers are only properly decoded when Perl was built with 64
bit support.
Strings and arrays are encoded with a definite length. Hashes as well,
unless they are tied (or otherwise magical).
Only the double data type is supported for NV data types - when Perl
uses long double to represent floating point values, they might not be
encoded properly. Half precision types are accepted, but not encoded.
Strict mode and canonical mode are not implemented.
LIMITATIONS ON PERLS WITHOUT 64-BIT INTEGER SUPPORT
On perls that were built without 64 bit integer support (these are rare
nowadays, even on 32 bit architectures, as all major Perl distributions
are built with 64 bit integer support), support for any kind of 64 bit
value in CBOR is very limited - most likely, these 64 bit values will be
truncated, corrupted, or otherwise not decoded correctly. This also
includes string, float, array and map sizes that are stored as 64 bit
integers.
THREADS
This module is *not* guaranteed to be thread safe and there are no plans
to change this until Perl gets thread support (as opposed to the
horribly slow so-called "threads" which are simply slow and bloated
process simulations - use fork, it's *much* faster, cheaper, better).
(It might actually work, but you have been warned).
$coder = CBOR::XS->new;
$binary_cbor_data = $coder->encode ($perl_value);
$perl_value = $coder->decode ($binary_cbor_data);
# prefix decoding
my $many_cbor_strings = ...;
while (length $many_cbor_strings) {
my ($data, $length) = $cbor->decode_prefix ($many_cbor_strings);
# data was decoded
substr $many_cbor_strings, 0, $length, ""; # remove decoded cbor string
}
=head1 DESCRIPTION
This module converts Perl data structures to the Concise Binary Object
Representation (CBOR) and vice versa. CBOR is a fast binary serialisation
format that aims to use an (almost) superset of the JSON data model, i.e.
when you can represent something useful in JSON, you should be able to
represent it in CBOR.
=item $enabled = $cbor->get_allow_sharing
If C<$enable> is true (or missing), then C<encode> will not double-encode
values that have been referenced before (e.g. when the same object, such
as an array, is referenced multiple times), but instead will emit a
reference to the earlier value.
This means that such values will only be encoded once, and will not result
in a deep cloning of the value on decode, in decoders supporting the value
sharing extension. This also makes it possible to encode cyclic data
structures (which need C<allow_cycles> to be enabled to be decoded by this
module).
It is recommended to leave it off unless you know your
communication partner supports the value sharing extensions to CBOR
(L<http://cbor.schmorp.de/value-sharing>), as without decoder support, the
resulting data structure might be unusable.
Detecting shared values incurs a runtime overhead when values are encoded
that have a reference counter larger than one, and might unnecessarily
increase the encoded size, as potentially shared values are encoded as
arrays or hashes pointed to by a reference). Weirder constructs, such as
an array with multiple "copies" of the I<same> string, which are hard but
not impossible to create in Perl, are not supported (this is the same as
with L<Storable>).
If C<$enable> is false (the default), then C<encode> will encode shared
data structures repeatedly, unsharing them in the process. Cyclic data
structures cannot be encoded in this mode.
This option does not affect C<decode> in any way - shared values and
references will always be decoded properly if present.
=item $cbor = $cbor->allow_cycles ([$enable])
=item $enabled = $cbor->get_allow_cycles
If C<$enable> is true (or missing), then C<decode> will happily decode
self-referential (cyclic) data structures. By default these will not be
decoded, as they need manual cleanup to avoid memory leaks, so code that
isn't prepared for this will not leak memory.
If C<$enable> is false (the default), then C<decode> will throw an error
when it encounters a self-referential/cyclic data structure.
This option does not affect C<encode> in any way - shared values and
references will always be encoded properly if present.
=item $cbor = $cbor->allow_weak_cycles ([$enable])
It is recommended to leave it off unless you know your
communications partner supports the stringref extension to CBOR
(L<http://cbor.schmorp.de/stringref>), as without decoder support, the
resulting data structure might not be usable.
If C<$enable> is false (the default), then C<encode> will encode strings
the standard CBOR way.
This option does not affect C<decode> in any way - string references will
always be decoded properly if present.
=item $cbor = $cbor->text_keys ([$enable])
=item $enabled = $cbor->get_text_keys
If C<$enabled> is true (or missing), then C<encode> will encode all
perl hash keys as CBOR text strings/UTF-8 string, upgrading them as needed.
If C<$enable> is false (the default), then C<encode> will encode hash keys
normally - upgraded perl strings (strings internally encoded as UTF-8) as
string without checking whether that is, in fact, true or not.
=item $cbor = $cbor->filter ([$cb->($tag, $value)])
=item $cb_or_undef = $cbor->get_filter
Sets or replaces the tagged value decoding filter (when C<$cb> is
specified) or clears the filter (if no argument or C<undef> is provided).
The filter callback is called only during decoding, when a non-enforced
tagged value has been decoded (see L<TAG HANDLING AND EXTENSIONS> for a
list of enforced tags). For specific tags, it's often better to provide a
default converter using the C<%CBOR::XS::FILTER> hash (see below).
The first argument is the numerical tag, the second is the (decoded) value
that has been tagged.
The filter function should return either exactly one value, which will
replace the tagged value in the decoded data structure, or no values,
which will result in default handling, which currently means the decoder
creates a C<CBOR::XS::Tagged> object to hold the tag and the value.
When the filter is cleared (the default state), the default filter
function, C<CBOR::XS::default_filter>, is used. This function simply
looks up the tag in the C<%CBOR::XS::FILTER> hash. If an entry exists
it must be a code reference that is called with tag and value, and is
responsible for decoding the value. If no entry exists, it returns no
values. C<CBOR::XS> provides a number of default filter functions already,
the the C<%CBOR::XS::FILTER> hash can be freely extended with more.
and receiving CBOR-encoded messages. The solution that works with CBOR and
about anything else is by prepending a length to every CBOR value, so the
receiver knows how many octets to read. More compact (and slightly slower)
would be to just send CBOR values back-to-back, as C<CBOR::XS> knows where
a CBOR value ends, and doesn't need an explicit length.
The following methods help with this:
=over 4
=item @decoded = $cbor->incr_parse ($buffer)
This method attempts to decode exactly one CBOR value from the beginning
of the given C<$buffer>. The value is removed from the C<$buffer> on
success. When C<$buffer> doesn't contain a complete value yet, it returns
nothing. Finally, when the C<$buffer> doesn't start with something
that could ever be a valid CBOR value, it raises an exception, just as
C<decode> would. In the latter case the decoder state is undefined and
must be reset before being able to parse further.
This method modifies the C<$buffer> in place. When no CBOR value can be
decoded, the decoder stores the current string offset. On the next call,
continues decoding at the place where it stopped before. For this to make
sense, the C<$buffer> must begin with the same octets as on previous
unsuccessful calls.
You can call this method in scalar context, in which case it either
returns a decoded value or C<undef>. This makes it impossible to
distinguish between CBOR null values (which decode to C<undef>) and an
unsuccessful decode, which is often acceptable.
=item @decoded = $cbor->incr_parse_multiple ($buffer)
Same as C<incr_parse>, but attempts to decode as many CBOR values as
possible in one go, instead of at most one. Calls to C<incr_parse> and
C<incr_parse_multiple> can be interleaved.
=item $cbor->incr_reset
Resets the incremental decoder. This throws away any saved state, so that
subsequent calls to C<incr_parse> or C<incr_parse_multiple> start to parse
a new CBOR value from the beginning of the C<$buffer> again.
CBOR integers become (numeric) perl scalars. On perls without 64 bit
support, 64 bit integers will be truncated or otherwise corrupted.
=item byte strings
Byte strings will become octet strings in Perl (the Byte values 0..255
will simply become characters of the same value in Perl).
=item UTF-8 strings
UTF-8 strings in CBOR will be decoded, i.e. the UTF-8 octets will be
decoded into proper Unicode code points. At the moment, the validity of
the UTF-8 octets will not be validated - corrupt input will result in
corrupted Perl strings.
=item arrays, maps
CBOR arrays and CBOR maps will be converted into references to a Perl
array or hash, respectively. The keys of the map will be stringified
during this process.
=item null
These methods I<MUST NOT> change the data structure that is being
serialised. Failure to comply to this can result in memory corruption -
and worse.
If an object supports neither C<TO_CBOR> nor C<FREEZE>, encoding will fail
with an error.
=head3 DECODING
Objects encoded via C<TO_CBOR> cannot (normally) be automatically decoded,
but objects encoded via C<FREEZE> can be decoded using the following
protocol:
When an encoded CBOR perl object is encountered by the decoder, it will
look up the C<THAW> method, by using the stored classname, and will fail
if the method cannot be found.
After the lookup it will call the C<THAW> method with the stored classname
as first argument, the constant string C<CBOR> as second argument, and all
values returned by C<FREEZE> as remaining arguments.
=head1 TAG HANDLING AND EXTENSIONS
This section describes how this module handles specific tagged values
and extensions. If a tag is not mentioned here and no additional filters
are provided for it, then the default handling applies (creating a
CBOR::XS::Tagged object on decoding, and only encoding the tag when
explicitly requested).
Tags not handled specifically are currently converted into a
L<CBOR::XS::Tagged> object, which is simply a blessed array reference
consisting of the numeric tag value followed by the (decoded) CBOR value.
Future versions of this module reserve the right to special case
additional tags (such as base64url).
=head2 ENFORCED TAGS
These tags are always handled when decoding, and their handling cannot be
overridden by the user.
=over 4
=item 26 (perl-object, L<http://cbor.schmorp.de/perl-object>)
These tags are automatically created (and decoded) for serialisable
objects using the C<FREEZE/THAW> methods (the L<Types::Serialier> object
serialisation protocol). See L<OBJECT SERIALISATION> for details.
=item 28, 29 (shareable, sharedref, L<http://cbor.schmorp.de/value-sharing>)
These tags are automatically decoded when encountered (and they do not
result in a cyclic data structure, see C<allow_cycles>), resulting in
shared values in the decoded object. They are only encoded, however, when
C<allow_sharing> is enabled.
Not all shared values can be successfully decoded: values that reference
themselves will I<currently> decode as C<undef> (this is not the same
as a reference pointing to itself, which will be represented as a value
that contains an indirect reference to itself - these will be decoded
properly).
Note that considerably more shared value data structures can be decoded
than will be encoded - currently, only values pointed to by references
will be shared, others will not. While non-reference shared values can be
generated in Perl with some effort, they were considered too unimportant
to be supported in the encoder. The decoder, however, will decode these
values as shared values.
=item 256, 25 (stringref-namespace, stringref, L<http://cbor.schmorp.de/stringref>)
These tags are automatically decoded when encountered. They are only
encoded, however, when C<pack_strings> is enabled.
=item 22098 (indirection, L<http://cbor.schmorp.de/indirection>)
This tag is automatically generated when a reference are encountered (with
the exception of hash and array references). It is converted to a reference
when decoding.
=item 55799 (self-describe CBOR, RFC 7049)
When any of these need to load additional modules that are not part of the
perl core distribution (e.g. L<URI>), it is (currently) up to the user to
provide these modules. The decoding usually fails with an exception if the
required module cannot be loaded.
=over 4
=item 0, 1 (date/time string, seconds since the epoch)
These tags are decoded into L<Time::Piece> objects. The corresponding
C<Time::Piece::TO_CBOR> method always encodes into tag 1 values currently.
The L<Time::Piece> API is generally surprisingly bad, and fractional
seconds are only accidentally kept intact, so watch out. On the plus side,
the module comes with perl since 5.10, which has to count for something.
=item 2, 3 (positive/negative bignum)
These tags are decoded into L<Math::BigInt> objects. The corresponding
C<Math::BigInt::TO_CBOR> method encodes "small" bigints into normal CBOR
integers, and others into positive/negative CBOR bignums.
=item 4, 5, 264, 265 (decimal fraction/bigfloat)
Both decimal fractions and bigfloats are decoded into L<Math::BigFloat>
objects. The corresponding C<Math::BigFloat::TO_CBOR> method I<always>
encodes into a decimal fraction (either tag 4 or 264).
NaN and infinities are not encoded properly, as they cannot be represented
in CBOR.
See L<BIGNUM SECURITY CONSIDERATIONS> for more info.
=item 30 (rational numbers)
These tags are decoded into L<Math::BigRat> objects. The corresponding
C<Math::BigRat::TO_CBOR> method encodes rational numbers with denominator
C<1> via their numerator only, i.e., they become normal integers or
C<bignums>.
See L<BIGNUM SECURITY CONSIDERATIONS> for more info.
=item 21, 22, 23 (expected later JSON conversion)
CBOR::XS is not a CBOR-to-JSON converter, and will simply ignore these
tags.
CBOR is supposed to implement a superset of the JSON data model, and is,
with some coercion, able to represent all JSON texts (something that other
"binary JSON" formats such as BSON generally do not support).
CBOR implements some extra hints and support for JSON interoperability,
and the spec offers further guidance for conversion between CBOR and
JSON. None of this is currently implemented in CBOR, and the guidelines
in the spec do not result in correct round-tripping of data. If JSON
interoperability is improved in the future, then the goal will be to
ensure that decoded JSON data will round-trip encoding and decoding to
CBOR intact.
=head1 SECURITY CONSIDERATIONS
Tl;dr... if you want to decode or encode CBOR from untrusted sources, you
should start with a coder object created via C<new_safe> (which implements
the mitigations explained below):
my $coder = CBOR::XS->new_safe;
Disabling bigints will also partially or fully disable types that rely on
them, e.g. rational numbers that use bignums.
=head1 CBOR IMPLEMENTATION NOTES
This section contains some random implementation notes. They do not
describe guaranteed behaviour, but merely behaviour as-is implemented
right now.
64 bit integers are only properly decoded when Perl was built with 64 bit
support.
Strings and arrays are encoded with a definite length. Hashes as well,
unless they are tied (or otherwise magical).
Only the double data type is supported for NV data types - when Perl uses
long double to represent floating point values, they might not be encoded
properly. Half precision types are accepted, but not encoded.
Strict mode and canonical mode are not implemented.
=head1 LIMITATIONS ON PERLS WITHOUT 64-BIT INTEGER SUPPORT
On perls that were built without 64 bit integer support (these are rare
nowadays, even on 32 bit architectures, as all major Perl distributions
are built with 64 bit integer support), support for any kind of 64 bit
value in CBOR is very limited - most likely, these 64 bit values will
be truncated, corrupted, or otherwise not decoded correctly. This also
includes string, float, array and map sizes that are stored as 64 bit
integers.
=head1 THREADS
This module is I<not> guaranteed to be thread safe and there are no
plans to change this until Perl gets thread support (as opposed to the
horribly slow so-called "threads" which are simply slow and bloated
process simulations - use fork, it's I<much> faster, cheaper, better).
break;
case CBOR_TAG_VALUE_SHAREABLE:
{
if (ecb_expect_false (!dec->shareable))
dec->shareable = (AV *)sv_2mortal ((SV *)newAV ());
if (ecb_expect_false (dec->cbor.flags & (F_ALLOW_CYCLES | F_ALLOW_WEAK_CYCLES)))
{
// if cycles are allowed, then we store an AV as value
// while it is being decoded, and gather unresolved
// references in it, to be re4solved after decoding.
int idx, i;
AV *av = newAV ();
av_push (dec->shareable, (SV *)av);
idx = AvFILLp (dec->shareable);
sv = decode_sv (dec);
// the AV now contains \undef for all unresolved references,
// so we fix them up here.
sv = AvARRAY (dec->shareable)[idx];
// reference to cycle, we create a new \undef and use that, and also
// registerr it in the AV for later fixing
if (ecb_expect_false (SvTYPE (sv) == SVt_PVAV))
{
AV *av = (AV *)sv;
sv = newRV_noinc (&PL_sv_undef);
av_push (av, SvREFCNT_inc_NN (sv));
}
else if (ecb_expect_false (sv == &PL_sv_undef)) // not yet decoded, but cycles not allowed
ERR ("cyclic CBOR data structure found, but allow_cycles is not enabled");
else // we decoded the object earlier, no cycle
sv = newSVsv (sv);
}
break;
case CBOR_TAG_PERL_OBJECT:
{
if (dec->cbor.flags & F_FORBID_OBJECTS)
goto filter;
sv = decode_sv (dec);
( run in 0.321 second using v1.01-cache-2.11-cpan-05444aca049 )