CBOR-XS
view release on metacpan or search on metacpan
234567891011121314151617181920212223242526TODO: pack_keys?
TODO: document encode_cbor_sharing?
TODO: large negative integers
TODO: type cast tests.
TODO: round-tripping of types, such as float16 - maybe types::Serialiser support?
TODO: possibly implement https://peteroupc.github.io/CBOR/extended.html, but NaNs are nonportable. rely on libecb?
TODO: https://github.com/svaarala/cbor-specs/blob/master/cbor-nonutf8-string-tags.rst, but maybe that is overkill?
1.87 Fri 08 Sep 2023 22:14:18 CEST
- shared references were not decoded correctly: instead of getting
multiple references to the same object, you got the same
reference to the same object, causing a number of issues. For
example, modifying the reference would modify all places the
reference was used, and encoding the decoded structure would
unshare the previously shared hashes, as trheir reference count
would be 1. Fixing this was rather involved, as perl lacks the
ability to easily swap or copy arrays and hashes.
- \0, \1, \
undef
do
not work, and were not intended to ever work, as
special
values
, despite being mentioned in the documentation (reported
by nuclightq).
- new feature: allow_weak_cycles.
1.86 Thu 04 Nov 2021 17:48:16 CET
- fixed a wrong
printf
format
specifier (reported by Petr PÃsaÅ™).
11121314151617181920212223242526272829303132
$coder
= CBOR::XS->new;
$binary_cbor_data
=
$coder
->encode (
$perl_value
);
$perl_value
=
$coder
->decode (
$binary_cbor_data
);
# prefix decoding
my
$many_cbor_strings
= ...;
while
(
length
$many_cbor_strings
) {
my
(
$data
,
$length
) =
$cbor
->decode_prefix (
$many_cbor_strings
);
# data was decoded
substr
$many_cbor_strings
, 0,
$length
,
""
;
# remove decoded cbor string
}
DESCRIPTION
This module converts Perl data structures to the Concise Binary Object
Representation (CBOR) and vice versa. CBOR is a fast binary
data model, i.e.
when
you can represent something useful in JSON, you
should be able to represent it in CBOR.
In short, CBOR is a faster and quite compact binary alternative to JSON,
152153154155156157158159160161162163164165166167168169170171172$enabled
=
$cbor
->get_allow_sharing
If
$enable
is true (or missing), then
"encode"
will not
double-encode
values
that have been referenced
before
(e.g.
when
the
same object, such as an array, is referenced multiple
times
), but
instead will emit a reference to the earlier value.
This means that such
values
will only be encoded once, and will not
result in a deep cloning of the value on decode, in decoders
supporting the value sharing extension. This also makes it possible
to encode cyclic data structures (which need
"allow_cycles"
to be
enabled to be decoded by this module).
It is recommended to leave it off
unless
you know your communication
partner supports the value sharing extensions to CBOR
(<http://cbor.schmorp.de/value-sharing>), as without decoder
support, the resulting data structure might be unusable.
Detecting shared
values
incurs a runtime overhead
when
values
are
encoded that have a reference counter larger than one, and might
unnecessarily increase the encoded size, as potentially shared
values
are encoded as shareable whether or not they are actually
176177178179180181182183184185186187188189190191192193194195196197198199200201202
scalars, arrays or hashes pointed to by a reference). Weirder
constructs, such as an array
with
multiple
"copies"
of the
*same
*
string, which are hard but not impossible to create in Perl, are not
supported (this is the same as
with
Storable).
If
$enable
is false (the
default
), then
"encode"
will encode shared
data structures repeatedly, unsharing them in the process. Cyclic
data structures cannot be encoded in this mode.
This option does not affect
"decode"
in any way - shared
values
and
references will always be decoded properly
if
present.
$cbor
=
$cbor
->allow_cycles ([
$enable
])
$enabled
=
$cbor
->get_allow_cycles
If
$enable
is true (or missing), then
"decode"
will happily decode
self-referential (cyclic) data structures. By
default
these will not
be decoded, as they need manual cleanup to avoid memory leaks, so
code that isn't prepared
for
this will not leak memory.
If
$enable
is false (the
default
), then
"decode"
will throw an error
when
it encounters a self-referential/cyclic data structure.
This option does not affect
"encode"
in any way - shared
values
and
references will always be encoded properly
if
present.
$cbor
=
$cbor
->allow_weak_cycles ([
$enable
])
$enabled
=
$cbor
->get_allow_weak_cycles
243244245246247248249250251252253254255256257258259260261262263
It is recommended to leave it off
unless
you know your
communications partner supports the stringref extension to CBOR
(<http://cbor.schmorp.de/stringref>), as without decoder support,
the resulting data structure might not be usable.
If
$enable
is false (the
default
), then
"encode"
will encode strings
the standard CBOR way.
This option does not affect
"decode"
in any way - string references
will always be decoded properly
if
present.
$cbor
=
$cbor
->text_keys ([
$enable
])
$enabled
=
$cbor
->get_text_keys
If
$enabled
is true (or missing), then
"encode"
will encode all perl
hash
keys
as CBOR text strings/UTF-8 string, upgrading them as
needed.
If
$enable
is false (the
default
), then
"encode"
will encode hash
keys
normally - upgraded perl strings (strings internally encoded as
UTF-8) as CBOR text strings, and downgraded perl strings as CBOR
320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349
supposedly valid UTF-8 will simply be dumped into the resulting CBOR
string without checking whether that is, in fact, true or not.
$cbor
=
$cbor
->filter ([
$cb
->(
$tag
,
$value
)])
$cb_or_undef
=
$cbor
->get_filter
Sets or replaces the tagged value decoding filter (
when
$cb
is
specified) or clears the filter (
if
no
argument or
"undef"
is
provided).
The filter callback is called only during decoding,
when
a
non-enforced tagged value
has
been decoded (see "TAG HANDLING AND
EXTENSIONS"
for
a list of enforced tags). For specific tags, it's
often better to provide a
default
converter using the
%CBOR::XS::FILTER
hash (see below).
The first argument is the numerical tag, the second is the (decoded)
value that
has
been tagged.
The filter function should
return
either exactly one value, which
will replace the tagged value in the decoded data structure, or
no
values
, which will result in
default
handling, which currently means
the decoder creates a
"CBOR::XS::Tagged"
object to hold the tag and
the value.
When the filter is cleared (the
default
state), the
default
filter
function,
"CBOR::XS::default_filter"
, is used. This function simply
looks up the tag in the
%CBOR::XS::FILTER
hash. If an entry
exists
it must be a code reference that is called
with
tag and value, and
is responsible
for
decoding the value. If
no
entry
exists
, it
returns
no
values
.
"CBOR::XS"
provides a number of
default
filter
435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476and receiving CBOR-encoded messages. The solution that works
with
CBOR
and about anything
else
is by prepending a
length
to every CBOR value,
so the receiver knows how many octets to
read
. More compact (and
slightly slower) would be to just
send
CBOR
values
back-to-back, as
"CBOR::XS"
knows where a CBOR value ends, and doesn't need an explicit
length
.
The following methods help
with
this:
@decoded
=
$cbor
->incr_parse (
$buffer
)
This method attempts to decode exactly one CBOR value from the
beginning of the
given
$buffer
. The value is removed from the
$buffer
on success. When
$buffer
doesn't contain a complete value
yet, it returns nothing. Finally,
when
the
$buffer
doesn't start
with
something that could ever be a valid CBOR value, it raises an
exception, just as
"decode"
would. In the latter case the decoder
state is undefined and must be
reset
before
being able to parse
further.
This method modifies the
$buffer
in place. When
no
CBOR value can be
decoded, the decoder stores the current string offset. On the
next
call, continues decoding at the place where it stopped
before
. For
this to make sense, the
$buffer
must begin
with
the same octets as
on previous unsuccessful calls.
You can call this method in
scalar
context, in which case it either
returns a decoded value or
"undef"
. This makes it impossible to
distinguish between CBOR null
values
(which decode to
"undef"
) and
an unsuccessful decode, which is often acceptable.
@decoded
=
$cbor
->incr_parse_multiple (
$buffer
)
Same as
"incr_parse"
, but attempts to decode as many CBOR
values
as
possible in one go, instead of at most one. Calls to
"incr_parse"
and
"incr_parse_multiple"
can be interleaved.
$cbor
->incr_reset
Resets the incremental decoder. This throws away any saved state, so
that subsequent calls to
"incr_parse"
or
"incr_parse_multiple"
start
to parse a new CBOR value from the beginning of the
$buffer
again.
This method can be called at any
time
, but it
*must
* be called
if
490491492493494495496497498499500501502503504505506507508509510511CBOR -> PERL
integers
CBOR integers become (numeric) perl scalars. On perls without 64 bit
support, 64 bit integers will be truncated or otherwise corrupted.
byte strings
Byte strings will become octet strings in Perl (the Byte
values
0..255 will simply become characters of the same value in Perl).
UTF-8 strings
UTF-8 strings in CBOR will be decoded, i.e. the UTF-8 octets will be
decoded into proper Unicode code points. At the moment, the validity
of the UTF-8 octets will not be validated - corrupt input will
result in corrupted Perl strings.
arrays, maps
CBOR arrays and CBOR maps will be converted into references to a
Perl array or hash, respectively. The
keys
of the
map
will be
stringified during this process.
null
CBOR null becomes
"undef"
in Perl.
720721722723724725726727728729730731732733734735736737738739These methods
*MUST
NOT* change the data structure that is being
serialised. Failure to comply to this can result in memory corruption -
and worse.
If an object supports neither
"TO_CBOR"
nor
"FREEZE"
, encoding will fail
with
an error.
DECODING
Objects encoded via
"TO_CBOR"
cannot (normally) be automatically
decoded, but objects encoded via
"FREEZE"
can be decoded using the
following protocol:
When an encoded CBOR perl object is encountered by the decoder, it will
look up the
"THAW"
method, by using the stored classname, and will fail
if
the method cannot be found.
After the lookup it will call the
"THAW"
method
with
the stored
classname as first argument, the constant string
"CBOR"
as second
argument, and all
values
returned by
"FREEZE"
as remaining arguments.
870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926TAG HANDLING AND EXTENSIONS
This section describes how this module handles specific tagged
values
and extensions. If a tag is not mentioned here and
no
additional filters
are provided
for
it, then the
default
handling applies (creating a
CBOR::XS::Tagged object on decoding, and only encoding the tag
when
explicitly requested).
Tags not handled specifically are currently converted into a
CBOR::XS::Tagged object, which is simply a blessed array reference
consisting of the numeric tag value followed by the (decoded) CBOR
value.
Future versions of this module reserve the right to special case
additional tags (such as base64url).
ENFORCED TAGS
These tags are always handled
when
decoding, and their handling cannot
be overridden by the user.
26 (perl-object, <http://cbor.schmorp.de/perl-object>)
These tags are automatically created (and decoded)
for
serialisable
objects using the
"FREEZE/THAW"
methods (the Types::Serialier object
serialisation protocol). See
"OBJECT SERIALISATION"
for
details.
28, 29 (shareable, sharedref, <http://cbor.schmorp.de/value-sharing>)
These tags are automatically decoded
when
encountered (and they
do
not result in a cyclic data structure, see
"allow_cycles"
),
resulting in shared
values
in the decoded object. They are only
encoded, however,
when
"allow_sharing"
is enabled.
Not all shared
values
can be successfully decoded:
values
that
reference themselves will
*currently
* decode as
"undef"
(this is not
the same as a reference pointing to itself, which will be
represented as a value that contains an indirect reference to itself
- these will be decoded properly).
Note that considerably more shared value data structures can be
decoded than will be encoded - currently, only
values
pointed to by
references will be shared, others will not. While non-reference
shared
values
can be generated in Perl
with
some effort, they were
considered too unimportant to be supported in the encoder. The
decoder, however, will decode these
values
as shared
values
.
256, 25 (stringref-namespace, stringref,
These tags are automatically decoded
when
encountered. They are only
encoded, however,
when
"pack_strings"
is enabled.
22098 (indirection, <http://cbor.schmorp.de/indirection>)
This tag is automatically generated
when
a reference are encountered
(
with
the exception of hash and array references). It is converted
to a reference
when
decoding.
55799 (self-describe CBOR, RFC 7049)
This value is not generated on encoding (
unless
explicitly requested
by the user), and is simply ignored
when
decoding.
932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976When they result in decoding into a specific Perl class, the module
usually provides a corresponding
"TO_CBOR"
method as well.
When any of these need to load additional modules that are not part of
the perl core distribution (e.g. URI), it is (currently) up to the user
to provide these modules. The decoding usually fails
with
an exception
if
the required module cannot be loaded.
0, 1 (date/
time
string, seconds since the epoch)
These tags are decoded into Time::Piece objects. The corresponding
"Time::Piece::TO_CBOR"
method always encodes into tag 1
values
currently.
The Time::Piece API is generally surprisingly bad, and fractional
seconds are only accidentally kept intact, so watch out. On the plus
side, the module comes
with
perl since 5.10, which
has
to count
for
something.
2, 3 (positive/negative bignum)
These tags are decoded into Math::BigInt objects. The corresponding
"Math::BigInt::TO_CBOR"
method encodes
"small"
bigints into normal
CBOR integers, and others into positive/negative CBOR bignums.
4, 5, 264, 265 (decimal fraction/bigfloat)
Both decimal fractions and bigfloats are decoded into Math::BigFloat
objects. The corresponding
"Math::BigFloat::TO_CBOR"
method
*always
*
encodes into a decimal fraction (either tag 4 or 264).
NaN and infinities are not encoded properly, as they cannot be
represented in CBOR.
See
"BIGNUM SECURITY CONSIDERATIONS"
for
more info.
30 (rational numbers)
These tags are decoded into Math::BigRat objects. The corresponding
"Math::BigRat::TO_CBOR"
method encodes rational numbers
with
denominator 1 via their numerator only, i.e., they become normal
integers or
"bignums"
.
See
"BIGNUM SECURITY CONSIDERATIONS"
for
more info.
21, 22, 23 (expected later JSON conversion)
CBOR::XS is not a CBOR-to-JSON converter, and will simply ignore
these tags.
982983984985986987988989990991992993994995996997998999100010011002CBOR and JSON
CBOR is supposed to implement a superset of the JSON data model, and is,
with
some coercion, able to represent all JSON texts (something that
other
"binary JSON"
formats such as BSON generally
do
not support).
CBOR implements some extra hints and support
for
JSON interoperability,
and the spec offers further guidance
for
conversion between CBOR and
JSON. None of this is currently implemented in CBOR, and the guidelines
in the spec
do
not result in correct round-tripping of data. If JSON
interoperability is improved in the future, then the goal will be to
ensure that decoded JSON data will round-trip encoding and decoding to
CBOR intact.
SECURITY CONSIDERATIONS
Tl;dr...
if
you want to decode or encode CBOR from untrusted sources,
you should start
with
a coder object created via
"new_safe"
(which
implements the mitigations explained below):
my
$coder
= CBOR::XS->new_safe;
my
$data
=
$coder
->decode (
$cbor_text
);
11261127112811291130113111321133113411351136113711381139114011411142114311441145114611471148114911501151115211531154115511561157115811591160116111621163
even without bigints.
Disabling bigints will also partially or fully disable types that rely
CBOR IMPLEMENTATION NOTES
This section contains some random implementation notes. They
do
not
describe guaranteed behaviour, but merely behaviour as-is implemented
right now.
64 bit integers are only properly decoded
when
Perl was built
with
64
bit support.
Strings and arrays are encoded
with
a definite
length
. Hashes as well,
unless
they are
tied
(or otherwise magical).
Only the double data type is supported
for
NV data types -
when
Perl
uses long double to represent floating point
values
, they might not be
encoded properly. Half precision types are accepted, but not encoded.
Strict mode and canonical mode are not implemented.
LIMITATIONS ON PERLS WITHOUT 64-BIT INTEGER SUPPORT
On perls that were built without 64 bit integer support (these are rare
nowadays, even on 32 bit architectures, as all major Perl distributions
are built
with
64 bit integer support), support
for
any kind of 64 bit
value in CBOR is very limited - most likely, these 64 bit
values
will be
truncated, corrupted, or otherwise not decoded correctly. This also
includes string, float, array and
map
sizes that are stored as 64 bit
integers.
THREADS
This module is
*not
* guaranteed to be thread safe and there are
no
plans
to change this
until
Perl gets thread support (as opposed to the
horribly slow so-called
"threads"
which are simply slow and bloated
process simulations -
use
fork
, it's
*much
* faster, cheaper, better).
(It might actually work, but you have been warned).
151617181920212223242526272829303132333435
$coder
= CBOR::XS->new;
$binary_cbor_data
=
$coder
->encode (
$perl_value
);
$perl_value
=
$coder
->decode (
$binary_cbor_data
);
# prefix decoding
my
$many_cbor_strings
= ...;
while
(
length
$many_cbor_strings
) {
my
(
$data
,
$length
) =
$cbor
->decode_prefix (
$many_cbor_strings
);
# data was decoded
substr
$many_cbor_strings
, 0,
$length
,
""
;
# remove decoded cbor string
}
=head1 DESCRIPTION
This module converts Perl data structures to the Concise Binary Object
Representation (CBOR) and vice versa. CBOR is a fast binary serialisation
format that aims to use an (almost) superset of the JSON data model, i.e.
when you can represent something useful in JSON, you should be able to
represent it in CBOR.
202203204205206207208209210211212213214215216217218219220221222=item $enabled = $cbor->get_allow_sharing
If C<$enable> is true (or missing), then C<encode> will not double-encode
values that have been referenced before (e.g. when the same object, such
as an array, is referenced multiple times), but instead will emit a
reference to the earlier value.
This means that such values will only be encoded once, and will not result
in a deep cloning of the value on decode, in decoders supporting the value
sharing extension. This also makes it possible to encode cyclic data
structures (which need C<allow_cycles> to be enabled to be decoded by this
module).
It is recommended to leave it off unless you know your
communication partner supports the value sharing extensions to CBOR
(L<http://cbor.schmorp.de/value-sharing>), as without decoder support, the
resulting data structure might be unusable.
Detecting shared values incurs a runtime overhead when values are encoded
that have a reference counter larger than one, and might unnecessarily
increase the encoded size, as potentially shared values are encoded as
226227228229230231232233234235236237238239240241242243244245246247248249250251252253arrays or hashes pointed to by a reference). Weirder constructs, such as
an array
with
multiple
"copies"
of the I<same> string, which are hard but
not impossible to create in Perl, are not supported (this is the same as
with
L<Storable>).
If C<
$enable
> is false (the
default
), then C<encode> will encode shared
data structures repeatedly, unsharing them in the process. Cyclic data
structures cannot be encoded in this mode.
This option does not affect C<decode> in any way - shared
values
and
references will always be decoded properly
if
present.
=item $cbor = $cbor->allow_cycles ([$enable])
=item $enabled = $cbor->get_allow_cycles
If C<$enable> is true (or missing), then C<decode> will happily decode
self-referential (cyclic) data structures. By default these will not be
decoded, as they need manual cleanup to avoid memory leaks, so code that
isn't prepared for this will not leak memory.
If C<$enable> is false (the default), then C<decode> will throw an error
when it encounters a self-referential/cyclic data structure.
This option does not affect C<encode> in any way - shared values and
references will always be encoded properly if present.
=item $cbor = $cbor->allow_weak_cycles ([$enable])
298299300301302303304305306307308309310311312313314315316317318It is recommended to leave it off
unless
you know your
communications partner supports the stringref extension to CBOR
(L<http://cbor.schmorp.de/stringref>), as without decoder support, the
resulting data structure might not be usable.
If C<
$enable
> is false (the
default
), then C<encode> will encode strings
the standard CBOR way.
This option does not affect C<decode> in any way - string references will
always be decoded properly
if
present.
=item $cbor = $cbor->text_keys ([$enable])
=item $enabled = $cbor->get_text_keys
If C<$enabled> is true (or missing), then C<encode> will encode all
perl hash keys as CBOR text strings/UTF-8 string, upgrading them as needed.
If C<$enable> is false (the default), then C<encode> will encode hash keys
normally - upgraded perl strings (strings internally encoded as UTF-8) as
378379380381382383384385386387388389390391392393394395396397398399400401402403404405406string without checking whether that is, in fact, true or not.
=item $cbor = $cbor->filter ([$cb->($tag, $value)])
=item $cb_or_undef = $cbor->get_filter
Sets or replaces the tagged value decoding filter (when C<$cb> is
specified) or clears the filter (if no argument or C<undef> is provided).
The filter callback is called only during decoding, when a non-enforced
tagged value has been decoded (see L<TAG HANDLING AND EXTENSIONS> for a
list of enforced tags). For specific tags, it's often better to provide a
default converter using the C<%CBOR::XS::FILTER> hash (see below).
The first argument is the numerical tag, the second is the (decoded) value
that has been tagged.
The filter function should return either exactly one value, which will
replace the tagged value in the decoded data structure, or no values,
which will result in default handling, which currently means the decoder
creates a C<CBOR::XS::Tagged> object to hold the tag and the value.
When the filter is cleared (the default state), the default filter
function, C<CBOR::XS::default_filter>, is used. This function simply
looks up the tag in the C<%CBOR::XS::FILTER> hash. If an entry exists
it must be a code reference that is called with tag and value, and is
responsible for decoding the value. If no entry exists, it returns no
values. C<CBOR::XS> provides a number of default filter functions already,
the the C<%CBOR::XS::FILTER> hash can be freely extended with more.
496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537and receiving CBOR-encoded messages. The solution that works
with
CBOR and
about anything
else
is by prepending a
length
to every CBOR value, so the
receiver knows how many octets to
read
. More compact (and slightly slower)
would be to just
send
CBOR
values
back-to-back, as C<CBOR::XS> knows where
a CBOR value ends, and doesn't need an explicit
length
.
The following methods help
with
this:
=over 4
=item @decoded = $cbor->incr_parse ($buffer)
This method attempts to decode exactly one CBOR value from the beginning
of the given C<$buffer>. The value is removed from the C<$buffer> on
success. When C<$buffer> doesn't contain a complete value yet, it returns
nothing. Finally, when the C<$buffer> doesn't start with something
that could ever be a valid CBOR value, it raises an exception, just as
C<decode> would. In the latter case the decoder state is undefined and
must be reset before being able to parse further.
This method modifies the C<$buffer> in place. When no CBOR value can be
decoded, the decoder stores the current string offset. On the next call,
continues decoding at the place where it stopped before. For this to make
sense, the C<$buffer> must begin with the same octets as on previous
unsuccessful calls.
You can call this method in scalar context, in which case it either
returns a decoded value or C<undef>. This makes it impossible to
distinguish between CBOR null values (which decode to C<undef>) and an
unsuccessful decode, which is often acceptable.
=item @decoded = $cbor->incr_parse_multiple ($buffer)
Same as C<incr_parse>, but attempts to decode as many CBOR values as
possible in one go, instead of at most one. Calls to C<incr_parse> and
C<incr_parse_multiple> can be interleaved.
=item $cbor->incr_reset
Resets the incremental decoder. This throws away any saved state, so that
subsequent calls to C<incr_parse> or C<incr_parse_multiple> start to parse
a new CBOR value from the beginning of the C<$buffer> again.
564565566567568569570571572573574575576577578579580581582583584585CBOR integers become (numeric) perl scalars. On perls without 64 bit
support, 64 bit integers will be truncated or otherwise corrupted.
=item byte strings
Byte strings will become octet strings in Perl (the Byte values 0..255
will simply become characters of the same value in Perl).
=item UTF-8 strings
UTF-8 strings in CBOR will be decoded, i.e. the UTF-8 octets will be
decoded into proper Unicode code points. At the moment, the validity of
the UTF-8 octets will not be validated - corrupt input will result in
corrupted Perl strings.
=item arrays, maps
CBOR arrays and CBOR maps will be converted into references to a Perl
array or hash, respectively. The keys of the map will be stringified
during this process.
=item null
846847848849850851852853854855856857858859860861862863864865866These methods I<MUST NOT> change the data structure that is being
serialised. Failure to comply to this can result in memory corruption -
and worse.
If an object supports neither C<TO_CBOR> nor C<FREEZE>, encoding will fail
with
an error.
=head3 DECODING
Objects encoded via C<TO_CBOR> cannot (normally) be automatically decoded,
but objects encoded via C<FREEZE> can be decoded using the following
protocol:
When an encoded CBOR perl object is encountered by the decoder, it will
look up the C<THAW> method, by using the stored classname, and will fail
if the method cannot be found.
After the lookup it will call the C<THAW> method with the stored classname
as first argument, the constant string C<CBOR> as second argument, and all
values returned by C<FREEZE> as remaining arguments.
103210331034103510361037103810391040104110421043104410451046104710481049105010511052105310541055105610571058105910601061106210631064106510661067106810691070107110721073107410751076107710781079108010811082108310841085108610871088108910901091=head1 TAG HANDLING AND EXTENSIONS
This section describes how this module handles specific tagged values
and extensions. If a tag is not mentioned here and no additional filters
are provided for it, then the default handling applies (creating a
CBOR::XS::Tagged object on decoding, and only encoding the tag when
explicitly requested).
Tags not handled specifically are currently converted into a
L<CBOR::XS::Tagged> object, which is simply a blessed array reference
consisting of the numeric tag value followed by the (decoded) CBOR value.
Future versions of this module reserve the right to special case
additional tags (such as base64url).
=head2 ENFORCED TAGS
These tags are always handled when decoding, and their handling cannot be
overridden by the user.
=over 4
=item 26 (perl-object, L<http://cbor.schmorp.de/perl-object>)
These tags are automatically created (and decoded) for serialisable
objects using the C<FREEZE/THAW> methods (the L<Types::Serialier> object
serialisation protocol). See L<OBJECT SERIALISATION> for details.
=item 28, 29 (shareable, sharedref, L<http://cbor.schmorp.de/value-sharing>)
These tags are automatically decoded when encountered (and they do not
result in a cyclic data structure, see C<allow_cycles>), resulting in
shared values in the decoded object. They are only encoded, however, when
C<allow_sharing> is enabled.
Not all shared values can be successfully decoded: values that reference
themselves will I<currently> decode as C<undef> (this is not the same
as a reference pointing to itself, which will be represented as a value
that contains an indirect reference to itself - these will be decoded
properly).
Note that considerably more shared value data structures can be decoded
than will be encoded - currently, only values pointed to by references
will be shared, others will not. While non-reference shared values can be
generated in Perl with some effort, they were considered too unimportant
to be supported in the encoder. The decoder, however, will decode these
values as shared values.
=item 256, 25 (stringref-namespace, stringref, L<http://cbor.schmorp.de/stringref>)
These tags are automatically decoded when encountered. They are only
encoded, however, when C<pack_strings> is enabled.
=item 22098 (indirection, L<http://cbor.schmorp.de/indirection>)
This tag is automatically generated when a reference are encountered (with
the exception of hash and array references). It is converted to a reference
when decoding.
=item 55799 (self-describe CBOR, RFC 7049)
11061107110811091110111111121113111411151116111711181119112011211122112311241125112611271128112911301131113211331134113511361137113811391140114111421143114411451146114711481149115011511152When any of these need to load additional modules that are not part of the
perl core distribution (e.g. L<URI>), it is (currently) up to the user to
provide these modules. The decoding usually fails
with
an exception
if
the
required module cannot be loaded.
=over 4
=item 0, 1 (date/time string, seconds since the epoch)
These tags are decoded into L<Time::Piece> objects. The corresponding
C<Time::Piece::TO_CBOR> method always encodes into tag 1 values currently.
The L<Time::Piece> API is generally surprisingly bad, and fractional
seconds are only accidentally kept intact, so watch out. On the plus side,
the module comes with perl since 5.10, which has to count for something.
=item 2, 3 (positive/negative bignum)
These tags are decoded into L<Math::BigInt> objects. The corresponding
C<Math::BigInt::TO_CBOR> method encodes "small" bigints into normal CBOR
integers, and others into positive/negative CBOR bignums.
=item 4, 5, 264, 265 (decimal fraction/bigfloat)
Both decimal fractions and bigfloats are decoded into L<Math::BigFloat>
objects. The corresponding C<Math::BigFloat::TO_CBOR> method I<always>
encodes into a decimal fraction (either tag 4 or 264).
NaN and infinities are not encoded properly, as they cannot be represented
in CBOR.
See L<BIGNUM SECURITY CONSIDERATIONS> for more info.
=item 30 (rational numbers)
These tags are decoded into L<Math::BigRat> objects. The corresponding
C<Math::BigRat::TO_CBOR> method encodes rational numbers with denominator
C<1> via their numerator only, i.e., they become normal integers or
C<bignums>.
See L<BIGNUM SECURITY CONSIDERATIONS> for more info.
=item 21, 22, 23 (expected later JSON conversion)
CBOR::XS is not a CBOR-to-JSON converter, and will simply ignore these
tags.
116411651166116711681169117011711172117311741175117611771178117911801181118211831184CBOR is supposed to implement a superset of the JSON data model, and is,
with
some coercion, able to represent all JSON texts (something that other
"binary JSON"
formats such as BSON generally
do
not support).
CBOR implements some extra hints and support
for
JSON interoperability,
and the spec offers further guidance
for
conversion between CBOR and
JSON. None of this is currently implemented in CBOR, and the guidelines
in the spec
do
not result in correct round-tripping of data. If JSON
interoperability is improved in the future, then the goal will be to
ensure that decoded JSON data will round-trip encoding and decoding to
CBOR intact.
=head1 SECURITY CONSIDERATIONS
Tl;dr... if you want to decode or encode CBOR from untrusted sources, you
should start with a coder object created via C<new_safe> (which implements
the mitigations explained below):
my $coder = CBOR::XS->new_safe;
1323132413251326132713281329133013311332133313341335133613371338133913401341134213431344134513461347134813491350135113521353135413551356135713581359136013611362Disabling bigints will also partially or fully disable types that rely on
=head1 CBOR IMPLEMENTATION NOTES
This section contains some random implementation notes. They do not
describe guaranteed behaviour, but merely behaviour as-is implemented
right now.
64 bit integers are only properly decoded when Perl was built with 64 bit
support.
Strings and arrays are encoded with a definite length. Hashes as well,
unless they are tied (or otherwise magical).
Only the double data type is supported for NV data types - when Perl uses
long double to represent floating point values, they might not be encoded
properly. Half precision types are accepted, but not encoded.
Strict mode and canonical mode are not implemented.
=head1 LIMITATIONS ON PERLS WITHOUT 64-BIT INTEGER SUPPORT
On perls that were built without 64 bit integer support (these are rare
nowadays, even on 32 bit architectures, as all major Perl distributions
are built with 64 bit integer support), support for any kind of 64 bit
value in CBOR is very limited - most likely, these 64 bit values will
be truncated, corrupted, or otherwise not decoded correctly. This also
includes string, float, array and map sizes that are stored as 64 bit
integers.
=head1 THREADS
This module is I<not> guaranteed to be thread safe and there are no
plans to change this until Perl gets thread support (as opposed to the
horribly slow so-called "threads" which are simply slow and bloated
process simulations - use fork, it's I<much> faster, cheaper, better).
119311941195119611971198119912001201120212031204120512061207120812091210121112121213
break;
case CBOR_TAG_VALUE_SHAREABLE:
{
if
(ecb_expect_false (!dec->shareable))
dec->shareable = (AV *)sv_2mortal ((SV *)newAV ());
if
(ecb_expect_false (dec->cbor.flags & (F_ALLOW_CYCLES | F_ALLOW_WEAK_CYCLES)))
{
//
if
cycles are allowed, then we store an AV as value
//
while
it is being decoded, and gather unresolved
// references in it, to be re4solved
after
decoding.
int
idx, i;
AV
*av
= newAV ();
av_push (dec->shareable, (SV *)av);
idx = AvFILLp (dec->shareable);
sv = decode_sv (dec);
// the AV now contains \
undef
for
all unresolved references,
// so we fix them up here.
12461247124812491250125112521253125412551256125712581259126012611262126312641265126612671268
sv = AvARRAY (dec->shareable)[idx];
// registerr it in the AV
for
later fixing
if
(ecb_expect_false (SvTYPE (sv) == SVt_PVAV))
{
AV
*av
= (AV *)sv;
sv = newRV_noinc (
&PL_sv_undef
);
av_push (av, SvREFCNT_inc_NN (sv));
}
else
if
(ecb_expect_false (sv ==
&PL_sv_undef
)) // not yet decoded, but cycles not allowed
ERR (
"cyclic CBOR data structure found, but allow_cycles is not enabled"
);
else
// we decoded the object earlier,
no
cycle
sv = newSVsv (sv);
}
break;
case CBOR_TAG_PERL_OBJECT:
{
if
(dec->cbor.flags & F_FORBID_OBJECTS)
goto
filter;
sv = decode_sv (dec);
( run in 0.249 second using v1.01-cache-2.11-cpan-05444aca049 )