CBOR-XS

 view release on metacpan or  search on metacpan

Changes  view on Meta::CPAN

2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
TODO: pack_keys?
TODO: document encode_cbor_sharing?
TODO: large negative integers
TODO: type cast tests.
TODO: round-tripping of types, such as float16 - maybe types::Serialiser support?
TODO: possibly implement https://peteroupc.github.io/CBOR/extended.html, but NaNs are nonportable. rely on libecb?
 
1.87 Fri 08 Sep 2023 22:14:18 CEST
        - shared references were not decoded correctly: instead of getting
          multiple references to the same object, you got the same
          reference to the same object, causing a number of issues. For
          example, modifying the reference would modify all places the
          reference was used, and encoding the decoded structure would
          unshare the previously shared hashes, as trheir reference count
          would be 1. Fixing this was rather involved, as perl lacks the
          ability to easily swap or copy arrays and hashes.
        - \0, \1, \undef do not work, and were not intended to ever work, as
          special values, despite being mentioned in the documentation (reported
          by nuclightq).
        - new feature: allow_weak_cycles.
 
1.86 Thu 04 Nov 2021 17:48:16 CET
        - fixed a wrong printf format specifier (reported by Petr PísaÅ™).

README  view on Meta::CPAN

11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
     $coder = CBOR::XS->new;
     $binary_cbor_data = $coder->encode ($perl_value);
     $perl_value       = $coder->decode ($binary_cbor_data);
 
     # prefix decoding
 
     my $many_cbor_strings = ...;
     while (length $many_cbor_strings) {
        my ($data, $length) = $cbor->decode_prefix ($many_cbor_strings);
        # data was decoded
        substr $many_cbor_strings, 0, $length, ""; # remove decoded cbor string
     }
 
DESCRIPTION
    This module converts Perl data structures to the Concise Binary Object
    Representation (CBOR) and vice versa. CBOR is a fast binary
    serialisation format that aims to use an (almost) superset of the JSON
    data model, i.e. when you can represent something useful in JSON, you
    should be able to represent it in CBOR.
 
    In short, CBOR is a faster and quite compact binary alternative to JSON,

README  view on Meta::CPAN

152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
$enabled = $cbor->get_allow_sharing
    If $enable is true (or missing), then "encode" will not
    double-encode values that have been referenced before (e.g. when the
    same object, such as an array, is referenced multiple times), but
    instead will emit a reference to the earlier value.
 
    This means that such values will only be encoded once, and will not
    result in a deep cloning of the value on decode, in decoders
    supporting the value sharing extension. This also makes it possible
    to encode cyclic data structures (which need "allow_cycles" to be
    enabled to be decoded by this module).
 
    It is recommended to leave it off unless you know your communication
    partner supports the value sharing extensions to CBOR
    (<http://cbor.schmorp.de/value-sharing>), as without decoder
    support, the resulting data structure might be unusable.
 
    Detecting shared values incurs a runtime overhead when values are
    encoded that have a reference counter larger than one, and might
    unnecessarily increase the encoded size, as potentially shared
    values are encoded as shareable whether or not they are actually

README  view on Meta::CPAN

176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
    scalars, arrays or hashes pointed to by a reference). Weirder
    constructs, such as an array with multiple "copies" of the *same*
    string, which are hard but not impossible to create in Perl, are not
    supported (this is the same as with Storable).
 
    If $enable is false (the default), then "encode" will encode shared
    data structures repeatedly, unsharing them in the process. Cyclic
    data structures cannot be encoded in this mode.
 
    This option does not affect "decode" in any way - shared values and
    references will always be decoded properly if present.
 
$cbor = $cbor->allow_cycles ([$enable])
$enabled = $cbor->get_allow_cycles
    If $enable is true (or missing), then "decode" will happily decode
    self-referential (cyclic) data structures. By default these will not
    be decoded, as they need manual cleanup to avoid memory leaks, so
    code that isn't prepared for this will not leak memory.
 
    If $enable is false (the default), then "decode" will throw an error
    when it encounters a self-referential/cyclic data structure.
 
    This option does not affect "encode" in any way - shared values and
    references will always be encoded properly if present.
 
$cbor = $cbor->allow_weak_cycles ([$enable])
$enabled = $cbor->get_allow_weak_cycles

README  view on Meta::CPAN

243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
    It is recommended to leave it off unless you know your
    communications partner supports the stringref extension to CBOR
    (<http://cbor.schmorp.de/stringref>), as without decoder support,
    the resulting data structure might not be usable.
 
    If $enable is false (the default), then "encode" will encode strings
    the standard CBOR way.
 
    This option does not affect "decode" in any way - string references
    will always be decoded properly if present.
 
$cbor = $cbor->text_keys ([$enable])
$enabled = $cbor->get_text_keys
    If $enabled is true (or missing), then "encode" will encode all perl
    hash keys as CBOR text strings/UTF-8 string, upgrading them as
    needed.
 
    If $enable is false (the default), then "encode" will encode hash
    keys normally - upgraded perl strings (strings internally encoded as
    UTF-8) as CBOR text strings, and downgraded perl strings as CBOR

README  view on Meta::CPAN

320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
    supposedly valid UTF-8 will simply be dumped into the resulting CBOR
    string without checking whether that is, in fact, true or not.
 
$cbor = $cbor->filter ([$cb->($tag, $value)])
$cb_or_undef = $cbor->get_filter
    Sets or replaces the tagged value decoding filter (when $cb is
    specified) or clears the filter (if no argument or "undef" is
    provided).
 
    The filter callback is called only during decoding, when a
    non-enforced tagged value has been decoded (see "TAG HANDLING AND
    EXTENSIONS" for a list of enforced tags). For specific tags, it's
    often better to provide a default converter using the
    %CBOR::XS::FILTER hash (see below).
 
    The first argument is the numerical tag, the second is the (decoded)
    value that has been tagged.
 
    The filter function should return either exactly one value, which
    will replace the tagged value in the decoded data structure, or no
    values, which will result in default handling, which currently means
    the decoder creates a "CBOR::XS::Tagged" object to hold the tag and
    the value.
 
    When the filter is cleared (the default state), the default filter
    function, "CBOR::XS::default_filter", is used. This function simply
    looks up the tag in the %CBOR::XS::FILTER hash. If an entry exists
    it must be a code reference that is called with tag and value, and
    is responsible for decoding the value. If no entry exists, it
    returns no values. "CBOR::XS" provides a number of default filter

README  view on Meta::CPAN

435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
A typical use case would be a network protocol that consists of sending
and receiving CBOR-encoded messages. The solution that works with CBOR
and about anything else is by prepending a length to every CBOR value,
so the receiver knows how many octets to read. More compact (and
slightly slower) would be to just send CBOR values back-to-back, as
"CBOR::XS" knows where a CBOR value ends, and doesn't need an explicit
length.
 
The following methods help with this:
 
@decoded = $cbor->incr_parse ($buffer)
    This method attempts to decode exactly one CBOR value from the
    beginning of the given $buffer. The value is removed from the
    $buffer on success. When $buffer doesn't contain a complete value
    yet, it returns nothing. Finally, when the $buffer doesn't start
    with something that could ever be a valid CBOR value, it raises an
    exception, just as "decode" would. In the latter case the decoder
    state is undefined and must be reset before being able to parse
    further.
 
    This method modifies the $buffer in place. When no CBOR value can be
    decoded, the decoder stores the current string offset. On the next
    call, continues decoding at the place where it stopped before. For
    this to make sense, the $buffer must begin with the same octets as
    on previous unsuccessful calls.
 
    You can call this method in scalar context, in which case it either
    returns a decoded value or "undef". This makes it impossible to
    distinguish between CBOR null values (which decode to "undef") and
    an unsuccessful decode, which is often acceptable.
 
@decoded = $cbor->incr_parse_multiple ($buffer)
    Same as "incr_parse", but attempts to decode as many CBOR values as
    possible in one go, instead of at most one. Calls to "incr_parse"
    and "incr_parse_multiple" can be interleaved.
 
$cbor->incr_reset
    Resets the incremental decoder. This throws away any saved state, so
    that subsequent calls to "incr_parse" or "incr_parse_multiple" start
    to parse a new CBOR value from the beginning of the $buffer again.
 
    This method can be called at any time, but it *must* be called if

README  view on Meta::CPAN

490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
CBOR -> PERL
  integers
      CBOR integers become (numeric) perl scalars. On perls without 64 bit
      support, 64 bit integers will be truncated or otherwise corrupted.
 
  byte strings
      Byte strings will become octet strings in Perl (the Byte values
      0..255 will simply become characters of the same value in Perl).
 
  UTF-8 strings
      UTF-8 strings in CBOR will be decoded, i.e. the UTF-8 octets will be
      decoded into proper Unicode code points. At the moment, the validity
      of the UTF-8 octets will not be validated - corrupt input will
      result in corrupted Perl strings.
 
  arrays, maps
      CBOR arrays and CBOR maps will be converted into references to a
      Perl array or hash, respectively. The keys of the map will be
      stringified during this process.
 
  null
      CBOR null becomes "undef" in Perl.

README  view on Meta::CPAN

720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
These methods *MUST NOT* change the data structure that is being
 serialised. Failure to comply to this can result in memory corruption -
 and worse.
 
 If an object supports neither "TO_CBOR" nor "FREEZE", encoding will fail
 with an error.
 
DECODING
 Objects encoded via "TO_CBOR" cannot (normally) be automatically
 decoded, but objects encoded via "FREEZE" can be decoded using the
 following protocol:
 
 When an encoded CBOR perl object is encountered by the decoder, it will
 look up the "THAW" method, by using the stored classname, and will fail
 if the method cannot be found.
 
 After the lookup it will call the "THAW" method with the stored
 classname as first argument, the constant string "CBOR" as second
 argument, and all values returned by "FREEZE" as remaining arguments.

README  view on Meta::CPAN

870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
TAG HANDLING AND EXTENSIONS
    This section describes how this module handles specific tagged values
    and extensions. If a tag is not mentioned here and no additional filters
    are provided for it, then the default handling applies (creating a
    CBOR::XS::Tagged object on decoding, and only encoding the tag when
    explicitly requested).
 
    Tags not handled specifically are currently converted into a
    CBOR::XS::Tagged object, which is simply a blessed array reference
    consisting of the numeric tag value followed by the (decoded) CBOR
    value.
 
    Future versions of this module reserve the right to special case
    additional tags (such as base64url).
 
  ENFORCED TAGS
    These tags are always handled when decoding, and their handling cannot
    be overridden by the user.
 
    26 (perl-object, <http://cbor.schmorp.de/perl-object>)
        These tags are automatically created (and decoded) for serialisable
        objects using the "FREEZE/THAW" methods (the Types::Serialier object
        serialisation protocol). See "OBJECT SERIALISATION" for details.
 
    28, 29 (shareable, sharedref, <http://cbor.schmorp.de/value-sharing>)
        These tags are automatically decoded when encountered (and they do
        not result in a cyclic data structure, see "allow_cycles"),
        resulting in shared values in the decoded object. They are only
        encoded, however, when "allow_sharing" is enabled.
 
        Not all shared values can be successfully decoded: values that
        reference themselves will *currently* decode as "undef" (this is not
        the same as a reference pointing to itself, which will be
        represented as a value that contains an indirect reference to itself
        - these will be decoded properly).
 
        Note that considerably more shared value data structures can be
        decoded than will be encoded - currently, only values pointed to by
        references will be shared, others will not. While non-reference
        shared values can be generated in Perl with some effort, they were
        considered too unimportant to be supported in the encoder. The
        decoder, however, will decode these values as shared values.
 
    256, 25 (stringref-namespace, stringref,
        These tags are automatically decoded when encountered. They are only
        encoded, however, when "pack_strings" is enabled.
 
    22098 (indirection, <http://cbor.schmorp.de/indirection>)
        This tag is automatically generated when a reference are encountered
        (with the exception of hash and array references). It is converted
        to a reference when decoding.
 
    55799 (self-describe CBOR, RFC 7049)
        This value is not generated on encoding (unless explicitly requested
        by the user), and is simply ignored when decoding.

README  view on Meta::CPAN

932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
When they result in decoding into a specific Perl class, the module
usually provides a corresponding "TO_CBOR" method as well.
 
When any of these need to load additional modules that are not part of
the perl core distribution (e.g. URI), it is (currently) up to the user
to provide these modules. The decoding usually fails with an exception
if the required module cannot be loaded.
 
0, 1 (date/time string, seconds since the epoch)
    These tags are decoded into Time::Piece objects. The corresponding
    "Time::Piece::TO_CBOR" method always encodes into tag 1 values
    currently.
 
    The Time::Piece API is generally surprisingly bad, and fractional
    seconds are only accidentally kept intact, so watch out. On the plus
    side, the module comes with perl since 5.10, which has to count for
    something.
 
2, 3 (positive/negative bignum)
    These tags are decoded into Math::BigInt objects. The corresponding
    "Math::BigInt::TO_CBOR" method encodes "small" bigints into normal
    CBOR integers, and others into positive/negative CBOR bignums.
 
4, 5, 264, 265 (decimal fraction/bigfloat)
    Both decimal fractions and bigfloats are decoded into Math::BigFloat
    objects. The corresponding "Math::BigFloat::TO_CBOR" method *always*
    encodes into a decimal fraction (either tag 4 or 264).
 
    NaN and infinities are not encoded properly, as they cannot be
    represented in CBOR.
 
    See "BIGNUM SECURITY CONSIDERATIONS" for more info.
 
30 (rational numbers)
    These tags are decoded into Math::BigRat objects. The corresponding
    "Math::BigRat::TO_CBOR" method encodes rational numbers with
    denominator 1 via their numerator only, i.e., they become normal
    integers or "bignums".
 
    See "BIGNUM SECURITY CONSIDERATIONS" for more info.
 
21, 22, 23 (expected later JSON conversion)
    CBOR::XS is not a CBOR-to-JSON converter, and will simply ignore
    these tags.

README  view on Meta::CPAN

982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
CBOR and JSON
    CBOR is supposed to implement a superset of the JSON data model, and is,
    with some coercion, able to represent all JSON texts (something that
    other "binary JSON" formats such as BSON generally do not support).
 
    CBOR implements some extra hints and support for JSON interoperability,
    and the spec offers further guidance for conversion between CBOR and
    JSON. None of this is currently implemented in CBOR, and the guidelines
    in the spec do not result in correct round-tripping of data. If JSON
    interoperability is improved in the future, then the goal will be to
    ensure that decoded JSON data will round-trip encoding and decoding to
    CBOR intact.
 
SECURITY CONSIDERATIONS
    Tl;dr... if you want to decode or encode CBOR from untrusted sources,
    you should start with a coder object created via "new_safe" (which
    implements the mitigations explained below):
 
       my $coder = CBOR::XS->new_safe;
 
       my $data = $coder->decode ($cbor_text);

README  view on Meta::CPAN

1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
    even without bigints.
 
    Disabling bigints will also partially or fully disable types that rely
    on them, e.g. rational numbers that use bignums.
 
CBOR IMPLEMENTATION NOTES
    This section contains some random implementation notes. They do not
    describe guaranteed behaviour, but merely behaviour as-is implemented
    right now.
 
    64 bit integers are only properly decoded when Perl was built with 64
    bit support.
 
    Strings and arrays are encoded with a definite length. Hashes as well,
    unless they are tied (or otherwise magical).
 
    Only the double data type is supported for NV data types - when Perl
    uses long double to represent floating point values, they might not be
    encoded properly. Half precision types are accepted, but not encoded.
 
    Strict mode and canonical mode are not implemented.
 
LIMITATIONS ON PERLS WITHOUT 64-BIT INTEGER SUPPORT
    On perls that were built without 64 bit integer support (these are rare
    nowadays, even on 32 bit architectures, as all major Perl distributions
    are built with 64 bit integer support), support for any kind of 64 bit
    value in CBOR is very limited - most likely, these 64 bit values will be
    truncated, corrupted, or otherwise not decoded correctly. This also
    includes string, float, array and map sizes that are stored as 64 bit
    integers.
 
THREADS
    This module is *not* guaranteed to be thread safe and there are no plans
    to change this until Perl gets thread support (as opposed to the
    horribly slow so-called "threads" which are simply slow and bloated
    process simulations - use fork, it's *much* faster, cheaper, better).
 
    (It might actually work, but you have been warned).

XS.pm  view on Meta::CPAN

15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
 $coder = CBOR::XS->new;
 $binary_cbor_data = $coder->encode ($perl_value);
 $perl_value       = $coder->decode ($binary_cbor_data);
 
 # prefix decoding
 
 my $many_cbor_strings = ...;
 while (length $many_cbor_strings) {
    my ($data, $length) = $cbor->decode_prefix ($many_cbor_strings);
    # data was decoded
    substr $many_cbor_strings, 0, $length, ""; # remove decoded cbor string
 }
 
=head1 DESCRIPTION
 
This module converts Perl data structures to the Concise Binary Object
Representation (CBOR) and vice versa. CBOR is a fast binary serialisation
format that aims to use an (almost) superset of the JSON data model, i.e.
when you can represent something useful in JSON, you should be able to
represent it in CBOR.

XS.pm  view on Meta::CPAN

202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
=item $enabled = $cbor->get_allow_sharing
 
If C<$enable> is true (or missing), then C<encode> will not double-encode
values that have been referenced before (e.g. when the same object, such
as an array, is referenced multiple times), but instead will emit a
reference to the earlier value.
 
This means that such values will only be encoded once, and will not result
in a deep cloning of the value on decode, in decoders supporting the value
sharing extension. This also makes it possible to encode cyclic data
structures (which need C<allow_cycles> to be enabled to be decoded by this
module).
 
It is recommended to leave it off unless you know your
communication partner supports the value sharing extensions to CBOR
(L<http://cbor.schmorp.de/value-sharing>), as without decoder support, the
resulting data structure might be unusable.
 
Detecting shared values incurs a runtime overhead when values are encoded
that have a reference counter larger than one, and might unnecessarily
increase the encoded size, as potentially shared values are encoded as

XS.pm  view on Meta::CPAN

226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
arrays or hashes pointed to by a reference). Weirder constructs, such as
an array with multiple "copies" of the I<same> string, which are hard but
not impossible to create in Perl, are not supported (this is the same as
with L<Storable>).
 
If C<$enable> is false (the default), then C<encode> will encode shared
data structures repeatedly, unsharing them in the process. Cyclic data
structures cannot be encoded in this mode.
 
This option does not affect C<decode> in any way - shared values and
references will always be decoded properly if present.
 
=item $cbor = $cbor->allow_cycles ([$enable])
 
=item $enabled = $cbor->get_allow_cycles
 
If C<$enable> is true (or missing), then C<decode> will happily decode
self-referential (cyclic) data structures. By default these will not be
decoded, as they need manual cleanup to avoid memory leaks, so code that
isn't prepared for this will not leak memory.
 
If C<$enable> is false (the default), then C<decode> will throw an error
when it encounters a self-referential/cyclic data structure.
 
This option does not affect C<encode> in any way - shared values and
references will always be encoded properly if present.
 
=item $cbor = $cbor->allow_weak_cycles ([$enable])

XS.pm  view on Meta::CPAN

298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
It is recommended to leave it off unless you know your
communications partner supports the stringref extension to CBOR
(L<http://cbor.schmorp.de/stringref>), as without decoder support, the
resulting data structure might not be usable.
 
If C<$enable> is false (the default), then C<encode> will encode strings
the standard CBOR way.
 
This option does not affect C<decode> in any way - string references will
always be decoded properly if present.
 
=item $cbor = $cbor->text_keys ([$enable])
 
=item $enabled = $cbor->get_text_keys
 
If C<$enabled> is true (or missing), then C<encode> will encode all
perl hash keys as CBOR text strings/UTF-8 string, upgrading them as needed.
 
If C<$enable> is false (the default), then C<encode> will encode hash keys
normally - upgraded perl strings (strings internally encoded as UTF-8) as

XS.pm  view on Meta::CPAN

378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
string without checking whether that is, in fact, true or not.
 
=item $cbor = $cbor->filter ([$cb->($tag, $value)])
 
=item $cb_or_undef = $cbor->get_filter
 
Sets or replaces the tagged value decoding filter (when C<$cb> is
specified) or clears the filter (if no argument or C<undef> is provided).
 
The filter callback is called only during decoding, when a non-enforced
tagged value has been decoded (see L<TAG HANDLING AND EXTENSIONS> for a
list of enforced tags). For specific tags, it's often better to provide a
default converter using the C<%CBOR::XS::FILTER> hash (see below).
 
The first argument is the numerical tag, the second is the (decoded) value
that has been tagged.
 
The filter function should return either exactly one value, which will
replace the tagged value in the decoded data structure, or no values,
which will result in default handling, which currently means the decoder
creates a C<CBOR::XS::Tagged> object to hold the tag and the value.
 
When the filter is cleared (the default state), the default filter
function, C<CBOR::XS::default_filter>, is used. This function simply
looks up the tag in the C<%CBOR::XS::FILTER> hash. If an entry exists
it must be a code reference that is called with tag and value, and is
responsible for decoding the value. If no entry exists, it returns no
values. C<CBOR::XS> provides a number of default filter functions already,
the the C<%CBOR::XS::FILTER> hash can be freely extended with more.

XS.pm  view on Meta::CPAN

496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
and receiving CBOR-encoded messages. The solution that works with CBOR and
about anything else is by prepending a length to every CBOR value, so the
receiver knows how many octets to read. More compact (and slightly slower)
would be to just send CBOR values back-to-back, as C<CBOR::XS> knows where
a CBOR value ends, and doesn't need an explicit length.
 
The following methods help with this:
 
=over 4
 
=item @decoded = $cbor->incr_parse ($buffer)
 
This method attempts to decode exactly one CBOR value from the beginning
of the given C<$buffer>. The value is removed from the C<$buffer> on
success. When C<$buffer> doesn't contain a complete value yet, it returns
nothing. Finally, when the C<$buffer> doesn't start with something
that could ever be a valid CBOR value, it raises an exception, just as
C<decode> would. In the latter case the decoder state is undefined and
must be reset before being able to parse further.
 
This method modifies the C<$buffer> in place. When no CBOR value can be
decoded, the decoder stores the current string offset. On the next call,
continues decoding at the place where it stopped before. For this to make
sense, the C<$buffer> must begin with the same octets as on previous
unsuccessful calls.
 
You can call this method in scalar context, in which case it either
returns a decoded value or C<undef>. This makes it impossible to
distinguish between CBOR null values (which decode to C<undef>) and an
unsuccessful decode, which is often acceptable.
 
=item @decoded = $cbor->incr_parse_multiple ($buffer)
 
Same as C<incr_parse>, but attempts to decode as many CBOR values as
possible in one go, instead of at most one. Calls to C<incr_parse> and
C<incr_parse_multiple> can be interleaved.
 
=item $cbor->incr_reset
 
Resets the incremental decoder. This throws away any saved state, so that
subsequent calls to C<incr_parse> or C<incr_parse_multiple> start to parse
a new CBOR value from the beginning of the C<$buffer> again.

XS.pm  view on Meta::CPAN

564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
CBOR integers become (numeric) perl scalars. On perls without 64 bit
support, 64 bit integers will be truncated or otherwise corrupted.
 
=item byte strings
 
Byte strings will become octet strings in Perl (the Byte values 0..255
will simply become characters of the same value in Perl).
 
=item UTF-8 strings
 
UTF-8 strings in CBOR will be decoded, i.e. the UTF-8 octets will be
decoded into proper Unicode code points. At the moment, the validity of
the UTF-8 octets will not be validated - corrupt input will result in
corrupted Perl strings.
 
=item arrays, maps
 
CBOR arrays and CBOR maps will be converted into references to a Perl
array or hash, respectively. The keys of the map will be stringified
during this process.
 
=item null

XS.pm  view on Meta::CPAN

846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
These methods I<MUST NOT> change the data structure that is being
serialised. Failure to comply to this can result in memory corruption -
and worse.
 
If an object supports neither C<TO_CBOR> nor C<FREEZE>, encoding will fail
with an error.
 
=head3 DECODING
 
Objects encoded via C<TO_CBOR> cannot (normally) be automatically decoded,
but objects encoded via C<FREEZE> can be decoded using the following
protocol:
 
When an encoded CBOR perl object is encountered by the decoder, it will
look up the C<THAW> method, by using the stored classname, and will fail
if the method cannot be found.
 
After the lookup it will call the C<THAW> method with the stored classname
as first argument, the constant string C<CBOR> as second argument, and all
values returned by C<FREEZE> as remaining arguments.

XS.pm  view on Meta::CPAN

1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
=head1 TAG HANDLING AND EXTENSIONS
 
This section describes how this module handles specific tagged values
and extensions. If a tag is not mentioned here and no additional filters
are provided for it, then the default handling applies (creating a
CBOR::XS::Tagged object on decoding, and only encoding the tag when
explicitly requested).
 
Tags not handled specifically are currently converted into a
L<CBOR::XS::Tagged> object, which is simply a blessed array reference
consisting of the numeric tag value followed by the (decoded) CBOR value.
 
Future versions of this module reserve the right to special case
additional tags (such as base64url).
 
=head2 ENFORCED TAGS
 
These tags are always handled when decoding, and their handling cannot be
overridden by the user.
 
=over 4
 
=item 26 (perl-object, L<http://cbor.schmorp.de/perl-object>)
 
These tags are automatically created (and decoded) for serialisable
objects using the C<FREEZE/THAW> methods (the L<Types::Serialier> object
serialisation protocol). See L<OBJECT SERIALISATION> for details.
 
=item 28, 29 (shareable, sharedref, L<http://cbor.schmorp.de/value-sharing>)
 
These tags are automatically decoded when encountered (and they do not
result in a cyclic data structure, see C<allow_cycles>), resulting in
shared values in the decoded object. They are only encoded, however, when
C<allow_sharing> is enabled.
 
Not all shared values can be successfully decoded: values that reference
themselves will I<currently> decode as C<undef> (this is not the same
as a reference pointing to itself, which will be represented as a value
that contains an indirect reference to itself - these will be decoded
properly).
 
Note that considerably more shared value data structures can be decoded
than will be encoded - currently, only values pointed to by references
will be shared, others will not. While non-reference shared values can be
generated in Perl with some effort, they were considered too unimportant
to be supported in the encoder. The decoder, however, will decode these
values as shared values.
 
=item 256, 25 (stringref-namespace, stringref, L<http://cbor.schmorp.de/stringref>)
 
These tags are automatically decoded when encountered. They are only
encoded, however, when C<pack_strings> is enabled.
 
=item 22098 (indirection, L<http://cbor.schmorp.de/indirection>)
 
This tag is automatically generated when a reference are encountered (with
the exception of hash and array references). It is converted to a reference
when decoding.
 
=item 55799 (self-describe CBOR, RFC 7049)

XS.pm  view on Meta::CPAN

1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
When any of these need to load additional modules that are not part of the
perl core distribution (e.g. L<URI>), it is (currently) up to the user to
provide these modules. The decoding usually fails with an exception if the
required module cannot be loaded.
 
=over 4
 
=item 0, 1 (date/time string, seconds since the epoch)
 
These tags are decoded into L<Time::Piece> objects. The corresponding
C<Time::Piece::TO_CBOR> method always encodes into tag 1 values currently.
 
The L<Time::Piece> API is generally surprisingly bad, and fractional
seconds are only accidentally kept intact, so watch out. On the plus side,
the module comes with perl since 5.10, which has to count for something.
 
=item 2, 3 (positive/negative bignum)
 
These tags are decoded into L<Math::BigInt> objects. The corresponding
C<Math::BigInt::TO_CBOR> method encodes "small" bigints into normal CBOR
integers, and others into positive/negative CBOR bignums.
 
=item 4, 5, 264, 265 (decimal fraction/bigfloat)
 
Both decimal fractions and bigfloats are decoded into L<Math::BigFloat>
objects. The corresponding C<Math::BigFloat::TO_CBOR> method I<always>
encodes into a decimal fraction (either tag 4 or 264).
 
NaN and infinities are not encoded properly, as they cannot be represented
in CBOR.
 
See L<BIGNUM SECURITY CONSIDERATIONS> for more info.
 
=item 30 (rational numbers)
 
These tags are decoded into L<Math::BigRat> objects. The corresponding
C<Math::BigRat::TO_CBOR> method encodes rational numbers with denominator
C<1> via their numerator only, i.e., they become normal integers or
C<bignums>.
 
See L<BIGNUM SECURITY CONSIDERATIONS> for more info.
 
=item 21, 22, 23 (expected later JSON conversion)
 
CBOR::XS is not a CBOR-to-JSON converter, and will simply ignore these
tags.

XS.pm  view on Meta::CPAN

1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
CBOR is supposed to implement a superset of the JSON data model, and is,
with some coercion, able to represent all JSON texts (something that other
"binary JSON" formats such as BSON generally do not support).
 
CBOR implements some extra hints and support for JSON interoperability,
and the spec offers further guidance for conversion between CBOR and
JSON. None of this is currently implemented in CBOR, and the guidelines
in the spec do not result in correct round-tripping of data. If JSON
interoperability is improved in the future, then the goal will be to
ensure that decoded JSON data will round-trip encoding and decoding to
CBOR intact.
 
 
=head1 SECURITY CONSIDERATIONS
 
Tl;dr... if you want to decode or encode CBOR from untrusted sources, you
should start with a coder object created via C<new_safe> (which implements
the mitigations explained below):
 
   my $coder = CBOR::XS->new_safe;

XS.pm  view on Meta::CPAN

1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
Disabling bigints will also partially or fully disable types that rely on
them, e.g. rational numbers that use bignums.
 
 
=head1 CBOR IMPLEMENTATION NOTES
 
This section contains some random implementation notes. They do not
describe guaranteed behaviour, but merely behaviour as-is implemented
right now.
 
64 bit integers are only properly decoded when Perl was built with 64 bit
support.
 
Strings and arrays are encoded with a definite length. Hashes as well,
unless they are tied (or otherwise magical).
 
Only the double data type is supported for NV data types - when Perl uses
long double to represent floating point values, they might not be encoded
properly. Half precision types are accepted, but not encoded.
 
Strict mode and canonical mode are not implemented.
 
 
=head1 LIMITATIONS ON PERLS WITHOUT 64-BIT INTEGER SUPPORT
 
On perls that were built without 64 bit integer support (these are rare
nowadays, even on 32 bit architectures, as all major Perl distributions
are built with 64 bit integer support), support for any kind of 64 bit
value in CBOR is very limited - most likely, these 64 bit values will
be truncated, corrupted, or otherwise not decoded correctly. This also
includes string, float, array and map sizes that are stored as 64 bit
integers.
 
 
=head1 THREADS
 
This module is I<not> guaranteed to be thread safe and there are no
plans to change this until Perl gets thread support (as opposed to the
horribly slow so-called "threads" which are simply slow and bloated
process simulations - use fork, it's I<much> faster, cheaper, better).

XS.xs  view on Meta::CPAN

1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
  break;
 
case CBOR_TAG_VALUE_SHAREABLE:
  {
    if (ecb_expect_false (!dec->shareable))
      dec->shareable = (AV *)sv_2mortal ((SV *)newAV ());
 
    if (ecb_expect_false (dec->cbor.flags & (F_ALLOW_CYCLES | F_ALLOW_WEAK_CYCLES)))
      {
        // if cycles are allowed, then we store an AV as value
        // while it is being decoded, and gather unresolved
        // references in it, to be re4solved after decoding.
        int idx, i;
        AV *av = newAV ();
        av_push (dec->shareable, (SV *)av);
        idx = AvFILLp (dec->shareable);
 
        sv = decode_sv (dec);
 
        // the AV now contains \undef for all unresolved references,
        // so we fix them up here.

XS.xs  view on Meta::CPAN

1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
    sv = AvARRAY (dec->shareable)[idx];
 
    // reference to cycle, we create a new \undef and use that, and also
    // registerr it in the AV for later fixing
    if (ecb_expect_false (SvTYPE (sv) == SVt_PVAV))
      {
        AV *av = (AV *)sv;
        sv = newRV_noinc (&PL_sv_undef);
        av_push (av, SvREFCNT_inc_NN (sv));
      }
    else if (ecb_expect_false (sv == &PL_sv_undef)) // not yet decoded, but cycles not allowed
      ERR ("cyclic CBOR data structure found, but allow_cycles is not enabled");
    else // we decoded the object earlier, no cycle
      sv = newSVsv (sv);
  }
  break;
 
case CBOR_TAG_PERL_OBJECT:
  {
    if (dec->cbor.flags & F_FORBID_OBJECTS)
      goto filter;
 
    sv = decode_sv (dec);



( run in 0.249 second using v1.01-cache-2.11-cpan-05444aca049 )