Cpanel-JSON-XS
view release on metacpan or search on metacpan
- support arbitrary stringification with encode, with convert_blessed
and allow_blessed.
- ithread support. Cpanel::JSON::XS is thread-safe, JSON::XS not
- is_bool can be called as method, JSON::XS::is_bool not.
- performance optimizations for threaded Perls
- relaxed mode, allowing many popular extensions
- protect our magic object from corruption by wrong or missing external
methods, like FREEZE/THAW or serialization with other methods.
- additional fixes for:
- #208 - no security-relevant out-of-bounds reading of module memory
when decoding hash keys without ending ':'
- [cpan #88061] AIX atof without USE_LONG_DOUBLE
- #10 unshare_hek crash
- #7, #29 avoid re-blessing where possible. It fails in JSON::XS for
READONLY values, i.e. restricted hashes.
- #41 overloading of booleans, use the object not the reference.
- #62 -Dusequadmath conversion and no SEGV.
- #72 parsing of values followed \0, like 1\0 does fail.
- #72 parsing of illegal unicode or non-unicode characters.
- #96 locale-insensitive numeric conversion.
- #154 numeric conversion fixed since 5.22, using the same strtold as perl5.
- #167 sort tied hashes with canonical.
- #212 fix utf8 object stringification
- public maintenance and bugtracker
- use ppport.h, sanify XS.xs comment styles, harness C coding style
- common::sense is optional. When available it is not used in the
published production module, just during development and testing.
- extended testsuite, passes all
http://seriot.ch/projects/parsing_json.html tests. In fact it is the
only know JSON decoder which does so, while also being the fastest.
- support many more options and methods from JSON::PP: stringify_infnan,
allow_unknown, allow_stringify, allow_barekey, encode_stringify,
allow_bignum, allow_singlequote, dupkeys_as_arrayref, sort_by
(partially), escape_slash, convert_blessed, ... optional decode_json(,
allow_nonref) arg. relaxed implements allow_dupkeys.
- support all 5 unicode BOM's: UTF-8, UTF-16LE, UTF-16BE, UTF-32LE,
UTF-32BE, encoding internally to UTF-8.
FUNCTIONAL INTERFACE
The following convenience methods are provided by this module. They are
exported by default:
$json_text = encode_json $perl_scalar, [json_type]
Converts the given Perl data structure to a UTF-8 encoded, binary
string (that is, the string contains octets only). Croaks on error.
This function call is functionally identical to:
$json_text = Cpanel::JSON::XS->new->utf8->encode ($perl_scalar, $json_type)
Except being faster.
For the type argument see Cpanel::JSON::XS::Type.
$perl_scalar = decode_json $json_text [, $allow_nonref [, my $json_type
] ]
The opposite of "encode_json": expects an UTF-8 (binary) string of
an json reference and tries to parse that as an UTF-8 encoded JSON
text, returning the resulting reference. Croaks on error.
This function call is functionally identical to:
$perl_scalar = Cpanel::JSON::XS->new->utf8->decode ($json_text, $json_type)
except being faster.
Note that older decode_json versions in Cpanel::JSON::XS older than
3.0116 and JSON::XS did not set allow_nonref but allowed them due to
a bug in the decoder.
If the new 2nd optional $allow_nonref argument is set and not false,
the "allow_nonref" option will be set and the function will act is
described as in the relaxed RFC 7159 allowing all values such as
objects, arrays, strings, numbers, "null", "true", and "false". See
""OLD" VS. "NEW" JSON (RFC 4627 VS. RFC 7159)" below, why you don't
want to do that.
For the 3rd optional type argument see Cpanel::JSON::XS::Type.
$is_boolean = Cpanel::JSON::XS::is_bool $scalar
Returns true if the passed scalar represents either "JSON::PP::true"
or "JSON::PP::false", two constants that act like 1 and 0,
respectively and are used to represent JSON "true" and "false"
values in Perl. (Also recognizes the booleans produced by JSON::XS.)
See MAPPING, below, for more information on how JSON values are
mapped to Perl.
DEPRECATED FUNCTIONS
from_json
from_json has been renamed to decode_json
to_json
to_json has been renamed to encode_json
A FEW NOTES ON UNICODE AND PERL
# first parse the initial "["
for (;;) {
sysread $fh, my $buf, 65536
or die "read error: $!";
$json->incr_parse ($buf); # void context, so no parsing
# Exit the loop once we found and removed(!) the initial "[".
# In essence, we are (ab-)using the $json object as a simple scalar
# we append data to.
last if $json->incr_text =~ s/^ \s* \[ //x;
}
# now we have the skipped the initial "[", so continue
# parsing all the elements.
for (;;) {
# in this loop we read data until we got a single JSON object
for (;;) {
if (my $obj = $json->incr_parse) {
# do something with $obj
last;
}
# add more data
sysread $fh, my $buf, 65536
or die "read error: $!";
$json->incr_parse ($buf); # void context, so no parsing
}
# in this loop we read data until we either found and parsed the
# separating "," between elements, or the final "]"
for (;;) {
# first skip whitespace
$json->incr_text =~ s/^\s*//;
# if we find "]", we are done
if ($json->incr_text =~ s/^\]//) {
print "finished.\n";
exit;
}
# if we find ",", we can continue with the next element
if ($json->incr_text =~ s/^,//) {
last;
}
# if we find anything else, we have a parse error!
if (length $json->incr_text) {
die "parse error near ", $json->incr_text;
}
# else add more data
sysread $fh, my $buf, 65536
or die "read error: $!";
$json->incr_parse ($buf); # void context, so no parsing
}
This is a complex example, but most of the complexity comes from the
fact that we are trying to be correct (bear with me if I am wrong, I
never ran the above example :).
BOM
Detect all unicode Byte Order Marks on decode. Which are UTF-8,
UTF-16LE, UTF-16BE, UTF-32LE and UTF-32BE.
The BOM encoding is set only for one specific decode call, it does not
change the state of the JSON object.
Warning: With perls older than 5.20 you need load the Encode module
before loading a multibyte BOM, i.e. >= UTF-16. Otherwise an error is
thrown. This is an implementation limitation and might get fixed later.
See <https://tools.ietf.org/html/rfc7159#section-8.1> *"JSON text SHALL
be encoded in UTF-8, UTF-16, or UTF-32."*
*"Implementations MUST NOT add a byte order mark to the beginning of a
JSON text", "implementations (...) MAY ignore the presence of a byte
order mark rather than treating it as an error".*
See also <http://www.unicode.org/faq/utf_bom.html#BOM>.
Beware that Cpanel::JSON::XS is currently the only JSON module which
does accept and decode a BOM.
The latest JSON spec
<https://www.greenbytes.de/tech/webdav/rfc8259.html#character.encoding>
forbid the usage of UTF-16 or UTF-32, the character encoding is UTF-8.
Thus in subsequent updates BOM's of UTF-16 or UTF-32 will throw an
error.
MAPPING
This section describes how Cpanel::JSON::XS maps Perl values to JSON
values and vice versa. These mappings are designed to "do the right
thing" in most circumstances automatically, preserving round-tripping
characteristics (what you put in comes out as something equivalent).
For the more enlightened: note that in the following descriptions,
lowercase *perl* refers to the Perl interpreter, while uppercase *Perl*
refers to the abstract Perl language itself.
JSON -> PERL
object
A JSON object becomes a reference to a hash in Perl. No ordering of
object keys is preserved (JSON does not preserve object key ordering
itself).
array
A JSON array becomes a reference to an array in Perl.
string
A JSON string becomes a string scalar in Perl - Unicode codepoints
in JSON are represented by the same codepoints in the Perl string,
so no manual decoding is necessary.
number
A JSON number becomes either an integer, numeric (floating point) or
string scalar in perl, depending on its range and any fractional
parts. On the Perl level, there is no difference between those as
Perl handles all the conversion details, but an integer may take
slightly less memory and might represent more values exactly than
floating point numbers.
If the number consists of digits only, Cpanel::JSON::XS will try to
represent it as an integer value. If that fails, it will try to
represent it as a numeric (floating point) value if that is possible
without loss of precision. Otherwise it will preserve the number as
a string value (in which case you lose roundtripping ability, as the
JSON number will be re-encoded to a JSON string).
Numbers containing a fractional or exponential part will always be
represented as numeric (floating point) values, possibly at a loss
of precision (in which case you might lose perfect roundtripping
ability, but the JSON number will still be re-encoded as a JSON
number).
Note that precision is not accuracy - binary floating point values
cannot represent most decimal fractions exactly, and when converting
from and to floating point, "Cpanel::JSON::XS" only guarantees
precision up to but not including the least significant bit.
true, false
When "unblessed_bool" is set to true, then JSON "true" becomes 1 and
JSON "false" becomes 0.
Otherwise these JSON atoms become "JSON::PP::true" and
"JSON::PP::false", respectively. They are "JSON::PP::Boolean"
objects and are overloaded to act almost exactly like the numbers 1
and 0. You can check whether a scalar is a JSON boolean by using the
( run in 0.870 second using v1.01-cache-2.11-cpan-39bf76dae61 )