MathML-Entities
view release on metacpan or search on metacpan
# convert named HTML entities to character references:
$utf8 = name2utf8($html); # utf8
DESCRIPTION
MathML::Entities a content conversion filter for named XHTML+MathML
entities. There are over two thousand named entities in the XHTML+MathML
DTD. All the Entities defined in the XHTML+MathML DTD except the five
"safe" ones ("<", ">", "&", """, "'"), will be
converted to the equivalent numeric character references or to utf-8
characters. Named entities which are not in the XHTML+MathML DTD are
escaped. This makes the resulting XHTML (or XHTML+MathML) safe for
consumption by non-validating XML parsers.
Unlike, HTML::Entities, the mapping between MathML named entities and
codepoints is many-to-one. Therefore, there's no particular sense in
having an inverse function, which takes codepoints to named entities.
Based on: HTML::Entities by Koichi Taniguchi <taniguchi@livedoor.jp>
FUNCTIONS
The following functions are exported by default.
lib/MathML/Entities.pm view on Meta::CPAN
$utf8 = name2utf8($html); # utf8
=head1 DESCRIPTION
MathML::Entities a content conversion filter for named
XHTML+MathML entities. There are over two thousand named entities in the
XHTML+MathML DTD. All the Entities defined in the XHTML+MathML DTD
except the five "safe" ones (C<<>, C<>>, C<&>, C<">, C<'>),
will be converted to the equivalent numeric character references or to utf-8 characters.
Named entities which are not in the XHTML+MathML DTD are escaped. This makes
the resulting XHTML (or XHTML+MathML) safe for consumption by non-validating
XML parsers.
Unlike, HTML::Entities, the mapping between MathML named entities and codepoints
is many-to-one. Therefore, there's no particular sense in having an inverse
function, which takes codepoints to named entities.
Based on: HTML::Entities by Koichi Taniguchi E<lt>taniguchi@livedoor.jpE<gt>
=head1 FUNCTIONS
t/conversions.t view on Meta::CPAN
#!/usr/bin/perl -w
use Test::Simple tests => 12;
use MathML::Entities;
ok(name2numbered('© 2004') eq '© 2004', 'XHTML entities to numeric char refs');
ok(name2utf8('© 2004') eq chr(169).chr(160).'2004', 'XHTML entities to utf-8');
ok(name2numbered('by &foo;') eq 'by &foo;', 'Unknown entities I');
ok(name2utf8('by &foo;') eq 'by &foo;', 'Unknown entities II');
ok(name2numbered('&, <, >, ' "') eq '&, <, >, ' "', 'Safe five I');
ok(name2utf8('&, <, >, ' "') eq '&, <, >, ' "', 'Safe five II');
ok(name2numbered('&, <, >, &APOS; "') eq '&, <, >, ' "', 'Uppercase safe five I');
ok(name2utf8('&, <, >, &APOS; "') eq '&, <, >, ' "', 'Uppercase safe five II');
ok(name2numbered('∮d𝔉') eq '∮d𝔉', 'MathML entities to numeric char refs');
ok(name2utf8('∮d𝔉') eq chr(8750).'d'.chr(120073), 'MathML entities to utf-8');
ok(name2numbered('  =⃥') eq '  =⃥', 'Multiple character refs');
ok(name2utf8('  =⃥') eq chr(8287).chr(8202).chr(61).chr(8421), 'Multiple utf-8 characters');
( run in 1.119 second using v1.01-cache-2.11-cpan-49f99fa48dc )