MathML-Entities

 view release on metacpan or  search on metacpan

README  view on Meta::CPAN

     # convert named HTML entities to character references:
     $utf8 = name2utf8($html);    # utf8 
 
DESCRIPTION
    MathML::Entities a content conversion filter for named XHTML+MathML
    entities. There are over two thousand named entities in the XHTML+MathML
    DTD. All the Entities defined in the XHTML+MathML DTD except the five
    "safe" ones ("<", ">", "&", """, "'"), will be
    converted to the equivalent numeric character references or to utf-8
    characters. Named entities which are not in the XHTML+MathML DTD are
    escaped. This makes the resulting XHTML (or XHTML+MathML) safe for
    consumption by non-validating XML parsers.

    Unlike, HTML::Entities, the mapping between MathML named entities and
    codepoints is many-to-one. Therefore, there's no particular sense in
    having an inverse function, which takes codepoints to named entities.

    Based on: HTML::Entities by Koichi Taniguchi <taniguchi@livedoor.jp>

FUNCTIONS
    The following functions are exported by default.

lib/MathML/Entities.pm  view on Meta::CPAN

 $utf8 = name2utf8($html);    # utf8 
 
=head1 DESCRIPTION

MathML::Entities a content conversion filter for named
XHTML+MathML entities. There are over two thousand named entities in the
XHTML+MathML DTD. All the Entities defined in the XHTML+MathML DTD 
except the five "safe" ones (C<&lt;>, C<&gt;>, C<&amp;>, C<&quot;>, C<&apos;>),
will be converted to the equivalent numeric character references or to utf-8 characters.
Named entities which are not in the XHTML+MathML DTD are escaped. This makes
the resulting XHTML (or XHTML+MathML) safe for consumption by non-validating
XML parsers.

Unlike, HTML::Entities, the mapping between MathML named entities and codepoints
is many-to-one. Therefore, there's no particular sense in having an inverse
function, which takes codepoints to named entities.

Based on: HTML::Entities by Koichi Taniguchi E<lt>taniguchi@livedoor.jpE<gt>

=head1 FUNCTIONS

t/conversions.t  view on Meta::CPAN

#!/usr/bin/perl -w

           use Test::Simple tests => 12;

use MathML::Entities;

ok(name2numbered('&copy;&nbsp;2004') eq '&#x000A9;&#x000A0;2004', 'XHTML entities to numeric char refs');
ok(name2utf8('&copy;&nbsp;2004') eq chr(169).chr(160).'2004', 'XHTML entities to utf-8');
ok(name2numbered('by &foo;') eq 'by &amp;foo;', 'Unknown entities I');
ok(name2utf8('by &foo;') eq 'by &amp;foo;', 'Unknown entities II');
ok(name2numbered('&amp;, &lt;, &gt;, &apos; &quot;') eq '&amp;, &lt;, &gt;, &apos; &quot;', 'Safe five I');
ok(name2utf8('&amp;, &lt;, &gt;, &apos; &quot;') eq '&amp;, &lt;, &gt;, &apos; &quot;', 'Safe five II');
ok(name2numbered('&AMP;, &LT;, &GT;, &APOS; &QUOT;') eq '&amp;, &lt;, &gt;, &apos; &quot;', 'Uppercase safe five I');
ok(name2utf8('&AMP;, &LT;, &GT;, &APOS; &QUOT;') eq '&amp;, &lt;, &gt;, &apos; &quot;', 'Uppercase safe five II');
ok(name2numbered('&conint;d&Ffr;') eq '&#x0222E;d&#x1D509;', 'MathML entities to numeric char refs');
ok(name2utf8('&conint;d&Ffr;') eq chr(8750).'d'.chr(120073), 'MathML entities to utf-8');
ok(name2numbered('&ThickSpace;&bne;') eq '&#x0205F;&#x0200A;&#x0003D;&#x020E5;', 'Multiple character refs');
ok(name2utf8('&ThickSpace;&bne;') eq chr(8287).chr(8202).chr(61).chr(8421), 'Multiple utf-8 characters');



( run in 1.119 second using v1.01-cache-2.11-cpan-49f99fa48dc )