HTML-Detergent

 view release on metacpan or  search on metacpan

lib/HTML/Detergent.pm  view on Meta::CPAN

C<E<lt>metaE<gt>> elements into the C<E<lt>headE<gt>>. This enables
the inclusion of metadata and the re-association of the main content
with links that represent aspects of the page which have been removed
(e.g. navigation, copyright statement, etc.). In addition, if the
page's URI is supplied to the L</process> method, the
C<E<lt>baseE<gt>> element is either added or rewritten to reflect it,
and the URI attributes in the body are rewritten relative to the base.
Otherwise they are left alone.

The document returned is an L<XML::LibXML::Document> object using the
XHTML namespace, C<http://www.w3.org/1999/xhtml>, but does not profess
to validate against any particular schema. If DTD declarations
(including the empty C<E<lt>!DOCTYPE htmlE<gt>> recommended in HTML5)
are desired, they can be added on afterward. Likewise, the object can
be converted from XML into HTML using L<XML::LibXML::Document/toStringHTML>.

=head1 METHODS

=head2 new %CONFIG | \%CONFIG | $CONFIG

Initialize the processor, either with a list of configuration

t/data/about.html  view on Meta::CPAN


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html><head>

<title>Information Architecture Institute > About Us - Information Architecture Institute</title>

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<meta http-equiv="Content-Language" content="en-us" />
<meta http-equiv="imagetoolbar" content="false" />
<meta name="MSSmartTagsPreventParsing" content="true" />
<meta name="google-site-verification" content="lKF0xMZ8xxuNdr8KfR9FmJe2gkZChc0LNR_Q7swbZQo" />



( run in 1.187 second using v1.01-cache-2.11-cpan-49f99fa48dc )