HTML-Detoxifier
view release on metacpan or search on metacpan
lib/HTML/Detoxifier.pm view on Meta::CPAN
and securely. Tags are divided into functional groups, each of which can be
disallowed or allowed as you wish. Additionally, HTML::Detoxifier knows how to
clean inline CSS; with HTML::Detoxifier, you can securely allow users to use
style sheets without allowing cross-site scripting vulnerabilities. (Yes, it is
possible to execute JavaScript from CSS!)
In addition to this main purpose, HTML::Detoxifier cleans up some common
mistakes with HTML: all tags are closed, empty tags are converted to valid
XML (that is, with a trailing /), and images without ALT text as required in
HTML 4.0 are given a plain ALT tag. The module does its best to emit valid
XHTML 1.0; it even adds XML declarations and DOCTYPE elements where needed.
=cut
use constant TAG_GROUPS => {
links => {
a => undef,
area => undef,
link => undef,
map => undef
},
lib/HTML/Detoxifier.pm view on Meta::CPAN
delete $attrs{style} if exists $attrs{style};
delete $attrs{class} if exists $attrs{class};
delete $attrs{id} if exists $attrs{id}
} elsif (exists $opts{disallow}{dynamic}) {
$attrs{style} = remove_scripts_from_css $attrs{style} if
$attrs{style}
}
if (lc $token->[1] eq 'html') {
# Add a valid XML declaration and a doctype. HTML::Detoxifier
# converts everything to XHTML 1.0, so we might as well
# qualify it!
$out = <<"ENDDECL" . $out;
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
ENDDECL
$attrs{xmlns} = "http://www.w3.org/1999/xhtml"
unless $attrs{xmlns};
$attrs{lang} = "en-US" unless $attrs{lang};
}
$out .= "<" . lc $token->[1];
while (my ($key, $value) = each %attrs) {
$value = encode_entities $value;
( run in 0.809 second using v1.01-cache-2.11-cpan-49f99fa48dc )