HTML-Detoxifier

 view release on metacpan or  search on metacpan

lib/HTML/Detoxifier.pm  view on Meta::CPAN

and securely. Tags are divided into functional groups, each of which can be
disallowed or allowed as you wish. Additionally, HTML::Detoxifier knows how to
clean inline CSS; with HTML::Detoxifier, you can securely allow users to use
style sheets without allowing cross-site scripting vulnerabilities. (Yes, it is
possible to execute JavaScript from CSS!)

In addition to this main purpose, HTML::Detoxifier cleans up some common
mistakes with HTML: all tags are closed, empty tags are converted to valid
XML (that is, with a trailing /), and images without ALT text as required in
HTML 4.0 are given a plain ALT tag. The module does its best to emit valid
XHTML 1.0; it even adds XML declarations and DOCTYPE elements where needed.

=cut

use constant TAG_GROUPS => {
	links => {
		a => undef,
		area => undef,
		link => undef,
		map => undef
	},

lib/HTML/Detoxifier.pm  view on Meta::CPAN

				delete $attrs{style} if exists $attrs{style};
				delete $attrs{class} if exists $attrs{class};
				delete $attrs{id} if exists $attrs{id}
			} elsif (exists $opts{disallow}{dynamic}) {
				$attrs{style} = remove_scripts_from_css $attrs{style} if
					$attrs{style}
			}
			
			if (lc $token->[1] eq 'html') {	
				# Add a valid XML declaration and a doctype. HTML::Detoxifier
				# converts everything to XHTML 1.0, so we might as well
				# qualify it!

				$out = <<"ENDDECL" . $out;
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
ENDDECL

				$attrs{xmlns} = "http://www.w3.org/1999/xhtml"
					unless $attrs{xmlns};
				$attrs{lang} = "en-US" unless $attrs{lang};	
			}

			$out .= "<" . lc $token->[1];
			while (my ($key, $value) = each %attrs) {
				$value = encode_entities $value;



( run in 1.256 second using v1.01-cache-2.11-cpan-49f99fa48dc )