CSS-Inliner

 view release on metacpan or  search on metacpan

README  view on Meta::CPAN

    http://www.w3.org/TR/html5/syntax.html

    NOTE: In the event that no charset can be identified the library will
    handle the content as a mix of UTF-8/CP-1252/8859-1/ASCII by attempting
    to use the Encoding::FixLatin module, as this combination is relatively
    common in the wild. Finally, if Encoding::FixLatin is unavailable the
    content will be treated as ASCII.

    Input Parameters: content - scalar presumably containing both html and
    css charset - (optional) programmer specified charset for the passed
    content ctcharset - (optional) content-type specified charset for
    content retrieved via a url

  decode_characters
    Implement the character decoding algorithm for HTML as outlined by the
    various working groups

    Basically apply best practices for determining the applied character
    encoding and properly decode it

    It is expected that this method will be called before any calls to

lib/CSS/Inliner.pm  view on Meta::CPAN

http://www.w3.org/TR/html5/syntax.html

NOTE: In the event that no charset can be identified the library will handle the content as a mix of
UTF-8/CP-1252/8859-1/ASCII by attempting to use the Encoding::FixLatin module, as this combination
is relatively common in the wild. Finally, if Encoding::FixLatin is unavailable the content will be
treated as ASCII.

Input Parameters:
 content - scalar presumably containing both html and css
 charset - (optional) programmer specified charset for the passed content
 ctcharset - (optional) content-type specified charset for content retrieved via a url

=cut

sub detect_charset {
  my ($self,$params) = @_;

  $self->_check_object();

  unless ($params && $$params{content}) {
    croak "You must pass content for content character decoding";

lib/CSS/Inliner.pm  view on Meta::CPAN

  $self->_configure_tree({ tree => $extract_tree });

  $extract_tree->parse_content($$params{content});

  my $head = $extract_tree->look_down("_tag", "head"); # there should only be one

  my $meta_charset;
  if ($head) {
    # pull key header meta elements
    my $meta_charset_elem = $head->look_down('_tag','meta','charset',qr/./);
    my $meta_equiv_charset_elem = $head->look_down('_tag','meta','http-equiv',qr/content-type/i,'content',qr/./);

    # assign meta charset, we give precedence to meta http_equiv content type
    if ($meta_equiv_charset_elem) {
      my $meta_equiv_content = $meta_equiv_charset_elem->attr('content');

      # leverage charset allowable chars from https://tools.ietf.org/html/rfc2978
      if ($meta_equiv_content =~ /charset(?:\s*)=(?:\s*)([\w!#$%&'\-+^`{}~]+)/i) {
        $meta_charset = find_encoding($1);
      }
    }

t/html/relaxed.html  view on Meta::CPAN

<!DOCTYPE html>
<html lang="en">
  <body>
    <span>dafdsfdsaF</span>
  </body>
  <section>dafdsfdsaF</section>
  <meta name="content-type" value="utf8" />
  <!--Stylesheet-->
  <style type="text/css">
    section { color: red; font-size: 20px }
    section { color: blue; font-size: 17px; } /* Comment */
  </style>
  <article>
    <button type="button">Click here!</button>
  </article>
  <nav>
    <a href="/html/">HTML</a>

t/html/relaxed_result.html  view on Meta::CPAN

<!DOCTYPE html>
<html lang="en">
 <body><span>dafdsfdsaF</span></body><section style="font-size: 17px; color: #FF0000;">dafdsfdsaF</section><meta name="content-type" value="utf8" />
 <!--Stylesheet--><article style="border-radius: 22px;"><button style="font-family: Helvetica;" type="button">Click here!</button></article> <nav> <a href="/html/">HTML</a> <big>New!</big> <a href="/css/">CSS</a> <a href="/js/">JavaScript</a> <a href...
 </head>
 <body><samp>Sample output from a <mark>computer</mark> program</samp><meta content="HTML,CSS,XML,JavaScript" name="keywords" /> Text<wbr />break </body><section style="font-size: 17px; color: #FF0000;"> What about this text <hr /></section><script>
   document.write("Hello World!")
 </script>
</html>



( run in 1.379 second using v1.01-cache-2.11-cpan-524268b4103 )