Apache-GuessCharset

 view release on metacpan or  search on metacpan

MANIFEST  view on Meta::CPAN

Changes
MANIFEST
Makefile.PL
README
lib/Apache/GuessCharset.pm
t/00_compile.t
t/01_guess.t
t/sjis.html

README  view on Meta::CPAN

NAME
    Apache::GuessCharset - adds HTTP charset by guessing file's encoding

SYNOPSIS
      PerlModule Apache::GuessCharset
      SetHandler perl-script
      PerlFixupHandler Apache::GuessCharset

      # how many bytes to read for guessing (default 512)
      PerlSetVar GuessCharsetBufferSize 1024

      # list of encoding suspects
      PerlSetVar GuessCharsetSuspects euc-jp
      PerlAddVar GuessCharsetSuspects shiftjis
      PerlAddVar GuessCharsetSuspects 7bit-jis

DESCRIPTION
    Apache::GuessCharset is an Apache handler which adds HTTP charset
    attribute by automaticaly guessing file' encodings via Encode::Guess.

CONFIGURATION
    This module uses following configuration variables.

    GuessCharsetSuspects
        a list of encodings for "Encode::Guess" to check. See the
        Encode::Guess manpage for details.

    GuessCharsetBufferSize
        specifies how many bytes for this module to read from source file,
        to properly guess encodings. default is 512.

AUTHOR
    Tatsuhiko Miyagawa <miyagawa@bulknews.net>

    This library is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.

SEE ALSO
    the Encode::Guess manpage, the Apache::File manpage

lib/Apache/GuessCharset.pm  view on Meta::CPAN

    my $r = shift;
    return DECLINED if
	! $r->is_main                  or
	$r->content_type !~ m@^text/@  or
	$r->content_type =~ /charset=/ or
	! -e $r->finfo                 or
	-d _                           or
	!(my $chunk = read_chunk($r));

    my @suspects = $r->dir_config->get('GuessCharsetSuspects');
    my $enc  = guess_encoding($chunk, @suspects);
    unless (ref $enc) {
	warn "Couldn't guess encoding: $enc" if $DEBUG;
	return DECLINED;
    }

    my $iana    = iana_charset_name($enc->name);
    my $charset = lc($Prefered_MIME{$iana} || $iana); # lowercased
    warn "Guessed: $charset" if $DEBUG;
    $r->content_type($r->content_type . "; charset=$charset");
    return OK;
}

lib/Apache/GuessCharset.pm  view on Meta::CPAN

    my $buffer_size = $r->dir_config('GuessCharsetBufferSize') || 512;
    read $fh, my($chunk), $buffer_size;
    return $chunk;
}

1;
__END__

=head1 NAME

Apache::GuessCharset - adds HTTP charset by guessing file's encoding

=head1 SYNOPSIS

  SetHandler perl-script
  PerlFixupHandler +Apache::GuessCharset

  # how many bytes to read for guessing (default 512)
  PerlSetVar GuessCharsetBufferSize 1024

  # list of encoding suspects
  PerlSetVar GuessCharsetSuspects euc-jp
  PerlAddVar GuessCharsetSuspects shiftjis
  PerlAddVar GuessCharsetSuspects 7bit-jis

=head1 DESCRIPTION

Apache::GuessCharset is an Apache fix-up handler which adds HTTP
charset attribute by automaticaly guessing text files' encodings via
Encode::Guess.

=head1 CONFIGURATION

This module uses following configuration variables.

=over 4

=item GuessCharsetSuspects

a list of encodings for C<Encode::Guess> to check. See
L<Encode::Guess> for details.

=item GuessCharsetBufferSize

specifies how many bytes for this module to read from source file, to
properly guess encodings. default is 512.

=back

=head1 AUTHOR

Tatsuhiko Miyagawa E<lt>miyagawa@bulknews.netE<gt>

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.



( run in 0.569 second using v1.01-cache-2.11-cpan-702932259ff )