Catalyst-Plugin-Params-Demoronize

 view release on metacpan or  search on metacpan

Changes  view on Meta::CPAN

1.14 - Fri Nov  5 16:10:13 CDT 2010
 - switch to MRO::Compat (RT #54653)
 - switch to Dist::Zilla

1.13 - Tue Nov 17 15:31:14 CST 2009
 - fix warnings from undefined strings,
   courtesy of Cory Watson <gphat@cpan.org>
 - update MANIFEST

1.12 - Wed Apr 29 12:14:51 CDT 2009
 - add unicode tests, also courtesy of Chisel

1.11 - Mon Apr 27 16:02:55 CDT 2009
 - fix failing tests, thanks to the patch provided by
   Chisel Wright <chisel@cpan.org>

1.1 - Tue Feb 17 14:20:05 CST 2009
 - major fixes to the unicode portion of the conversion,
   courtesy of Cory Watson <gphat@cpan.org>

1.0 - Thu Nov 20 12:51:37 CST 2008
 - initial release

MANIFEST  view on Meta::CPAN

Changes
LICENSE
MANIFEST
META.yml
Makefile.PL
README
dist.ini
lib/Catalyst/Plugin/Params/Demoronize.pm
t/00-compile.t
t/demoronize_unicode.t
t/demoronize_windows1252.t
t/release-pod-syntax.t

README  view on Meta::CPAN

NAME
    Catalyst::Plugin::Params::Demoronize - convert common UTF-8 and
    Windows-1252 characters to their ASCII equivalents

SYNOPSIS
      # Be sure and use the Unicode plugin if you want to handle Unicode
      # replacement.
      use Catalyst qw(Unicode Demoronize);

      # Optionally enable replacement of common unicode "smart" characters.
      MyApp->config->{demoronize} = { replace_unicode => 1 }

DESCRIPTION
    to borrow a few passages from the documentation packaged with john
    walker's demoronizer.pl:

        ...as is usually the case when you encounter something shoddy in the
        vicinity of a computer, Microsoft incompetence and gratuitous
        incompatibility were to blame. Western language HTML documents are
        written in the ISO 8859-1 Latin-1 character set, with a specified
        set of escapes for special characters. Blithely ignoring this

README  view on Meta::CPAN

        single and double quotes are similarly transformed (even though
        ASCII already contains apostrophe and single open quote characters),
        and double hyphens are replaced by the incompatible em dash symbol.
        What other horrors occur, I know not. If the user notices this
        happening at all, their reaction might be "Thank you Billy-boy--that
        looks ever so much nicer," not knowing they've been set up to look
        like a moron to folks all over the world.

    these characters are commonly inserted into form elements via cut and
    paste operations. in many cases, they are converted to UTF-8 by the
    browser. this plugin will replace both the unicode characters AND the
    Windows-1252 characters with sane ASCII equivalents.

UNICODE
    Demoronize assumes that you are using Catalyst::Plugin::Unicode to
    convert incoming parameters into Unicode characters. If you are not and
    enable optional "replace_unicode", you may have issues.

CONFIG
  replace_unicode
    If this flag is enabled (it is off by default) then commonly substituted
    Unicode characters will be converted to their ASCII equivalents.

  replace_map
    A map of Unicode characters and their ASCII equivalents that will be
    swapped. This can be overridden, but defaults to:

METHODS
    prepare_parameters
        Converts parameters.

lib/Catalyst/Plugin/Params/Demoronize.pm  view on Meta::CPAN

=head1 NAME

Catalyst::Plugin::Params::Demoronize - convert common UTF-8 and Windows-1252 characters to their ASCII equivalents

=head1 SYNOPSIS

  # Be sure and use the Unicode plugin if you want to handle Unicode
  # replacement.
  use Catalyst qw(Unicode Demoronize);

  # Optionally enable replacement of common unicode "smart" characters.
  MyApp->config->{demoronize} = { replace_unicode => 1 }

=head1 DESCRIPTION

to borrow a few passages from the documentation packaged
with john walker's demoronizer.pl:

=over 4

...as is usually the case when you encounter something
shoddy in the vicinity of a computer, Microsoft incompetence

lib/Catalyst/Plugin/Params/Demoronize.pm  view on Meta::CPAN

this happening at all, their reaction might be "Thank you
Billy-boy--that looks ever so much nicer," not knowing
they've been set up to look like a moron to folks all over
the world.

=back

these characters are commonly inserted into form elements
via cut and paste operations.  in many cases, they are
converted to UTF-8 by the browser.  this plugin will replace
both the unicode characters AND the Windows-1252 characters
with sane ASCII equivalents.

=head1 UNICODE

Demoronize assumes that you are using L<Catalyst::Plugin::Unicode>
to convert incoming parameters into Unicode characters.  If you are
not and enable optional C<replace_unicode>, you may have issues.

=head1 CONFIG

=head2 replace_unicode

If this flag is enabled (it is off by default) then commonly substituted
Unicode characters will be converted to their ASCII equivalents.

=head2 replace_map

A map of Unicode characters and their ASCII equivalents that will be swapped.
This can be overridden, but defaults to:

=cut

lib/Catalyst/Plugin/Params/Demoronize.pm  view on Meta::CPAN

        '’' => "'",     # 92, RIGHT SINGLE QUOTATION MARK
        '“' => '"',     # 93, LEFT DOUBLE QUOTATION MARK
        '”' => '"',     # 94, RIGHT DOUBLE QUOTATION MARK
        '•' => '*',     # 95, BULLET
        '–' => '-',     # 96, EN DASH
        '—' => '-',     # 97, EM DASH
        '‹' => '<',     # 8B, SINGLE LEFT-POINTING ANGLE QUOTATION MARK
        '›' => '>',     # 9B, SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
    };

    if(exists($config->{'replace_unicode'}) && $config->{'replace_unicode'}) {

        foreach my $replace (keys(%{ $config->{replace_map} })) {
            next unless defined($str);
            $str =~ s/$replace/$config->{replace_map}->{$replace}/g;
        }
    }

	return $str;
}

t/demoronize_unicode.t  view on Meta::CPAN

use Test::MockObject::Extends;
use Test::MockObject;

use_ok('Catalyst::Plugin::Params::Demoronize');

my $c		= new Test::MockObject::Extends 'Catalyst::Plugin::Params::Demoronize';
my $req		= new Test::MockObject;
my $params	= {};

$c->set_always(req => $req);
$c->set_always(config => { demoronize => {replace_unicode => 1} });
$req->set_always(params => $params);

# pasted smart quotes from:
# http://office.microsoft.com/en-gb/word/HA101732421033.aspx
$params->{string} = q{pasted “smart quotes” string};
$c->prepare_parameters;
is_deeply($params, { string => q{pasted "smart quotes" string} }, 'pasted smart quotes');

# unicode smart quotes from:
# http://office.microsoft.com/en-gb/word/HA101732421033.aspx
$params->{string} = qq<unicoded \x{201c}smart quotes\x{201d} string>;
$c->prepare_parameters;
is_deeply($params, { string => q{unicoded "smart quotes" string} }, 'unicoded smart quotes');

# pasted phrase from
# http://office.microsoft.com/en-gb/word/HA101732421033.aspx
$params->{string} = qq<Click the AutoFormat As You Type tab, and under Replace as you type, select or clear the "Straight quotes" with “smart quotes” check box.>;
$c->prepare_parameters;
is_deeply($params, { string => q{Click the AutoFormat As You Type tab, and under Replace as you type, select or clear the "Straight quotes" with "smart quotes" check box.} }, 'pasted phrase from Microsoft site');

__DATA__
You see, “state of the art” Microsoft Office applications sport a nifty
feature called “smart quotes”.  (Rule of thumb – every time Microsoft



( run in 0.554 second using v1.01-cache-2.11-cpan-88abd93f124 )