Encode-Simple
view release on metacpan or search on metacpan
lib/Decode/ARGV.pm view on Meta::CPAN
package Decode::ARGV;
use strict;
use warnings;
use Encode::Simple qw(decode decode_lax decode_utf8 decode_utf8_lax);
our $VERSION = '1.003';
sub import {
my ($class, $mode, $encoding) = @_;
if (defined $mode and !defined $encoding and $mode ne 'strict' and $mode ne 'lax') {
$encoding = $mode;
}
my @args;
if (defined $mode and $mode eq 'lax') {
if (!defined $encoding or lc($encoding) eq 'utf-8') {
@args = map { decode_utf8_lax $_ } @ARGV;
} else {
@args = map { decode_lax $encoding, $_ } @ARGV;
}
} else {
if (!defined $encoding or lc($encoding) eq 'utf-8') {
@args = map { decode_utf8 $_ } @ARGV;
} else {
@args = map { decode $encoding, $_ } @ARGV;
}
}
# only munge @ARGV if we got this far
@ARGV = @args;
1;
}
1;
=encoding UTF-8
=head1 NAME
Decode::ARGV - Decode the command-line arguments to characters
=head1 SYNOPSIS
use Decode::ARGV; # decodes from UTF-8
use Decode::ARGV 'cp1252';
$ perl -MDecode::ARGV -E'say "Argument contains only word characters" unless $ARGV[0] =~ m/\W/' 'Ñлово'
=head1 DESCRIPTION
This module provides simple in-place decoding of command-line arguments in the global array
L<@ARGV|perlvar/"@ARGV">. As with most input and output, command-line arguments are provided to the
script in bytes, and must be decoded to characters before performing string operations like
C<length> or regex matches.
The C<-CA> switch for Perl performs a similar function, but this has some deficiencies. It assumes
via the C<:utf8> internal layer that the arguments are valid UTF-8 bytes, ignoring invalid Unicode,
and even resulting in malformed strings if the bytes do not happen to be well-formed UTF-8. This
switch is also difficult to use in a script and cannot decode the arguments from other encodings.
=head1 PARAMETERS
use Decode::ARGV;
use Decode::ARGV 'lax';
use Decode::ARGV 'Shift_JIS';
use Decode::ARGV lax => 'Shift_JIS';
$ perl -MDecode::ARGV ...
$ perl -MDecode::ARGV=lax ...
$ perl -MDecode::ARGV=Shift_JIS ...
$ perl -MDecode::ARGV=lax,Shift_JIS ...
By default, C<Decode::ARGV> will decode C<@ARGV> in-place using L<Encode::Simple/decode_utf8>,
which will throw an exception if any argument doesn't contain valid well-formed UTF-8 bytes.
C<lax> can be specified as the first (optional) import parameter to instead use
L<Encode::Simple/decode_utf8_lax>, leaving replacement characters in the resulting strings instead
of throwing an exception.
The next optional import parameter specifies an alternate encoding to expect from the command-line,
in which case L<Encode::Simple/decode> or L<Encode::Simple/decode_lax> will be used to decode the
arguments from that encoding.
=head1 BUGS
Report any issues on the public bugtracker.
=head1 AUTHOR
Dan Book <dbook@cpan.org>
=head1 COPYRIGHT AND LICENSE
This software is Copyright (c) 2021 by Dan Book.
This is free software, licensed under:
The Artistic License 2.0 (GPL Compatible)
=head1 SEE ALSO
L<Encode::Simple>
( run in 1.274 second using v1.01-cache-2.11-cpan-13bb782fe5a )