App-psort
view release on metacpan or search on metacpan
if ($unique) {
$code .= ' }';
}
eval $code;
die "Error while evaluating '$code': $@" if $@;
for (@data) {
print $_->[0];
}
}
sub add_psort {
my($file_or_fh) = @_;
my($fh, $diag);
if (ref $file_or_fh eq 'GLOB') {
$fh = $file_or_fh;
if ($file_or_fh == \*STDIN) {
$diag = 'standard input';
} else {
$diag = 'provided file handle'; # currently cannot happen
}
} else {
my $file = $file_or_fh;
$diag = $file;
open $fh, '<', $file or die "Can't open $file: $!";
}
while(<$fh>) {
my $line = $_;
my $res = $cb->($_); # force scalar context
if (!defined $res) {
if (!$no_warnings) {
warn "Uninitialized value returned in psort eval or regexp at $diag line $.\n";
}
$res = '';
}
$res = uc $res if $ignore_case;
$res =~ s{^\s+}{} if $ignore_leading_blanks;
$res =~ s{[[:^print:]]}{}g if $ignore_nonprinting;
$res =~ s{\r?\n$}{}; # remove newline at end
push @data, [$line, $res];
}
}
__END__
=head1 NAME
psort - a perl-enhanced sort
=head1 SYNOPSIS
psort [OPTION]... [FILE]...
=head1 DESCRIPTION
A perl-enhanced variant of L<sort(1)>. The specified files (or
standard input) are written sorted to standard output.
By default, sorting is done using perl's L<< cmp|perlop/cmp >>
operator, without any use of locales or encodings.
=head2 OPTIONS
=over
=item -b, --ignore-leading-blanks
Ignore any whitespace character (C<\s>) at the beginning of a line.
=item -c, --check
Do not output anything. Just check if the input is sorted and return
the exit value 0 for sorted and 1 for unsorted.
=item -C, --compare-function
Sort using a custom perl function. For your convenience, the enclosing
"sub {" and "}" must not be specified. Like in perl's sort,
the variables C<$a> and C<$b> are available.
Examples
=over
=item * Reimplementing the C<-n> switch:
-C '$a <=> $b'
=item * Using locale comparisons:
-C 'use locale; $a cmp $b'
=back
Note that it is possible to put C<BEGIN { ... }> blocks into the
comparison function.
=item -e, --field-function
Extract the sorting field (or the sorting key) using a custom perl
function. For your convenience, the enclosing "sub {" and "}" must not
be specified. The current line is available in the variable C<$_>. It
is expected that the last expression is the field to be used for
comparisons.
Examples:
=over
=item * Using just the identity:
-e '$_'
=item * Using only the first four characters for comparisons:
-e 'substr($_, 0, 4)'
=item * Using a regular expression:
-e '/(\d+) wallclock/ && $1'
=back
Note that it is possible to put C<BEGIN { ... }> blocks into the
comparison function.
=item --rx
Use a regular expression for extracting the sorting field. If a
capture group is detected in the regexp, then this capture group is
used for the extraction, otherwise the whole matched portion is used.
For example, the above mentioned C<-e> snippet
-e '/(\d+) wallclock/ && $1'
could be written as
--rx '(\d+) wallclock'
Only the first capture group is used, others are ignored (for now).
The capture group detection code just uses a heuristic, which may fail
in special cases.
=item -f, --ignore-case
Fold all characters to its uppercased version for comparison.
=item -i, --ignore-nonprinting
Ignore non-printing characters (everything matching the C<<
[[:^print]] >> character class) for comparison.
=item -Mmodule[=import]
Load a perl module. The syntax is the same like perl's C<-M> option.
=item -mmodule[=import]
Load a perl module without default import. The syntax is the same like perl's
C<-M> option.
=item -n, --numeric-sort
Sort numerically. It is using perl's L<< <=>|perlop/<=> >> operator.
=item -N, --natural-sort
Sort using L<Sort::Naturally>, if available.
=item -r, --reverse
Reverse the result of comparisons.
=item -u, --unique
Output is made unique for adjacent lines. If -c is specified, then
check for strict ordering (adjacent equal lines are considered as
unsorted).
=item -v, --version
Print psort's version.
=item -V, --version-sort
Sort versions using L<CPAN::Version>, if available.
=item -X, --no-warnings
By default psort warns if a custom field function or rx returns an
undefined value. These warnings may be suppressed with this option.
=back
=head2 COMPATIBILITY
Some options found in GNU/POSIX sort are also available in psort. But
no attempt was done to make psort compatible to GNU/POSIX sort.
Especially there's no locale support (but see above how to C<use
locale> in the C<-C> option). There's also no encoding support (though
it probably can be emulated by using C<<Encode/decode> in the C<-e> or
C<-C> option).
=head2 TODO
Here are some ideas for future options:
=over
=item C<--encoding>
Specify the input and output encoding.
=item Unicode sorting
An option to use L<Unicode::Collate>.
Currently the longish one-liner has to be used:
psort -MUnicode::Collate -MEncode=decode -e 'decode("utf-8", $_)' -C 'BEGIN { $Collator = Unicode::Collate->new } $Collator->cmp($a,$b)'
=item Sort specific columns (<-k>)
Currently one has to use something like the following to sort by
columns:
psort -e '@F=split; $F[...]'
=item C<--locale>
Specify a locale.
=item C<-o>
Instead writing to standard output, write to the specified output file.
=item C<-m>
Assume that input files are already sorted.
=item C<-u>
Output only unique lines.
=back
=head1 AUTHOR
Slaven ReziE<x0107>
=head1 COPYRIGHT AND LICENSE
Copyright (C) 2009,2011,2013,2015,2016,2018,2019,2022,2023 by Slaven ReziE<x0107>
This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself, either Perl version 5.8.8 or,
at your option, any later version of Perl 5 you may have available.
=head1 SEE ALSO
L<sort(1)>, L<Sort::Naturally>, L<CPAN::Version>.
An alternative perl-enhanced sort program: L<subsort> (in L<App::subsort>).
=cut
( run in 2.015 seconds using v1.01-cache-2.11-cpan-ceb78f64989 )