App-psort

 view release on metacpan or  search on metacpan

bin/psort  view on Meta::CPAN

    if ($unique) {
	$code .= ' }';
    }
    eval $code;
    die "Error while evaluating '$code': $@" if $@;

    for (@data) {
	print $_->[0];
    }
}

sub add_psort {
    my($file_or_fh) = @_;
    my($fh, $diag);
    if (ref $file_or_fh eq 'GLOB') {
	$fh = $file_or_fh;
	if ($file_or_fh == \*STDIN) {
	    $diag = 'standard input';
	} else {
	    $diag = 'provided file handle'; # currently cannot happen
	}
    } else {
	my $file = $file_or_fh;
	$diag = $file;
	open $fh, '<', $file or die "Can't open $file: $!";
    }

    while(<$fh>) {
	my $line = $_;
	my $res = $cb->($_); # force scalar context
	if (!defined $res) {
	    if (!$no_warnings) {
		warn "Uninitialized value returned in psort eval or regexp at $diag line $.\n";
	    }
	    $res = '';
	}
	$res = uc $res            if $ignore_case;
	$res =~ s{^\s+}{}         if $ignore_leading_blanks;
	$res =~ s{[[:^print:]]}{}g if $ignore_nonprinting;
	$res =~ s{\r?\n$}{}; # remove newline at end
	push @data, [$line, $res];
    }
}

__END__

=head1 NAME

psort - a perl-enhanced sort

=head1 SYNOPSIS

    psort [OPTION]... [FILE]...

=head1 DESCRIPTION

A perl-enhanced variant of L<sort(1)>. The specified files (or
standard input) are written sorted to standard output.

By default, sorting is done using perl's L<< cmp|perlop/cmp >>
operator, without any use of locales or encodings.

=head2 OPTIONS

=over

=item -b, --ignore-leading-blanks

Ignore any whitespace character (C<\s>) at the beginning of a line.

=item -c, --check

Do not output anything. Just check if the input is sorted and return
the exit value 0 for sorted and 1 for unsorted.

=item -C, --compare-function

Sort using a custom perl function. For your convenience, the enclosing
"sub {" and "}" must not be specified. Like in perl's sort,
the variables C<$a> and C<$b> are available.

Examples

=over

=item * Reimplementing the C<-n> switch:

    -C '$a <=> $b'

=item * Using locale comparisons:

    -C 'use locale; $a cmp $b'

=back

Note that it is possible to put C<BEGIN { ... }> blocks into the
comparison function.

=item -e, --field-function

Extract the sorting field (or the sorting key) using a custom perl
function. For your convenience, the enclosing "sub {" and "}" must not
be specified. The current line is available in the variable C<$_>. It
is expected that the last expression is the field to be used for
comparisons.

Examples:

=over

=item * Using just the identity:

    -e '$_'

=item * Using only the first four characters for comparisons:

    -e 'substr($_, 0, 4)'

=item * Using a regular expression:

    -e '/(\d+) wallclock/ && $1'

=back

Note that it is possible to put C<BEGIN { ... }> blocks into the
comparison function.

=item --rx

Use a regular expression for extracting the sorting field. If a
capture group is detected in the regexp, then this capture group is
used for the extraction, otherwise the whole matched portion is used.

For example, the above mentioned C<-e> snippet

    -e '/(\d+) wallclock/ && $1'

could be written as

    --rx '(\d+) wallclock'

Only the first capture group is used, others are ignored (for now).

The capture group detection code just uses a heuristic, which may fail
in special cases.

=item -f, --ignore-case

Fold all characters to its uppercased version for comparison.

=item -i, --ignore-nonprinting

Ignore non-printing characters (everything matching the C<<
[[:^print]] >> character class) for comparison.

=item -Mmodule[=import]

Load a perl module. The syntax is the same like perl's C<-M> option.

=item -mmodule[=import]

Load a perl module without default import. The syntax is the same like perl's
C<-M> option.

=item -n, --numeric-sort

Sort numerically. It is using perl's L<< <=>|perlop/<=> >> operator.

=item -N, --natural-sort

Sort using L<Sort::Naturally>, if available.

=item -r, --reverse

Reverse the result of comparisons.

=item -u, --unique

Output is made unique for adjacent lines. If -c is specified, then
check for strict ordering (adjacent equal lines are considered as
unsorted).

=item -v, --version

Print psort's version.

=item -V, --version-sort

Sort versions using L<CPAN::Version>, if available.

=item -X, --no-warnings

By default psort warns if a custom field function or rx returns an
undefined value. These warnings may be suppressed with this option.

=back

=head2 COMPATIBILITY

Some options found in GNU/POSIX sort are also available in psort. But
no attempt was done to make psort compatible to GNU/POSIX sort.
Especially there's no locale support (but see above how to C<use
locale> in the C<-C> option). There's also no encoding support (though
it probably can be emulated by using C<<Encode/decode> in the C<-e> or
C<-C> option).

=head2 TODO

Here are some ideas for future options:

=over

=item C<--encoding>

Specify the input and output encoding.

=item Unicode sorting

An option to use L<Unicode::Collate>.

Currently the longish one-liner has to be used:

    psort -MUnicode::Collate -MEncode=decode -e 'decode("utf-8", $_)' -C 'BEGIN { $Collator = Unicode::Collate->new } $Collator->cmp($a,$b)'

=item Sort specific columns (<-k>)

Currently one has to use something like the following to sort by
columns:

    psort -e '@F=split; $F[...]'

=item C<--locale>

Specify a locale.

=item C<-o>

Instead writing to standard output, write to the specified output file.

=item C<-m>

Assume that input files are already sorted.

=item C<-u>

Output only unique lines.

=back

=head1 AUTHOR

Slaven ReziE<x0107>

=head1 COPYRIGHT AND LICENSE

Copyright (C) 2009,2011,2013,2015,2016,2018,2019,2022,2023 by Slaven ReziE<x0107>

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself, either Perl version 5.8.8 or,
at your option, any later version of Perl 5 you may have available.

=head1 SEE ALSO

L<sort(1)>, L<Sort::Naturally>, L<CPAN::Version>.

An alternative perl-enhanced sort program: L<subsort> (in L<App::subsort>).

=cut



( run in 2.015 seconds using v1.01-cache-2.11-cpan-ceb78f64989 )