cascade results from the CPAN

Perl6-Doc
treating C<-xyz> as a bareword. On the other hand, in Perl 5, I<all>
method names are essentially in the unrecognized category until run
time, so it would be impossible to tell whether to parse the minus sign
as a real negation. Optional type declarations in Perl 6 would only
help the compiler with variables that are actually declared to have a
type. Fortunately, a negated 1 is still true, so even if we parsed the
negation as a real negation, it might still end up doing the right
thing. But it's all very tacky.

So I'm thinking of a different tack. Instead of bundling the letters:

    -drwx $file

let's think about the trick of returning the value of C<$file> for a
true value. Then we'd write nested unary operators like this:

    -d -r -w -x $file

One tricky thing about that is that the operators are applied right to
left. And they don't really short circuit the way stacked C<&&> would
(though the optimizer could probably fix that). So I expect we could do
this for the default, and if you want the C<-drwx> as an autoloaded
backstop, you can explicitly declare that.

In any event, the proposed C<filetest> built-in need not be built in.
It can just be a universal method. (Or maybe just common to strings and
filehandles?)

My one hesitation in making cascading operators work like that is that
people might be tempted to get cute with the returned filename:

    $handle = open -r -w -x $file or die;

That might be terribly confusing to a lot of people. The solution to
this conundrum is presented at the end of the next section.

=head2 RFC 290: Better english names for -X

This RFC proposes long names as aliases for the various filetest
operators, so that instead of saying:

    -r $file

you might say something like:

    use english;
    freadable($file)

Actually, there's no need for the C<use english>, I expect. These names
could merely universal (or nearly universal) methods. In any case, we
should start getting used to the idea that C<mumble($foo)> is
equivalent to C<$foo.mumble()>, at least in the absence of a local
subroutine definition to the contrary. So I expect that we'll see both:

    is_readable($file)

and:

    $file.is_readable

Similar to the cascaded filetest ops in the previous section, one
approach might be that the boolean methods return the object in
question for success so that method calls could be stacked without
repeating the object:

    if ($file.is_dir
             .is_readable
             .is_writable
             .is_executable) {

[Update: the syntax above is now illegal.]

But C<-drwx $file> could still be construed as more readable, for some
definition of readability. And cascading methods aren't really
short-circuited. Plus, the value returned would have to be something
like "$file is true," to prevent confusion over filename "0."

There is also the question of whether this really saves us anything
other than a little notational convenience. If each of those methods
has to do a I<stat> on the filename, it will be rather slow. To fix
that, what we'd actually have to return would be not the filename, but
some object containing the stat buffer (represented in Perl 5 by the
C<_> character). If we did that, we wouldn't have to play C<$file is
true> games, because a valid stat buffer object would (presumably)
always be true (at least until it's false).

The same argument would apply to cascaded filetest operators we talked
about earlier. An autoloaded C<-drwx> handler would presumably be smart
enough to do a single stat. But we'd likely lose the speed gain by
invoking the autoload mechanism. So cascaded operators (either C<-X>
style or C<.is_XXX> style) are the way to go. They just return objects
that know how to be either boolean or stat buffer objects in context.
This implies you could even say

    $statbuf = -f $file or die "Not a regular file: $file";
    if (-r -w $statbuf) { ... }

This allows us to simplify the special case in Perl 5 represented by
the C<_> token, which was always rather difficult to explain. And
returning a stat buffer instead of C<$file> prevents the confusing:

    $handle = open -r -w -x $file or die;

Unless, of course, we decide to make a stat buffer object return the
filename in a string context. C<:-)>

=head2 RFC 283: C<tr///> in array context should return a histogram

Yes, but ...

While it's true that I put that item into the Todo list ages ago, I
think that histograms should probably have their own interface, since
the histogram should probably be returned as a complete hash in scalar
context, but we can't guess that they'll want a histogram for an
ordinary scalar C<tr///>. On the other hand, it could just be a C</h>
modifier. But we've already done violence to C<tr///> to make it do
character counting without transliterating, so maybe this isn't so far
fetched.

One problem with this RFC is that it does the histogram over the input
rather than the output string. The original Todo entry did not specify
this, but it was what I had intended. But it's more useful to do it on
the resulting characters because then you can use the C<tr///> itself
to categorize characters into, say, vowels and consonants, and then
count the resulting V's and C's.

On the other hand, I'm thinking that the C<tr///> interface is really
rather lousy, and getting lousier every day. The whole C<tr///>
interface is kind of sucky for any sort of dynamically generated data.
But even without dynamic data, there are serious problems. It was bad
enough when the character set was just ASCII. The basic problem is that
the notation is inside out from what it should be, in the sense that it
doesn't actually show which characters correspond, so you have to count
characters. We made some progress on that in Perl 5 when, instead of:

    tr/abcdefghijklmnopqrstuvwxyz/VCCCVCCCVCCCCCVCCCCCVCCCCC/

we allowed you to say:

    tr[abcdefghijklmnopqrstuvwxyz]
      [VCCCVCCCVCCCCCVCCCCCVCCCCC]

There are also shenanigans you can play if you know that duplicates on
the left side prefer the first mention to subsequent mentions:

    tr/aeioua-z/VVVVVC/

But you're still working against the notation. We need a more explicit
way to put character classes into correspondence.
( run in 1.378 second using v1.01-cache-2.11-cpan-df04353d9ac )