FunctionalPerl

 view release on metacpan or  search on metacpan

docs/intro.md  view on Meta::CPAN

assuming that those files or directories are not modified while
reading them (lazily!). If that assumption doesn't hold up, then you
may end up being surprised that you're getting different values than
you were expecting when first opening the directory or
stream. Nonetheless, it's what people in Clojure and Perl 6 and surely
other languages with streams are doing all the time. (Haskell ditched
this approach a while ago for its unsafety, and instead now provides
alternatives that can't break the guarantees that its type system
gives.)

Example:

    fperl> system("echo 'Hello\nWorld.\n' > ourtestfile.txt")
    $VAR1 = 0;
    fperl> our $l = xfile_lines("ourtestfile.txt")
    $VAR1 = lazy { "DUMMY" };
    fperl> $l->first
    $VAR1 = 'Hello
    ';
    fperl> $l->rest
    $VAR1 = lazy { "DUMMY" };

At this point it might still not have read the second line from the
file; saying "might" since probably Perl buffers the input file in
bigger blocks. But in any case, you could do something like the
following without making the perl try to read infinitely much into
process memory:

    fperl> our $l = fh_to_chunks xopen_read("/dev/zero"), 10
    $VAR1 = lazy { "DUMMY" };
    fperl> $l->first
    $VAR1 = '^@^@^@^@^@^@^@^@^@^@';
    fperl> $l->drop(1000)->first
    $VAR1 = '^@^@^@^@^@^@^@^@^@^@';

(Or replace /dev/zero with /dev/urandom.)

For more examples using lazy evaluation and streams, see
`FP::IOStream`, `FP::Text::CSV`, `FP::DBI`,
[functional_XML](../functional_XML/README.md) and the [example
scripts](../examples/).

The nice thing of this is that you can stop writing for or while loops
now, and you can build up a processing chain similar to how you can
write pipelines in the shell. You can write a function that takes a
stream and returns a processed stream, and pass that to another
function that does some other processing, and group those two
functions into one which you can then group together with other
grouped-up ones. Just like you can write shell scripts that use a
pipeline and then pipe up those scripts themselves as if they were
"atoms".

There's a catch, though, currently: unlike programming language
implementations that have been written explicitely to deal with the
functional programming style, the Perl implementation does not release
variables and subroutine arguments as early as theoretically possible,
which means that when calling subroutines that are consuming streams
(like `drop`) the head of the stream would not be released while
walking it, which would mean that the program could run out of
memory. The functional-perl libraries go to some pains to work around
the issue by weakening the subroutine argument slots (in `@_`). More
concretely, this means that after calling `drop` in the example above,
`$l` has been weakened, and if there's no other strong reference
holding the head of the stream, then it becomes undef. This means when
you try to run the same expression again, you get:

    fperl> $l->drop(1000)->first
    Exception: 'Can\'t call method "drop" on an undefined value at (eval 147) line 1.
    '
    fperl 1> 

You can prevent this manually by protecting `$l` using the `Keep` function:

    fperl> our $l = fh_to_chunks xopen_read("/dev/urandom"), 10
    $VAR1 = lazy { "DUMMY" };
    fperl> Keep($l)->drop(1000)->first
    $VAR1 = '<94> )&m^C<8C>ESC<AB>A';
    fperl> Keep($l)->drop(1000)->first
    $VAR1 = '<94> )&m^C<8C>ESC<AB>A';

There is hope that we might find a better way to deal with this
(implement variable life time analysis as a pragma/module), but no
promises here!


## Fresh lexicals and closures

Let's get a better understanding of functions, and first try the
following:

    fperl> our ($f1,$f2) = do { our $a = 10; my $f1 = sub { $a }; $a = 11; my $f2 = sub { $a }; ($f1,$f2) }
    $VAR1 = sub { "DUMMY" };
    $VAR2 = sub { "DUMMY" };
    fperl> &$f1
    $VAR1 = 11;
    fperl> &$f2
    $VAR1 = 11;

The two subroutines are both referring to the same instance of a
variable, and setting that variable to a new value also changes what
the first subroutine sees.

In this case, the reference to the variable is implemented by perl by
simply embedding the variable name in the code: the "our" variables
are package globals which only exist once with the same (fully
qualified) name in the whole program, hence it's enough to store that
name in the program code itself (i.e. only once over the program
lifetime).

Let's try a lexical variable instead (`my $a`):

    fperl> our ($f1,$f2) = do { my $a = 10; my $f1 = sub { $a }; $a = 11; my $f2 = sub { $a }; ($f1,$f2) }
    $VAR1 = sub { "DUMMY" };
    $VAR2 = sub { "DUMMY" };
    fperl> &$f1
    $VAR1 = 11;

Still the same result: the two subroutines are still referring to the
same instance of a variable. Since `$a` only lives lexically in the do
block though, the subroutines now need to store a pointer reference to
it (the way this is implemented is by storing both a pointer to the
compiled code, and a pointer to the variable together in the CODE ref
data structure).

docs/intro.md  view on Meta::CPAN


        &$ourlist($start)->for_each(\&xprintln);
    }

Note that this first declares `$ourlist`, then assigns it; this is so
that the expression that generates the value to be held by the
variable can see the variable, too (so that the function can call
itself). As always, assignments to a variable after introducing it is
dangerous: here it creates a cycle from the internal data structure
representing the subroutine to itself, preventing perl from
deallocating `$ourlist` after exiting the `hello` subroutine. One
solution is to add

    use FP::Weak;

to the imports and change the last line of `hello` into:

        Weakened($ourlist)->($start)->for_each(\&xprintln);

Another solution, and the one preferred by the author of this text, is
to use `fix`, the fixpoint combinator, which is a function that takes
a function as its argument and returns a different function that when
called calls the original function with itself as the first
argument. That was a mouthful, let's see how it looks:

    use FP::fix;

    sub hello ($start, $end) {
        my $inverse = fun ($x) { lazy { 1 / $x } };

        my $ourlist = fix fun ($self, $i) {
            $i < $end ? null
              : cons &$inverse($i), &$self($i-1)
        };

        &$ourlist($start)->for_each(\&xprintln);
    }

When `$ourlist` is called, it calls the nameless function that is the
argument to `fix`, and passes it `$ourlist` (or an equivalent thereof)
and `$start`; our function can then call "itself" through `$self` and
still only needs to pass the "real" argument (the new value for
`$i`). In real world use you would usually rename `$self` to
`$ourlist`, too; they are given different names here just for
illustration.

Do you think that's hard to understand or use? I suggest you play with
it a bit and see whether it grows on you. BTW, a nice property of fix
is that the outer `$ourlist` variable can actually avoided in cases
such as this one--the result from fix can be called immediately:

        fix (fun ($self, $i) {
            $i < $end ? null
              : cons &$inverse($i), &$self($i-1)
        })
          ->($start)->for_each(\&xprintln);

Another idea for a syntactical improvement implemented via a module
would be a recursive variant of `my`, i.e. one where the expression to
the right sees the variable directly, and then applies the `fix` or
weakening transparently, but, like the other ideas mentioned above,
this will take some effort and may only be feasible if there is enough
interest (and hence some form of at least moral support).


## More on functions

Pure functions (and methods) are good blocks for modular programming,
i.e. they are a good approach to make small reusable pieces that
combine easily: their simple API makes them easily
understandable. Their reliability (no side effects, hence no
surprises) makes them easily reusable. It helps being aware of a few
functional "patterns" for good reusability:

### Higher-order functions

Those are functions that take other functions as an argument. Examples
are many sequence processing functions (or methods), like some of
those which we have already seen: `map`, `fold`, `fold_right`,
`filter`. The function they take as an argument may be one that
handles a single value, and they "augment" it to work on all values in
a sequence. Or the function argument may change the way that the
higher-order function works.

In 'Writing a list-generating function' we have written a function
`ourlist` that builds a list while calling `inverse` on every `$i` it
goes through. Let's turn that into a reusable function by making it
higher-order:

    fperl> sub inverse ($x) { lazy { 1 / $x } }
    fperl> sub ourlist ($f, $from, $to) { $from >= $to ? cons &$f($from), ourlist($f, $from - 1, $to) : null }
    fperl> F ourlist (\&inverse, 4, 1)
    $VAR1 = list('0.25', '0.333333333333333', '0.5', '1');

It would now better be renamed, perhaps to something like
`downwards_iota_map`. But we could also split up the function into
downwards_iota and map parts if we're using lazy evaluation, then we
could use those separately. In fact both are already available in
functional-perl:

    fperl> F stream_step_range(-1, 4, 1)->map(\&inverse)
    $VAR1 = list('0.25', '0.333333333333333', '0.5', '1');

(The naming of these more exotic functions like `stream_step_range` is
still open to changes: hints about how other languages/libraries name
those are very welcome.)

A secial kind of higher-order functions is combinators.

### Combinators

> *A combinator is a higher-order function that uses only function
> application and earlier defined combinators to define a result from
> its arguments.*
> ([Wikipedia](https://en.wikipedia.org/wiki/Combinator))

There are already a number of such functions defined in
`FP::Combinators`. The two most commonly used ones are `flip`, which
takes a function expecting 2 arguments and returns a function
expecting them in reverse order, and `compose`, which takes two (or
more) functions, and returns a function that nests them (i.e. calls



( run in 0.888 second using v1.01-cache-2.11-cpan-39bf76dae61 )