FunctionalPerl
view release on metacpan or search on metacpan
docs/howto.md view on Meta::CPAN
### Variable life times
Lexical variables in the current implementation of the perl
interpreter live until the scope in which they are defined is
exited. Note explicitely that this means they may still reference
their data even at points of the code within a scope from which on
they can never be used anymore. Example:
{
my $s = ["Hello"];
print $$s[0], "\n";
main_event_loop(); # the array remains allocated till the event
# loop returns, even though never (normally)
# accessible
}
You may ask why you should care about a little data staying
around. The first answer is that the data might be big, but the more
important second answer in the context of functional programming is
that the data structure might be a hierarchical data structure like a
linked list that's passed on, and then appended to there (by way of
mutation, or in the case of lazy functional programming, by way of
mutation hidden in promises). The top (or head, first, in case of
linked lists) of the data structure might be released by the called
code as time goes on. But the variable in the calling scope will still
hold on to it, meaning, it will grow, possibly without
bounds. Example:
{
my $s = xfile_lines $path; # lazy linked list of lines
print "# ".$s->first."\n";
$s->for_each (sub { print "> $_[0]\n" });
}
Without further ado, this will retain all lines of the file at `$path`
in `$s` while the `for_each` forces in (and itself releases) line
after line.
This is a problem that many programming language implementations (that
are not written to support lazy evaluation) have. Luckily in the case
of Perl, it can be worked around, by assigning `undef` or better
weakening the variable from within the called method:
sub for_each {
my ($s, $proc) = @_;
weaken $_[0];
...
}
`weaken` is a bit more friendly than `$_[0] = undef;` in that it
leaves the variable set if there's still another reference to the head
around.
With this trick (which is used in all of the relevant
functions/methods in `FP::Stream`), the above example actually *does*
release the head of the stream in a timely manner.
Now there may be situations where you actually really want to keep
`$s` alive. In such a case, you can protect its value from being
clobbered by passing it through the `Keep` function from `FP::Weak`:
{
my $s = xfile_lines $path; # lazy linked list of lines
print "# ".$s->first."\n";
Keep($s)->for_each (sub { print "> ".$_[0]."\n" });
$s->for_each (sub { print "again: > ".$_[0]."\n" });
}
Of course this *will* keep the whole file in memory until it reaches
the second `for_each`! So perhaps you'd really want to do the
following:
{
my $s = xfile_lines $path; # lazy linked list of lines
print "# ".$s->first."\n";
$s->for_each (sub { print "> ".$_[0]."\n" });
$s = xfile_lines $path; # reopen the file from the start
$s->for_each (sub { print "again: > ".$_[0]."\n" });
}
This is probably the ugliest part when programming functionally in
Perl. Perhaps the interpreter could be changed (or a lowlevel module
written) so that lexical variables are automatically cleared upon
their last access (and something like @_ = () is enough to clear it from
the perl calling stack, if not automatic). An argument against this is
inspection using debuggers or modules like `PadWalker`, so it will
have to be enabled explicitely (lexically scoped).
### Stack memory and tail calls
Another, closely related, place where the perl interpreter does not
release memory in a timely (enough for some programs) manner, are
subroutine calls in tail position. The tail position is the place of
the last expression or statement in a (subroutine) scope. There's no
need to remember the current context (other than, again, to aid
inspection for debugging), and hence the current context could be
released and the tail-called subroutine be made to return directly to
the parent context, but the interpreter doesn't do it.
sub sum_map_to {
my ($fn, $start, $end, $total) = @_;
# this example only contains an expression in tail position
# (ignoring the variable binding statement).
$start < $end ?
sum_map_to ($fn, $start + 1, $end, $total + &$fn($start))
: $total
}
This causes code using recursion to allocate stack memory proportional
to the number of recursive calls, even if the calls are all in tail
position. It keeps around a chain of return addresses, but also (due
to the issue described in the previous section) references to possibly
unused data.
See [`intro/tailcalls`](../intro/tailcalls) and
[`intro/more_tailcalls`](../intro/more_tailcalls) for solutions to
this problem.
( run in 1.156 second using v1.01-cache-2.11-cpan-39bf76dae61 )