UNITCHECK results from the CPAN

UNITCHECK

B-C

view release on metacpan or search on metacpan

  pushmark
    args ...
  gv => subname
  entersub

=head2 Call a method

Here we have several combinations to define the package and the method name, either
compile-time (static as constant string), or dynamic as B<GV> (for the method name) or 
B<PADSV> (package name).

B<method_named> holds the method name as C<sv> if known at compile time.
If not B<gv> (of the name) and B<method> is used.
The package name is at the top of the stack.
A call stack is added with B<pushmark>.

1. Static compile time package ("class") and method:

Class->subname(args...) =>

  pushmark
  const => PV "Class"
    args ...
  method_named => PV "subname"
  entersub

2. Run-time package ("object") and compile-time method:

$obj->meth(args...) =>

  pushmark
  padsv => GV *packagename
    args ...
  method_named => PV "meth"
  entersub

3. Run-time package and run-time method:

$obj->$meth(args...) =>

  pushmark
  padsv => GV *packagename
    args ...
  gvsv => GV *meth
  method
  entersub

4. Compile-time package ("class") and run-time method:

Class->$meth(args...) =>

  pushmark
  const => PV "Class"
    args ...
  gvsv => GV *meth
  method
  entersub

=head1 Hooks

=head2 Special execution blocks BEGIN, CHECK, UNITCHECK, INIT, END

Perl keeps special arrays of subroutines that are executed at the
beginning and at the end of a running Perl program and its program
units. These subroutines correspond to the special code blocks:
C<BEGIN>, C<CHECK>, C<UNITCHECK>, C<INIT> and C<END>. (See basics at
L<perlmod/basics>.)

Such arrays belong to Perl's internals that you're not supposed to
see. Entries in these arrays get consumed by the interpreter as it
enters distinct compilation phases, triggered by statements like
C<require>, C<use>, C<do>, C<eval>, etc.  To play as safest as
possible, the only allowed operations are to add entries to the start
and to the end of these arrays.

BEGIN, UNITCHECK and INIT are FIFO (first-in, first-out) blocks while
CHECK and END are LIFO (last-in, first-out).

L<Devel::Hook> allows adding code the start or end of these
blocks. L<Manip::END> even tries to remove certain entries.

=head3 The BEGIN block

A special array of code at C<PL_beginav>, that is executed before
C<main_start>, the first op, which is defined be called C<ENTER>.
E.g. C<use module;> adds its require and importer code into the BEGIN
block.

=head3 The CHECK block

The B compiler starting block at C<PL_checkav>. This hooks int the
check function which is executed for every op created in bottom-up,
basic order.

=head3 The UNITCHECK block

A new block since Perl 5.10 at C<PL_unitcheckav> runs right after the
CHECK block, to seperate possible B compilation hooks from other
checks.

=head3 The INIT block

At C<PL_initav>.

=head3 The END block

At C<PL_endav>.

L<Manip::END> started to mess around with this block.

The array contains an C<undef> for each block that has been
encountered. It's not really an C<undef> though, it's a kind of raw
coderef that's not wrapped in a scalar ref. This leads to funky error
messages like C<Bizarre copy of CODE in sassign> when you try to assign
one of these values to another variable. See L<Manip::END> how to
manipulate these values array.

=head2 B and O module. The perl compiler.

Malcom Beattie's B modules hooked into the early op tree stages to
represent the internal ops as perl objects and added the perl compiler
backends. See L<B> and L<perlcompile>.

The three main compiler backends are still B<Bytecode>, B<C> and B<CC>.

I<Todo: Describe B's object representation a little bit deeper, its
CHECK hook, its internal transformers for Bytecode (asm and vars) and
C (the sections).>

=head2 MAD

MAD stands for "Misc Attributed Data".

Larry Wall worked on a new MAD compiler backend outside of the B
approach, dumping the internal op tree representation as B<XML> or
B<YAML>, not as tree of perl B objects.

The idea is that all the information needed to recreate the original source is
stored in the op tree. To do this the tokens for the ops are associated with ops,
these madprops are a list of key-value pairs, where the key is a character as
listed at the end of F<op.h>, the value normally is a string, but it might also be
a op, as in the case of a optimized op ('O'). Special for the whitespace key '_'
(whitespace before) and '#' (whitespace after), which indicate the whitespace or
comment before/after the previous key-value pair.

Also when things normally compiled out, like a BEGIN block, which normally do
not results in any ops, instead create a NULLOP with madprops used to recreate
the object.

I<Is there any documentation on this?>

Why this awful XML and not the rich tree of perl objects?

Well there's an advantage.
The MAD XML can be seen as some kind of XML Storable/Freeze of the B

perloptree.pod view on Meta::CPAN

It's probably best to copy one of the existing runops functions and
change it to suit your needs. Then, in the C<BOOT> section of your XS
file, add the line:

  PL_runops = my_runops;

This function should be as efficient as possible to keep your programs
running as fast as possible. See L<Jit> for an even faster just-in-time 
compilation runloop.

=head3 Walkers or runops

The standard op tree B<walker> or B<runops> is as simple as this fast
C<Perl_runops_standard()> in (F<run.c>). It starts with C<main_start> and walks
the C<op_next> chain until the end. No need to check other fields, strictly
linear through the tree.

  int
  Perl_runops_standard(pTHX)
  {
	dVAR;
	while ((PL_op = CALL_FPTR(PL_op->op_ppaddr)(aTHX))) {
		PERL_ASYNC_CHECK(); /* until 5.13.2 */
	}
	TAINT_NOT;
	return 0;
  }

To inspect the op tree within a perl program, you can also hook C<PL_runops> (see
above at L</"Pluggable runops">) to your own perl walker (see e.g. L<B::Utils>
for various useful walkers), but you cannot modify the tree from within the B
accessors, only via XS. Or via L<B::Generate> as explained in Simon Cozen's 
"Hacking the Optree for Fun..." L<http://www.perl.com/pub/a/2002/05/07/optree.html>.

I<Todo: Show the other runloops, and esp. the B:Utils ones.>
I<Todo: Describe the dumper, the debugging and more extended walkers.>

=head1 SEE ALSO

=head2 Internal and external modifications

See the short description of the internal optimizer in the "Brief Summary".

I<Todo: Describe the exported variables and functions which can be
hooked, besides simply adding code to the blocks.>

Via L</"Pluggable runops"> you can provide your own walker function, as it
is done in most B modules. Best see L<B::Utils>.

You may also create custom ops at runtime (well, strictly speaking at
compile-time) via L<B::Generate>.

=head2 Modules

The most important op tree module is L<B::Concise> by Stephen McCamant.

L<B::Utils> provides abstract-enough op tree grep's and walkers with
callbacks from the perl level.

L<Devel::Hook> allows adding perl hooks into the BEGIN, CHECK,
UNITCHECK, INIT blocks.

L<Devel::TypeCheck> tries to verify possible static typing for
expressions and variables, a pretty hard problem for compilers,
esp. with such dynamic and untyped variables as Perl 5.

Reini Urban maintains the interactive op tree debugger L<B::Debugger>, 
the Compiler suite (B::C, B::CC, B::Bytecode), L<B::Generate> and 
is working on L<Jit>.

=head2 Various Articles

The best source of information is the source. It is very well documented.

There are some pod files from talks and workshops in F<ramblings/>.
From YAPC EU 2010 there is a good screencast at L<http://vimeo.com/14058377>.

Simon Cozens has posted the course material to NetThink's
L<http://books.simon-cozens.org/index.php/Perl_5_Internals#The_Lexer_and_the_Parser>
training course. This is the currently best available description on
that subject.

"Hacking the Optree for Fun..." at
L<http://www.perl.com/pub/a/2002/05/07/optree.html> is the next step by
Simon Cozens.

Scott Walters added more details at L<http://perldesignpatterns.com/?PerlAssembly>

Joshua ben Jore wrote a 50 minute presentation on "Perl 5
VM guts" at L<http://diotalevi.isa-geek.net/~josh/Presentations/Perl%205%20VM/>
focusing on the op tree for SPUG, the Seattle Perl User's Group.

Eric Wilhelm wrote a brief tour through the perl compiler backends for
the impatient refactorerer. The perl_guts_tour as mp3
L<http://scratchcomputing.com/developers/perl_guts_tour.html> or as
pdf L<http://scratchcomputing.com/developers/perl_guts_tour.pdf>

This text was created in this wiki article:
L<http://www.perlfoundation.org/perl5/index.cgi?optree_guts>
The with B::C released version should be more actual.

=head1 Conclusion

So this is about 30% of the basic op tree information so far. Not speaking about
the guts. Simon Cozens and Scott Walters have more 30%, in the source are more
10% to copy&paste, and in the compilers and run-time information is the rest. I
hope with the help of some hackers we'll get it done, so that some people will
begin poking around in the B backends. And write the wonderful new C<dump>/C<undump>
functionality (which actually worked in the early years on Solaris) to
save-image and load-image at runtime as in LISP, analyse and optimize the
output, output PIR (parrot code), emit LLVM or another JIT optimized code or
even write assemblers. I have a simple one at home. :)

Written 2008 on the perl5 wiki with socialtext and pod in parallel 
by Reini Urban, CPAN ID C<rurban>.

( run in 0.330 second using v1.01-cache-2.11-cpan-eab888a1d7d )