Acme-OneHundredNotOut

 view release on metacpan or  search on metacpan

OneHundredNotOut.pm  view on Meta::CPAN

Mark-Jason Dominus, another huge influence in the development of my CPAN
career, has written C<Tie::File>, which not only has a better name but
is actually efficient too.

=head2 The Internals Phase

1999-2000 were disastrous years for me personally but magnificent years
Perl-sonally. Stuck in a boring job and a tiny flat in the middle of
Tokyo, I had plenty of time to get stuck into more Perl development. I
felt that getting involved with C<perl5-porters> would be a good way of
gettting to know more about Perl, and so I needed a hobby horse - an
issue of Perl's development that I cared about. Since I was in Japan and
working a lot with non-Latin text, Unicode support seemed a good thing
to work on, and so L<Unicode::Decompose> appeared, while I fixed up a
substantial part of the post-5.6 core Unicode support.

I'd recommend this way to anyone who wants to get more involved in the
Perl community, although I was very lucky in terms of who else happened
to be around at the time: Gurusamy Sarathy was extremely gracious in
helping me turn my fledgling C code into something fit for the Perl
core, and he also helped me understand the C<perl5-porters> etiquette
(yes, there was some at the time) and what makes a good patch, while
Jarkko Hietaniemi was always good for suggestions of interesting things
for keen people to work on. Seriously, get involved. If I can do it,
anyone can.

Anyway, this fixation with understanding the Perl 5 internals, and
especially the Perl 5 compiler, (due to yet another of my Perl
influences, the great Malcolm Beattie) led to quite a torrent of
modules, from L<ByteCache>, an implementation of just-in-time
compilation for Perl modules, through L<B::Flags> and L<B::Tree> to help
visualising the Perl op tree, to L<uninit>, L<B::Generate>, L<optimizer>
and L<B::Utils> for modifying it.

=head2 Perl About The House

Now we abandon chronological order somewhat and take a look at the
various areas in which I've used Perl. One of these areas has been the
automation of everyday life: checking my bank balance with
L<Finance::Bank::LloydsTSB> (the first Perl module to interface to
personal internet banking, no less) and my phone bill with a release of
Tony Bowden's L<Data::BT::PhoneBill>. 

L<Finance::Bank::LloydsTSB> was meant to go with L<Finance::QIF>, my
Quicken file parser, to produce another now-abandoned idea, a Perl
finances manager. It seemed that I'm only capable of producing modules,
not full standalone applications - or at least, it seemed that way until
I produced L<Bryar>, my blogging software, based on the concepts from
Rael Dornfest's C<blosxom> and beginning my adventures with Andy
Wardley's Template Toolkit. Bryar also tuned me in to the
Model-View-Controller framework idea, of which more later.

Another project I briefly played with was a personal robot, using the
C<Sphinx>/C<Festival> speech handling and recognition modules from
Cepstral and Kevin Lenzo. I didn't have X10, so I couldn't shout
"lights" into the air in a wonderfully scifi way, but I could shout
"mail" and have a summary of my inbox read to me, "news" to get the
latest BBC news headlines, and "time" to hear the time. Of course,
getting computers to tell the time nicely takes a little bit of work. I
don't like "It's eleven oh-three pee em", since that's not what someone
would say if you asked them the time. I wanted my robot to say "It's
just after eleven", and that's what L<Time::Human> does. Shame about the
localisation.

=head2 Messing About With Classes

One of the things that continues to amaze me about Perl is its
flexibility; the way you can change core parts of its operation, even
from pure Perl. This lead to quite a few modules, many of which were
mere proofs of concept.

L<Sub::Versive>, for instance, was the first module on CPAN to handle
pre- and post-hooks for a subroutine; it has since been joined by a
plethora of imitators. It was written, though, in response to a peculiar
scenario. I was writing a module (C<Safety::First>) which provided
additional built-in-like functions for Perl to encourage and facilitate
defensive programming and intelligible error reporting. ("Couldn't open
file? Why not?") These built-ins had to be available from every
package, which meant playing with C<UNIVERSAL::AUTOLOAD>. But what if
another package was already using C<UNIVERSAL::AUTOLOAD>? Hence,
C<Sub::Versive> wrapped it in a pre-hook. Of course, with the
interesting bit of the problem solved, C<Safety::First> was abandoned.

L<Class::Dynamic> was an interesting attempt to provide support for code
references in C<@ISA>, analogous to code references in C<@INC>. It
works, but of course I could never find any practical use for it.

L<Class::Wrap> was written as a lazy profiler. A certain application I
was writing for my employer of the time, Kasei, made use of the (IMHO
evil) C<Mail::Message> module. How do we isolate all calls to that
class? There are plenty of modules out there for instrumenting
individual methods, including of course C<Sub::Versive>. But the whole
class? C<Class::Wrap> takes a wonderfully brute-force but workable
approach to the problem. A real profiler, however, can be constructed
from L<Devel::DProfPP>, which is sort of a profiler toolkit.

I wrote a couple of other modules with Kasei in this category,
particularly while working on our Plucene port of the Lucene search
engine. (I guess I could claim C<Plucene> as one of my 100 modules, but
that would be to deny Marc Kerr the recognition he deserves for the work
he put in to packaging, documenting and providing tests for my insane
and scrambled code.) I wrote L<Bit::Vector::Minimal>, for instance, as I
ported C<org.apache.lucene.util.BitVector>; L<Tie::Array::Stored>, which
I'm amazed wasn't already implemented on CPAN, provided the Perl
equivalent of C<org.apache.lucene.util.PriorityQueue>.
L<Lucene::QueryParser>, of course, does what it says on the tin. (I also
produced a couple of add-ons for Plucene after leaving Kasei when I was
doing a bit of Plucene consultancy:
L<Plucene::Plugin::Analyzer::PorterAnalyzer> and
L<Plucene::Plugin::WeightedQueryParser>.)

Another module produced in the course of writing Plucene was
L<Class::HasA>, a handy little utility module which works well with Tony
Bowden's C<Class::Accessor> and merely dispatches certain method calls
to objects contained within your object.

And speaking of C<Class::Accessor>, L<Class::Accessor::Assert> would
have been a godsend while writing Plucene, as it's a version of accessor
handling which typechecks what you're putting into the accessor slots.
When you're converting a typed language into an untyped one, occasional
checks that you're handling the right kind of object don't go amiss. I
learnt my lesson eventually, though, and wrote the module after Plucene
was done.

Another Java-influenced module was C<Attribute::Final>, which was written 
for my book Advanced Perl Programming as an example of both attributes
and messing about with the class module - by marking some subtourines as
C<:final>, you get an error if a derived class attempts to override it.
As with many of my proof-of-concept modules, this isn't something I'd
ever use myself, but I know others have used it. I'll let you into a
secret - over the past few months I've settled on giving modules a
version number of C<0.x> if I've never used them myself and C<1.x> if I
have.

Java wasn't the only language to influence my Perl coding activities.
Ruby is a wonderful little language I first encountered in Japan, but
didn't really get into until around 2003. Of course, when you see
another language has dome good ideas, you steal them, which is what I
did with L<rubyisms>, L<SUPER>, and L<Class::SingletonMethod> - all of
which, by the way, are B<excellent> examples of what you can do to the
behaviour of Perl just from pure Perl. C<SUPER> is the kind of module
I've so often wanted to use in production code but never dared.

=head2 Smart Perl

My views on human-computer interface and computer usability have been
unchanged since I wrote C<Tie::DiscoveryHash> way back in the mists of
time. The underlying principle behind that module was simple: the user
should B<never> tell the computer anything it already knows or can
reasonably be expected to work out. C<Tie::DiscoveryHash> was all about
having the computer find out stuff for itself.

This has influenced a number of my modules, which have focussed on
trying to make everything as simple as possible for the user (or more
usually, for the programmer using my modules) and then a bit simpler.

So, for instance, I found the whole process of keeping values persistent
between runs of Perl a bit of a nightmare - I could never remember the
syntax for tying to C<DB_File>, and I would always forget to use the
extremely handy C<MLDBM> module. I just wanted to say "keep this
variable around". L<Attribute::Persistent> does just that, cleanly and
simply. It even works out a sensible place to put the database, so you
don't have to.

Similarly, L<Config::Auto> works out where your application might keep a
configuration file, works out what format it's in, parses it, and hands
you back a hash. No muss, no fuss. And more importantly, no need to even
think about writing a config file parser again. It's done once, forever.
L<Getopt::Auto> applies the same design principles to handling command
line arguments - I hate forgetting how to use C<Getopt::Long>.

Other attempts at making things simple for the end-user weren't that
successful. As part of writing my (first) mail archiving and indexing
program, C<Mail::Miner>, of which more later, I wanted a nice way for
users to specify a time period in which they're looking for mails - "a
week ago", "sometime last summer", "near the beginning of last month" -
and so on. L<Date::PeriodParser> would take these descriptions and turn
them into a start and end time in which to search. Except, of course,
that this is a very hard thing to do and requires a lot of heuristics,
and while I started off quite well, as ever, I got distracted with other
interesting and considerably more tractable problems.

=head2 Mail Handling

A good number of my Perl modules focussed on mail handling, so many that
I was actually able to get a job basically doing mail processing in
Perl. It all started with L<Mail::Audit>. I was introduced to
F<procmail> at University, and it was useful enough, but it kept having
locking problems and losing my mail, and I didn't really understand it,
to be honest, so I wanted to write my mail filtering rules in Perl.
C<Mail::Audit> worked well for a couple of years before it grew into an
obese monster. I actually only use a very old version of C<Mail::Audit>
on my production server.

As part of the attempt to slim it back down again, I abstracted out one
of the major parts of its functionality, delivering an email to a local
mailbox. Now I only use mbox files, so it was reasonably easy for me,
but people wanted me to add Maildir and whatever to C<Mail::Audit>, so I
kicked it all out to L<Mail::LocalDelivery> instead.

But I found that I still wasn't able to filter my mail adequately and
find the stuff I needed from it. Attachments were a big problem, since
they both made ordinary search with C<grep> or C<grepmail> much slower,
and they weren't always easy to find anyway. So I wrote something to
remove attachments from mail and stick them in a database, and while I'm
at it, index mail for quick retrieval. And then it grew to identifying
"interesting" features of an email and searching for them too, and then
L<Mail::Miner> was born.

Finally, I got into web display of archived email, and needed a way of
displaying threads. Amazingly, nobody had coded up JWZ's mail threading
algorithm in Perl yet, so I did that too: L<Mail::Thread>.

But then I decided that C<Mail::*> was in a very sick state. I had been
working with the mail handling modules from CPAN - including my own -
and grown to hate them; they were all too slow, too complicated, too
buggy or all three. It was time for action, and the Perl Email Project
was born. 

L<Email::Simple> was the first thing to come out of this, and is 



( run in 0.461 second using v1.01-cache-2.11-cpan-a1f116cd669 )