Acme-OneHundredNotOut

 view release on metacpan or  search on metacpan

OneHundredNotOut.pm  view on Meta::CPAN

a huge amount of other cool stuff you can do with it. 

I swear the day I wrote L<Text::Ngram>, there were no other modules on
CPAN which extracted n-grams, but as soon as I released it it looked
like there were three or four there all along. (Including one from
Jarkko, no less.) Anyway, I wanted to see if I could still remember how
to write XS modules, especially since I'd just written a book about it.

L<Lingua::EN::Inflect::Number> is a terrible hack, but it works. I
needed it to make C<Class::DBI::Relationship> (of which more later)
more human-friendly. L<Lingua::EN::FindNumber> is another hack written
for APP; I was a little surprised that C<Lingua::EN::Words2Nums>, which
is a fantastic module in its own right, can turn English descriptions of
numbers into digits, but it can't actually pull the numbers out of a
text in the first place. So I fixed that.

=head2 Text Munging, and Some More Mail Stuff

Applying my linguistic experience to the problems of intelligent mail
indexing, searching and displaying led to churning out another set of
modules.

The first problem was what to do with search results. You know those
little snippets that Google and other search engines display when you
search for some terms? They contextualise the terms in the body of the
document and highlight them in a snippet that best represents how
they're used in the document. This is actually a really hard problem,
and it took me several goes to get L<Text::Context> right. It uses
L<Text::Context::EitherSide> as an "emergency" contextualizer if it
can't get anything right at all, but the algorithm itself is a bit of a
swine. I actually had to prototype this module in Ruby to get my
thinking clear enough to code it up in Perl...

L<Text::Quoted> was another mail display problem - it's nice to
display different layers of quoted text in an email in different
colours. Identifying the quoted text isn't that hard, but working out
a particular bit nests is also surprisingly tricky. So I sorted it out.

The next problem I had to solve lead on from this. Suppose you've got
some mail, which is plain text, and you're going to display it as HTML.
Along the way, you want to turn any URIs into links, (maybe using
something like L<URI::Find::Schemeless::Stricter> to find things which
look like URLs, but which doesn't think that numbered lists are IP
addresses) escape any non-HTML-safe characters, highlight search terms,
put different quoted regions in different colours, and maybe do other
things too. The thing is, you have to be very careful about the order in
which you do this. Once you've escaped the HTML, you might mess up your
colouring of quoted text, but if you've turned the URIs into links
first, you'll mess them up when you escape all the HTML entities.
L<Text::Decorator> allows you to do all these transformations in a nice,
safe way, "layering" things like URI escaping, highlighting, and so on,
and then rendering to text or HTML or whatever when all the layers have
been applied.

C<Text::Decorator> was written in a meta-programming system I wrote
called L<pool>, which I should probably use more. It writes the boring
bit of OO classes for you given a simple description of the methods and
attributes.

Oh, and if you're not contextualising search terms in a mail snippet,
you probably just want to display the original content rather than the
first few lines, which invariablely contain lots of quoting of another
message. L<Text::Original>, extracted from the code of the Mariachi
project and so actually only packaged by me and written by Richard Clamp
and Simon Wistor, does just this.

L<WWW::Hotmail> was an attempt to solve the problem of how to import all
the mail a user already has into our archiving program, a problem Gmail
is now dealing with. Actually, Gmail's currently dealing with pretty
much all the problems we looked at last year. It's quite funny, really.

=head2 SIMON Hits The Web

I hate web programming. HTML is boring, CGI is boring, and I tried
avoiding it for as long as I could. This stopped when I worked for
Oxford University, handling their webmail service, which lead to
L<Bundle::WING>. Also at Oxford, I had to work with C<AxKit>, which
caused me innumerable headaches but I finally got some working XSP
applications written, not without writing the
L<Apache::AxKit::Language::XSP::ObjectTaglib> and
L<AxKit::XSP::Minisession> helper modules. I also did some playing
around with C<mod_perl>, thanks to the rather wonderful I<mod_perl
Cookbook>, and came up with L<Apache::OneTimeURL> when, during a
particularly paranoid phase, I wanted to give out my physical address
in URLs that would self-destruct after a single reading.

After leaving, though, I discovered the C<Class::DBI>/Template Toolkit
pair which has dominated my web programming since then. If you haven't
played with these two modules yet, you really need to, since they
work so well together, and with other modules like C<CGI::Untaint>, that 
they simplify so much of web and database work. I extended
C<CGI::Untaint> with a bunch of extra patterns while at Kasei and
afterwards, including L<CGI::Untaint::ipaddress>,
L<CGI::Untaint::upload> and L<CGI::Untaint::html>, 
I also wrote a whole plethora of C<CDBI> extensions:
L<Class::DBI::AsForm>, L<Class::DBI::Plugin::Type>,
L<Class::DBI::Loader::GraphViz> (reflecting my penchant for data
visualization), and L<Class::DBI::Loader::Relationship>, which applies
the "as simple as possible and a bit simpler" approach to defining data
relationships.

The whole culmination of C<CDBI>, TT, and all these other technologies
came when I sat down and wrote L<Maypole>, a Model-View-Controller
framework with, again, emphasis on making things very simple to get
working. The Perl Foundation's sponsorship of Maypole development has
been one of the proudest achievements in my CPAN career, and lead not
only to a stonking big manual, loads of examples, but also
L<Maypole::Authentication::UserSessionCookie> and L<Maypole::Component>.

Template Toolkit and XML came back together again in a recent project
where I've had render some XML as part of a Maypole application.
Amazingly, there wasn't an XSLT filter for the Template Toolkit, so
L<Template::Plugin::XSLT> was born.

=head2 Games, Diversions and Toys

It was only when I got back from Japan that I learnt to play Go. How
stupid was that. For a year I had access to some of the best Go clubs
and professional teacher and players in the world, and then I only pick
the bloody game up when I get back to England. Anyway, any computer
programmer who learns to play go, and they all do soon or later,



( run in 1.081 second using v1.01-cache-2.11-cpan-cdf2f3d4e48 )