HTML-TableExtractor

 view release on metacpan or  search on metacpan

TableExtractor.pm  view on Meta::CPAN

=head1 DESCRIPTION

Parses HTML looking for table-related elements (table, tr, td and th as of 
version 0.1).

Three callbacks can be registered for each element. These callbacks,
described below, are executed whenever an element of a particular type is
encountered.
  
  o  start_${tagname}  Called whenever $tagname is opened.
  o  ${tagname}        Called immediately after start_${tagname}, and
		                   immediately before end_${tagname}.
  o  end_${tagname}    Called whenever a closing $tagname is encountered.


=head2 EXAMPLE

  use HTML::TableExtractor;
  $p = HTML::TableExtractor->new();
  $p->parse($html,
      start_table => sub {
        my ($attr, $origtext) = @_;
        print "Table border is $table->{border}\n";
      },
      tr => sub { print "Row opened or closed.\n" },
      );

	
=head1 METHODS

=over 4

=item start($parser, $tag, $attr, $attrseq, $origtext);

Called whenever a particular start tag has been recognised. This module
recognises these tags: <table>, <tr>, <td> & <th>.

This method will be called by the parser and is not intended to be called from
an application. 

=item end($parser, $tag, $origtext); 

Called whenever a particular end tag is encountered.

This method will be called by the parser and is not intended to be called from
an application. 

=item $p->parse($html, tag_type => \&coderef, ...);

This method is all you really need to do. Call it with callbacks for each tag
type. These will be executed as described above.


=back

=head2 EXPORTS


=head2 CAVEATS, BUGS, and TODO

o  parse() should handle other data sources, such as streaming, file handle
etc.


=head2 SEE ALSO

HTML::Parser, HTML::TableContentParser

=head1 AUTHOR

Simon Drabble  E<lt>simon@thebigmachine.org<gt>

(C) 2002  Simon Drabble  

This software is released under the same terms as perl.

=cut



( run in 2.135 seconds using v1.01-cache-2.11-cpan-140bd7fdf52 )