HTML-Parser
view release on metacpan or search on metacpan
experimental features. The interface to these might still change.
* Implemented filters to reduce the numbers of callbacks generated:
- $p->ignore_tags()
- $p->report_only_tags()
- $p->ignore_elements()
* New @attr argspec. Less overhead than 'attr' and allow
compatibility with XML::Parser style start events.
* The whole argspec can be wrapped up in @{...} to signal
flattening. Only makes a difference when the target is an
array.
3.19 2001-03-09
* Avoid the entity2char global. That should make the module
more thread safe. Patch by Gurusamy Sarathy <gsar@ActiveState.com>.
3.18 2001-02-24
* There was a C++ style comment left in util.c. Strict C
compilers do not like that kind of stuff.
3.17 2001-02-23
* The 3.16 release broke MULTIPLICITY builds. Fixed.
3.16 2001-02-22
* The unbroken_text option now works across ignored tags.
* Fix casting of pointers on some 64 bit platforms.
* Fix decoding of Unicode entities. Only optionally available for
perl-5.7.0 or better.
* Expose internal decode_entities() function at the Perl level.
* Reindented some code.
3.15 2000-12-26
* HTML::TokeParser's get_tag() method now takes multiple
tags to match. Hopefully the documentation is also a bit clearer.
* #define PERL_NO_GET_CONTEXT: Should speed up things for thread
enabled versions of perl.
* Quote some more entities that also happens to be perl keywords.
This avoids warnings on perl-5.004.
* Unicode entities only triggered for perl-5.7.0 or higher.
3.14 2000-12-03
* If a handler triggered by flushing text at eof called the
eof method then infinite recursion occurred. Fixed.
Bug discovered by Jonathan Stowe <gellyfish@gellyfish.com>.
* Allow <!doctype ...> to be parsed as declaration.
3.13 2000-09-17
* Experimental support for decoding of Unicode entities.
3.12 2000-09-14
* Some tweaks to get it to compile with "Optimierender Microsoft (R)
32-Bit C/C++-Compiler, Version 12.00.8168, fuer x86."
Patch by Matthias Waldorf <matthias.waldorf@zoom.de>.
* HTML::Entities documentation spelling patch by
David Dyck <dcd@tc.fluke.com>.
3.11 2000-08-22
* HTML::LinkExtor and eg/hrefsub now obtain %linkElements from
the HTML::Tagset module.
3.10 2000-06-29
* Avoid core dump when stack gets relocated as the result of
text handler invocation while $p->unbroken_text is enabled.
Needed to refresh the stack pointer.
3.09 2000-06-28
* Avoid core dump if somebody clobbers the aliased $self argument of
a handler.
* HTML::TokeParser documentation update suggested by
Paul Makepeace <Paul.Makepeace@realprogrammers.com>.
3.08 2000-05-23
* Fix core dump for large start tags.
Bug spotted by Alexander Fraser <green795@hotmail.com>
* Added yet another example program: eg/hanchors
* Typo fix by Jamie McCarthy <jamie@mccarthy.org>
3.07 2000-03-20
* Fix perl5.004 builds (was broken in 3.06)
* Declaration parsing mode now only triggers for <!DOCTYPE ...> and
<!ENTITY ...>. Based on patch by la mouton <kero@3sheep.com>.
3.06 2000-03-06
* Multi-threading/MULTIPLICITY compilation fix.
Both Doug MacEachern <dougm@pobox.com> and
Matthias Urlichs <smurf@noris.net> provided a patch.
* Avoid some "statement not reached" warnings from picky
compilers.
* Remove final commas in enums as ANSI C does not allow
them and some compilers actually care.
Patch by James Walden <jamesw@ichips.intel.com>
* Added eg/htextsub example program.
3.05 2000-01-22
* Implemented $p->unbroken_text option
* Don't parse content of certain HTML elements as CDATA when
xml_mode is enabled.
* Offset was reported with wrong sign for text at end of chunk.
3.04 2000-01-15
* Backed out 3.03-patch that checked for legal handler and attribute
names in the HTML::Parser constructor.
* Documentation typo fixed by Michael.
3.03 2000-01-14
* We did not get out of comment mode for comments ending with an
odd number of "-" before ">". Patch by la mouton <kero@3sheep.com>
* Documentation patch by Michael.
3.02 1999-12-21
* Hide ~-magic IV-pointer to 'struct p_state' behind a reference.
This allow copying of the internal _hparser_xs_state element, and
will make HTML-Tree-0.61 work again.
* Introduced $p->init() which might be useful for subclasses that
only want the initialization part of the constructor.
* Filled out DIAGNOSTICS section of the HTML::Parser POD.
3.01 1999-12-19
* Rely on ~-magic instead of a DESTROY method to deallocate
the internal 'struct p_state'. This avoid memory leaks
when people simply wipe of the content of the object hash.
* One of the assertion in hparser.c had opposite logic. This made
( run in 1.094 second using v1.01-cache-2.11-cpan-71847e10f99 )