Search-Tools

 view release on metacpan or  search on metacpan

Changes  view on Meta::CPAN

 - optimizations to HeatMap and Snipper sentence detection, which has the
   nice side effect of avoiding breaking HTML entities in snipped HTML. To
   take advantage, use as_sentences => 1.

0.77 15 Aug 2012
 - add stemming support for Query->matches_html and Query->matches_text
 - add HiLiter->html_stemmer with passthrough to plain_stemmer until
   failing test cases materialize.
 - some fixes for stemming support, mostly turning off optimizations based
   on regular expressions.

0.76 7 Aug 2012
 - finally(!) add real stemming tests and support to Snipper and HiLiter 

0.75 6 Aug 2012
 - add some tests for Perl 5.17.x test failures
 - fix edge case where short snip generated spurious ellipses

0.74 21 May 2012
 - yank some meta data from a test doc to avoid security scan problems on
   CPAN

0.73 13 May 2012 (Happy Mothers Day)
 - fix edge case with snipping phrases that contain non-word characters
   other than spaces.

0.72 30 April 2012
 - more fixes, similar to 0.71 (for now missing Keywords class)

0.71 28 Feb 2012
 - fix failing tests due to removed classes in 0.70

0.70 23 Feb 2012
 - refactor XML->escape for some performance gain
 - remove long-deprecated Keywords classes

0.69 22 Feb 2012
 - fix XML->escape() to preserve UTF-8 flag on the returned SV*

0.68 15 Jan 2012
 - add missing dTHX macro per
   https://rt.cpan.org/Ticket/Display.html?id=74022

0.67 12 Jan 2012
 - bolster Tokenizer sentence detection, adding list of abbreviations from
   Linga::EN::Tagger.
 - fix missing 'lang' param for SpellCheck
 - fix placement of dSP macro in tokenize() C func to properly scope stack
   variables.
 - add slurp() method to Search::Tools

0.66 05 Dec 2011
 - undo 0.65 change, since HTML entities are case sensitive
   (http://www.w3.org/TR/html4/charset.html#h-5.3.2)

0.65 02 Dec 2011
 - lowercase named entity matches. patch from Adam Lesperance.

0.64 02 Dec 2011
 - optimizations to regex matching in Query->matches and HiLiter
 - according to Unicode spect \xfeff (BOM) is deprecated as whitespace
   character in favor of \x2060. HTML whitespace definition changed
   accordingly.
 - fix edge case in HiLiter where match on single letter could cause
   infinite loop.
 - add Query->fields method to see the fields searched for.
 - fix XML->unescape_named to support entities with \d in them, and
   case-insensitive. https://rt.cpan.org/Ticket/Display.html?id=72904

0.63 06 Oct 2011
 - change __func__ macro to use FUNCION__ instead since Perl core
   implements that portable macro.

0.62 26 Aug 2011
 - remove ';' as sentence boundary character (it was marked as TODO in
   search-tools.c) because character entities use it (e.g. &).

0.61 29 July 2011
 - add term_min_length option to QueryParser, to ignore terms unless then
   are N chars or longer. Useful for skipping single-character words when
   Snipping or HiLiting. For backwards compatibility the default is 1.
 - fix treat_uris_like_phrases regex to add / character in addition to @.\

0.60 13 July 2011
 - fix whitespace def to include   (broke HTML::HiLiter)

0.59 19 June 2011
 - add normalize_whitespace feature to XML->no_html() method.
 - add several Unicode whitespace defs to $whitespace regex in XML class
   per http://en.wikipedia.org/wiki/Mapping_of_Unicode_characters

0.58 27 May 2011
 - fix unescaped string in regex in HiLiter

0.57 22 Feb 2011
 - extend bug-fix from 0.56 to prevent false matches on match markers.

0.56 10 Feb 2011
 - fix bug where query terms 'span' or 'style' were breaking hiliting by
   "double-dipping"

0.55 25 Oct 2010
 - disable one more test for perl >= 5.14 (see 0.54)

0.54 24 Oct 2010
 - fixes for Search::Query 0.18
 - disabled some tests that break under perl >= 5.14.  See
   https://rt.cpan.org/Ticket/Display.html?id=62417

0.53 26 June 2010
 - add ->matches_text and ->matches_html methods to Query class

0.52 22 June 2010
 - tweek locale tests because some OSes (linux) use 'UTF8' instead of
   'UTF-8' naming.
 - small optimizations to HiLiter

0.51 23 May 2010
 - singularizer in XML->perl_to_xml will now treat common English plurals

0.50 19 May 2010



( run in 1.731 second using v1.01-cache-2.11-cpan-39bf76dae61 )