BioPerl

 view release on metacpan or  search on metacpan

Bio/DB/GFF/Adaptor/dbi/pg_fts.pm  view on Meta::CPAN


=head1 NOTES ABOUT TSearch2 SEARCHING

You should know a few things about how searching with TSearch2 works in
the GBrowse environment:

=over

=item 1

TSearch2 does not do wild cards, so you should encourage your users not
to use them.  If wild cards are used, the adaptor will fall back on 
an ILIKE search, which will be much slower.

=item 2

However, TSearch2 does do 'word stemming'.  That is, if you search
for 'copy', it will find 'copy', 'copies', and 'copied'.

=item 3

TSearch2 does not do phrase searching; all of the terms in the
search string are ANDed together.

=back

=head1 ACKNOWLEDGEMENTS

Special thanks to Russell Smithies and Paul Smale at AgResearch in
New Zealand for giving me their recipe for doing full text indexing
in a GFF database.

=head1 BUGS

Please report bugs to the BioPerl and/or GBrowse mailing lists
(L<mailto:bioperl-l@lists.open-bio.org> and L<mailto:gmod-gbrowse@lists.sourceforge.net>
respectively).

=head1 SEE ALSO

Please see L<Bio::DB::GFF::Adaptor::dbi::pg> for more information
about tuning your PostgreSQL server for GFF data, and for general
information about GFF database access, see L<Bio::DB::GFF>.

=head1 AUTHOR

Scott Cain, cain@cshl.edu

=head1 APPENDIX

=cut

# a simple postgres adaptor
use strict;
use Bio::DB::GFF::Adaptor::dbi;
use base qw(Bio::DB::GFF::Adaptor::dbi::pg);

use constant FULLTEXTSEARCH => <<END;
SELECT distinct gclass,gname,fattribute_value,fmethod,fsource
    FROM fgroup,fattribute_to_feature,fdata,ftype
     WHERE fgroup.gid=fdata.gid
       AND fdata.fid=fattribute_to_feature.fid
       AND fdata.ftypeid=ftype.ftypeid
       AND (fattribute_to_feature.idxfti @@ to_tsquery('default', ?))
END
;

use constant FULLTEXTWILDCARD => <<END;
SELECT distinct gclass,gname,fattribute_value,fmethod,fsource
    FROM fgroup,fattribute_to_feature,fdata,ftype
     WHERE fgroup.gid=fdata.gid
       AND fdata.fid=fattribute_to_feature.fid
       AND fdata.ftypeid=ftype.ftypeid
       AND lower(fattribute_to_feature.fattribute_value) LIKE lower(?)
END
;

sub new {
  my $class = shift;
  my $self  = $class->SUPER::new(@_);
  return $self;
}

=head2 search_notes

 Title   : search_notes
 Usage   : @search_results = $db->search_notes("full text string",$limit)
 Function: Search the notes for a text string, using PostgreSQL TSearch2
 Returns : array of results
 Args    : full text search string, and an optional row limit
 Status  : public

This is based on the mysql-specific method that makes use of the TSearch2
functionality in PosgreSQL's contrib directory. Given a search string,
it performs a full-text search of the notes table and returns an array
of results.  Each row of the returned array is a arrayref containing
the following fields:

  column 1   A Bio::DB::GFF::Featname object, for passing to segment()
  column 2   The text of the note
  column 3   A relevance score.

=cut

sub search_notes {
  my $self = shift;
  my ($search_string,$limit) = @_;

  my @terms = split /\s+/, $search_string;

  my $sth;
  if ($search_string =~ /\*/) {
      $search_string =~ tr/*/%/s;
      my $query = FULLTEXTWILDCARD;
      $query   .= " limit $limit" if defined $limit;
      $sth      = $self->dbh->do_query($query,$search_string);
  }
  elsif (@terms == 1) {
      my $query = FULLTEXTSEARCH;
      $query   .= " limit $limit" if defined $limit;
      $sth      = $self->dbh->do_query($query,$search_string);
  }
  else {
      my $query = FULLTEXTSEARCH;
      my $andstring = join (' & ', @terms);
#      $query   .= qq{ AND (fattribute_to_feature.fattribute_value ILIKE '\%$search_string%')};
      $query   .= " LIMIT $limit" if defined $limit;
      $sth      = $self->dbh->do_query($query,$andstring);
  } 
  
  my @results;



( run in 0.572 second using v1.01-cache-2.11-cpan-5735350b133 )