BioPerl

 view release on metacpan or  search on metacpan

Bio/DB/GFF.pm  view on Meta::CPAN

 Args    : a set of named parameters
 Status  : Public

This method can be used to initialize an empty database.  It takes the following
named arguments:

  -erase     A boolean value.  If true the database will be wiped clean if it
             already contains data.

Other named arguments may be recognized by subclasses.  They become database
meta values that control various settable options.

As a shortcut (and for backward compatibility) a single true argument
is the same as initialize(-erase=E<gt>1).

=cut

sub initialize {
  my $self = shift;

  my ($erase,$meta) = rearrange(['ERASE'],@_);
  $meta ||= {};

  # initialize (possibly erasing)
  return unless $self->do_initialize($erase);
  my @default = $self->default_meta_values;

  # this is an awkward way of uppercasing the 
  # even-numbered values (necessary for case-insensitive SQL databases)
  for (my $i=0; $i<@default; $i++) {
    $default[$i] = uc $default[$i] if !($i % 2);
  }

  my %values = (@default,%$meta);
  foreach (keys %values) {
    $self->meta($_ => $values{$_});
  }
  1;
}


=head2 load_gff

 Title   : load_gff
 Usage   : $db->load_gff($file|$directory|$filehandle [,$verbose]);
 Function: load GFF data into database
 Returns : count of records loaded
 Args    : a directory, a file, a list of files, 
           or a filehandle
 Status  : Public

This method takes a single overloaded argument, which can be any of:

=over 4

=item *

a scalar corresponding to a GFF file on the system

A pathname to a local GFF file.  Any files ending with the .gz, .Z, or
.bz2 suffixes will be transparently decompressed with the appropriate
command-line utility.

=item *

an array reference containing a list of GFF files on the system

For example ['/home/gff/gff1.gz','/home/gff/gff2.gz']

=item *

directory path

The indicated directory will be searched for all files ending in the
suffixes .gff, .gff.gz, .gff.Z or .gff.bz2.

=item *

filehandle

An open filehandle from which to read the GFF data.  Tied filehandles
now work as well.

=item *

a pipe expression

A pipe expression will also work. For example, a GFF file on a remote
web server can be loaded with an expression like this:

  $db->load_gff("lynx -dump -source http://stein.cshl.org/gff_test |");

=back

The optional second argument, if true, will turn on verbose status
reports that indicate the progress.

If successful, the method will return the number of GFF lines
successfully loaded.

NOTE:this method used to be called load(), but has been changed.  The
old method name is also recognized.

=cut

sub load_gff {
  my $self              = shift;
  my $file_or_directory = shift || '.';
  my $verbose           = shift;

  local $self->{__verbose__} = $verbose;
  return $self->do_load_gff($file_or_directory) if ref($file_or_directory) 
                                                   && tied *$file_or_directory;

  my $tied_stdin = tied(*STDIN);
  open my $SAVEIN, "<&STDIN" unless $tied_stdin;
  local @ARGV = $self->setup_argv($file_or_directory,'gff','gff3') or return;  # to play tricks with reader
  my $result = $self->do_load_gff('ARGV');
  open STDIN, '<', $SAVEIN unless $tied_stdin;  # restore STDIN
  return $result;
}

*load = \&load_gff;

=head2 load_gff_file

 Title   : load_gff_file
 Usage   : $db->load_gff_file($file [,$verbose]);
 Function: load GFF data into database
 Returns : count of records loaded
 Args    : a path to a file
 Status  : Public

This is provided as an alternative to load_gff_file. It doesn't munge
STDIN or play tricks with ARGV.

=cut

sub load_gff_file {
  my $self     = shift;
  my $file     = shift;
  my $verbose  = shift;
  my $fh = IO::File->new($file) or return;
  return $self->do_load_gff($fh);
}

=head2 load_fasta

 Title   : load_fasta
 Usage   : $db->load_fasta($file|$directory|$filehandle);
 Function: load FASTA data into database
 Returns : count of records loaded
 Args    : a directory, a file, a list of files, 
           or a filehandle
 Status  : Public

This method takes a single overloaded argument, which can be any of:

=over 4

=item *

scalar corresponding to a FASTA file on the system

A pathname to a local FASTA file.  Any files ending with the .gz, .Z, or
.bz2 suffixes will be transparently decompressed with the appropriate
command-line utility.

=item *

array reference containing a list of FASTA files on the
system

For example ['/home/fasta/genomic.fa.gz','/home/fasta/genomic.fa.gz']

=item *

path to a directory

The indicated directory will be searched for all files ending in the
suffixes .fa, .fa.gz, .fa.Z or .fa.bz2.

=item *

filehandle

An open filehandle from which to read the FASTA data.

=item *

pipe expression

A pipe expression will also work. For example, a FASTA file on a remote
web server can be loaded with an expression like this:

  $db->load_gff("lynx -dump -source http://stein.cshl.org/fasta_test.fa |");

=back

=cut

sub load_fasta {
  my $self              = shift;
  my $file_or_directory = shift || '.';
  my $verbose           = shift;

  local $self->{__verbose__} = $verbose;
  return $self->load_sequence($file_or_directory) if ref($file_or_directory)
                                                     && tied *$file_or_directory;

  my $tied = tied(*STDIN);
  open my $SAVEIN, "<&STDIN" unless $tied;
  local @ARGV = $self->setup_argv($file_or_directory,'fa','dna','fasta') or return;  # to play tricks with reader
  my $result = $self->load_sequence('ARGV');
  open STDIN, '<', $SAVEIN unless $tied;  # restore STDIN
  return $result;
}


=head2 load_fasta_file

 Title   : load_fasta_file
 Usage   : $db->load_fasta_file($file [,$verbose]);
 Function: load FASTA data into database
 Returns : count of records loaded
 Args    : a path to a file



( run in 0.575 second using v1.01-cache-2.11-cpan-39bf76dae61 )