Alt-CWB-ambs

 view release on metacpan or  search on metacpan

lib/CWB/CQP.pm  view on Meta::CPAN

Command output can then be read with the B<getline>, B<getlines> and B<getrow> methods.
If asynchronous communication is desired, use B<ready> to check whether output is available.

It is an error to B<run> a new command before the output of the previous command has completely
been processed.

=cut

sub run {
  croak 'USAGE:  $cqp->run($cmd]);'
    unless @_ == 2;
  my $self = shift;
  my $cmd = shift;
  my $debug = $self->{'debug'};

  $cmd =~ s/\n+/ /g;            # make sure there are no newline characters (to be on the safe side)
  $cmd =~ s/(;\s*)+$//;         # ";" will be added when $cmd is sent to CQP

  my $active_cmd = $self->{'command'};
  croak "Error: new CQP command issued while '$active_cmd' is still being processed"
    if $active_cmd;

  $self->{'command'} = "$cmd;";
  $self->{'status'} = 'ok';
  $self->{'buffer'} = "";
  $self->{'lines'} = [];
  $self->{'error_message'} = [];

  print "CQP << $cmd;\n"
    if $debug;
  $self->{'in'}->print("$cmd;\n .EOL.;\n"); # append .EOL. command to mark end of CQP output
}

=item I<$num_of_lines> = I<$cqp>->B<ready>;

=item I<$num_of_lines> = I<$cqp>->B<ready>(I<$timeout>);

Check if output from current CQP command is available for reading with B<getline> etc.,
returning the number of lines currently held in the input buffer (possibly including an
end-of-output marker line that will not be returned by B<getline> etc.).  If there is no
active command, returns B<undef>.

The first form of the command returns immediately.  The second form waits up to I<$timeout>
seconds for CQP output to become available.  Use a negative I<$timeout> for blocking mode.

=cut

sub ready {
  my $self = shift;
  my $timeout = shift;

  my $lines = @{$self->{'lines'}};
  return $lines            # output has already been buffered => ready to read
    if $lines > 0;
  return undef             # no command active => undefined state
    unless $self->{'command'};
  return $self->_update($timeout); # try to read from CQP process & return number of lines available (NB: line buffer was empty before)
}

## INTERNAL: reset internal status after command has been completed, check that there is no extra output
sub _eol {
  my $self = shift;
  while ($self->_update > 0) { 1 } # check for any pending output from CQP process
  carp "CWB::CQP:  Unexpected CQP output after end of command:\n",
    map {" | $_\n"} @{$self->{'lines'}},
      "(command was: ".$self->{'command'}.")"
        if @{$self->{'lines'}} > 0;
  $self->{'lines'} = [];
  $self->{'command'} = undef; # no active command now
}

=item I<$line> = I<$cqp>->B<getline>;

Read one line of output from CQP process, blocking if necessary until output beomes available.
Returns B<undef> when all output from the current CQP command has been read.

=cut

sub getline {
  croak 'USAGE:  $line = $cqp->getline;'
    unless @_ == 1;
  my $self = shift;
  croak 'CWB::CQP:  $cqp->getline called without active CQP command'
    unless $self->{'command'};
  my $debug = $self->{'debug'};

  $self->_update(-1)            # fill line buffer if necessary (blocking mode)
    unless @{$self->{'lines'}} > 0;

  my $line = shift @{$self->{'lines'}};
  if ($line eq '-::-EOL-::-') { 
    ## special line printed by ".EOL.;" marks end of CQP output
    print "CQP ", "-" x 60, "\n"
      if $debug;
    $self->_eol;
    return undef;               # undef return value marks end of output
  }
  else {
    print "CQP >> $line\n"
      if $debug;
    return $line;
  }
}

=item I<@lines> = I<$cqp>->B<getlines>(I<$n>);

Read I<$n> lines of output from the CQP process, blocking as long as necessary.  An explicit B<undef> element is included at the end of the output of a CQP command.  Note that B<getlines> may return fewer than I<$n> lines if the end of output is reac...

Set C<I<$n> = 0> to read all complete lines currently held in the input buffer (as indicated by the B<ready> method), or specify a negative value to read the complete output of the active CQP command.

=cut

sub getlines {
  croak 'USAGE:  @lines = $cqp->getlines($n);'
    unless @_ == 2 and $_[1] =~ /^-?[0-9]+$/ and wantarray;
  my $self = shift;
  my $n_lines = shift;
  my @lines = ();
  if ($n_lines == 0) {
    while (my $line = shift @{$self->{'lines'}}) {
      if ($line eq '-::-EOL-::-') {
        $self->_eol;
        push @lines, undef;
      }
      else {
        push @lines, $line;
      }
    }
  }
  else {
    while ($n_lines != 0) {     # if $n_lines < 0, reads complete output of CQP command
      while ($n_lines != 0 and @{$self->{'lines'}} > 0) {
        my $line = shift @{$self->{'lines'}};
        if ($line eq '-::-EOL-::-') {
          $self->_eol;
          push @lines, undef;
          $n_lines = 0;
        }
        else {
          push @lines, $line;
          $n_lines--;
        }
      }
      $self->_update(-1)     # wait for CQP output to become available (in $self's input buffer)
        if $n_lines != 0;
    }
    return (wantarray) ? @lines : shift @lines;
  }
}

=item I<@lines> = I<$cqp>->B<exec>(I<$cmd>);

A convenience function that executes CQP command I<$cmd>, waits for it to complete, and returns all lines of
output from the command.

Fully equivalent to the following two commands, except that the trailing B<undef> returnd by B<getlines> is not included in the output:

  $cqp->run($cmd);
  @lines = $cqp->getlines(-1);

=cut

sub exec {
  croak 'USAGE:  $cqp->exec($cmd);'
    unless @_ == 2;
  my $self = shift;
  my $cmd = shift;
  my $debug = $self->{'debug'};

  $self->run($cmd);
  my @result = $self->getlines(-1);
  my $eol = pop @result;
  if (defined $eol) {
    die "CWB::CQP:  INTERNAL ERROR in _exec() -- missing 'undef' at end of command output (ignored)";
    push @result, $eol; # seems to be regular line, so push it back onto result list
  }
  return @result;
}

=item I<@fields> = I<$cqp>->B<getrow>;

=item I<@rows> = I<$cqp>->B<exec_rows>(I<$cmd>);

Convenience functions for reading TAB-delimited tables, which are generated by CQP commands such as B<count>, B<group>, B<tabulate> and B<show cd>.

B<getrow> returns a single row of output, split into TAB-delimited fields.  If the active CQP command has completed, it returns an empty list.

B<exec_rows> executes the CQP command I<$cmd>, waits for it to complete, and then returns the TAB-delimited table as an array of array references.  You can then use multiple indices to access a specific element of the table, e.g. C<I<@rows>[41][2]> f...

=cut

sub getrow {
  croak 'USAGE:  @fields = $cqp->getrow;'
    unless @_ == 1 and wantarray;
  my $self = shift;
  my $line = $self->getline;
  return ()
    unless defined $line;
  return split /\t/, $line;
}

sub exec_rows {
  croak 'USAGE:  @rows = $cqp->exec_rows($cmd);'
    unless @_ == 2 and wantarray;
  my $self = shift;
  my $cmd = shift;
  my @lines = $self->exec($cmd); ## **TODO** this function could be optimised to collect arrayrefs directly
  return map { [ split /\t/ ] } @lines;
}

=item I<$cqp>->B<begin_query>;

=item I<$cqp>->B<end_query>;

Enter/exit query lock mode for safe execution of CQP queries entered by an untrusted user (e.g. from a Web interface).  In query lock mode, all interactive CQP commands are temporarily disabled; in particular, it is impossible to access files or exec...

=cut

sub begin_query {
  my $self = shift;
  croak 'CWB::CQP:  $cqp->begin_query; has been called while query lock mode is already active'
    if $self->{'query_lock'};
  my $key = 1 + int rand(1_000_000); # make sure this is a TRUE value
  $self->exec("set QueryLock $key");
  $self->{'query_lock'} = $key;
}

sub end_query {
  my $self = shift;
  my $key = $self->{'query_lock'};
  if ($key) {
    $self->exec("unlock $key");
    $self->{'query_lock'} = undef;
  }
  else {



( run in 0.345 second using v1.01-cache-2.11-cpan-8f98c5d2c55 )