Alt-CWB-ambs

 view release on metacpan or  search on metacpan

lib/CWB/CEQL/Parser.pm  view on Meta::CPAN

reentrant, but multiple parsers for the same grammar can be run in parallel.
The return value I<$grammar> is an object of class B<MyGrammar>.

=cut

sub new {
  my $class = shift;
  my $self = {
              'PARAM_DEFAULTS' => {},  # globally set default values for parameters
              'PARAM' => undef,        # working copies of parameters during parse
              'INPUT' => undef,        # input string (defined while parsing)
              'ERROR' => undef,        # error message generated by last parse (undef = no error)
              'CALLSTACK' => [],       # call stack for backtrace in case of error
              'GROUPS' => undef,       # group structure for shift-reduce parser (undef if not active)
              'GROUPSTACK' => undef,   # stack of nested bracketing groups (undef if not active)
             };
  bless($self, $class);
}

=item I<$result> = I<$grammar>->B<Parse>(I<$string> [, I<$rule>]);

lib/CWB/CEQL/Parser.pm  view on Meta::CPAN

B<undef> is returned.

=cut

sub Parse {
  croak 'Usage:  $result = $grammar->Parse($string [, $rule]);'
    unless @_ == 2 or @_ == 3;
  my ($self, $input, $rule) = @_;
  $rule = "default"
    unless defined $rule;
  confess "CWB::CEQL::Parser: Parse() method is not re-entrant\n(tried to parse '$input' while parsing '".$self->{INPUT}."')"
    if defined $self->{INPUT};

  $self->{INPUT} = $input;
  %{$self->{PARAM}} = %{$self->{PARAM_DEFAULTS}}; # shallow copy of hash
  $self->{CALLSTACK} = [];              # re-initialise call stack (should destroy information from last parse)
  $self->{GROUPS} = undef;              # indicate that shift-reduce parser is not active
  $self->{CURRENT_GROUP} = undef;
  $self->{GROUPSTACK} = undef;
  $self->{ERROR} = undef;               # clear previous errors

  my $result = eval { $self->Call($rule, $input) }; # catch exceptions from parse errors
  if (not defined $result) {
    my $error = $@;
    chomp($error);                      # remove trailing newline
    $error = "parse of '' ".$self->{INPUT}." '' returned no result (reason unknown)"
      if $error eq "";
    $error =~ s/\s*\n\s*/ **::** /g;
    $self->{ERROR} = $error;
  }

  $self->{INPUT} = undef;               # no active parse
  $self->{PARAM} = undef;               # restore global parameter values (PARAM_DEFAULTS)
  return $result;                       # undef if parse failed
}

=item I<@lines_of_text> = I<$grammar>->B<ErrorMessage>;

If the last parse failed, returns a detailed error message and backtrace of
the callstack as a list of text lines (without newlines).  Otherwise, returns
empty list.

=cut

sub ErrorMessage {
  my $self = shift;
  my $error = $self->{ERROR};
  return ()
    unless defined $error;

  my @lines = "**Error:** $error";
  my $previous_frame = { RULE => "", INPUT => "" }; # init do dummy frame to avoid special case below
  foreach my $frame (reverse @{$self->{CALLSTACK}}) {
    my $rule = $frame->{RULE};
    if ($rule eq "APPLY") {
      my @done = @{$frame->{APPLY_DONE}};
      my @remain = @{$frame->{APPLY_ITEMS}};
      push @lines, " - at this location: '' @done ''**<==**'' @remain ''";
    }
    else {
      my $input = $frame->{INPUT};
      my $previous_input = $previous_frame->{INPUT} || "";
      if (($previous_input eq $input) and ($previous_frame->{RULE} ne "APPLY")) {
        $lines[-1] .= ", **$rule**";
      }
      else {
        push @lines, " - when parsing '' $input '' as **$rule**";
      }
    }
    $previous_frame = $frame;
  }
  return @lines;

lib/CWB/CEQL/Parser.pm  view on Meta::CPAN

class (which I<$grammar> is an instance of) and should be described in the
grammar's documentation.

=cut

sub SetParam {
  croak 'Usage:  $grammar->SetParam($name, $value)'
    unless @_ == 3;
  my ($self, $name, $value) = @_;
  ## select either global parameter values (user level) or working copy (during parse)
  my $param_set = (defined $self->{INPUT}) ? $self->{PARAM} : $self->{PARAM_DEFAULTS};
  croak "CWB::CEQL::Parser: parameter '$name' does not exist"
    unless exists $param_set->{$name};
  $param_set->{$name} = $value;
}

sub GetParam {
  croak 'Usage:  $grammar->GetParam($name)'
    unless @_ == 2;
  my ($self, $name) = @_;
  my $param_set = (defined $self->{INPUT}) ? $self->{PARAM} : $self->{PARAM_DEFAULTS};
  croak "CWB::CEQL::Parser: parameter '$name' does not exist"
    unless exists $param_set->{$name};
  return $param_set->{$name};
}

=back


=head1 METHODS USED BY GRAMMAR AUTHORS

lib/CWB/CEQL/Parser.pm  view on Meta::CPAN

will be created in the working copy of the parameter set and will only be
available during the current parse.

=cut

sub NewParam {
  confess 'Usage:  $self->NewParam($name, $default_value)'
    unless @_ == 3;
  my ($self, $name, $value) = @_;
  ## select either global parameter values (user level) or working copy (during parse)
  my $param_set = (defined $self->{INPUT}) ? $self->{PARAM} : $self->{PARAM_DEFAULTS};
  confess "CWB::CEQL::Parser: parameter '$name' already exists, cannot create with NewParam()"
    if exists $param_set->{$name};
  $param_set->{$name} = $value;
}

=item I<$result> = I<$self>->B<Call>(I<$rule>, I<$input>);

Apply rule I<$rule> to input string I<$input>.  The return value I<$result>
depends on the grammar rule, but is usually a string containing a translated
version of I<$input>.  Grammar rules may also annotate this string with

lib/CWB/CEQL/Parser.pm  view on Meta::CPAN

Note that B<Call> never returns B<undef>.  In case of an error, the entire
parse is aborted.

=cut

sub Call {
  confess 'Usage:  $result = $self->Call($rule, $input);'
    unless @_ == 3;
  my ($self, $rule, $input) = @_;
  confess "Sorry, we're not parsing yet"
    unless defined $self->{INPUT};
  my $method = $self->can("$rule");
  confess "the rule **$rule** does not exist in grammar **".ref($self)."** (internal error)\n"
    unless defined $method;
  my $frame = {RULE => $rule,
               INPUT => $input};
  push @{$self->{CALLSTACK}}, $frame; 
  my $result = $method->($self, $input);
  die "rule **$rule** failed to return a result (internal error)\n"
    unless defined $result;
  my $return_frame = pop @{$self->{CALLSTACK}};
  die "call stack has been corrupted (internal error)"
    unless $return_frame eq $frame;
  return $result;
}

lib/CWB/CEQL/Parser.pm  view on Meta::CPAN

this is based on flat backup copies, so complex data structures may have been
altered destructively).

=cut

sub Try {
  confess 'Usage:  $result = $self->Try($rule, $input);'
    unless @_ == 3;
  my ($self, $rule, $input) = @_;
  confess "Sorry, we're not parsing yet"
    unless defined $self->{INPUT};

  ## make flat backup copies of important data structures and ensure they are restored upon return
  ## (this is not completely safe, but should undo most changes that a failed parse may have made)
  my $back_param = [ @{$self->{PARAM}} ];
  my $back_callstack = [ @{$self->{CALLSTACK}} ]; 
  my ($back_groups, $back_current_group, $back_groupstack) = (undef, undef, undef);
  if (defined $self->{GROUPS}) {
    $back_groups = [ @{$self->{GROUPS}} ];
    $back_current_group = [ @{$back_groups->[0]} ]
      if @$back_groups > 0;

lib/CWB/CEQL/Parser.pm  view on Meta::CPAN

automatically removed from the list of return values.

=cut

sub Apply {
  confess 'Usage:  @results = $self->Apply($rule, @items);'
    unless @_ >= 2;
  my $self = shift;
  my $rule = shift;
  my $frame = {RULE => "APPLY",
               INPUT => undef,
               APPLY_ITEMS => [ @_ ],
               APPLY_DONE => []};
  push @{$self->{CALLSTACK}}, $frame;

  ## data structures for nested groups and result values must be restored on exit (in case of nested Apply())
  local $self->{GROUPS} = [ [] ];   # set up data structure to collect result values of nested groups
  local $self->{GROUPSTACK} = [];   # stack of nested groups (keeps track of nesting depth and ensures proper nesting)

  ## process each input item in turn
  while (@{$frame->{APPLY_ITEMS}}) {

lib/CWB/CEQL/Parser.pm  view on Meta::CPAN

parse, the B<NewParam>, B<SetParam> and B<GetParam> methods operate on this
working copy.

The C<PARAM> variable is re-initialised before each parse with a flat copy of
the C<PARAM_DEFAULTS> hashref.  Therefore, care has to be taken when modifying
complex parameter values within grammar rules, as the changes will affect the
global values in C<PARAM_DEFAULTS>.  If complex values need to be changed
internally, the grammar rule should always update the parameter with
B<SetParam> and a deep copy of the previous parameter value.

=item INPUT

The current input string passed to the B<Parse> method.  This variable is
mostly used to indicate whether the parser is currently active or not (e.g. in
order to avoid nested B<Parse> calls).

=item ERROR

Error message generated by the last parse, or B<undef> if the parse was
successful.  This error message is returned by B<ErrorMessage> and
B<HtmlErrorMessage> together with a backtrace of the parser's call stack.

lib/CWB/CEQL/Parser.pm  view on Meta::CPAN

fields:

=over 4

=item RULE 

Name of the grammar rule (i.e. Perl B<method>) invoked.  When the shift-reduce
parser is called with B<Apply>, a special rule named C<APPLY> is pushed on the
stack.

=item INPUT

Input string for the grammar rule (which should be a constituent of the
respective type).

=item APPLY_ITEMS (optional, "APPLY" rule only)

List (arrayref) of items passed to B<Apply> for processing by the shift-reduce
parser.  This field is only present in the call stack entry for the special
C<APPLY> rule.  Items are shifted from this list to C<APPLY_DONE> as they are
processed by the shift-reduce parser.



( run in 0.317 second using v1.01-cache-2.11-cpan-4e96b696675 )