Alt-CWB-ambs

 view release on metacpan or  search on metacpan

data/vrt/VeryShortStories.vrt  view on Meta::CPAN

that	WDT	that
allowed	VBD	allow
him	PP	him
to	TO	to
choose	VB	choose
from	IN	from
more	JJR	more
than	IN	than
fifty	CD	fifty
different	JJ	different
fonts	NNS	font
for	IN	for
the	DT	the
time	NN	time
display	NN	display
and	CC	and
other	JJ	other
messages	NNS	message
.	SENT	.
</s>
<s>
Ed	NP	Ed
switched	VBD	switch
to	TO	to
the	DT	the
``	``	``
Standard	NP	Standard
''	''	''
font	NN	font
,	,	,
the	DT	the
only	JJ	only
one	NN	one
that	WDT	that
was	VBD	be
even	RB	even
remotely	RB	remotely
legible	JJ	legible
.	SENT	.

lib/CWB/CEQL/Parser.pm  view on Meta::CPAN

the automatic Perl stack trace, which provides no useful information for
grammar users and is likely to be confusing.  B<CWB::CEQL::Parser> will add
its own stack trace of subrule invocations so that users can pinpoint the
precise location of the syntax error.  In order to make this stack trace
readable and informative, DPP rules should always be given descriptive names: use
C<wildcard_expression> or C<part_of_speech> rather than C<rule1723a>.

The B<HtmlErrorMessage> method will automatically convert HTML metacharacters
and non-ASCII characters to entities, so it is safe to include the returned
HTML code directly in a Web page.  Error messages may use basic wiki-style
formatting: C<''...''> for typewriter font, C<//...//> for italics and
C<**...**> for bold font.  Note that such markup is non-recursive and nested
formatting will be ignored.  User input should always be enclosed in
C<''...''> in error messages so that C<//> and C<**> sequences in the input
are not mistaken as formatting instructions.

=head2 Calling subrules

Most DPP rules divide the input string into one or more subconstituents,
similar to the rules of a standard context-free grammar.  The main difference
is that a DPP rule has to settle on the specific positions and categories
of the subconstituents, rather than just listing possible category sequences.

lib/CWB/CEQL/Parser.pm  view on Meta::CPAN

}

=item I<$html_code> = I<$grammar>->B<HtmlErrorMessage>;

If the last parse failed, returns HTML-formatted error message and backtrace
of the callstack.  The string I<$html_code> is valid HTML and can directly be
included in a generated Web page.  In particular, unsafe and non-ASCII
characters have been encoded as HTML entities.  Simple, non-recursive
wiki-style markup in an error message is interpreted in the following way:

  **<text>**    <text> is shown in bold font (<b> ... </b>)
  //<text>//    <text> is displayed in italics (<i> ... </i>)
  ''<text>''    <text> is shown in typewriter font (<code> ... </code>)

Lines starting with C< - > (note the two blanks) are converted into list items.

=cut

sub HtmlErrorMessage {
  my $self = shift;
  my @text_lines = $self->ErrorMessage();
  if (@text_lines > 0) {
    return $self->formatHtmlText(@text_lines);

lib/CWB/CEQL/Parser.pm  view on Meta::CPAN

=item *

C<< **<text>** >> - <text> is displayed in bold face (C<< <b> ... </b> >>)

=item *

C<< //<text>// >> - <text> is displayed in italics (C<< <i> ... </i> >>)

=item *

C<< ''<text>'' >> - <text> is shown in typewriter font (C<< <code> ... </code> >>)

=item *

lines starting with C< - > (note the two blanks before and after the
hyphen) are converted into list items

=item *

all other lines are formatted as separate paragraphs (C<< <p> ... </p> >>)

=back

The wiki markup is non-recursive, i.e. no substitutions will be applied to
the text wrapped in C<''...''> etc.  This behaviour is intentional, so that
e.g. B<**> in a query expression will not be mistaken for a bold face marker,
(as long as the query is displayed in typewriter font, i.e. as C<''<query>''>).

=cut

sub formatHtmlText {
  my $self = shift;
  my @html_lines = ();
  my $in_list = 0;
  while (@_) {
    my $line = shift;
    my $list_item = ($line =~ s{^ -\s+}{}) ? 1 : 0;



( run in 1.019 second using v1.01-cache-2.11-cpan-97f6503c9c8 )