XML-Checker
view release on metacpan or search on metacpan
lib/XML/Checker.pm view on Meta::CPAN
{
my ($self, $tag) = @_;
#?? if first tag, check with root element - or does expat check this already?
my $context = $self->{Context};
$context->[0]->Start ($self, $tag);
my $erule = $self->{ERule}->{$tag};
if (defined $erule)
{
unshift @$context, $erule->context;
}
else
{
# It's not a real error according to the XML Spec.
$self->fail (101, "undefined ELEMENT [$tag]");
unshift @$context, new XML::Checker::Context::ANY;
}
#?? what about ARule ??
my $arule = $self->{ARule}->{$tag};
if (defined $arule)
{
$self->{CurrARule} = $arule;
$arule->StartAttr;
}
}
# PerlSAX API
sub end_element
{
shift->End;
}
sub End
{
my ($self) = @_;
my $context = $self->{Context};
$context->[0]->End ($self);
shift @$context;
}
# PerlSAX API
sub characters
{
my ($self, $hash) = @_;
my $data = $hash->{Data};
if ($self->{InCDATA})
{
$self->CData ($data);
}
else
{
$self->Char ($data);
}
}
# PerlSAX API
sub start_cdata
{
$_[0]->{InCDATA} = 1;
}
# PerlSAX API
sub end_cdata
{
$_[0]->{InCDATA} = 0;
}
sub Char
{
my ($self, $text) = @_;
my $context = $self->{Context};
# NOTE: calls to isWS may set this to 1.
$INSIGNIF_WS = 0;
$context->[0]->Char ($self, $text);
}
# Treat CDATASection same as Char (Text)
sub CData
{
my ($self, $cdata) = @_;
my $context = $self->{Context};
$context->[0]->Char ($self, $cdata);
# CDATASection can never be insignificant whitespace
$INSIGNIF_WS = 0;
#?? I'm not sure if this assumption is correct
}
# PerlSAX API
sub comment
{
my ($self, $hash) = @_;
$self->Comment ($hash->{Data});
}
sub Comment
{
# ?? what can be checked here?
}
# PerlSAX API
sub entity_reference
{
my ($self, $hash) = @_;
$self->EntityRef ($hash->{Name}, 0);
#?? parameter entities (like %par;) are NOT supported!
# PerlSAX::handle_default should be fixed!
}
sub EntityRef
{
my ($self, $ref, $isParam) = @_;
if ($isParam)
{
# expand to "%name;"
print STDERR "XML::Checker::Entity - parameter Entity (%ent;) not implemented\n";
}
else
{
# Treat same as Char - for now
my $context = $self->{Context};
$context->[0]->Char ($self, "&$ref;");
$INSIGNIF_WS = 0;
#?? I could count the number of times each Entity is referenced
}
}
# PerlSAX API
sub unparsed_entity_decl
{
my ($self, $hash) = @_;
$self->Unparsed ($hash->{Name});
#?? what about Base, SytemId, PublicId ?
}
sub Unparsed
{
my ($self, $entity) = @_;
# print "ARule::Unparsed $entity\n";
if ($self->{Unparsed}->{$entity})
{
lib/XML/Checker.pm view on Meta::CPAN
releases. The Start handler works a little different (see below) and I
added Attr, InitDomElem, FinalDomElem, CDATA and EntityRef handlers.
See L<XML::Parser> for a description of the handlers that are not listed below.
Note that this interface may disappear, when the PerlSAX interface stabilizes.
=over 4
=item Start ($tag)
$checker->Start($tag);
Call this when an Element with the specified $tag name is encountered.
Different from the Start handler in L<XML::Parser>, in that no attributes
are passed in (use the Attr handler for those.)
=item Attr ($tag, $attrName, $attrValue, $isSpecified)
$checker->Attr($tag,$attrName,$attrValue,$spec);
Checks an attribute with the specified $attrName and $attrValue against the
ATTLIST definition of the element with the specified $tag name.
$isSpecified means whether the attribute was specified (1) or defaulted (0).
=item EndAttr ()
$checker->EndAttr;
This should be called after all attributes are passed with Attr().
It will check which of the #REQUIRED attributes were not specified and generate
the appropriate error (159) for each one that is missing.
=item CDATA ($text)
$checker->CDATA($text);
This should be called whenever CDATASections are encountered.
Similar to Char handler (but might perform different checks later...)
=item EntityRef ($entity, $isParameterEntity)
$checker->EntityRef($entity,$isParameterEntity);
Checks the ENTITY reference. Set $isParameterEntity to 1 for
entity references that start with '%'.
=item InitDomElem () and FinalDomElem ()
Used by XML::DOM::Element::check() to initialize (and cleanup) the
context stack when checking a single element.
=back
=head2 PerlSAX interface
XML::Checker now also supports the PerlSAX interface, so you can use XML::Checker
wherever you use PerlSAX handlers.
XML::Checker implements the following methods: start_document, end_document,
start_element, end_element, characters, processing_instruction, comment,
start_cdata, end_cdata, entity_reference, notation_decl, unparsed_entity_decl,
entity_decl, element_decl, attlist_decl, doctype_decl, xml_decl
Not implemented: set_document_locator, ignorable_whitespace
See PerlSAX.pod for details. (It is called lib/PerlSAX.pod in the libxml-perl
distribution which can be found at CPAN.)
=head1 CAVEATS
This is an alpha release. It is not actively maintained, patches are accepted and
incoporated in new releases, but that's about it. If you are interested in taking
over maintimance of the module, email tjmather@tjmather.com.
For a much faster, and correct DTD validator, see L<XML::LibXML>. It
uses the libxml2 library to validate DTD.
=head1 AUTHOR
Enno Derksen is the original author.
Send patches to T.J. Mather at
<F<tjmather@tjmather.com>>.
=head1 SEE ALSO
L<XML::LibXML> provides validating parsers against a DTD
and is recommended over XML::Checker since it uses the libxml2 library which is
fast and well-tested.
The XML spec (Extensible Markup Language 1.0) at L<http://www.w3.org/TR/REC-xml>
The L<XML::Parser> and L<XML::Parser::Expat> manual pages.
The other packages that come with XML::Checker:
L<XML::Checker::Parser>, L<XML::DOM::ValParser>
The DOM Level 1 specification at L<http://www.w3.org/TR/REC-DOM-Level-1>
The PerlSAX specification. It is currently in lib/PerlSAX.pod in the
libxml-perl distribution by Ken MacLeod.
The original SAX specification (Simple API for XML) can be found at
L<http://www.megginson.com/SAX> and L<http://www.megginson.com/SAX/SAX2>
( run in 1.635 second using v1.01-cache-2.11-cpan-437f7b0c052 )