Treex-PML

 view release on metacpan or  search on metacpan

lib/Treex/PML/Instance.pm  view on Meta::CPAN


XML namespace URI for PML instances

=item PML_SCHEMA_NS

XML namespace URI for PML schemas

=item SUPPORTED_PML_VERSIONS

space-separated list of supported PML-schema version numbers

=back

=item :diagnostics

Imports internal _die, _warn, and _debug diagnostics commands.

=back


=head1 CONFIGURATION

The option 'config' of the methods load() and save() can provide a
parsed configuration file. The configuration file is a PML instance whose PML
schema is defined in the file C<pmlbackend_conf_schema.xml>
distributed with L<Treex::PML> in
C<Treex/PML/Backend/pmlbackend_conf_schema.xml>.

This file can set defaults for some options of load() and save() and
it can also define rules for pre-processing the input documents before
parsing them as PML and for post-processing the output documents after
serializing them as PML. Currently only XSLT 1.0, Perl and
external-command pre-processing and XSLT 1.0 post-processing are
implemented.

The C<PMLTransform> backend, when intialized (e.g. by calling
by calling C<AddBackend('PMLTransform')>), automatically loads the
first configuration file named C<pmlbackend_conf.xml> it finds in the
C<Treex::PML>'s resource paths.  Additionally, it searches for all
configuration files named C<pmlbackend_conf.inc> in the resource paths
and merges their transformation rules into in-memory image of the main
configuration file. Then, C<PMLTransform>  uses this resulting configuration for all
load/save operations.

IMPORTANT NOTE: it is recommended to add the C<PMLTransform> backend as the last
I/O backend since its test() method automatically accepts any XML file
(with the prospect of attempting to transform it during the read()
phase)! So it B<must> be added into the I/O backends list after all other backends
working with XML-based formats.

Here is an example of a configuration file (see the schema for more
details).

    <?xml version="1.0" encoding="utf-8"?>
    <pmlbackend xmlns="http://ufal.mff.cuni.cz/pdt/pml/">
      <head>
        <schema href="pmlbackend_conf_schema.xml"/>
      </head>
      <options>
        <load>
          <validate_cdata>1</validate_cdata>
          <use_resources>1</use_resources>
        </load>
        <save>
          <indent>4</indent>
          <validate_cdata>1</validate_cdata>
          <write_single_LM>1</write_single_LM>
        </save>
      </options>
      <transform_map>
        <transform id="alpino" test="alpino_ds[@version='1.1' or @version='1.2']">
          <in type="xslt" href="alpino2pml.xsl"/>
          <out type="xslt" href="pml2alpino.xsl"/>
        </transform>
        <transform id="sdata" root="sdata" ns="http://ufal.mff.cuni.cz/pdt/pml/">
          <in type="perl" command="require SDataMerge; return SDataMerge::transform(@_);"/>
        </transform>
        <transform id="tei" test="*[namespace-uri()='http://www.tei-c.org/ns/1.0']">
          <in type="pipe" command="tei2pml.sh">
            <param name="--stdin" />
            <param name="--stdout" />
          </in>
        </transform>
      </transform_map>
    </pmlbackend>

=head1 METHODS

=over 3

=item Treex::PML::Instance->new ()

NOTE: Don't call this constructor directly, use
Treex::PML::Factory->createPMLInstance() instead!

Create a new empty PML instance object.

=item Treex::PML::Instance->load (\%opts)

=item $pml->load (\%opts)

NOTE: Don't call this method as a constructor directly, use
Treex::PML::Factory->createPMLInstance() instead!

Read a PML instance from file, filehandle, string, or DOM.  This
method may be used both on an existing object (in which case it
operates on and returns this object) or as a constructor (in which
case it creates a new C<Treex::PML::Instance> object and returns it). Possible
options are: 

  {
    filename => $filename,   # and/or
    fh => \*FH,              # or
    string => $xml_string,   # or
    dom => $document,        # (XML::LibXML::Document)

    config => $cfg_pml,      # (Treex::PML::Instance)

    parser_options => \%opt, # (XML::LibXML parser options)
    no_trees => $bool,
    no_references => $bool,
    no_knit => $bool,
    selected_references => { name => $bool, ... },
    selected_knits => { name => $bool, ... }
  }



( run in 0.799 second using v1.01-cache-2.11-cpan-524268b4103 )