XML-XSH2
view release on metacpan or search on metacpan
WARNING: XSH2 redirection syntax is not yet finished. It is currently the same as in XSH1 but this may be changed in the future releases.
Output redirection can be used to pipe output of some XSH B<command> to some external program, or to capture it to a variable. Redirection of output of more than one XSH command can be achieved using the B<do> command.
=head2 Redirect output to an external program
The syntax for redirecting the output of a XSH command to an external program, is B<xsh-command E<verbar> shell-command ;>, where B<xsh-command> is any XSH2 command and B<shell-command> is any command (or code) recognized by the default shell interpr...
Example: Use well-known UNIX commands to filter XPath-based XML listing from a document and count the results
xsh> ls //something/* | grep foo | wc
=head2 Capture output to a variable
The syntax for capturing the output of an XSH command to a variable is B<xsh-command E<verbar>E<gt> $variable>, where B<xsh-command> is any XSH B<command> and B<$variable> is any valid name for a B<variable>.
Example: Store the number of all words in a variable named count.
xsh> count //words |> $count
=head1 GLOBAL SETTINGS
The commands listed below can be used to modify the default behavior of the XML parser or XSH2 itself. Some of the commands switch between two different modes according to a given expression (which is expected to result either in zero or non-zero val...
The B<encoding> and B<query-encoding> settings allow to specify character encodings of user's input and XSH2's own output. This is particularly useful when you work with UTF-8 encoded documents on a console which only supports 8-bit characters.
The B<settings> command displays current settings by means of XSH2 commands. Thus it can not only be used to review current values, but also to store them for future use, e.g. in ~E<sol>.xsh2rc file.
Example:
xsh> settings | cat > ~/.xsh2rc
=head2 RELATED COMMANDS
backups, debug, empty-tags, encoding, indent, keep-blanks, load-ext-dtd, nobackups, nodebug, parser-completes-attributes, parser-expands-entities, parser-expands-xinclude, pedantic-parser, query-encoding, quiet, recovering, register-function, registe...
=head1 INTERACTING WITH PERL AND SHELL
Along with XPath, Perl is one of two XSH2 expression languages, and borrows XSH2 its great expressive power. Perl is a language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on t...
=head2 Calling Perl
Perl B<expressions or blocks of code> can either be used as arguments to any XSH2 command. One of them is B<perl> command which simply evaluates the given Perl block. Other commands, such as B<map>, even require Perl expression argument and allow qui...
To prevent conflict between XSH2 internals and the evaluated Perl code, XSH2 runs such code in the context of a special namespace B<XML::XSH2::Map>. As described in the section B<Variables>, XSH2 string variables may be accessed and possibly assigned...
The interaction between XSH2 and Perl actually works the other way round as well, so that you may call back XSH2 from the evaluated Perl code. For this, Perl function B<xsh> is defined in the B<XML::XSH2::Map> namespace. All parameters passed to this...
Moreover, the following Perl helper functions are defined:
B<xsh(string,....)> - evaluates given string(s) as XSH2 commands.
B<call(name)> - call a given XSH2 subroutine.
B<count(string)> - evaluates given string as an XPath expression and returns either literal value of the result (in case of boolean, string and float result type) or number of nodes in a returned node-set.
B<literal(stringE<verbar>object)> - if passed a string, evaluates it as a XSH2 expression and returns the literal value of the result; if passed an object, returns literal value of the object. For example, B<literal('$docE<sol>expression')> returns t...
B<serialize(stringE<verbar>object)> - if passed a string, it first evaluates the string as a XSH2 expression to obtain a node-list object. Then it serializes the object into XML. The resulting string is equal to the output of the XSH2 command B<ls> a...
B<type(stringE<verbar>object)> - if passed a string, it first evaluates the string as XSH2 expression to obtain a node-list object. It returns a list of strings representing the types of nodes in the node-list (ordered in the canonical document order...
B<nodelist(stringE<verbar>object,...)> - converts its arguments to objects if necessary and returns a node-list consisting of the objects.
B<xpath(string, node?)> - evaluates a given string as an XPath expression in the context of a given node and returns the result.
B<echo(string,...)> - prints given strings on XSH2 output. Note, that in the interactive mode, XSH2 redirects all output to a specific terminal file handle stored in the variable B<$OUT>. So, if you for example mean to pipe the result to a shell comm...
In the following examples we use Perl to populate the Middle-Earth with Hobbits whose names are read from a text file called B<hobbits.txt>, unless there are some Hobbits in Middle-Earth already.
Example: Use Perl to read text files
unless (//creature[@race='hobbit']) {
perl {
open my $fh, "hobbits.txt" };
@hobbits=<$file>;
close $fh;
}
foreach { @hobbits } {
copy xsh:new-element("creature","name",.,"race","hobbit")
into m:/middle-earth/creatures;
}
}
Example: The same code as a single Perl block
perl {
unless (count(//creature[@race='hobbit'])) {
open my $file, "hobbits.txt";
foreach (<$file>) {
xsh(qq{insert element "<creature name='$_' race='hobbit'>"
into m:/middle-earth/creatures});
}
close $file;
}
};
=head2 Writing your own XPath extension functions in Perl
XSH2 allows users to extend the set of XPath functions by providing extension functions written in Perl. This can be achieved using the B<register-function> command. The perl code implementing an extension function works as a usual perl routine accep...
The arguments passed to the perl implementation by the XPath engine are simple scalars for string, boolean and float argument types and B<XML::LibXML::NodeList> objects for node-set argument types. The implementation is responsible for checking the a...
Extension functions SHOULD NOT MODIFY the document DOM tree. Doing so could not only confuse the XPath engine but possibly even result in an critical error (such as segmentation fault). Calling XSH2 commands from extension function implementations is...
The extension function implementation must return a single value, which can be of one of the following types: simple scalar (a number or string), B<XML::LibXML::Boolean> object reference (result is a boolean value), B<XML::LibXML::Literal> object ref...
=head2 Calling the System Shell
In the interactive mode, XSH2 interprets all lines starting with the exclamation mark (B<!>) as shell commands and invokes the system shell to interpret the line (this is to mimic FTP and similar command-line interpreters).
Example:
xsh> !ls -l
-rw-rw-r-- 1 pajas pajas 6355 Mar 14 17:08 Artistic
drwxrwxr-x 2 pajas users 128 Sep 1 10:09 CVS
-rw-r--r-- 1 pajas pajas 14859 Aug 26 15:19 ChangeLog
-rw-r--r-- 1 pajas pajas 2220 Mar 14 17:03 INSTALL
-rw-r--r-- 1 pajas pajas 18009 Jul 15 17:35 LICENSE
-rw-rw-r-- 1 pajas pajas 417 May 9 15:16 MANIFEST
-rw-rw-r-- 1 pajas pajas 126 May 9 15:16 MANIFEST.SKIP
<span><a/><b/></span>
<span><a/><b/></span>
</root>
=item See also:
xinsert_command insert_command move_command xmove_command
=back
=head2 xcopy
=over 4
=item Usage:
xcopy [--respectiveE<verbar>:r] [--preserve-orderE<verbar>:p] B<expression> B<location> B<expression>
=item Aliases:
xcp
=item Description:
xcopy is similar to B<copy>, but copies all nodes in the first node-list B<expression> to all destinations determined by the B<location> directive relative to the second node-list B<expression>. See B<copy> for detailed description of B<xcopy> argume...
If B<--respectiveE<verbar>:r> option is used, then the target node-list B<expression> is evaluated in the context of the source node being copied.
The B<--preserve-orderE<verbar>:p> option can be used to ensure that the copied nodes are in the same relative order as the corresponding source nodes. Otherwise, if B<location> is B<after> or B<prepend>, the relative order of the copied nodes will b...
Example: Copy all middle-earth creatures within the document $a into every world of the document $b.
xsh> xcopy $a/middle-earth/creature into $b//world
=item See also:
copy_command move_command xmove_command insert_command xinsert_command
=back
=head2 xinsert
=over 4
=item Usage:
xinsert [--namespace B<expression>] B<node-type> B<expression> B<location> B<xpath>
=item Aliases:
xadd
=item Description:
Create new nodes of the B<node-type> given in the 1st argument of name specified in the 2nd argument and insert them to B<location>s relative to nodes in the node-list specified in the 4th argument.
For element nodes, the the 2nd argument B<expression> should evaluate to something like "E<lt>element-name att-name='attvalue' ...E<gt>". The B<E<lt>> and B<E<gt>> characters are optional. If no attributes are used, the expression may simply consist ...
Attribute nodes use the following syntax: "att-name='attvalue' [...]".
For the other types of nodes (text, cdata, comments) the expression should contain the node's literal content. Again, it is necessary to quote all whitespace and special characters as in any expression argument.
The B<location> argument should be one of: B<after>, B<before>, B<into>, B<replace>, B<append> or B<prepend>. See documentation of the B<location> argument type for more detail.
Optionally, for element and attribute nodes, a namespace may be specified with B<--namespace> or B<:n>. If used, the expression should evaluate to the desired namespace URI and the name of the element or attribute being inserted must have a prefix.
The command returns a node-list consisting of nodes it created.
Note, that instead of B<xinsert>, you can alternatively use one of B<xsh:new-attribute>, B<xsh:new-cdata>, B<xsh:new-chunk>, B<xsh:new-comment>, B<xsh:new-element>, B<xsh:new-element-ns>, B<xsh:new-pi>, and B<xsh:new-text> together with the command B...
Example: Give each chapter a provisional title element.
xsh> my $new_titles := xinsert element "<title font-size=large underline=yes>" \
into /book/chapter
xsh> xinsert text "Change me!" into $new_titles;
Example: Same as above, using xcopy and xsh:new-... instead of xinsert
xsh> my $new_titles := xcopy xsh:new-element("title","font-size","large","underline","yes") \
into /book/chapter
xsh> xcopy xsh:new-text("Change me!") into $new_titles;
=item See also:
insert_command move_command xmove_command
=back
=head2 xmove
=over 4
=item Usage:
xmove [--respectiveE<verbar>:r] [--preserve-orderE<verbar>:p] B<xpath> B<location> B<xpath>
=item Aliases:
xmv
=item Description:
Like B<xcopy>, except that B<xmove> removes the source nodes after a successful copy. Remember that the moved nodes are actually different nodes from the original ones (which may not be obvious when moving nodes within a single document into location...
This command returns a node-list consisting of all nodes it created on the target locations.
If B<--respectiveE<verbar>:r> option is used, then the target node-list B<expression> is evaluated in the context of the source node being copied.
The B<--preserve-orderE<verbar>:p> option can be used to ensure that the copied nodes are in the same relative order as the corresponding source nodes. Otherwise, if B<location> is B<after> or B<prepend>, the relative order of the copied nodes will b...
See B<xcopy> for more details on how the copies of the moved nodes are created.
The following example demonstrates how B<xmove> can be used to get rid of HTML B<E<lt>fontE<gt>> elements while preserving their content. As an exercise, try to figure out why simple B<foreach E<sol>E<sol>font { xmove node() replace . }> would not wo...
Example: Get rid of all E<lt>fontE<gt> tags
while //font {
foreach //font {
xmove node() replace .;
}
}
=item See also:
move_command copy_command xcopy_command insert_command xinsert_command
=back
=head2 xpath-axis-completion
$c = string($b[1]/@name) # $c contains string value of //creature[1]/@name (e.g. Bilbo)
echo //creature # prints: //creature
echo (//creature) # evaluates (//creature) as XPath and prints the
# text content of the resulting node-set
echo { join(",",split(//,$a)) } # prints: b,a,r
echo ${{ join(",",split(//,$a)) }} # the same
echo "${{ join(",",split(//,$a)) }}" # the same
echo "${(//creature[1]/@name)}" # prints e.g.: Bilbo
echo ${(//creature[1]/@name)} # the same
echo //creature[1]/@name # the same
echo string(//creature[1]/@name) # the same
echo (//creature[1]/@name) # the same
Example: In-line documents
$a="bar"
echo foo <<END baz;
xx ${a} yy
END
# prints foo xx bar yy baz
echo foo <<"END" baz;
xx ${a} yy
END
# same as above
echo foo <<'END' baz;
xx ${a} yy
END
# prints foo xx $a yy baz
Example: Expressions returning result of a XSH2 command
copy &{ sort --key @best_score --numeric //player } into .;
=item B<filename>
An B<expression> which evaluates to a valid filename or URL. As long as the expression contains no whitespace, no brackets of any type, quotes, double-quotes, B<$> character nor B<@> character, it is treated as a literal token which evaluates to itse...
=item B<location>
One of: B<after>, B<before>, B<into>, B<append>, B<prepend>, B<replace>.
This argument is required by all commands that insert nodes to a document in some way to a destination described by an XPath expression. The meaning of the values listed above is supposed be obvious in most cases, however the exact semantics for loca...
B<afterE<sol>before> place the node right afterE<sol>before the destination node, except for when the destination node is a document node or one of the source nodes is an attribute: If the destination node is a document node, the source node is attac...
B<appendE<sol>prepend> appendsE<sol>prepends the source node to the destination node. If the destination node can contain other nodes (i.e. it is an element or a document node) then the entire source node is attached to it. In case of other destinati...
B<into> can also be used to place the source node to the end of an element (in the same way as B<append>), to attach an attribute to an element, or, if the destination node is a text node, cdata section, processing-instruction, attribute or comment, ...
B<replace> replaces the entire destination node with the source node except for the case when the destination node is an attribute and the source node is not. In such a case only the value of the destination attribute is replaced with the textual con...
=item B<node-type>
One of: element, attribute, text, cdata, comment, chunk and (EXPERIMENTALLY!) entity_reference. A chunk is a character string which forms a well-balanced piece of XML.
Example:
add element hobbit into //middle-earth/creatures;
add attribute 'name="Bilbo"' into //middle-earth/creatures/hobbit[last()];
add chunk '<hobbit name="Frodo">A small guy from <place>Shire</place>.</hobbit>'
into //middle-earth/creatures;
=item B<nodename>
An B<expression> which evaluates to a valid name of an element, attribute or processing-instruction node. As long as the expression contains no whitespace, no brackets of any type, quotes, double-quotes, B<$> character, nor B<@> character, it is trea...
=item B<perl-code>
A block of Perl code enclosed in braces. All XSH2 variables are transparently accessible from the Perl code as well.
For more information about embedded Perl code in XSH2, predefined functions etc., see B<Perl_shell>.
Example:
xsh> $i={ "foo" };
xsh> perl { echo "$i-bar\n"; } # prints foo-bar
xsh> echo { "$i-bar" } # very much the same as above
=item B<subroutine>
A sub-routine name is an identifier matching the following regular expression B<[a-zA-Z_][a-zA-Z0-9_]*>, i.e., it must be at least one character long, must beginning with a letter or underscore, and may only containing letters, underscores, and digit...
=item B<xpath>
XSH2 can evaluate XPath expressions as defined in W3C recommendation at http:E<sol>E<sol>www.w3.orgE<sol>TRE<sol>xpath with only a little limitation on use of syntactically ignorable whitespace. (Nice interactive XPath tutorials and references can be...
In order to allow XSH2 to use white-space as a command argument delimiter (which is far more convenient to type than, say, commas), the use of white-space in XPath is slightly restricted. Thus, in XSH2, white-space can only occur in those parts of an...
/ foo / bar [ @baz = "bar" ]
should in XSH2 be written as either of
/foo/bar[ @baz = "bar" ]
avoiding any white-space outside the square brackets, or completely enclosed in brackets as in
( / foo / bar [ @baz = "bar" ] ).
XSH2 provides a number of powerful XPath extension functions, listed below and described in separate sections. XPath extension functions by default belong to XSH2 namespace B<http:E<sol>E<sol>xsh.sourceforge.netE<sol>xshE<sol>> with a namespace prefi...
XPath extension functions defined in XSH2: xsh:base-uri, xsh:context, xsh:current, xsh:doc, xsh:document, xsh:document-uri, xsh:documents, xsh:evaluate, xsh:filename, xsh:grep, xsh:id2, xsh:if, xsh:join, xsh:lc, xsh:lcfirst, xsh:lineno, xsh:lookup, x...
Example: Open a document and count all sections containing a subsection
xsh $scratch/> $v := open mydocument1.xml;
xsh $v/> $k := open mydocument2.xml;
xsh $k/> count //section[subsection]; # searches k
xsh $k/> count $v//section[subsection]; # searches v
=back
=head1 XPATH EXTENSION FUNCTION REFERENCE
=head2 xsh:base-uri
=over 4
=item Usage:
string xsh:base-uri(node-set?)
=item Description:
Returns base URI of the first node in the node-set (or the current node). The function should work on both XML and HTML documents even if base mechanisms for these are completely different. It returns the base as defined in RFC 2396 sections "5.1.1. ...
=back
=head2 xsh:context
=over 4
=item Usage:
node-set xsh:context(node-set NODE, float BEFORE, float AFTER)
=item Description:
Returns a node-set of sibling nodes surrounding NODE. The span consists of (up to) BEFORE-many nodes immediately preceding NODE, the NODE itself, and (up to) AFTER-many nodes immediately following NODE. If the AFTER is not given, AFTER is set equal t...
=back
=head2 xsh:current
=over 4
=item Usage:
node-set xsh:current()
=item Description:
This function (very similar to XSLT B<current()> extension function) returns a node-set having the current node as its only member.
=back
=head2 xsh:doc
=over 4
=item Usage:
=back
=head2 xsh:matches
=over 4
=item Usage:
boolean xsh:matches(string STR,string PATTERN)
=item Description:
Returns B<true> if B<STR> matches the regular expression B<PATTERN>. Otherwise returns B<false>.
=back
=head2 xsh:max
=over 4
=item Usage:
float xsh:max(object EXPRESSION, ...)
=item Description:
Returns the maximum of numeric values computed from given B<EXPRESSION>(s). If B<EXPRESSION> evaluates to a node-set, string values of individual nodes are used.
=back
=head2 xsh:min
=over 4
=item Usage:
float xsh:min(object EXPRESSION, ...)
=item Description:
Returns the minimum of numeric values computed from given B<EXPRESSION>(s). If B<EXPRESSION> evaluates to a node-set, string values of individual nodes are used.
=back
=head2 xsh:new-attribute
=over 4
=item Usage:
node-set xsh:new-attribute(string NAME1,string
VALUE1,[string NAME2, string VALUE2, ...])
=item Description:
Return a node-set consisting of newly created attribute nodes with given names and respective values.
=back
=head2 xsh:new-cdata
=over 4
=item Usage:
node-set xsh:new-cdata(string DATA)
=item Description:
Create a new cdata section node node filled with given B<DATA> and return a node-set containing the new node as its only member.
=back
=head2 xsh:new-chunk
=over 4
=item Usage:
node-set xsh:new-chunk(string XML)
=item Description:
This is just an alias for B<xsh:parse>. It parses given piece of XML and returns a node-set consisting of the top-level element within the parsed tree.
=back
=head2 xsh:new-comment
=over 4
=item Usage:
node-set xsh:new-comment(string DATA)
=item Description:
Create a new comment node containing given B<DATA> and return a node-set containing the new node as its only member.
=back
=head2 xsh:new-element
=over 4
=item Usage:
node-set xsh:new-element(string NAME,[string ATTR1-NAME1,
string ATTR-VALUE1, ...])
=item Description:
Create a new element node with given B<NAME> and optionally attributes with given names and values and return a node-set containing the new node as its only member.
=back
=head2 xsh:new-element-ns
=over 4
=item Usage:
node-set xsh:new-element-ns(string NAME,string NS,[string ATTR1-NAME1,
string ATTR-VALUE1, ...])
=item Description:
Create a new element node with given B<NAME> and namespace-uri B<NS> and optionally attributes with given names and values and return a node-set containing the new node as its only member.
=back
( run in 0.547 second using v1.01-cache-2.11-cpan-39bf76dae61 )