XML-Comma

 view release on metacpan or  search on metacpan

lib/XML/Comma/docs/guide.html  view on Meta::CPAN

<User>
<username> kwindla </username>
<full_name> Kwindla Hultman Kramer </full_name>
</User>

<bio> Kwin is a programmer who likes <a href="http://use.perl.org">Perl</a>
and <a href="http://www.motorola.com/mcu">6812</a>
assembly language. </bio>
</pre>

<p> The above Doc is perfectly fine. Because the two <i>a</i> tags are
balanced, the parser has no problem reading in the Doc. After parsing
is finished the content of the <i>bio</i> element is treated just like
any other "flat" piece of content. </p>

<p> We will run into problems, however, if we're not extremely careful
about the HTML we try to store in the <i>bio</i> element. For example,
HTML includes a number of "empty" tags that are usually used in a
non-balanced fashion -- <i>img</i> and <i>br</i>, for example. Unless
we force the use of XHTML syntax, which mandates XML-compatible tag
usage, we'll need to either escape all mark-up characters or wrap
content in a CDATA section. </p>

<p> The utility methods <b>XML_basic_escape</b> and
<b>XML_basic_unescape</b> handle simple escaping and unescaping of
markup characters. </p>

<pre>
use Comma::Util qw ( XML_basic_escape XML_basic_unescape );
# escape a string
$escaped = XML_basic_escape ( '&lt;img src="picture.png"&gt;' );
$unescaped = XML_basic_unescape ( $escaped );
</pre>

<p> The <b>set()</b> and <b>get()</b> methods provide a means to
escape and unescape strings during get and set operations. If
<b>set()</b> is called with additional arguments following the
<i>content</i> arg, they are interpreted as paremeters that effect how
the set is performed. The argument <b>escape=>1</b> forces the content
string to be escaped before other pieces of the set routine --
validation, etc. -- go to work. Similarly, calling <b>get()</b> with
the parameterized arg <b>unescape=>1</b> unescapes the content string
before it is returned. </p>

<pre>
# safe set()
$doc-&gt;element('bio')-&gt;set ( $html_stuff, escape=&gt;1 );

# get() bio content in a string that we can incorporate directly into
# a web page
$doc-&gt;element('bio')-&gt;get ( unescape=&gt;1 );
</pre>

<p> Our other option, as mentioned above, is to "wrap" the bio
element's content in an XML CDATA section. The CDATA envelope forces
an XML parser to treat the characters inside it as plain text. Comma
allows an element to be flagged as CDATA-fied, meaning that on output
the entire contents will be wrapped in a CDATA section. Comma treats
this CDATA facility as high-impact and coarse-grained. As a result the
declaration is a one-way street: once a CDATA element, always a CDATA
element. The <b>cdata_wrap()</b> method flips the switch, so to
speak. </p>

<pre>
# configure the bio element so that it always CDATA-wraps its content
$doc-&gt;element('bio')-&gt;cdata_wrap();
# now we can set() with impunity
$doc-&gt;set ( $messy_html );
</pre>

<p> The <b>to_string()</b> method on the CDATA-set element will
produce output that looks something like this: </p>

<pre>
&lt;bio&gt;&lt;![CDATA[Kwin is a programmer who likes &lt;a href="http://use.perl.org"&gt;Perl&lt;/a&gt;
and &lt;a href="http://www.motorola.com/mcu"&gt;6812&lt;/a&gt;
assembly language.]]&gt;&lt;/bio&gt;
</pre>

<h2>Flexible and Automatic Escape/Unescape</h2>

<p> Escaping and unescaping element content is common enough to
warrant specific configurability for each Element in a Def of: </p>

<ol>
<li> The code that performs the <b>escape</b> operation</li>
<li> The code that performs the <b>unescape</b> operation</li>
<li> Whether to automatically escape element content on a <b>set()</b></li>
<li> Whether to automatically unescape element content on a <b>get()</b></li>
</ol>

<p> Here is a (silly) example of a custom escape/unescape pair as part
of an Element's definition: </p>

<pre>
&lt;element&gt;
  &lt;name&gt;Xs_are_dangerous&lt;/name&gt;
  &lt;escapes&gt;
    &lt;escape_code&gt; 
      sub { my $str=shift; $str =~ s:X:--x--:g; return $str; }
    &lt;/escape_code&gt;
    &lt;unescape_code&gt;
      sub { my $str=shift; $str =~ s:--x--:X:g; return $str; }
    &lt;/unescape_code&gt;
    &lt;auto&gt;1&lt;/auto&gt;
  &lt;/escapes&gt;
&lt;/element&gt;
</pre>

<p>Within the <b>escapes</b> section, <b>escape_code</b> specifies
some code that performs the ecape, and <b>unescape_code</b> specifies
some code that performs the unescape. They default, respectively, to: </p>

<pre>
   \&amp;XML::Comma::Util::XML_basic_escape
   \&amp;XML::Comma::Util::XML_basic_unescape
</pre>

<p> The <b>auto</b> element controls behaviors 3 and 4, from the list
above. The content of <b>auto</b> is eval'ed at Def load time, and if
<b>auto</b> contains a scalar value, that value sets the default for
both escaping and unescaping. If <b>auto</b> contains a listref, the
first value in the list controls escaping, and the second
unescaping. <b>auto</b> defaults to "0". </p>

<p> In the example above, <b>auto</b> is "1", so content is silently

lib/XML/Comma/docs/guide.html  view on Meta::CPAN

    <li>$doc = XML::Comma::Doc-&gt;new ( file =&gt; )</li>
    <li>$doc = XML::Comma::Doc-&gt;retrieve ( key, [timeout=&gt;&lt;seconds&gt;] )</li>
    <li>$doc = XML::Comma::Doc-&gt;retrieve ( store =&gt;, type =&gt;, id =&gt;, [timeout=&gt;&lt;seconds&gt;] )</li>
    <li>$doc || undef = XML::Comma::Doc-&gt;retrieve_no_wait ( key )</li>
    <li>$doc || undef = XML::Comma::Doc-&gt;retrieve_no_wait ( store =&gt;, type =&gt;, id =&gt; )</li>
    <li>$doc = XML::Comma::Doc-&gt;read ( key )</li>
    <li>$doc = XML::Comma::Doc-&gt;read ( &lt;retrieve arguments&gt; )</li>
    <li>$doc = $doc-&gt;get_lock ( [timeout=&gt;&lt;seconds&gt;] );</li>
    <li>$doc || undef = $doc-&gt;get_lock_no_wait();</li>
    <li>$string = $doc-&gt;to_string()</li>
    <li>$string = $doc-&gt;comma_hash()</li>
    <li>@elements = $doc-&gt;get_leaf_nodes( [ include => [ path_1 ... path_n ] ], [ exclude => [ path_1 ... path_n ] ])</li>
    <li>$string = $doc-&gt;full_field_texts( [ same args as get_leaf_nodes ] );</li>
    <li>$self = $doc-&gt;store ( store=&gt;, [keep_open=&gt;], [no_hooks=&gt;], [args...] )</li>
    <li>$self = $doc-&gt;erase()</li>
    <li>$self = $doc-&gt;copy()</li>
    <li>$self = $doc-&gt;copy() ( &lt;store arguments&gt; )</li>
    <li>$self = $doc-&gt;move()</li>
    <li>$self = $doc-&gt;move() ( &lt;store arguments&gt; )</li>
    <li>$store  = $doc-&gt;doc_store()</li>
    <li>$string = $doc-&gt;doc_location()</li>
    <li>$string = $doc-&gt;doc_id()</li>
    <li>$string = $doc-&gt;doc_key()</li>
    <li>$string = $doc-&gt;doc_source_file()</li>
    <li>$bool   = $doc-&gt;doc_is_locked()</li>
    <li>$bool   = $doc-&gt;doc_is_new()</li>
    <li>$int    = $doc-&gt;doc_last_modified()</li>
    <li>$doc = $doc-&gt;index_update ( index=>$index )</li>
    <li>$doc = $doc-&gt;index_remove ( index=>$index )</li>
  </ul></li>

  <li>all elements<ul>
    <li>$string = $el-&gt;tag()</li>
    <li>$string = $el-&gt;tag_up_path()</li>
    <li>$def = $el-&gt;def()</li>
    <li>$return_val = $el-&gt;method ( $name, [ @args...] )</li>
    <li>null = $el-&gt;set_attr ( $name =&gt; $value, [ $name =&gt; $value ... ] );</li>
    <li>$string = $el-&gt;get_attr ( $name );</li>
    <li>$hash_ref = $def-&gt;def_pnotes();</li>
    <li>@names = $def-&gt;applied_macros();</li>
    <li>1/undef = $def-&gt;applied_macros ( @names );</li>
    <li>$hashref = $el->pnotes();</li>
  </ul></li>

  <li>blob elements<ul>
    <li>$string = $el-&gt;set( $string )</li>
    <li>$string = $el-&gt;get()</li>
    <li>'' = $el-&gt;set_from_file ( $filename )</li>
    <li>'' = $el-&gt;validate()</li>
    <li>$string = $el-&gt;append ( $more_string )</li>
    <li>$string = $el-&gt;get_location()</li>
  </ul></li>

  <li>simple elements<ul>
    <li>$string = $el-&gt;get( [unescape=&gt;], [%args] )</li>
    <li>$string = $el-&gt;get_without_default()</li>
    <li>$string = $el-&gt;set ( $string, [escape=&gt;], [%args] )</li>
    <li>$string = $el-&gt;append ( $more_string )</li>
    <li>$string = $el-&gt;validate()</li>
    <li>$string = $el-&gt;validate_content ( $string )</li>
    <li>1 = $el-&gt;cdata_wrap();</li>
  </ul></li>

  <li>nested elements<ul>
    <li>@els/[] = $el-&gt;elements ( [@tags] )</li>
    <li>$el = $el-&gt;element ( $tag )</li>
    <li>$el = $el-&gt;add_element ( $tag )</li>
    <li>$el = $el-&gt;delete_element ( $tag )</li>
    <li>@strings/[] = $el-&gt;elements_group_get ( $tag )</li>
    <li>@strings/[] = $el-&gt;elements_group_add ( $tag, @strings )</li>
    <li>@els/[] = $el-&gt;elements_group_delete ( $tag, @strings ) </li>
    <li>$bool = $el-&gt;elements_group_lists ( $tag, $string )</li>
    <li>$bool = $el-&gt;element_is_plural ( $tag )</li>
    <li>$bool = $el-&gt;element_is_defined ( $tag )</li>
    <li>$bool = $el-&gt;element_is_nested ( $tag )</li>
    <li>$bool = $el-&gt;element_is_blob ( $tag )</li>
    <li>$bool = $el-&gt;element_is_required ( $tag )</li>
    <li>'' = $el-&gt;validate()</li>
    <li>[DEPRECATED] '' = $el-&gt;validate_structure()</li>
    <li>@els = $el-&gt;get_all_blobs()</li>
    <li>$el = $el-&gt;group_elements();</li>
    <li>@els/[] = $el-&gt;sort_elements ( [@tags] )</li>
  </ul></li>

  <li>XML::Comma::Def<ul>
    <li>$def = XML::Comma::Def-&gt;read ( name =&gt; )</li>
    <li>@names = $def-&gt;store_names();</li>
    <li>$store = $def-&gt;get_store ( $name );</li>
    <li>@names = $def-&gt;index_names();</li>
    <li>@names = $def-&gt;method_names();</li>
    <li>$store = $def-&gt;get_index ( $name );</li>
    <li>$hash_ref = $def-&gt;def_pnotes();</li>
    <li>$code_ref = $def-&gt;add_hook ( $hook_type, $string || $code_ref );</li>
    <li>$code_ref = $def-&gt;add_method ( $name, $string || $code_ref );</li>
    <li>$code_ref || undef = $def-&gt;method_code ( $name );</li>
    <li>@return/[] = $def-&gt;method ( $name, @args );</li>
    <li>@names = $def-&gt;applied_macros();</li>
    <li>1/undef = $def-&gt;applied_macros ( @names );</li>
    <li>@els/[] = $def-&gt;def_sub_elements();</li>
    <li>$el = $def-&gt;def_by_name ( $element_name );</li>
    <li>1/undef = $def-&gt;is_required();</li>
    <li>1/undef = $def-&gt;is_plural();</li>
    <li>1/undef = $def-&gt;is_nested();</li>
    <li>1/undef = $def-&gt;is_blob();</li>
    <li>1/undef = $def-&gt;is_ignore_for_hash();</li>
    <li>1/undef = $def-&gt;has_property( [ ignore_for_hash |
include_for_hash | plural | required | nested | blob | enum | boolean |
range | timestamp | timestamp_created | timestamp_last_modified |
doc_key | single_line ] );</li>
  </ul></li>

  <li>XML::Comma::Indexing::Index<ul>
    <li>@names = $index-&gt;field_names();</li>
    <li>@names = $index-&gt;sort_names(); [ DEPRECATED ]</li>
    <li>@names = $index-&gt;collection_names();</li>
    <li>@names = $index-&gt;textsearch_names();</li>
    <li>@names = $index-&gt;method_names();</li>
    <li>$type_name = $index-&gt;collection_type ( $collection_name );</li>
    <li>$iterator = $index-&gt;iterator ( [%args] );</li>
    <li>$iterator/undef = $index-&gt;single ( [%args] );</li>
    <li>$doc/undef = $index-&gt;single_read ( [%args] );</li>



( run in 0.565 second using v1.01-cache-2.11-cpan-e93a5daba3e )