XML-Twig

 view release on metacpan or  search on metacpan

Changes  view on Meta::CPAN

3.30 - 2007-11-06 

- fixed: a couple of bugs in namespace handling, spotted by
  Shlomo Yonas (see https://rt.cpan.org/Ticket/Display.html?id=27617
  and http://www.perlmonks.org/?node_id=624830)

- added: the XML::Twig::Elt fields method which returns a list of
  fields 

- added: the normalize method in XML::Twig and XML::Twig::Elt,
  which merge together consecutive pcdata elements. As much as 
  possible (so far after a cut, delete or erase), the twig is 
  kept normalized, eg there are no consecutive #PCDATA elements
  in it. Suggestion of someone whose name (and emails) I can't
  find at the moment.

- added: the indented_a / cvs format for pretty_print, that makes the
  output friendly to line-oriented version control tools, as described
  in http://tinyurl.com/2kwscq (RT #24954). Thanks to Sjur Moshagen
  for a patch that I adapted to the current version.

Changes  view on Meta::CPAN

- fixed: a bug in xml_pp when pretty printing a 
  file in place in a different file system

3.24 - 2006-05-09 

- added: loading the text of entities stored in 
  separate files (using SYSTEM) when the (awfully
  named!) expand_external_ents option is used.
  Thanks to jhx for spotting this.

- changed: set_cdata, set_pi and set_comment so that
  if you call them on an element of the wrong kind,
  everything works as expected, instead of swallowing
  silently the data. Bug spotted by cmccutcheon

- fixed: a whole bunch of things to make the module 
  run and the tests pass on VMS, thanks to Peter 
  (Stig) Edwards who reported bug RT #18655 and 
  provided a patch.

- fixed: bug on get_xpath( '/root[1]') expressions,

Changes  view on Meta::CPAN

- fixed: bug in the nsgmls pretty printer that output
  invalid XML (an extra \n was added in the end tag)
  found by Lee Goddard

- fixed: test 284 in test_additional to make it pass 
  in RedHat's version of perl 5.8.0, thanks to
  rdhayes for debugging and fixing that test.

- improved: first shot at getting Pis and comments back in the
  proper place, even in 'keep' mode. At the moment
  using set_pcdata  (or set_cdata) removes all
  embedded comments/pis

- fixed: a bug with pi's in keep mode (pi's would not
  be copied if they were within an element) found by
  Pascal Sternis

- added: a fix to get rid of spurious warnings, sent
  by Anthony Persaud 

- added: the remove_cdata option to the XML::Twig new
  method, that will output CDATA sections as regular
  (escaped) PCDATA

- added: the index option to the XML::Twig new method,
  and the associated XML::Twig index method, which 
  generates a list of element matching a condition 
  during parsing

- added: the XML::Twig::Elt first_descendant method

Changes  view on Meta::CPAN

- improved: the conversion functions (errors are now reported when the
  function is created and not when it is first used)

- added: the output_encoding option to XML::Twig->new, which allows 
  specifying an encoding for the output: the conversion filter is
  created using Encode (perl 5.8.0) Text::Iconv or Unicode::* The
  XML declaration is also updated

- added: #CDATA and #ENT can now be used in handler expressions

- added: XML::Twig::Elt remove_cdata method, which turns CDATA sections
  into regular PCDATA elements

- improved: set_asis can now be used to output CDATA sections un-escaped (and without
  the CDATA section markers) 

3.04 - 2002-04-01

- fixed: handlers for XML::Parser 2.27 so the module can pass the tests

3.03 - 2002-03-26

Changes  view on Meta::CPAN

3.00 - 2002-01-09

- COMPATIBILITY 
  WARNING: THIS CHANGE IS NOT BACKWARD COMPATIBLE

  But it is The Right Thing To Do

  In normal mode (when KeepEncoding is not used) the XML data is
  now stored as parsed by XML::Parser, ie the base entities are
  expanded. The "print" methods (print, sprint and flush, plus the
  new xml_string, pcdata_xml_string and att_xml_string) return the
  data in XML-escaped form: & and < are escaped in PCDATA and
  &, < and the quote (" by default) are turned to &amp; &lt; and
  &quot; (or &apos; if the quote is '). The "text" methods (text,
  att and pcdata) return the stored text as is.
  So if you want to output XML you should use the "print" methods
  and if you want to output text you should use the "text" methods.

  Note that this breaks the trick consisting in adding tags to the
  content of an element: $elt->prefix( "<b>") no longer adds a <b>
  tag before an element. $elt->print will now output "&lt;b>...".
  (but you can still use it by marking those elements as 'asis').
  It also fixes the annoying &apos; thingie that used to replace '
  in the data. 

Changes  view on Meta::CPAN

  used by the object (useful if you don't have WeakRef installed)

- added: XML::Twig and XML::Twig::Elt ignore methods, which can be called
  from a start_tag_handlers handler and cause the element (or the
  current element if called on a twig) to be ignored by the 
  parsing

- added: XML::Twig parse_start_tag option that overrides the default function
  used to parse start tags when KeepEncoding is used

- added: XML::Twig::Elt xml_string, pcdata_xml_string and att_xml_string
  all return an XML-escaped string for an element (including 
  sub-elements and their tags but not the enclosing tags for the 
  element), a #PCDATA element and an attribute

- added: XML::Twig::Elt methods tag and set_tag, equivalent respectively
  to gi and set_gi
  
- added: XML::Twig and XML::Twig::Elt set_keep_encoding methods can be used
  to set the keep_encoding value if you use several twigs with 
  different keep_encoding options

MANIFEST  view on Meta::CPAN

t/test_with_lwp.t
t/test_with_lwp.xml
t/test_with_lwp_not_wf.xml
t/test_attregexp_cond.t
t/test_xpath_cond.t
t/test_erase.t
t/test_even_more_coverage.t
t/test_keep_atts_order.t
t/test_mark.t
t/test_ignore_elts.t
t/test_cdata.t
t/test_twig_roots.t
t/test_spaces.t
t/test_simplify.t
t/test_entities.t
t/test_pi_handler.t
t/test_comment_handler.t
t/test_pos.t
t/test_variables.t
t/test_drop_comments.t
t/test_unique_xpath.t

lib/XML/Twig.pm  view on Meta::CPAN

# handlers used in regular mode
my %twig_handlers=( Start      => \&_twig_start,
                    End        => \&_twig_end,
                    Char       => \&_twig_char,
                    Entity     => \&_twig_entity,
                    Notation   => \&_twig_notation,
                    XMLDecl    => \&_twig_xmldecl,
                    Doctype    => \&_twig_doctype,
                    Element    => \&_twig_element,
                    Attlist    => \&_twig_attlist,
                    CdataStart => \&_twig_cdatastart,
                    CdataEnd   => \&_twig_cdataend,
                    Proc       => \&_twig_pi,
                    Comment    => \&_twig_comment,
                    Default    => \&_twig_default,
                    ExternEnt  => \&_twig_extern_ent,
      );

# handlers used when twig_roots is used and we are outside of the roots
my %twig_handlers_roots=
  ( Start      => \&_twig_start_check_roots,
    End        => \&_twig_end_check_roots,

lib/XML/Twig.pm  view on Meta::CPAN

      }

    if( $args{OutputFilter})
      { $self->set_output_filter( $args{OutputFilter});
        delete $args{OutputFilter};
      }
    else
      { $self->set_output_filter( 0); }

    if( $args{RemoveCdata})
      { $self->set_remove_cdata( $args{RemoveCdata});
        delete $args{RemoveCdata};
      }
    else
      { $self->set_remove_cdata( 0); }

    if( $args{OutputTextFilter})
      { $self->set_output_text_filter( $args{OutputTextFilter});
        delete $args{OutputTextFilter};
      }
    else
      { $self->set_output_text_filter( 0); }

    if( $args{KeepAttsOrder})
      { $self->{keep_atts_order}= $args{KeepAttsOrder};

lib/XML/Twig.pm  view on Meta::CPAN

                  else
                    { $$xml=~ s{^(<\?xml.*?\?>)?}{}s;
                      #warn "  converting to utf8 from $encoding\n";
                      $$xml= _to_utf8( $encoding, $$xml);
                    }
                }
            }
        }

      # some versions of HTML::TreeBuilder escape CDATA sections
      $$xml=~ s{(&lt;!\[CDATA\[.*?\]\]&gt;)}{_unescape_cdata( $1)}eg;
  }

  sub _xml_parser_encodings
    { my @encodings=( 'iso-8859-1'); # this one is included by default, there is no map for it in @INC
      foreach my $inc (@INC)
        { push @encodings, map { basename( $_, '.enc') } glob( File::Spec->catdir( $inc => XML => Parser => Encodings => '*.enc')); }
      return map { $_ => 1 } @encodings;
    }
}

sub _unescape_cdata
  { my( $cdata)= @_;
    $cdata=~s{&lt;}{<}g;
    $cdata=~s{&gt;}{>}g;
    $cdata=~s{&amp;}{&}g;
    return $cdata;
  }

sub _as_XML {

    # fork of HTML::Element::as_XML, which is a little too buggy and inconsistent between versions for my liking
    my ($elt) = @_;
    my $xml= '';
    my $empty_element_map = $elt->_empty_element_map;

    my ( $tag, $node, $start );    # per-iteration scratch

lib/XML/Twig.pm  view on Meta::CPAN

    return $@ ? $t->_reset_twig_after_error : $t;
  }

# restore a twig in a proper state so it can be reused for a new parse
sub _reset_twig
  { my $t= shift;
    $t->{twig_parsing}= 0;
    delete $t->{twig_current};
    delete $t->{extra_data};
    delete $t->{twig_dtd};
    delete $t->{twig_in_pcdata};
    delete $t->{twig_in_cdata};
    delete $t->{twig_stored_space};
    delete $t->{twig_entity_list};
    $t->root->delete if( $t->root);
    delete $t->{twig_root};
    return $t;
  }

sub _reset_twig_after_error
  { my $t= shift;
    $t->_reset_twig;

lib/XML/Twig.pm  view on Meta::CPAN

    $t->{twig_right_after_root}=0; #XX

    my $current= $t->{twig_current} or return; # ugly hack, with ignore on, twig_current can disappear
    return unless length $t->{twig_stored_spaces};
    my $current_gi= $current->gi;

    if( ! $t->{twig_discard_all_spaces})
      { if( ! defined( $t->{twig_space_policy}->{$current_gi}))
          { $t->{twig_space_policy}->{$current_gi}= _space_policy( $t, $current_gi); }
        if(    $t->{twig_space_policy}->{$current_gi} || ($t->{twig_stored_spaces}!~ m{\n}) || $t->{twig_preserve_space})
          { _insert_pcdata( $t, $t->{twig_stored_spaces} ); }
      }

    $t->{twig_stored_spaces}='';

    return;
  }

# the default twig handlers, which build the tree
sub _twig_start
   { # warn " in _twig_start...\n"; # DEBUG handler

    #foreach my $s (@_) { next if ref $s; warn "$s: ", is_utf8( $s) ? "has flag" : "FLAG NOT SET"; } # YYY

    my ($p, $gi, @att)= @_;
    my $t=$p->{twig};

    # empty the stored pcdata (space stored in case they are really part of
    # a pcdata element) or stored it if the space policy dictates so
    # create a pcdata element with the spaces if need be
    _add_or_discard_stored_spaces( $t);
    my $parent= $t->{twig_current};

    # if we were parsing PCDATA then we exit the pcdata
    if( $t->{twig_in_pcdata})
      { $t->{twig_in_pcdata}= 0;
        $t->_trigger_text_handler();

        $parent->del_twig_current;
        $parent= $parent->_parent;
      }

    # if we choose to keep the encoding then we need to parse the tag
    if( my $func = $t->{parse_start_tag})
      { ($gi, @att)= &$func($p->original_string); }
    elsif( $t->{twig_entities_in_attribute})

lib/XML/Twig.pm  view on Meta::CPAN

      }
    return $t;
  }

sub _twig_end
   { # warn " in _twig_end...\n"; # DEBUG handler
    my ($p, $gi)  = @_;

    my $t=$p->{twig};

    if( $t->{twig_in_pcdata} )
      { $t->_trigger_text_handler(); }

    if( $t->{twig_map_xmlns}) { $gi= $t->_replace_prefix( $gi); }

    _add_or_discard_stored_spaces( $t);

    # the new twig_current is the parent
    my $elt= $t->{twig_current};
    $elt->del_twig_current;

    # if we were parsing PCDATA then we exit the pcdata too
    if( $t->{twig_in_pcdata})
      {
        $t->{twig_in_pcdata}= 0;
        $elt= $elt->_parent if($elt->_parent);
        $elt->del_twig_current;
      }

    # parent is the new current element
    my $parent= $elt->_parent;
    $t->{twig_current}= $parent;

    if( $parent)
      { $parent->set_twig_current;

lib/XML/Twig.pm  view on Meta::CPAN

      { return $name; }
  }

sub _twig_char
   { # warn " in _twig_char...\n"; # DEBUG handler

    my ($p, $string)= @_;
    my $t=$p->{twig};

    if( $t->{twig_keep_encoding})
      { if( !$t->{twig_in_cdata})
          { $string= $p->original_string(); }
        else
          {
            use bytes;
            if( length( $string) < 1024)
              { $string= $p->original_string(); }
            else
              { #warn "dodgy case";
                # TODO original_string does not hold the entire string, but $string is wrong
                # I believe due to a bug in XML::Parser
                # for now, we use the original string, even if it means that it's been converted to utf8
              }
          }
      }

    if( $t->{twig_input_filter}) { $string= $t->{twig_input_filter}->( $string); }
    if( $t->{twig_char_handler}) { $string= $t->{twig_char_handler}->( $string); }

    my $elt= $t->{twig_current};

    if(    $t->{twig_in_cdata})
      { # text is the continuation of a previously created cdata
        $elt->append_cdata( $t->{twig_stored_spaces} . $string);
      }
    elsif( $t->{twig_in_pcdata})
      { # text is the continuation of a previously created pcdata
        if( $t->{extra_data})
          { $elt->_push_extra_data_in_pcdata( $t->{extra_data}, length( $elt->{pcdata}));
            $t->{extra_data}='';
          }
        $elt->append_pcdata( $string);
      }
    else
      {
        # text is just space, which might be discarded later
        if( $string=~/\A\s*\Z/s)
          {
            if( $t->{extra_data})
              { # we got extra data (comment, pi), lets add the spaces to it
                $t->{extra_data} .= $string;
              }
            else
              { # no extra data, just store the spaces
                $t->{twig_stored_spaces}.= $string;
              }
          }
        else
          { my $new_elt= _insert_pcdata( $t, $t->{twig_stored_spaces}.$string);
            $elt->del_twig_current;
            $new_elt->set_twig_current;
            $t->{twig_current}= $new_elt;
            $t->{twig_in_pcdata}=1;
            if( $t->{extra_data})
              { $new_elt->_push_extra_data_in_pcdata( $t->{extra_data}, 0);
                $t->{extra_data}='';
              }
          }
      }
    return;
  }

sub _twig_cdatastart
   { # warn " in _twig_cdatastart...\n"; # DEBUG handler

    my $p= shift;
    my $t=$p->{twig};

    $t->{twig_in_cdata}=1;
    my $cdata=  $t->{twig_elt_class}->new( $CDATA);
    my $twig_current= $t->{twig_current};

    if( $t->{twig_in_pcdata})
      { # create the node as a sibling of the PCDATA
        $cdata->set_prev_sibling( $twig_current);
        $twig_current->set_next_sibling( $cdata);
        my $parent= $twig_current->_parent;
        $cdata->set_parent( $parent);
        $parent->set_last_child( $cdata);
        $t->{twig_in_pcdata}=0;
      }
    else
      { # we have to create a PCDATA element if we need to store spaces
        if( $t->_space_policy($twig_current->gi) && $t->{twig_stored_spaces})
          { _insert_pcdata( $t, $t->{twig_stored_spaces}); }
        $t->{twig_stored_spaces}='';

        # create the node as a child of the current element
        $cdata->set_parent( $twig_current);
        if( my $prev_sibling= $twig_current->_last_child)
          { $cdata->set_prev_sibling( $prev_sibling);
            $prev_sibling->set_next_sibling( $cdata);
          }
        else
          { $twig_current->set_first_child( $cdata); }
        $twig_current->set_last_child( $cdata);
      }

    $twig_current->del_twig_current;
    $t->{twig_current}= $cdata;
    $cdata->set_twig_current;
    if( $t->{extra_data}) { $cdata->set_extra_data( $t->{extra_data}); $t->{extra_data}='' };
    return;
  }

sub _twig_cdataend
   { # warn " in _twig_cdataend...\n"; # DEBUG handler

    my $p= shift;
    my $t=$p->{twig};

    $t->{twig_in_cdata}=0;

    my $elt= $t->{twig_current};
    $elt->del_twig_current;
    my $cdata= $elt->cdata;
    $elt->_set_cdata( $cdata);

    push @{$t->{_twig_context_stack}}, { $ST_TAG => $CDATA };

    if( $t->{twig_handlers})
      { # look for handlers
        my @handlers= _handler( $t, $t->{twig_handlers}, $CDATA);
        local $_= $elt; # so we can use $_ in the handlers
        foreach my $handler ( @handlers) { $handler->($t, $elt) || last; }
      }

    pop @{$t->{_twig_context_stack}};

    $elt= $elt->_parent;
    $t->{twig_current}= $elt;
    $elt->set_twig_current;

    $t->{twig_long_cdata}=0;
    return;
  }

sub _pi_elt_handlers
  { my( $t, $pi)= @_;
    my $pi_handlers= $t->{twig_handlers}->{pi_handlers} || return;
    foreach my $handler ( $pi_handlers->{$pi->target}, $pi_handlers->{''})
      { if( $handler) { local $_= $pi; $handler->( $t, $pi) || last; } }
  }

lib/XML/Twig.pm  view on Meta::CPAN

        my $elt= $t->{twig_elt_class}->new( $type);
        $elt->$set( @parser_args);
        if( $t->{extra_data})
          { $elt->set_extra_data( $t->{extra_data});
            $t->{extra_data}='';
          }

        if( ! $t->root)
          { $t->_add_cpi_outside_of_root( leading_cpi => $elt);
          }
        elsif( $t->{twig_in_pcdata})
          { $t->_trigger_text_handler();
            # create the node as a sibling of the PCDATA
            $elt->paste_after( $twig_current);
            $t->{twig_in_pcdata}=0;
          }
        elsif( $twig_current)
          { # we have to create a PCDATA element if we need to store spaces
            if( $t->_space_policy($twig_current->gi) && $t->{twig_stored_spaces})
              { _insert_pcdata( $t, $t->{twig_stored_spaces}); }
            $t->{twig_stored_spaces}='';
            # create the node as a child of the current element
            $elt->paste_last_child( $twig_current);
          }
        else
          { $t->_add_cpi_outside_of_root( trailing_cpi => $elt); }

        if( $twig_current)
          { $twig_current->del_twig_current;
            my $parent= $elt->_parent;

lib/XML/Twig.pm  view on Meta::CPAN


    # tries to clean-up (probably not very well at the moment)
    #undef $p->{twig};
    undef $t->{twig_parser};
    delete $t->{twig_parsing};
    @{$t}{ qw( twig_parser twig_parsing _twig_context_stack twig_current) }=();

    return $t;
  }

sub _insert_pcdata
  { my( $t, $string)= @_;
    # create a new PCDATA element
    my $parent= $t->{twig_current};    # always defined
    my $elt;
    if( exists $t->{twig_alt_elt_class})
      { $elt=  $t->{twig_elt_class}->new( $PCDATA);
        $elt->_set_pcdata( $string);
      }
    else
      { $elt= bless( { gi => $XML::Twig::gi2index{$PCDATA}, pcdata => $string }, 'XML::Twig::Elt'); }

    my $prev_sibling= $parent->_last_child;
    if( $prev_sibling)
      { $prev_sibling->set_next_sibling( $elt);
        $elt->set_prev_sibling( $prev_sibling);
      }
    else
      { $parent->set_first_child( $elt); }

    $elt->set_parent( $parent);

lib/XML/Twig.pm  view on Meta::CPAN


    my $t= $p->{twig};

    # we need to process the data in 2 cases: entity, or spaces after the closing tag

    # after the closing tag (no twig_current and root has been created)
    if(  ! $t->{twig_current} && $t->{twig_root} && $string=~ m{^\s+$}m) { $t->{twig_stored_spaces} .= $string; }

    # process only if we have an entity
    if( $string=~ m{^&([^;]*);$})
      { # the entity has to be pure pcdata, or we have a problem
        if( ($p->original_string=~ m{^<}) && ($p->original_string=~ m{>$}) )
          { # string is a tag, entity is in an attribute
            $t->{twig_entities_in_attribute}=1 if( $t->{twig_do_not_escape_amp_in_atts});
          }
        else
          { my $ent;
            if( $t->{twig_keep_encoding})
              { _twig_char( $p, $string);
                $ent= substr( $string, 1, -1);
              }

lib/XML/Twig.pm  view on Meta::CPAN

  {
    my( $t, $string)=@_;

    my $twig_current= $t->{twig_current};

    my $ent=  $t->{twig_elt_class}->new( $ENT);
    $ent->set_ent( $string);

    _add_or_discard_stored_spaces( $t);

    if( $t->{twig_in_pcdata})
      { # create the node as a sibling of the #PCDATA

        $ent->set_prev_sibling( $twig_current);
        $twig_current->set_next_sibling( $ent);
        my $parent= $twig_current->_parent;
        $ent->set_parent( $parent);
        $parent->set_last_child( $ent);
        # the twig_current is now the parent
        $twig_current->del_twig_current;
        $t->{twig_current}= $parent;
        # we left pcdata
        $t->{twig_in_pcdata}=0;
      }
    else
      { # create the node as a child of the current element
        $ent->set_parent( $twig_current);
        if( my $prev_sibling= $twig_current->_last_child)
          { $ent->set_prev_sibling( $prev_sibling);
            $prev_sibling->set_next_sibling( $ent);
          }
        else
          { if( $twig_current) { $twig_current->set_first_child( $ent); } }

lib/XML/Twig.pm  view on Meta::CPAN

      }

    my $p=$t->{twig_parser};
    if( $t->{twig_keep_encoding})
      { $p->setHandlers( %twig_handlers_finish_print); }
    else
      { $p->setHandlers( %twig_handlers_finish_print_original); }
    return $t;
  }

sub set_remove_cdata { return XML::Twig::Elt::set_remove_cdata( @_); }

sub output_filter          { return XML::Twig::Elt::output_filter( @_);          }
sub set_output_filter      { return XML::Twig::Elt::set_output_filter( @_);      }

sub output_text_filter     { return XML::Twig::Elt::output_text_filter( @_);     }
sub set_output_text_filter { return XML::Twig::Elt::set_output_text_filter( @_); }

sub set_input_filter
  { my( $t, $input_filter)= @_;
    my $old_filter= $t->{twig_input_filter};

lib/XML/Twig.pm  view on Meta::CPAN


    # if a gi is passed then use it
    my $gi= shift;
    $elt->set_gi( $gi);

    my $atts= ref $_[0] eq 'HASH' ? shift : undef;

    if( $atts && defined $atts->{$CDATA})
      { delete $atts->{$CDATA};

        my $cdata= $class->new( $CDATA => @_);
        return $class->new( $gi, $atts, $cdata);
      }

    if( $gi eq $PCDATA)
      { if( grep { ref $_ } @_) { croak "element $PCDATA can only be created from text"; }
        $elt->_set_pcdata( join '', @_);
      }
    elsif( $gi eq $ENT)
      { $elt->set_ent( shift); }
    elsif( $gi eq $CDATA)
      { if( grep { ref $_ } @_) { croak "element $CDATA can only be created from text"; }
        $elt->_set_cdata( join '', @_);
      }
    elsif( $gi eq $COMMENT)
      { if( grep { ref $_ } @_) { croak "element $COMMENT can only be created from text"; }
        $elt->_set_comment( join '', @_);
      }
    elsif( $gi eq $PI)
      { if( grep { ref $_ } @_) { croak "element $PI can only be created from text"; }
        $elt->_set_pi( shift, join '', @_);
      }
    else

lib/XML/Twig.pm  view on Meta::CPAN

        if( defined $atts->{$ASIS})  { $elt->set_asis(  $atts->{$ASIS} ); delete $atts->{$ASIS};  }
        if( defined $atts->{$EMPTY}) { $elt->set_empty( $atts->{$EMPTY}); delete $atts->{$EMPTY}; }
        if( keys %$atts) { $elt->set_atts( $atts); }
        $elt->_set_id( $atts->{$ID}) if( $atts->{$ID});
      }

    return $elt;
  }

# optimized version of $elt->new( PCDATA, $text);
sub _new_pcdata
  { my $class= $_[0];
    $class= ref $class || $class;
    my $elt  = {};
    bless $elt, $class;
    $elt->set_gi( $PCDATA);
    $elt->_set_pcdata( $_[1]);
    return $elt;
  }

# this function creates an XM:::Twig::Elt from a string
# it is quite clumsy at the moment, as it just creates a
# new twig then returns its root
# there might also be memory leaks there
# additional arguments are passed to new XML::Twig
sub parse
  { my $class= shift;

lib/XML/Twig.pm  view on Meta::CPAN

  }

# return the gi if it's a "real" element, 0 otherwise
sub is_elt
  { if(  $_[0]->{gi} >=  $XML::Twig::SPECIAL_GI)
     { return $_[0]->gi; }
    else
      { return 0; }
  }

sub is_pcdata
  { my $elt= shift;
    return (exists $elt->{'pcdata'});
  }

sub is_cdata
  { my $elt= shift;
    return (exists $elt->{'cdata'});
  }

sub is_pi
  { my $elt= shift;
    return (exists $elt->{'target'});
  }

sub is_comment
  { my $elt= shift;
    return (exists $elt->{'comment'});
  }

sub is_ent
  { my $elt= shift;
    return (exists $elt->{ent} || $elt->{ent_name});
  }

sub is_text
  { my $elt= shift;
    return (exists( $elt->{'pcdata'}) || (exists $elt->{'cdata'}));
  }

sub is_empty
  { return $_[0]->{empty} || 0; }

sub set_empty
  { $_[0]->{empty}= defined( $_[1]) ? $_[1] : 1; return $_[0]; }

sub set_not_empty
  { delete $_[0]->{empty} if( $_[0]->is_empty); return $_[0]; }

sub set_asis
  { my $elt=shift;

    foreach my $descendant ($elt, $elt->_descendants )
      { $descendant->{asis}= 1;
        if( $descendant->is_cdata)
          { $descendant->set_gi( $PCDATA);
            $descendant->_set_pcdata( $descendant->cdata);
          }

      }
    return $elt;
  }

sub set_not_asis
  { my $elt=shift;
    foreach my $descendant ($elt, $elt->descendants)
      { delete $descendant->{asis} if $descendant->{asis};}

lib/XML/Twig.pm  view on Meta::CPAN

  { return $_[0]->{asis}; }

sub closed
  { my $elt= shift;
    my $t= $elt->twig || return;
    my $curr_elt= $t->{twig_current};
    return 1 unless( $curr_elt);
    return $curr_elt->in( $elt);
  }

sub set_pcdata
  { my( $elt, $pcdata)= @_;

    if( $elt->_extra_data_in_pcdata)
      { _try_moving_extra_data( $elt, $pcdata);
      }
    $elt->{pcdata}= $pcdata;
    return $elt;
  }

sub _extra_data_in_pcdata      { return $_[0]->{extra_data_in_pcdata}; }
sub _set_extra_data_in_pcdata  { $_[0]->{extra_data_in_pcdata}= $_[1]; return $_[0]; }
sub _del_extra_data_in_pcdata  { delete $_[0]->{extra_data_in_pcdata}; return $_[0]; }
sub _unshift_extra_data_in_pcdata
    { my $e= shift;
      $e->{extra_data_in_pcdata}||=[];
      unshift @{$e->{extra_data_in_pcdata}}, { text => shift(), offset => shift() };
    }
sub _push_extra_data_in_pcdata
  { my $e= shift;
    $e->{extra_data_in_pcdata}||=[];
    push @{$e->{extra_data_in_pcdata}}, { text => shift(), offset => shift() };
  }

sub _extra_data_before_end_tag     { return $_[0]->{extra_data_before_end_tag} || ''; }
sub _set_extra_data_before_end_tag { $_[0]->{extra_data_before_end_tag}= $_[1]; return $_[0]}
sub _del_extra_data_before_end_tag { delete $_[0]->{extra_data_before_end_tag}; return $_[0]}
sub _prefix_extra_data_before_end_tag
  { my( $elt, $data)= @_;
    if($elt->{extra_data_before_end_tag})
      { $elt->{extra_data_before_end_tag}= $data . $elt->{extra_data_before_end_tag}; }
    else
      { $elt->{extra_data_before_end_tag}= $data; }
    return $elt;
  }

# internal, in cases where we know there is no extra_data (inlined anyway!)
sub _set_pcdata { $_[0]->{pcdata}= $_[1]; }

# try to figure out if we can keep the extra_data around
sub _try_moving_extra_data
  { my( $elt, $modified)=@_;
    my $initial= $elt->{pcdata};
    my $cpis= $elt->_extra_data_in_pcdata;

    if( (my $offset= index( $modified, $initial)) != -1)
      { # text has been added
        foreach (@$cpis) { $_->{offset}+= $offset; }
      }
    elsif( ($offset= index( $initial, $modified)) != -1)
      { # text has been cut
        my $len= length( $modified);
        foreach my $cpi (@$cpis) { $cpi->{offset} -= $offset; }
        $elt->_set_extra_data_in_pcdata( [ grep { $_->{offset} >= 0 && $_->{offset} < $len } @$cpis ]);
      }
    else
      {    _match_extra_data_words( $elt, $initial, $modified)
        || _match_extra_data_chars( $elt, $initial, $modified)
        || $elt->_del_extra_data_in_pcdata;
      }
  }

sub _match_extra_data_words
  { my( $elt, $initial, $modified)= @_;
    my @initial= split /\b/, $initial;
    my @modified= split /\b/, $modified;

    return _match_extra_data( $elt, length( $initial), \@initial, \@modified);
  }

lib/XML/Twig.pm  view on Meta::CPAN

  { my( $elt, $initial, $modified)= @_;
    my @initial= split //, $initial;
    my @modified= split //, $modified;

    return _match_extra_data( $elt, length( $initial), \@initial, \@modified);
  }

sub _match_extra_data
  { my( $elt, $length, $initial, $modified)= @_;

    my $cpis= $elt->_extra_data_in_pcdata;

    if( @$initial <= @$modified)
      {
        my( $ok, $positions, $offsets)= _pos_offset( $initial, $modified);
        if( $ok)
          { my $offset=0;
            my $pos= shift @$positions;
            foreach my $cpi (@$cpis)
              { while( $cpi->{offset} >= $pos)
                  { $offset= shift @$offsets;

lib/XML/Twig.pm  view on Meta::CPAN


            foreach my $cpi (@$cpis)
              { while( $cpi->{offset} >= $pos)
                  { $offset= shift @$offsets;
                    $prev_pos= $pos;
                    $pos= shift @$positions || $length +1;
                  }
                $cpi->{offset} -= $offset;
                if( $cpi->{offset} < $prev_pos) { delete $cpi->{text}; }
              }
            $elt->_set_extra_data_in_pcdata( [ grep { exists $_->{text} } @$cpis ]);
            return 1;
          }
      }
    return 0;
  }

sub _pos_offset
  { my( $short, $long)= @_;
    my( @pos, @offset);
    my( $s_length, $l_length)=(0,0);

lib/XML/Twig.pm  view on Meta::CPAN

            push @pos, $s_length;
            push @offset, $l_length - $s_length;
          }
        my $length= length( $s_word);
        $s_length += $length;
        $l_length += $length;
      }
    return( 1, \@pos, \@offset);
  }

sub append_pcdata
  { $_[0]->{'pcdata'}.= $_[1];
    return $_[0];
  }

sub pcdata        { return $_[0]->{pcdata}; }

sub append_extra_data
  {  $_[0]->{extra_data}.= $_[1];
     return $_[0];
  }

sub set_extra_data
  { $_[0]->{extra_data}= $_[1];
    return $_[0];
  }

lib/XML/Twig.pm  view on Meta::CPAN

    $c=~ s{^-}{ -};
    $c=~ s{-$}{- };
    $c=~ s{--}{- -}g;
    return $c;
  }

sub set_ent  { $_[0]->{ent}= $_[1]; return $_[0]; }
sub ent      { return $_[0]->{ent}; }
sub ent_name { return substr( $_[0]->ent, 1, -1);}

sub set_cdata
  { my $elt= shift;
    unless( $elt->{gi} == $XML::Twig::gi2index{$CDATA})
      { $elt->cut_children;
        $elt->insert_new_elt( first_child => $CDATA, @_);
        return $elt;
      }
    $elt->_set_cdata( $_[0]);
    return $_[0];
  }

sub _set_cdata
  { $_[0]->{cdata}= $_[1];
    return $_[0];
  }

sub append_cdata
  { $_[0]->{cdata}.= $_[1];
    return $_[0];
  }
sub cdata { return $_[0]->{cdata}; }

sub contains_only_text
  { my $elt= shift;
    return 0 unless $elt->is_elt;
    foreach my $child ($elt->_children)
      { return 0 if $child->is_elt; }
    return $elt;
  }

sub contains_only

lib/XML/Twig.pm  view on Meta::CPAN

sub _move_extra_data_after_erase
  { my( $elt)= @_;
    # extra_data
    if( my $extra_data= $elt->{extra_data})
      { my $target= $elt->_first_child || $elt->_next_sibling;
        if( $target)
          {
            if( $target->is( $ELT))
              { $target->set_extra_data( $extra_data . ($target->extra_data || '')); }
            elsif( $target->is( $TEXT))
              { $target->_unshift_extra_data_in_pcdata( $extra_data, 0); }  # TO CHECK
          }
        else
          { my $parent= $elt->parent; # always exists or the erase cannot be performed
            $parent->_prefix_extra_data_before_end_tag( $extra_data);
          }
      }

     # extra_data_before_end_tag
    if( my $extra_data= $elt->_extra_data_before_end_tag)
      { if( my $target= $elt->_next_sibling)
          { if( $target->is( $ELT))
              { $target->set_extra_data( $extra_data . ($target->extra_data || '')); }
            elsif( $target->is( $TEXT))
              {
                $target->_unshift_extra_data_in_pcdata( $extra_data, 0);
             }
          }
        elsif( my $parent= $elt->parent)
          { $parent->_prefix_extra_data_before_end_tag( $extra_data); }
       }

    return $elt;

  }
BEGIN

lib/XML/Twig.pm  view on Meta::CPAN

       }
  }

# split a text element at a given offset
sub split_at
  { my( $elt, $offset)= @_;
    my $text_elt= $elt->is_text ? $elt : $elt->first_child( $TEXT) || return '';
    my $string= $text_elt->text;
    my $left_string= substr( $string, 0, $offset);
    my $right_string= substr( $string, $offset);
    $text_elt->set_pcdata( $left_string);
    my $new_elt= $elt->new( $elt->gi, $right_string);
    $new_elt->paste( after => $elt);
    return $new_elt;
  }

# split an element or its text descendants into several, in place
# all elements (new and untouched) are returned
sub split
  { my $elt= shift;
    my @text_chunks;

lib/XML/Twig.pm  view on Meta::CPAN

  { my( $e1, $e2)= @_;
    croak "invalid merge: can only merge 2 elements"
        unless( isa( $e2, 'XML::Twig::Elt'));
    croak "invalid merge: can only merge 2 text elements"
        unless( $e1->is_text && $e2->is_text && ($e1->gi eq $e2->gi));

    my $t1_length= length( $e1->text);

    $e1->set_text( $e1->text . $e2->text);

    if( my $extra_data_in_pcdata= $e2->_extra_data_in_pcdata)
      { foreach my $data (@$extra_data_in_pcdata) { $e1->_push_extra_data_in_pcdata( $data->{text}, $data->{offset} + $t1_length); } }

    $e2->delete;

    return $e1;
  }

sub merge
  { my( $e1, $e2)= @_;
    my @e2_children= $e2->_children;
    if(     $e1->_last_child && $e1->_last_child->is_pcdata
        &&  @e2_children && $e2_children[0]->is_pcdata
      )
      { my $t1_length= length( $e1->_last_child->{pcdata});
        my $child1= $e1->_last_child;
        my $child2= shift @e2_children;
        $child1->{pcdata} .= $child2->{pcdata};

        my $extra_data= $e1->_extra_data_before_end_tag . $e2->extra_data;

        if( $extra_data)
          { $e1->_del_extra_data_before_end_tag;
            $child1->_push_extra_data_in_pcdata( $extra_data, $t1_length);
          }

        if( my $extra_data_in_pcdata= $child2->_extra_data_in_pcdata)
          { foreach my $data (@$extra_data_in_pcdata) { $child1->_push_extra_data_in_pcdata( $data->{text}, $data->{offset} + $t1_length); } }

        if( my $extra_data_before_end_tag= $e2->_extra_data_before_end_tag)
          { $e1->_set_extra_data_before_end_tag( $extra_data_before_end_tag); }
      }

    foreach my $e (@e2_children) { $e->move( last_child => $e1); }

    $e2->delete;
    return $e1;
  }

lib/XML/Twig.pm  view on Meta::CPAN

# recursively copy an element and returns the copy (can be huge and long)
sub copy
  { my $elt= shift;
    my $copy= $elt->new( $elt->gi);

    if( $elt->extra_data) { $copy->set_extra_data( $elt->extra_data); }
    if( $elt->_extra_data_before_end_tag) { $copy->_set_extra_data_before_end_tag( $elt->_extra_data_before_end_tag); }

    if( $elt->is_asis)   { $copy->set_asis; }

    if( $elt->is_pcdata)
      { $copy->set_pcdata( $elt->pcdata);
        if( $elt->_extra_data_in_pcdata) { $copy->_set_extra_data_in_pcdata( $elt->_extra_data_in_pcdata); }
      }
    elsif( $elt->is_cdata)
      { $copy->_set_cdata( $elt->cdata);
        if( $elt->_extra_data_in_pcdata) { $copy->_set_extra_data_in_pcdata( $elt->_extra_data_in_pcdata); }
      }
    elsif( $elt->is_pi)
      { $copy->_set_pi( $elt->target, $elt->data); }
    elsif( $elt->is_comment)
      { $copy->_set_comment( $elt->comment); }
    elsif( $elt->is_ent)
      { $copy->set_ent( $elt->ent); }
    else
      { my @children= $elt->_children;
        if( my $atts= $elt->atts)

lib/XML/Twig.pm  view on Meta::CPAN

  { my $elt= shift;
    my $t= $elt->twig;
    $t->ignore( $elt, @_);
  }

BEGIN {
  my $pretty                    = 0;
  my $quote                     = '"';
  my $INDENT                    = '  ';
  my $empty_tag_style           = 0;
  my $remove_cdata              = 0;
  my $keep_encoding             = 0;
  my $expand_external_entities  = 0;
  my $keep_atts_order           = 0;
  my $do_not_escape_amp_in_atts = 0;
  my $WRAP                      = '80';
  my $REPLACED_ENTS             = qq{&<};

  my ($NSGMLS, $NICE, $INDENTED, $INDENTEDCT, $INDENTEDC, $WRAPPED, $RECORD1, $RECORD2, $INDENTEDA)= (1..9);
  my %KEEP_TEXT_TAG_ON_ONE_LINE= map { $_ => 1 } ( $INDENTED, $INDENTEDCT, $INDENTEDC, $INDENTEDA, $WRAPPED);
  my %WRAPPED =  map { $_ => 1 } ( $WRAPPED, $INDENTEDA, $INDENTEDC);

lib/XML/Twig.pm  view on Meta::CPAN


  my %quote_style=
    ( double  => '"',
      single  => "'",
      # smart  => "smart",
    );

  my $xml_space_preserve; # set when an element includes xml:space="preserve"

  my $output_filter;      # filters the entire output (including < and >)
  my $output_text_filter; # filters only the text part (tag names, attributes, pcdata)

  my $replaced_ents= $REPLACED_ENTS;

  # returns those pesky "global" variables so you can switch between twigs
  sub global_state ## no critic (Subroutines::ProhibitNestedSubs);
    { return
       { pretty                    => $pretty,
         quote                     => $quote,
         indent                    => $INDENT,
         empty_tag_style           => $empty_tag_style,
         remove_cdata              => $remove_cdata,
         keep_encoding             => $keep_encoding,
         expand_external_entities  => $expand_external_entities,
         output_filter             => $output_filter,
         output_text_filter        => $output_text_filter,
         keep_atts_order           => $keep_atts_order,
         do_not_escape_amp_in_atts => $do_not_escape_amp_in_atts,
         wrap                      => $WRAP,
         replaced_ents             => $replaced_ents,
        };
    }

  # restores the global variables
  sub set_global_state
    { my $state= shift;
      $pretty                    = $state->{pretty};
      $quote                     = $state->{quote};
      $INDENT                    = $state->{indent};
      $empty_tag_style           = $state->{empty_tag_style};
      $remove_cdata              = $state->{remove_cdata};
      $keep_encoding             = $state->{keep_encoding};
      $expand_external_entities  = $state->{expand_external_entities};
      $output_filter             = $state->{output_filter};
      $output_text_filter        = $state->{output_text_filter};
      $keep_atts_order           = $state->{keep_atts_order};
      $do_not_escape_amp_in_atts = $state->{do_not_escape_amp_in_atts};
      $WRAP                      = $state->{wrap};
      $replaced_ents             = $state->{replaced_ents},
    }

  # sets global state to defaults
  sub init_global_state
    { set_global_state(
       { pretty                    => 0,
         quote                     => '"',
         indent                    => $INDENT,
         empty_tag_style           => 0,
         remove_cdata              => 0,
         keep_encoding             => 0,
         expand_external_entities  => 0,
         output_filter             => undef,
         output_text_filter        => undef,
         keep_atts_order           => undef,
         do_not_escape_amp_in_atts => 0,
         wrap                      => $WRAP,
         replaced_ents             => $REPLACED_ENTS,
        });
    }

lib/XML/Twig.pm  view on Meta::CPAN

    { return (sort { $pretty_print_style{$a} <=> $pretty_print_style{$b} || $a cmp $b } keys %pretty_print_style); }

  sub set_quote
    { my $style= $_[1] || $_[0];
      my $old_quote= $quote;
      croak "invalid quote '$style'" unless( exists $quote_style{$style});
      $quote= $quote_style{$style};
      return $old_quote;
    }

  sub set_remove_cdata
    { my $new_value= defined $_[1] ? $_[1] : $_[0];
      my $old_value= $remove_cdata;
      $remove_cdata= $new_value;
      return $old_value;
    }

  sub set_indent
    { my $new_value= defined $_[1] ? $_[1] : $_[0];
      my $old_value= $INDENT;
      $INDENT= $new_value;
      return $old_value;
    }

lib/XML/Twig.pm  view on Meta::CPAN

            { print $elt->end_tag;
              $elt->{end_tag_flushed}=1;
              $elt->_set_flushed;
            }
          $xml_space_preserve-- if $preserve;
          # used for pretty printing
          if( my $parent= $elt->parent) { $parent->{has_flushed_child}= 1; }
        }
      else # text or special element
        { my $text;
          if( $elt->is_pcdata)     { $text= $elt->pcdata_xml_string;
                                     if( my $parent= $elt->parent)
                                       { $parent->{contains_text}= 1; }
                                   }
          elsif( $elt->is_cdata)   { $text= $elt->cdata_string;
                                     if( my $parent= $elt->parent)
                                       { $parent->{contains_text}= 1; }
                                   }
          elsif( $elt->is_pi)      { $text= $elt->pi_string;          }
          elsif( $elt->is_comment) { $text= $elt->comment_string;     }
          elsif( $elt->is_ent)     { $text= $elt->ent_string;         }

          print $output_filter ? $output_filter->( $text) : $text;
        }
    }

lib/XML/Twig.pm  view on Meta::CPAN


      my $string='';

      if( ($elt->{gi} >= $XML::Twig::SPECIAL_GI) )
        { # sprint the children
          my $child= $elt->_first_child || '';
          while( $child)
            { $string.= $child->xml_text;
            } continue { $child= $child->_next_sibling; }
        }
      elsif( $elt->is_pcdata)  { $string .= $output_filter ?  $output_filter->($elt->pcdata_xml_string)
                                                           : $elt->pcdata_xml_string;
                               }
      elsif( $elt->is_cdata)   { $string .= $output_filter ?  $output_filter->($elt->cdata_string)
                                                           : $elt->cdata_string;
                               }
      elsif( $elt->is_ent)     { $string .= $elt->ent_string; }

      return $string;
    }

  sub xml_text_only
    { return join '', map { $_->xml_text if( $_->is_text || $_->is_ent) } $_[0]->_children; }

  # same as print but except... it does not print but rather returns the string

lib/XML/Twig.pm  view on Meta::CPAN

          my $child= $elt->_first_child;
          while( $child)
            { $child->_sprint;
              $child= $child->_next_sibling;
            }
          push @sprint, $elt->end_tag unless( $no_tag);
          $xml_space_preserve-- if $preserve;
        }
      else
        { push @sprint, $elt->{extra_data} if( $elt->{extra_data}) ;
          if(    $elt->is_pcdata)  { push @sprint, $elt->pcdata_xml_string; }
          elsif( $elt->is_cdata)   { push @sprint, $elt->cdata_string;      }
          elsif( $elt->is_pi)      { if( ($pretty >= $INDENTED) && !$elt->parent->{contains_text}) { push @sprint, "\n" . $INDENT x $elt->level; }
                                     push @sprint, $elt->pi_string;
                                   }
          elsif( $elt->is_comment) { if( ($pretty >= $INDENTED) && !$elt->parent->{contains_text}) { push @sprint, "\n" . $INDENT x $elt->level; }
                                     push @sprint, $elt->comment_string;
                                   }
          elsif( $elt->is_ent)     { push @sprint, $elt->ent_string;        }
        }

      return;
    }

  # just a shortcut to $elt->sprint( 1)
  sub xml_string
    { my $elt= shift;
      isa( $_[0], 'HASH') ?  $elt->sprint( shift(), 1) : $elt->sprint( 1);
    }

  sub pcdata_xml_string
    { my $elt= shift;
      if( defined( my $string= $elt->{pcdata}) )
        {
          if( ! $elt->_extra_data_in_pcdata)
            {
              $string=~ s/([$replaced_ents])/$XML::Twig::base_ent{$1}/g unless( !$replaced_ents || $keep_encoding || $elt->{asis});
              $string=~ s{\Q]]>}{]]&gt;}g;
            }
          else
            { _gen_mark( $string); # used by _(un)?protect_extra_data
              foreach my $data (reverse @{$elt->_extra_data_in_pcdata})
                { my $substr= substr( $string, $data->{offset});
                  if( $keep_encoding || $elt->{asis})
                    { substr( $string, $data->{offset}, 0, $data->{text}); }
                  else
                    { substr( $string, $data->{offset}, 0, _protect_extra_data( $data->{text})); }
                }
              unless( $keep_encoding || $elt->{asis})
                {
                  $string=~ s{([$replaced_ents])}{$XML::Twig::base_ent{$1}}g ;
                  $string=~ s{\Q]]>}{]]&gt;}g;

lib/XML/Twig.pm  view on Meta::CPAN

      { my( $extra_data)= @_;
        $extra_data=~ s{([<&>])}{:$mark:$char2ent{$1}:}g;
        return $extra_data;
      }

    sub _unprotect_extra_data
      { $_[0]=~ s{:$mark:(\w+):}{$ent2char{$1}}g; }

  }

  sub cdata_string
    { my $cdata= $_[0]->cdata;
      unless( defined $cdata) { return ''; }
      if( $remove_cdata)
        { $cdata=~ s/([$replaced_ents])/$XML::Twig::base_ent{$1}/g; }
      else
        { # if the CDATA includes the end of CDATA marker, we need to split it
          $cdata=~ s{$CDATA_END}{]]$CDATA_END$CDATA_START>}g;
          $cdata= $CDATA_START . $cdata . $CDATA_END;
        }
      return $cdata;
   }

  sub att_xml_string
    { my $elt= shift;
      my $att= shift;

      my $replace= $replaced_ents . "$quote\n\r\t";
      if($_[0] && $_[0]->{escape_gt} && ($replace!~ m{>}) ) { $replace .='>'; }

      if( defined (my $string= $elt->{att}->{$att}))

lib/XML/Twig.pm  view on Meta::CPAN


  # returns just the text, no tags, for an element
  sub text
    { my( $elt, @options)= @_;

      if( @options && grep { lc( $_) eq 'no_recurse' } @options) { return $elt->text_only; }
      my $sep = (@options && grep { lc( $_) eq 'sep' } @options) ? ' ' : '';

      my $string;

      if( $elt->is_pcdata)     { return  $elt->pcdata    . $sep;  }
      elsif( $elt->is_cdata)   { return  $elt->cdata     . $sep;  }
      elsif( $elt->is_pi)      { return  $elt->pi_string . $sep;  }
      elsif( $elt->is_comment) { return  $elt->comment   . $sep;  }
      elsif( $elt->is_ent)     { return  $elt->ent       . $sep ; }

      my $child= $elt->_first_child ||'';
      while( $child)
        {
          my $child_text= $child->text( @options);
          $string.= defined( $child_text) ? $sep . $child_text : '';
        } continue { $child= $child->_next_sibling; }

lib/XML/Twig.pm  view on Meta::CPAN

    { my $elt= shift;
      my $text= $elt->text( @_);
      $text=~ s{\s+}{ }sg;
      $text=~ s{^\s}{};
      $text=~ s{\s$}{};
      return $text;
    }

  sub trim
    { my( $elt)= @_;
      my $pcdata= $elt->first_descendant( $TEXT);
      (my $pcdata_text= $pcdata->text)=~ s{^\s+}{}s;
      $pcdata->set_text( $pcdata_text);
      $pcdata= $elt->last_descendant( $TEXT);
      ($pcdata_text= $pcdata->text)=~ s{\s+$}{};
      $pcdata->set_text( $pcdata_text);
      foreach my $pcdata ($elt->descendants( $TEXT))
        { ($pcdata_text= $pcdata->text)=~ s{\s+}{ }g;
          $pcdata->set_text( $pcdata_text);
        }
      return $elt;
    }

  # remove cdata sections (turns them into regular pcdata) in an element
  sub remove_cdata
    { my $elt= shift;
      foreach my $cdata ($elt->descendants_or_self( $CDATA))
        { if( $keep_encoding)
            { my $data= $cdata->cdata;
              $data=~ s{([&<"'])}{$XML::Twig::base_ent{$1}}g;
              $cdata->set_pcdata( $data);
            }
          else
            { $cdata->set_pcdata( $cdata->cdata); }
          $cdata->set_gi( $PCDATA);
          undef $cdata->{cdata};
        }
    }

sub _is_private      { return _is_private_name( $_[0]->gi); }
sub _is_private_name { return $_[0]=~ m{^#(?!default:)};                }

} # end of block containing package globals ($pretty_print, $quotes, keep_encoding...)

# merges consecutive #PCDATAs in am element
sub normalize
  { my( $elt)= @_;
    my @descendants= $elt->descendants( $PCDATA);
    while( my $desc= shift @descendants)
      { if( ! length $desc->{pcdata}) { $desc->delete; next; }
        while( @descendants && $desc->_next_sibling && $desc->_next_sibling== $descendants[0])
          { my $to_merge= shift @descendants;
            $desc->merge_text( $to_merge);
          }
      }
    return $elt;
  }

# SAX export methods
sub toSAX1

lib/XML/Twig.pm  view on Meta::CPAN

          { unless( $elt->_flushed) { $start_element->( $handler, $data); } }

        foreach my $child ($elt->_children)
          { $child->_toSAX( $handler, $start_tag_data, $end_tag_data); }

        if( (my $data= $end_tag_data->( $elt)) && (my $end_element = $handler->can( 'end_element')) )
          { $end_element->( $handler, $data); }
        _end_prefix_mapping( $elt, $handler);
      }
    else # text or special element
      { if( $elt->is_pcdata && (my $characters= $handler->can( 'characters')))
          { $characters->( $handler, { Data => $elt->pcdata });  }
        elsif( $elt->is_cdata)
          { if( my $start_cdata= $handler->can( 'start_cdata'))
              { $start_cdata->( $handler); }
            if( my $characters= $handler->can( 'characters'))
              { $characters->( $handler, {Data => $elt->cdata });  }
            if( my $end_cdata= $handler->can( 'end_cdata'))
              { $end_cdata->( $handler); }
          }
        elsif( ($elt->is_pi)  && (my $pi= $handler->can( 'processing_instruction')))
          { $pi->( $handler, { Target =>$elt->target, Data => $elt->data });  }
        elsif( ($elt->is_comment)  && (my $comment= $handler->can( 'comment')))
          { $comment->( $handler, { Data => $elt->comment });  }
        elsif( ($elt->is_ent))
          {
            if( my $se=   $handler->can( 'skipped_entity'))
              { $se->( $handler, { Name => $elt->ent_name });  }
            elsif( my $characters= $handler->can( 'characters'))

lib/XML/Twig.pm  view on Meta::CPAN

sub contains_text
  { my $elt= shift;
    my $child= $elt->_first_child;
    while ($child)
      { return 1 if( $child->is_text || $child->is_ent);
        $child= $child->_next_sibling;
      }
    return 0;
  }

# creates a single pcdata element containing the text as child of the element
# options:
#   - force_pcdata: when set to a true value forces the text to be in a #PCDATA
#                   even if the original element was a #CDATA
sub set_text
  { my( $elt, $string, %option)= @_;

    if( $elt->gi eq $PCDATA)
      { return $elt->set_pcdata( $string); }
    elsif( $elt->gi eq $CDATA)
      { if( $option{force_pcdata})
          { $elt->set_gi( $PCDATA);
            $elt->_set_cdata('');
            return $elt->set_pcdata( $string);
          }
        else
          { $elt->_set_cdata( $string);
            return $string;
          }
      }
    elsif( $elt->contains_a_single( $PCDATA) )
      { # optimized so we have a slight chance of not losing embedded comments and pi's
        $elt->_first_child->set_pcdata( $string);
        return $elt;
      }

    foreach my $child (@{[$elt->_children]})
      { $child->delete; }

    my $pcdata= $elt->_new_pcdata( $string);
    $pcdata->paste( $elt);

    $elt->set_not_empty;

    return $elt;
  }

# set the content of an element from a list of strings and elements
sub set_content
  { my $elt= shift;

lib/XML/Twig.pm  view on Meta::CPAN

      { $elt->set_empty; return $elt; }

    # case where we really want to do a set_text, the element is '#PCDATA'
    # or contains a single PCDATA and we only want to add text in it
    if( ($elt->gi eq $PCDATA || $elt->contains_a_single( $PCDATA))
        && (@_ == 1) && !( ref $_[0]))
      { $elt->set_text( $_[0]);
        return $elt;
      }
    elsif( ($elt->gi eq $CDATA) && (@_ == 1) && !( ref $_[0]))
      { $elt->_set_cdata( $_[0]);
        return $elt;
      }

    # delete the children
    foreach my $child (@{[$elt->_children]})
      { $child->delete; }

    if( @_) { $elt->set_not_empty; }

    foreach my $child (@_)
      { if( ref( $child) && isa( $child, 'XML::Twig::Elt'))
          { # argument is an element
            $child->paste( 'last_child', $elt);
          }
        else
          { # argument is a string
            if( (my $pcdata= $elt->_last_child) && $elt->_last_child->is_pcdata)
              { # previous child is also pcdata: just concatenate
                $pcdata->set_pcdata( $pcdata->pcdata . $child)
              }
            else
              { # previous child is not a string: create a new pcdata element
                $pcdata= $elt->_new_pcdata( $child);
                $pcdata->paste( 'last_child', $elt);
              }
          }
      }

    return $elt;
  }

# inserts an element (whose gi is given) as child of the element
# all children of the element are now children of the new element
# returns the new element

lib/XML/Twig.pm  view on Meta::CPAN

  }

# move an element, same syntax as paste, except the element is first cut
sub move
  { my $elt= shift;
    $elt->cut;
    $elt->paste( @_);
    return $elt;
  }

# adds a prefix to an element, creating a pcdata child if needed
sub prefix
  { my ($elt, $prefix, $option)= @_;
    my $asis= ($option && ($option eq 'asis')) ? 1 : 0;
    if( $elt->is_pcdata
        && (($asis && $elt->{asis}) || (!$asis && ! $elt->{asis}))
      )
      { $elt->set_pcdata( $prefix . $elt->pcdata); }
    elsif( $elt->_first_child && $elt->_first_child->is_pcdata
        && (   ($asis && $elt->_first_child->{asis})
            || (!$asis && ! $elt->_first_child->{asis}))
         )
      {
        $elt->_first_child->set_pcdata( $prefix . $elt->_first_child->pcdata);
      }
    else
      { my $new_elt= $elt->_new_pcdata( $prefix);
        my $pos= $elt->is_pcdata ? 'before' : 'first_child';
        $new_elt->paste( $pos => $elt);
        if( $asis) { $new_elt->set_asis; }
      }
    return $elt;
  }

# adds a suffix to an element, creating a pcdata child if needed
sub suffix
  { my ($elt, $suffix, $option)= @_;
    my $asis= ($option && ($option eq 'asis')) ? 1 : 0;
    if( $elt->is_pcdata
        && (($asis && $elt->{asis}) || (!$asis && ! $elt->{asis}))
      )
      { $elt->set_pcdata( $elt->pcdata . $suffix); }
    elsif( $elt->_last_child && $elt->_last_child->is_pcdata
        && (   ($asis && $elt->_last_child->{asis})
            || (!$asis && ! $elt->_last_child->{asis}))
         )
      { $elt->_last_child->set_pcdata( $elt->_last_child->pcdata . $suffix); }
    else
      { my $new_elt= $elt->_new_pcdata( $suffix);
        my $pos= $elt->is_pcdata ? 'after' : 'last_child';
        $new_elt->paste( $pos => $elt);
        if( $asis) { $new_elt->set_asis; }
      }
    return $elt;
  }

# create a path to an element ('/root/.../gi)
sub path
  { my $elt= shift;
    my @context= ( $elt, $elt->ancestors);

lib/XML/Twig.pm  view on Meta::CPAN

          { $dump .= ' ' . join( ' ', map { qq{$_="} . $elt->att( $_) . qq{"} } @atts); }

        $dump .= "\n";
        if( $extra) { $dump .= $elt->_dump_extra_data( $indent, $indent_sp, $short_text); }

        if( exists $option->{depth}) { $option->{depth}--; }
        $dump .= join( "", map { $_->_dump( $option) } $elt->_children) unless exists $option->{depth} && $option->{depth} <= 0;
      }
    else
      {
        if( $elt->is_pcdata)
          { $dump .= "$indent|-PCDATA:  '"  . _short_text( $elt->pcdata, $short_text) . "'\n" }
        elsif( $elt->is_ent)
          { $dump .= "$indent|-ENTITY:  '" . _short_text( $elt->ent, $short_text) . "'\n" }
        elsif( $elt->is_cdata)
          { $dump .= "$indent|-CDATA:   '" . _short_text( $elt->cdata, $short_text) . "'\n" }
        elsif( $elt->is_comment)
          { $dump .= "$indent|-COMMENT: '" . _short_text( $elt->comment_string, $short_text) . "'\n" }
        elsif( $elt->is_pi)
          { $dump .= "$indent|-PI:      '"      . $elt->target . "' - '" . _short_text( $elt->data, $short_text) . "'\n" }
        if( $extra) { $dump .= $elt->_dump_extra_data( $indent, $indent_sp, $short_text); }
      }
    return $dump;
  }

sub _dump_extra_data
  { my( $elt, $indent, $indent_sp, $short_text)= @_;
    my $dump='';
    if( $elt->extra_data)
      { my $extra_data = $indent . "|-- (cpi before) '" . _short_text( $elt->extra_data, $short_text) . "'";
        $extra_data=~ s{\n}{$indent_sp}g;
        $dump .= $extra_data . "\n";
      }
    if( $elt->_extra_data_in_pcdata)
      { foreach my $data ( @{$elt->_extra_data_in_pcdata})
          { my $extra_data = $indent . "|-- (cpi offset $data->{offset}) '" . _short_text( $data->{text}, $short_text) . "'";
            $extra_data=~ s{\n}{$indent_sp}g;
            $dump .= $extra_data . "\n";
          }
      }
    if( $elt->_extra_data_before_end_tag)
      { my $extra_data = $indent . "|-- (cpi end) '" . _short_text( $elt->_extra_data_before_end_tag, $short_text) . "'";
        $extra_data=~ s{\n}{$indent_sp}g;
        $dump .= $extra_data . "\n";
      }

lib/XML/Twig.pm  view on Meta::CPAN

Same as output_filter, except it doesn't apply to the brackets and quotes
around attribute values. This is useful for all filters that could change
the tagging, basically anything that does not just change the encoding of
the output. C<html>, C<safe> and C<safe_hex> are better used with this option.

=item input_filter

This option is similar to C<output_filter> except the filter is applied to
the characters before they are stored in the twig, at parsing time.

=item remove_cdata

Setting this option to a true value will force the twig to output C<#CDATA>
sections as regular (escaped) C<#PCDATA>.

=item parse_start_tag

If you use the C<keep_encoding> option, then this option can be used to replace
the default parsing function. You should provide a coderef (a reference to a
subroutine) as the argument. This subroutine takes the original tag (given
by XML::Parser::Expat C<original_string()> method) and returns a tag and the

lib/XML/Twig.pm  view on Meta::CPAN

elements with textual content are not broken as the \n is the significant).

B<WARNING>: This option leaves the document well-formed but might make it
invalid (not conformant to its DTD). If you have elements declared as:

  <!ELEMENT foo (#PCDATA|bar)>

then a C<foo> element including a C<bar> one will be printed as:

  <foo>
  <bar>bar is just pcdata</bar>
  </foo>

This is invalid, as the parser will take the line break after the C<foo> tag
as a sign that the element contains PCDATA, it will then die when it finds the
C<bar> tag. This may or may not be important for you, but be aware of it!

=item indented

Same as C<nice> (and with the same warning) but indents elements according to
their level.

lib/XML/Twig.pm  view on Meta::CPAN

though.

B<Note>: Comments in the middle of a text element such as:

  <p>text <!-- comment --> more text --></p>

are kept at their original position in the text. Using "print"
methods like C<print> or C<sprint> will return the comments in the
text. Using C<text> or C<field> on the other hand will not.

Any use of C<set_pcdata> on the C<#PCDATA> element (directly or
through other methods like C<set_content>) will delete the comment(s).

=item process

Comments are loaded in the twig and are treated as regular elements
with a C<tag> value of C<#COMMENT>. This can interfere with processing if you
expect C<< $elt->{first_child} >> to be an element but find a comment there.
Schema validation will not protect you from this as comments can happen anywhere.
You can use C<< $elt->first_child( 'tag') >> (which is a good habit anyway)
to get what you want.

lib/XML/Twig.pm  view on Meta::CPAN


=item set_empty_tag_style ($style)

Sets the empty tag display style ('C<normal>', 'C<html>' or 'C<expand>'). As
with C<L<set_pretty_print> >, this sets a global flag.

C<normal> outputs an empty tag 'C<< <tag/> >>', C<html> adds a space
'C<< <tag /> >>' for elements that can be empty in XHTML, and C<expand> outputs
'C<< <tag></tag> >>'.

=item set_remove_cdata ($flag)

Sets (or unsets) the flag that forces the twig to output C<#CDATA> sections as
regular (escaped) C<#PCDATA>.

=item print_prolog ($optional_filehandle, %options)

Prints the prolog (XML declaration + DTD + entity declarations) of a document.

Options: see C<L<flush> >.

lib/XML/Twig.pm  view on Meta::CPAN

=item in ($potential_parent)

Returns true (the potential parent element) if the element is in the potential_parent (C<$potential_parent> is
an element), otherwise false (0).

=item in_context ($cond, $optional_level)

Returns true (the matching including element) if the element is included in an element which passes C<$cond>
optionally within C<$optional_level> levels, otherwise false (0).

=item pcdata

Returns the text of a C<#PCDATA> element or C<undef> if the element is not
C<#PCDATA>.

=item pcdata_xml_string

Returns the text of a C<#PCDATA> element or C<undef> if the element is not C<#PCDATA>.
The text is "XML-escaped" ('&' and '<' are replaced by '&amp;' and '&lt;').

=item set_pcdata ($text)

Sets the text of a C<#PCDATA> element. This method does not check that the element is
indeed a C<#PCDATA> so usually you should use C<L<set_text> > instead.

=item append_pcdata ($text)

Adds the text at the end of a C<#PCDATA> element.

=item is_cdata

Returns true (1) if the element is a C<#CDATA> element, returns false ('') otherwise.

=item is_text

Returns true (1) if the element is a C<#CDATA> or C<#PCDATA> element, returns false ('') otherwise.

=item cdata

Returns the text of a C<#CDATA> element or C<undef> if the element is not
C<#CDATA>.

=item cdata_string

Returns the XML string of a C<#CDATA> element, including the opening and
closing markers.

=item set_cdata ($text)

Sets the text of a C<#CDATA> element.

=item append_cdata ($text)

Adds the text at the end of a C<#CDATA> element.

=item remove_cdata

Turns all C<#CDATA> sections in the element into regular C<#PCDATA> elements. This is useful
when converting XML to HTML, as browsers do not support CDATA sections.

=item extra_data

Returns the extra_data (comments and PI's) attached to an element.

=item set_extra_data ($extra_data)

lib/XML/Twig.pm  view on Meta::CPAN


=item contains_a_single ($exp)

Returns the (the matching child) if the element contains a single child that matches the expression C<$exp>,
otherwise returns false (0).

=item is_field

Same as C<contains_only_text>.

=item is_pcdata

Returns true (1) if the element is a C<#PCDATA> element, otherwise returns false ('').

=item is_ent

Returns true (1) if the element is an entity (an unexpanded entity) element,
otherwise returns false ('').

=item is_empty

lib/XML/Twig.pm  view on Meta::CPAN


Sets the method to output empty tags. Values are 'C<normal>' (default), 'C<html>',
and 'C<expand>'.

C<normal> outputs an empty tag 'C<< <tag/> >>', C<html> adds a space
'C<< <tag /> >>' for elements that can be empty in XHTML and C<expand> outputs
'C<< <tag></tag> >>'.

Returns the previous setting.

=item set_remove_cdata ($flag)

Sets (or unsets) the flag that forces the twig to output C<#CDATA> sections as
regular (escaped) C<#PCDATA>.

Returns the previous setting.

=item set_indent ($string)

Sets the indentation for the indented pretty print style (default is two spaces).

speedup.pl  view on Meta::CPAN

#!/usr/bin/perl 

use 5.010;

my $FIELD     = join( '|', qw( parent first_child last_child prev_sibling next_sibling pcdata cdata ent data target cdata pcdata comment flushed));
my $PRIVATE   = join( '|', qw( parent first_child last_child prev_sibling next_sibling pcdata cdata comment 
                               extra_data_in_pcdata extra_data_before_end_tag
                             )
                    ); # _$private is inlined
my $FORMER    = join( '|', qw( parent prev_sibling next_sibling)); # former_$former is inlined
my $SET_FIELD = join( '|', qw( first_child next_sibling ent data pctarget comment flushed));
my $SET_NOT_EMPTY= join( '|', qw( pcdata cdata comment)); # set the field

my $var= '(\$[a-z_]+(?:\[\d\])?|\$t(?:wig)?->root|\$t(?:wig)?->twig_current|\$t(?:wig)?->\{\'?twig_root\'?\}|\$t(?:wig)?->\{\'?twig_current\'?\})';

my $set_to = '(?:undef|\$\w+|\$\w+->\{\w+\}|\$\w+->\w+|\$\w+->\w+\([^)]+\))';
my $elt    = '\$(?:elt|new_elt|child|cdata|ent|_?parent|twig_current|next_sibling|first_child|prev_sibling|last_child|ref|elt->_parent)';


my %gi2index=( '', 0, PCDATA =>  1, CDATA =>  2, PI => 3, COMMENT => 4, ENT => 5);

(my $version= $])=~ s{\.}{}g;

my $in_pod = 0; # do not change the POD!

while( <>)
  {

speedup.pl  view on Meta::CPAN


    s{($elt)->former_($FORMER)}{($1\->{former} && $1\->{former}\->{$2})}g;

    s{($elt)->set_(parent|prev_sibling)\(\s*($set_to)\s*\)}{$1\->\{$2\}=$3; if( \$XML::Twig::weakrefs) { weaken( $1\->\{$2\});} }g;
    s{($elt)->set_(first_child)\(\s*($set_to)\s*\)}{ $1\->set_not_empty; $1\->\{$2\}=$3; }g;
    s{($elt)->set_(next_sibling)\(\s*($set_to)\s*\)}{ $1\->\{$2\}=$3; }g;
    s{($elt)->set_(last_child)\(\s*($set_to)\s*\)}{ $1\->set_not_empty; $1\->\{$2\}=$3; if( \$XML::Twig::weakrefs) { weaken( $1\->\{$2\});} }g;

    s/$var->atts/$1\->{att}/g;

    s/$var->append_(pcdata|cdata)\(([^)]*)\)/$1\->\{$2\}.= $3/g;
    s/$var->set_($SET_NOT_EMPTY)\(([^)]*)\)/$1\->\{$2\}= (delete $1->\{empty\} || 1) && $3/g;
    s/$var->_set_($SET_NOT_EMPTY)\s*\(([^)]*)\)/$1\->{$2}= $3/g;

    s/(\$[a-z][a-z_]*(?:\[\d\])?)->gi/\$XML::Twig::index2gi\[$1\->{'gi'}\]/g;

    s/$var->id/$1\->{'att'}->{\$ID}/g;
    s/$var->att\(\s*([^)]+)\)/$1\->{'att'}->\{$2\}/g;

    s/(\$[a-z][a-z_]*(?:\[\d\])?)->is_pcdata/(exists $1\->{'pcdata'})/g; 
    s/(\$[a-z][a-z_]*(?:\[\d\])?)->is_cdata/(exists $1\->{'cdata'})/g; 
    s/$var->is_pi/(exists $1\->{'target'})/g; 
    s/$var->is_comment/(exists $1\->{'comment'})/g; 
    s/$var->is_ent/(exists $1\->{'ent'})/g; 
    s/(\$,a-z][a-z_]*(?:\[\d\])?)->is_text/((exists $1\->{'pcdata'}) || (exists $1\->{'cdata'}))/g; 

    s/$var->is_empty/$1\->{'empty'}/g;
    s/$var->set_empty(?:\(([^)]*)\))?(?!_)/"$1\->{empty}= " . ($2 || 1)/ge;
    s/$var->set_not_empty/delete $1\->{empty}/g;

    s/$var->_is_private/( (substr( \$XML::Twig::index2gi\[$1\->{'gi'}\], 0, 1) eq '#') && (substr( \$XML::Twig::index2gi\[$1\->{'gi'}\], 0, 9) ne '#default:') )/g;
    s/_is_private_name\(\s*$var\s*\)/( $1=~ m{^#(?!default:)} )/g;

    s{_is_fh\(\s*$var\)}{isa( $1, 'GLOB') || isa( $1, 'IO::Scalar')}g;

t/test1.t  view on Meta::CPAN

$t6->parse( $st6); 
$doc= $t6->root;
$doc->prefix( 'p1:');
sttest( $t6->root,'<doc>p1:<el1>text</el1><el2>more text</el2></doc>', 
        "prefix doc"); 
my $el1= $doc->first_child( 'el1');
$el1->prefix( 'p2:');
sttest( $t6->root,'<doc>p1:<el1>p2:text</el1><el2>more text</el2></doc>',
        "prefix el1"); 
my $el2= $doc->first_child( 'el2');
my $pcdata= $el2->first_child( PCDATA);
$pcdata->prefix( 'p3:');
sttest( $t6->root,'<doc>p1:<el1>p2:text</el1><el2>p3:more text</el2></doc>', 
        "prefix pcdata"); 

exit 0;
__END__

t/test3.t  view on Meta::CPAN

use strict;


use File::Spec;
use lib File::Spec->catdir(File::Spec->curdir,"t");
use tools;

# This just tests a complete twig, no callbacks
# additional tests for element creation/parse and 
# space policy
# plus test for the is_pcdata method

$|=1;

use XML::Twig;

my $i=0;
my $failed=0;

my $TMAX=23; # do not forget to update!

t/test3.t  view on Meta::CPAN


my $string3= "<doc>\n<p>para</p>\n<p>\n</p>\n</doc>";
my $p11= XML::Twig::Elt->parse( $string3, KeepSpaces => 1);
sttest( $p4, $string, 'KeepSpaces');
my $p12= XML::Twig::Elt->parse( $string3, KeepSpacesIn => [ 'doc']);
sttest( $p12, "<doc>\n<p>para</p>\n<p></p>\n</doc>", 'KeepSpacesIn');
my $p13= XML::Twig::Elt->parse( $string3, KeepSpaces => 1);
sttest( $p13, "<doc>\n<p>para</p>\n<p>\n</p>\n</doc>", 'KeepSpaces');

my $p14= XML::Twig::Elt->parse( $string2);
my $is_pcdata= $p14->is_pcdata;
ok( $is_pcdata ? 0 : 1, "is_pcdata on a <para>");
my $pcdata= $p14->first_child( PCDATA);
$is_pcdata=  $pcdata->is_pcdata;
ok( $pcdata->is_pcdata, "is_pcdata on PCDATA");

my $erase_string='<?xml version="1.0"?><doc><elt id="elt1"><selt id="selt1"
>text 1</selt><selt id="selt2"><selt id="selt3"> text 2</selt></selt
><selt id="selt4"><selt id="selt5"> text 3</selt> text 4</selt
></elt></doc>';
my $er_t= new XML::Twig( TwigHandlers => { selt => sub { $_[1]->erase; } });
$er_t->parse( $erase_string);
sttest( $er_t->root, '<doc><elt id="elt1">text 1 text 2 text 3 text 4</elt></doc>',
 "erase");

# test whether Twig packs strings
my $br_pcdata= "line 1\nline 2\nline 3\n";
my $doc_br_pcdata= "<doc>$br_pcdata</doc>";
my $t_br_pcdata= new XML::Twig();
$t_br_pcdata->parse( $doc_br_pcdata);
$pcdata= $t_br_pcdata->root->first_child->pcdata;
stest( $pcdata, $br_pcdata, "multi-line pcdata");

exit 0;

t/test5.t  view on Meta::CPAN

$t3->parse( $doc);
if( $id2 eq $exp_id2) 
  { print "ok 20\n"; } else { print "not ok 20\n"; warn "$id2 instead of $exp_id2\n"; }

$id3=''; $exp_id3= 'p2_2p2_7';
$t3= new XML::Twig( TwigRoots => { 'p3/p2'    => sub { $id3.= $_[1]->id;} } );
$t3->parse( $doc);
if( $id3 eq $exp_id3) 
  { print "ok 21\n"; } else { print "not ok 21\n"; warn "$id3 instead of $exp_id3\n"; }

# test what happens to 0 in pcdata/cdata
my $pcdata= '<test><text>0</text></test>';
my $cdata= '<test><text><![CDATA[0]]></text></test>';
my $t4= new XML::Twig;

$t4->parse( $pcdata);
if( my $res= $t4->sprint eq $pcdata) { print "ok 22\n"; } 
else { print "not ok 22\n"; warn "sprint returns $res instead of $pcdata\n"; }

$t4->parse( $pcdata);
if( my $res= $t4->root->text eq '0') { print "ok 23\n"; } 
else { print "not ok 23\n"; warn "sprint returns $res instead of '0'\n"; }

$t4->parse( $cdata);
if( my $res= $t4->sprint eq $cdata) { print "ok 24\n"; } 
else { print "not ok 23\n"; warn "sprint returns $res instead of $cdata\n"; }

$t4->parse( $cdata);
if( my $res= $t4->root->text eq '0') { print "ok 25\n"; } 
else { print "not ok 25\n"; warn "sprint returns $res instead of '0'\n"; }

my $test_inherit=
'<doc att1="doc1" att2="doc2" att3="doc3"><elt att1="elt1" att_null="0">
  <subelt att1="subelt1" att2="subelt2"></subelt>
</elt></doc>';

my $t5= new XML::Twig;
$t5->parse( $test_inherit);

t/test_3_24.t  view on Meta::CPAN

  matches( $@, qr/^element #PCDATA can only be created from text/, "error in creating PCDATA element");

  eval { my $elt= XML::Twig::Elt->new( '#COMMENT', "foo", []); };
  matches( $@, qr/^element #COMMENT can only be created from text/, "error in creating COMMENT element");

  eval { my $elt= XML::Twig::Elt->new( '#PI', "foo", [], "bah!"); };
  matches( $@, qr/^element #PI can only be created from text/, "error in creating PI element");

}

{ # set_cdata on non CDATA element
  my $elt = XML::Twig::Elt->new("qux");
  $elt->set_cdata("test this '<' & this '>'");
  is( $elt->sprint, q{<qux><![CDATA[test this '<' & this '>']]></qux>}, "set_cdata on non CDATA element");
}

{ # set_comment on non comment element
  my $elt = XML::Twig::Elt->new(qux => "toto");
  $elt->set_comment( " booh ");
  is( $elt->sprint, q{<!-- booh -->}, "set_comment on non comment element");
}

{ # set_pi on non pi element
  my $elt = XML::Twig::Elt->new(qux => "toto");

t/test_3_27.t  view on Meta::CPAN

                    "problem with writing $tmp, likely linked to missing write permission on the current directory" );
            }
        }
    }
}

{
    my $doc = qq{<d><e1 id="e1">foo<e id="e">bar</e>baz</e1><e1 id="e2">toto <![CDATA[tata]]> tutu</e1></d>};
    my $t   = XML::Twig->parse($doc);
    is( $t->elt_id("e1")->text('no_recurse'), 'foobaz',         "text_only" );
    is( $t->elt_id("e2")->text_only,          'toto tata tutu', "text_only (cdata section)" );
    is( $t->elt_id("e")->text_only,           'bar',            "text_only (no embedded elt)" );
}

{
    my $doc = qq{<!DOCTYPE d SYSTEM "dummy.dtd" []><d><e1 id="e1">tutu &lt;&ent; <b>no</b>tata</e1></d>};
    my $t   = XML::Twig->parse($doc);
    is( $t->elt_id("e1")->text(),                 'tutu <&ent; notata',    "text with ent" );
    is( $t->elt_id("e1")->text('no_recurse'),     'tutu <&ent; tata',      "text no_recurse with ent" );
    is( $t->elt_id("e1")->xml_text(),             'tutu &lt;&ent; notata', "xml_text with ent" );
    is( $t->elt_id("e1")->xml_text('no_recurse'), 'tutu &lt;&ent; tata',   "xml_text no_recurse with ent" );

t/test_3_40.t  view on Meta::CPAN

}

{ my $t=XML::Twig->parse( '<root/>');
  $t->root->wrap_in( 'nroot');
  is( $t->sprint, '<nroot><root/></nroot>', 'wrapping the root');
}

{
my $t=XML::Twig->new;
XML::Twig::_set_weakrefs(0);
my $doc='<doc>\n  <e att="a">text</e><e>text <![CDATA[cdata text]]> more text <e>foo</e>\n more</e></doc>';
$t->parse( $doc);

$doc=~ s{\n  }{}; # just the first one
is( $t->sprint, $doc, 'parse with no weakrefs');

$t->root->insert_new_elt( first_child => x => 'text');
$doc=~ s{<doc>}{<doc><x>text</x>};
is( $t->sprint, $doc, 'insert first child with no weakrefs');

$t->root->insert_new_elt( last_child => x => 'text');

t/test_3_44.t  view on Meta::CPAN

  my $r= $t->root;
  $r->suffix( '&1', 'opt' );
  is( $t->sprint, '<d>f&amp;1</d>', 'suffix, non asis option');
  $r->suffix( '&2', 'asis');
  is( $t->sprint, '<d>f&amp;1&2</d>', 'suffix, asis option');
  $r->suffix( '&3');
  is( $t->sprint, '<d>f&amp;1&2&amp;3</d>', 'suffix, after a suffix with an asis option');
}
{ my $t= XML::Twig->parse( '<d>f</d>');
  $t->root->last_child->suffix( '&1', 'opt' );
  is( $t->sprint, '<d>f&amp;1</d>', 'pcdata suffix, non asis option');
  $t->root->last_child->suffix( '&2', 'asis');
  is( $t->sprint, '<d>f&amp;1&2</d>', 'pcdata suffix, asis option');
  $t->root->last_child->suffix( '&3', 'asis');
  is( $t->sprint, '<d>f&amp;1&2&3</d>', 'pcdata suffix, asis option, after an asis element');
  $t->root->last_child->suffix( '&4');
  is( $t->sprint, '<d>f&amp;1&2&3&amp;4</d>', 'pcdata suffix, after a suffix with an asis option');
}

{ my $t= XML::Twig->parse( '<d>f</d>');
  my $r= $t->root;
  $r->prefix( '&1', 'opt' );
  is( $t->sprint, '<d>&amp;1f</d>', 'prefix, non asis option');
  $r->prefix( '&2', 'asis');
  is( $t->sprint, '<d>&2&amp;1f</d>', 'prefix, asis option');
  $r->prefix( '&3');
  is( $t->sprint, '<d>&amp;3&2&amp;1f</d>', 'prefix, after a prefix with an asis option');
}
{ my $t= XML::Twig->parse( '<d>f</d>');
  $t->root->first_child->prefix( '&1', 'opt' );
  is( $t->sprint, '<d>&amp;1f</d>', 'pcdata prefix, non asis option');
  $t->root->first_child->prefix( '&2', 'asis');
  is( $t->sprint, '<d>&2&amp;1f</d>', 'pcdata prefix, asis option');
  $t->root->first_child->prefix( '&3', 'asis');
  is( $t->sprint, '<d>&3&2&amp;1f</d>', 'pcdata prefix, asis option, before an asis element');
  $t->root->first_child->prefix( '&4');
  is( $t->sprint, '<d>&amp;4&3&2&amp;1f</d>', 'pcdata prefix, after a prefix with an asis option');
}

{ my $weakrefs= XML::Twig::_weakrefs();
  XML::Twig::_set_weakrefs(0);

  my $t= XML::Twig->parse( '<d><e>f</e></d>');
  my $e= $t->first_elt( 'e');
  XML::Twig::Elt->new( x => 'g')->replace( $e);
  is( $t->sprint, '<d><x>g</x></d>', 'replace non root element without weakrefs');
  XML::Twig::Elt->new( y => 'h')->replace( $t->root);

t/test_3_45.t  view on Meta::CPAN

               );
foreach my $module ( sort keys  %html_conv)
  { SKIP: 
      { eval "use $module";
        skip "$module not available", 3 if $@ ;

        my $parser= XML::Twig->new( %{$html_conv{$module}});
        my $xml = $parser->safe_parse_html($html);
        print $@ if $@;

        my @cdata = $xml->get_xpath('//#CDATA');
        ok(@cdata == 1, "1 CDATA section found (using $module)");

        ok(((index $xml->sprint, "//]]>") >= 0), "end of cdata ok in doc (using $module)");
        #diag "\n", $xml->sprint, "\n";

        my @elts = $xml->get_xpath('//script');

        foreach my $el (@elts) 
          { #diag $el->sprint;
            ok(((index $el->sprint, "//]]>") >= 0), "end of cdata ok in script element (using $module)");
          }
      }
  }

# test & in HTML (RT #86633)
my $html_with_amp='<h1>Marco&amp;company</h1>';
my $expected_body= '<body><h1>Marco&amp;company</h1></body>';

SKIP: 
{ eval "use HTML::Tidy";

t/test_3_45.t  view on Meta::CPAN


SKIP:
{ eval "use HTML::TreeBuilder";
  skip "HTML::TreeBuilder not available", 1 if $@ ;
  my $parserh= XML::Twig->new();
  my $html = $parserh->safe_parse_html("<h1>Marco&amp;company</h1>");
  diag $@ if $@;
  is( $html->first_elt( 'body')->sprint , $expected_body, "&amp; in text, converting html with treebuilder");
}

is( XML::Twig::_unescape_cdata( '&lt;tag att="foo&amp;bar&amp;baz"&gt;&gt;&gt;&lt;/tag&gt;'), '<tag att="foo&bar&baz">>></tag>', '_unescape_cdata');

SKIP:
{ skip "safe_print_to_file method does not work on Windows", 6 if $^O =~ m{win}i;
  # testing safe_print_to_file
  my $tmp= "safe_print_to_file.xml";
  my $doc= "<doc>foo</doc>";
  unlink( $tmp); # no check, it could not be there
  my $t1= XML::Twig->nparse( $doc)->safe_print_to_file( $tmp);
  ok( -f $tmp, "safe_print_to_file created document");
  my $t2= XML::Twig->nparse( $tmp);

t/test_3_50.t  view on Meta::CPAN

  $e2->set_prev_sibling( $e1);
  $e1->set_next_sibling( $e2);
  is( $t->sprint, '<d><e1/><e2/></d>', 'set_first_child');

  my $e3= XML::Twig::Elt->new( 'e3');
  $d->set_last_child( $e3);
  $e2->set_next_sibling( $e3);
  $e3->set_prev_sibling( $e2);
  is( $t->sprint, '<d><e1/><e2/><e3/></d>', 'set_last_child');

  $e2->insert_new_elt( first_child => '#PCDATA')->_set_pcdata( 'foo');
  is( $t->sprint, '<d><e1/><e2>foo</e2><e3/></d>', '_set_pcdata');

  $e1->insert_new_elt( first_child => '#CDATA')->_set_cdata( 'bar');
  is( $t->sprint, '<d><e1><![CDATA[bar]]></e1><e2>foo</e2><e3/></d>', '_set_cdata');
}

exit;



t/test_additional.t  view on Meta::CPAN

      { $open= eval( 'sub { open( $_[0], $_[1], $_[2]) }'); }
  }

my $TMAX=663; 

print "1..$TMAX\n";

{
my $t= XML::Twig->new->parse( q{
  <doc>
    <cdata><![CDATA[cdata 01]]></cdata>
    <cdata>foo <![CDATA[cdata <02>]]> bar </cdata>
  </doc>
});

# use CDATA
my $cdata= $t->first_elt( CDATA)->text;
is( $cdata, 'cdata 01', 'first_elt( CDATA)');# test 1
is( $t->first_elt( CDATA)->cdata_string, '<![CDATA[cdata 01]]>', 'cdata_string');# test 2
is( $t->root->cdata_string, '', 'cdata_string for non cdata element');# test 3

my $cdata2= $t->root->first_child( 'cdata[2]')->next_elt( CDATA)->text;
is( $cdata2, 'cdata <02>', 'first_child( cdata[2])');# test 4
}

# test warning for invalid options
my $old_warning_handler= $SIG{__WARN__};

{
my $warning="";
$SIG{__WARN__} = sub { $warning.= join '', @_ };
XML::Twig->new( dummy_opt => 1);
$SIG{__WARN__}= $old_warning_handler;

t/test_additional.t  view on Meta::CPAN

  is( $count, 0, "children_count( 'none')");# test 21
  $count= $t->root->children_count;
  is( $count, 2, "children_count");# test 22
  ok( $t->root->first_child_matches( 'elt'), "first_child_matches");# test 23

  $t->root->insert_new_elt( 'p');
  nok( $t->root->all_children_are( 'elt'), "all_children_are( 'elt') (with p child)");# test 24

}

# test cdata append_cdata, append_extra_data, append_pcdata
{
my $t=XML::Twig->new->parse( '<doc><elt>text <![CDATA[some cdata]]> more text</elt></doc>');

my $cdata= $t->root->next_elt( CDATA)->cdata;
is( $cdata, 'some cdata', 'created CDATA element');# test 25

$t->root->next_elt( CDATA)->append_cdata( ' appended<>');
$t->root->next_elt( PCDATA)->append_pcdata( 'more ');
$t->root->first_child( 'elt')->append_extra_data( '<!-- comment -->');

is( $t->sprint, '<doc><!-- comment --><elt>text more <![CDATA[some cdata appended<>]]> more text</elt></doc>', "append_extra_data");# test 26
}

# test att_names and att_to_field
{ 
my $t= XML::Twig->new->parse( '<doc><elt att1="foo" att2="bar"/></doc>');
my $elt= $t->root->first_child_matches( 'elt');
ok( $elt, "first_child_matches");# test 27
my $att_names= join ':', sort $elt->att_names;
is( $att_names, 'att1:att2', "att_names");# test 28
$elt->att_to_field( 'att1');

t/test_additional.t  view on Meta::CPAN

is( $t->sprint, '<doc><elt>&lt;p>bar&lt;/p></elt></doc>', "set_not_asis");# test 51
$elt->set_asis;
is( $elt->is_asis ? 'asis' : 'not asis', 'asis', "is_asis (set, yes)");# test 52
is( $t->sprint, '<doc><elt><p>bar</p></elt></doc>', "set_asis");# test 53

$root->cut_children;
$root->insert_new_elt( first_child => '#CDATA' => "toto");
is( $t->sprint, '<doc><![CDATA[toto]]></doc>', "create CDATA");# test 54
is($root->last_child_matches( '#CDATA') ? "match" : "no match", "match", "last_child_matches (yes)");# test 55
is($root->last_child_matches( "foo") ? "match" : "no match", "no match", "last_child_matches (no)");# test 56
my $cdata= $root->last_child_matches( '#CDATA');
ok( $cdata->is_cdata, "cdata is_cdata");# test 57
nok( $cdata->is_comment, "cdata is_comment");# test 58
nok( $cdata->is_pi, "cdata is_pi");# test 59
nok( $cdata->is_empty, "cdata is_empty");# test 60
nok( $cdata->is_ent, "cdata is_ent");# test 61
ok( $cdata->is_first_child, "cdata is_first_child");# test 62
ok( $cdata->is_last_child, "cdata is_last_child");# test 63

}

# test field last_child_text last_child_trimmed_text
{ my $t= XML::Twig->new->parse( '<doc><field1>val1</field1><field2>val2</field2></doc>');
  my $root= $t->root;
  $root->set_field( field2 => "new  val2 ");
  is( $root->last_child_text( 'field2'), "new  val2 ", "set_field");# test 64
  is( $root->last_child_trimmed_text( 'field2'), "new val2", "set_field (trimmed text)");# test 65
  is( $root->last_child_text( 'field1'), "val1", "last_child_text");# test 66

t/test_additional.t  view on Meta::CPAN

$elt2->move( before => $elt1);
is( $t->sprint, '<doc><elt2/><elt1/></doc>', "cut");# test 209
$elt2->cut;
is( $t->sprint, '<doc><elt1/></doc>', "cut");# test 210
$elt2->replace( $elt1);
is( $t->sprint, '<doc><elt2/></doc>', "replace");# test 211
$elt2->set_content( "toto");
$elt2->suffix( ":foo");
is( $elt2->xml_string, "toto:foo", "suffix");# test 212
$elt2->first_child( '#TEXT')->suffix( 'bar');
is( $elt2->xml_string, "toto:foobar", "suffix on pcdata elt");# test 213
$elt2->replace_with( $elt1);
is( $t->sprint, '<doc><elt1/></doc>', "replace_with");# test 214
$elt1->set_content( "tto");
my $o= XML::Twig::Elt->new( b => "oo");
$o->paste_within( $elt1, 1);
is( $t->sprint, '<doc><elt1>t<b>oo</b>to</elt1></doc>', "replace_with");# test 215
$o->new( t => {a => 1 }, 'ta')->paste_within( $t->first_elt( 'b')->first_child, 1);
is( $t->sprint, '<doc><elt1>t<b>o<t a="1">ta</t>o</b>to</elt1></doc>', "replace_with");# test 216

}

t/test_additional.t  view on Meta::CPAN

is( $t->prolog( UpdateDTD => 1), $prolog, "prolog, updated DTD");# test 259

$t->entity_list->delete( 'ent3');
is( join( ':', sort $t->entity_names), "ent1:ent2:ent4:ent5", "entity_names");# test 260
$t->entity_list->delete( ($t->entity_list->list)[0]);
is( join( ':', sort $t->entity_names), "ent2:ent4:ent5", "entity_names");# test 261
}

{
my $t= XML::Twig->new( comments => 'process', pi =>'process')
                ->parse( '<doc><!--comment--><?target pi?>text<![CDATA[cdata]]></doc>');
is( $t->root->first_child( '#COMMENT')->get_type, "#COMMENT", "get_type #COMMENT");# test 262
is( $t->root->first_child( '#PI')->get_type, "#PI", "get_type #PI");# test 263
is( $t->root->first_child( '#CDATA')->get_type, "#CDATA", "get_type #CDATA");# test 264
is( $t->root->first_child( '#PCDATA')->get_type, "#PCDATA", "get_type #PCDATA");# test 265
is( $t->root->get_type, "#ELT", "get_type #ELT");# test 266
my $cdata= $t->root->first_child( '#CDATA');
$cdata->set_cdata( "new cdata");
is( $cdata->sprint, "<![CDATA[new cdata]]>", "set_cdata");# test 267
my $copy= $t->root->copy;
is( $copy->sprint, $t->root->sprint, 'copy of an element with extra data');# test 268

is( $t->sprint( pretty_print => 'indented'),# test 269
    qq{<doc><!--comment--><?target pi?>text<![CDATA[new cdata]]></doc>\n},
    'indented elt');

}


{ 
my $t= XML::Twig->new->parse( '<!DOCTYPE doc SYSTEM "dummy.dtd"><doc> text &ent; more</doc>');
my $ent= $t->first_elt( '#ENT');
is( $ent->get_type, "#ENT", "get_type");# test 270
is( $ent->ent, '&ent;', "ent");# test 271

t/test_additional.t  view on Meta::CPAN

                 ->parse( q{<doc att="foo">text</doc>});
  is( $t->sprint, q{<qbp ngg="sbb">grkg</qbp>}, "input filter");# test 329
  $t=XML::Twig->new;
  $t->parse( q{<doc att="foo">text</doc>});
  is( $t->sprint, q{<doc att="foo">text</doc>}, "input filter (none)");# test 330
  $t->set_input_filter( \&rot13);
  $t->parse( q{<qbp ngg="sbb">grkg</qbp>});
  is( $t->sprint, q{<doc att="foo">text</doc>}, "set_input_filter");# test 331
  $t->parse( '<doc><?target data?><elt/><!-- silly hey? --><elt/></doc>');
  is( $t->sprint, '<qbp><?gnetrg qngn?><ryg/><!-- fvyyl url? --><ryg/></qbp>',# test 332
      "set_input_filter on comments and cdata");  
 


}

sub rot13 { $_[0]=~ tr/a-z/n-za-m/; $_[0]; }

# test global_state methods
{ my $doc= q{<doc att="foo"><p>p 1</p><p>p 2</p></doc>};
  my $t=XML::Twig->new->parse( $doc);

t/test_additional.t  view on Meta::CPAN

  else
    { eval "require XML::Filter::BufferText;";
      if( $@)
        { skip(5, "XML::Filter::BufferText not available"); }
      else
        { import XML::SAX::Writer;
          import XML::Filter::BufferText;
          my $output='';
          my $writer = XML::SAX::Writer->new( Output => \$output);
          my $xmldecl= qq{<?xml version="1.0" encoding="UTF-8"?>};
          my $body= qq{<doc><!-- comment --><p att="p1">text</p><?target pi ?><ns xmlns:foo="uri2"><foo:e foo:att="bar">foo:e text</foo:e></ns><ns xmlns="uri2"><e att="tata">t</e></ns><p><![CDATA[ some cdata]]></p>[</doc>};
          my $doc= $xmldecl.$body;
          my $xfbtv= $XML::Filter::BufferText::VERSION;  
          if( $xfbtv < 1.01)
            { skip( 2, "XML::Filter::BufferText version $xfbtv has a bug in CDATA processing"); }
          else
            {
              my $t= XML::Twig->new( comments =>'process', pi => 'process')->parse( $doc);
              # add private data
              $t->root->set_att( '#priv' => 'private');
              $t->root->insert_new_elt( last_child => '#private');

t/test_additional.t  view on Meta::CPAN

  my $doc4=q{<?xml version="1.0"?><!-- comment 1 --><?pi data?><doc><elt/><?pi2 data2?>text more text</doc>};
  $t= XML::Twig->new->parse( $doc4);
  is( _hash( normalize_xml( $t->sprint)), _hash( $doc4), 'comment before PI (2 PIs, no comments)');# test 524

  my $doc3=q{<?xml version="1.0"?><!-- comment 1 --><?pi data?><doc><?pi2 data2?>text more text</doc>};
  $t= XML::Twig->new->parse( $doc3);
  is( _hash( normalize_xml( $t->sprint)), _hash( $doc3), 'comment before PI (2 PIs, no comments)');# test 525

  my $doc1=q{<?xml version="1.0"?><!-- comment 1 --><?pi data?><doc>t<?pi2 data2?>text<!--comment--> more text</doc>};
  $t= XML::Twig->new->parse( $doc1);
  is( _hash( normalize_xml( $t->sprint)), _hash( $doc1), 'comment before PI (2 PIs, pcdata before pi)');# test 526

  my $doc2=q{<?xml version="1.0"?><!-- comment 1 --><?pi data?><doc> <?pi2 data2?>text<!--comment--> more text</doc>};
  $t= XML::Twig->new->parse( $doc2);
  is( _hash( normalize_xml( $t->sprint)), _hash( $doc2), 'comment before PI (2 PIs)');# test 527


  $t= XML::Twig->new->parse( $doc);
  is( _hash( normalize_xml( $t->sprint)), _hash( $doc), 'comment before PI (3 PIs)');# test 528
}

t/test_additional.t  view on Meta::CPAN

                             'elt3/elt4'       => sub { $res .= 'E5'; },
                             '*[@att="c"]'     => sub { $res .= 'E6'; },
                             '*[@att=~/d/]'    => sub { $res .= 'E7'; },
                             _default_         => sub { $res .= 'E0'; }
                           },
                       )->parse( $doc);
  is( $res => 'E0E1E2E3E4E5E6E7E0', 'all types of handlers on start_tags');# test 529

}                  

{ my $doc= q{<doc>  <![CDATA[cdata]]></doc>};
  my $t= XML::Twig->new( keep_spaces => 1)->parse( $doc);
  is( $t->sprint, $doc, 'spaces before cdata');# test 530

}

{ my $doc= q{<doc>  <![CDATA[cdata]]>  <elt/>  <![CDATA[more cdata]]></doc>};
  my $t= XML::Twig->new( keep_spaces => 1)->parse( $doc);
  is( $t->sprint, $doc, '2 cdata sections');# test 531

}

{ my $doc= q{<doc>  <![CDATA[cdata]]>  <elt/>  <!-- comment --> <![CDATA[more cdata]]></doc>};
  my $t= XML::Twig->new( keep_spaces => 1, comments => 'process')->parse( $doc);
  is( $t->sprint, $doc, 'spaces and extra data before cdata');# test 532

}

{ # fun with suffix and asis
  my $t=XML::Twig->new->parse( '<doc>to</doc>');
  $t->root->suffix( 'to');
  is( $t->sprint, '<doc>toto</doc>', 'regular suffix');# test 533

  $t=XML::Twig->new->parse( '<doc><b>to</b></doc>');
  $t->root->suffix( 'to');

t/test_additional.t  view on Meta::CPAN


{ my $doc= '<?xml version="a.0"?><!DOCTYPE doc SYSTEM "no_dtd" []> <doc att="val"><p att="val">toto &ent; <![CDATA[ toto]]></p></doc>';
  my $t= XML::Twig->new->parse( $doc);
  my $alt_root= $t->root->copy;
  is( $alt_root->sprint, $t->root->sprint, 'copy with entity');# test 558

}

{ my $doc= '<doc>toto</doc>';
  my $t= XML::Twig->new->parse( $doc);
  my $pcdata= $t->first_elt( '#TEXT');
  my $start_tag= $pcdata->start_tag;
  nok( $start_tag, 'start_tag for a text element');# test 559
  $t->root->set_att( '#priv_att' => 1);
  is( $t->sprint, $doc, 'private attributes');# test 560
  my $priv_elt= $t->root->insert( '#priv_elt');
  is( $t->sprint, $doc, 'private element');# test 561
  $priv_elt->set_gi( 'foo');
  is( $t->sprint, '<doc><foo>toto</foo></doc>', 'private element');# test 562
  $priv_elt->set_gi( '#priv');
  is( $t->sprint, $doc, 'private element');# test 563
  $priv_elt->set_att( att => "val");

t/test_additional.t  view on Meta::CPAN

{ my $e= XML::Twig::Elt->new( 'toto');
  nok( scalar $e->_is_private, 'private elt (not)');# test 567
  $e->set_tag( '#toto');
  ok( scalar $e->_is_private, 'private elt (yes)');# test 568
  ok( scalar XML::Twig::Elt::_is_private_name( '#toto'), '_is_private_name (yes)');# test 569
  nok( scalar XML::Twig::Elt::_is_private_name( 'toto'), '_is_private_name (no)');# test 570
}

{ my $t= XML::Twig->new->parse( '<doc><![CDATA[toto]]></doc>');
  my $text_elt= $t->first_elt( '#TEXT');
  is( $text_elt->xml_string, '<![CDATA[toto]]>', 'xml_string for cdata');# test 571
  $text_elt->set_text( '<>');
  is( normalize_xml( $t->sprint), '<doc><![CDATA[<>]]></doc>', 'set_text on CDATA');# test 572
  $text_elt->set_text( '<>', force_pcdata => 1);
  is( normalize_xml( $t->sprint), '<doc>&lt;></doc>', 'set_text on CDATA (with force_pcdata)');# test 573
  $t->root->set_content( { att => "val" }, 'toto ', 'tata');
  is( $t->root->sprint, '<doc att="val">toto tata</doc>', 'set_content with attributes');# test 574
  $text_elt= $t->first_elt( '#TEXT');
  $text_elt->set_content( 'titi');
  is( $t->root->sprint, '<doc att="val">titi</doc>', 'set_content on text elt');# test 575
}

{ my $t=XML::Twig->new->parse( '<doc><elt>text 1</elt><elt>text 2</elt><elt>text 3</elt></doc>');
  my $elt1= $t->root->first_child( 'elt[1]');
  my $elt2= $t->root->first_child( 'elt[2]');

t/test_additional.t  view on Meta::CPAN

}
  
{ 
  if( $perl < 5.008)  
    { skip( 1, "need perl 5.8 or above to perform these tests (you have $perl)"); }
  else
    { my $doc= '<doc><![CDATA[toto]]>tata<!-- comment -->t<?pi data?> more</doc>';
      my $out=''; $open->( my $fh, ">", \$out);
      my $t= XML::Twig->new( comments => 'process', pi => 'process')->parse( $doc);
      $t->flush( $fh);
      is( $out, $doc, 'flush with cdata');# test 578
    }
}

{ my $doc=<<END;
<doc>
  <elt>text</elt><indent>this</indent>
  <pre>text to 
keep spaces
  in like
    this

t/test_additional.t  view on Meta::CPAN

  nok( $root->child_matches( 1, 'toto'), 'child_matches(not)');# test 620
  nok( $root->prev_sibling_matches( 'toto'), 'prev_sibling_matches(not)');# test 621
  nok( $root->prev_elt_text( 'toto'), 'prev_elt_text(not)');# test 622
  nok( $root->sibling_text( 1, 'toto'), 'prev_elt_text(not)');# test 623
  nok( $root->prev_elt_trimmed_text( 'toto'), 'prev_elt_trimmed_text(not)');# test 624
  nok( $root->prev_elt_matches( 'toto'), 'prev_elt_matches(not)');# test 625
  nok( $root->next_elt_trimmed_text( 'toto'), 'next_elt_trimmed_text(not)');# test 626
  nok( $root->next_elt_matches( 'toto'), 'next_elt_matches(not)');# test 627
  nok( $root->parent_text( 'toto'), 'parent_text(not)');# test 628
  nok( $root->parent_trimmed_text( 'toto'), 'parent_trimmed_text(not)');# test 629
  nok( $root->pcdata_xml_string, 'pcdata_xml_string of a non pcdata elt');# test 630
  nok( $root->att_xml_string( 'foo'), 'att_xml_string of a non existing att');# test 631
}

{ my $doc=<<END;
<doc>
  <elt xml:space="preserve">
    <sub id="s1">
      <sub>text 1</sub>
      <sub>text 2</sub>
    </sub>

t/test_additional.t  view on Meta::CPAN


  my $t=XML::Twig->new(pretty_print => 'none')->parse( $doc);
  is( $t->sprint, $expected_doc, 'doc with xml:space="preserve"');# test 632
  is( $t->get_xpath( '//*[@id="s1"]', 0)->sprint, $expected_s1, 'sub element of an xml:space="preserve" element');# test 633
  is( $t->get_xpath( '//*[@id="s2"]', 0)->sprint, $expected_s2, 'regular sub element');# test 634
}

{ my $e= XML::Twig::Elt->parse( '<elt/>');
  is( $e->xml_text, '', 'xml_text of an empty elt');# test 635
  $e= XML::Twig::Elt->parse( '<elt>toto</elt>')->first_child;
  is( $e->xml_text, 'toto', 'xml_text of a pcdata');# test 636
  $e->set_content();
  is( $e->xml_text, 'toto', 'empty set_content');# test 637
  $e= XML::Twig::Elt->parse( '<elt><![CDATA[toto]]></elt>')->first_child;
  is( $e->xml_text, '<![CDATA[toto]]>', 'xml_text of a cdata');# test 638
}

{ my $doc=   q{<doc xmlns:ns1="uri1" xmlns:ns2="uri2"><ns1:elt>toto</ns1:elt>}
           . q{<ns2:elt>tata</ns2:elt></doc>};
  my $expected_keep= $doc;
  $expected_keep=~ s{toto}{foo};
  $expected_keep=~ s{tata}{bar};
  my $t= XML::Twig->new( map_xmlns => { uri1 => "ns_1", uri2 => "ns_2" },
                         keep_original_prefix => 1,
                         twig_handlers => { 'ns_1:elt' => sub { $_->set_text( 'foo'); },

t/test_bugs_3_15.t  view on Meta::CPAN

}
{
  # check that leading spaces NOT after a \n are kept around
  my $doc= "<p>  <b>foo</b>bar</p>";
  my $result=  XML::Twig->new->parse( $doc)->sprint;
  is( $result => $doc, 'leading spaces kept when not after a \n');
}

{
my $t= XML::Twig->new->parse( "<doc><elt>  elt  1 </elt> <elt>  elt   2 </elt></doc>");
is( scalar $t->descendants( '#PCDATA'), 3, 'properly parsed pcdata');
}

{
my $t= XML::Twig->new->parse( "<doc>\n  <elt>  elt  1 </elt>\n  <elt>  elt   2 </elt>\n</doc>");
is( scalar $t->descendants( '#PCDATA'), 2, 'properly parsed pcdata');
}

{ # bug RT 8137
  my $doc= q{<doc  att="val"/>};
  (my $expected= $doc)=~ s{  }{ };
  is( XML::Twig->new( keep_encoding => 1)->parse( $doc)->sprint, $expected, 
      'keep_encoding and 2 spaces between gi and attribute'
    );
}

t/test_bugs_3_18.t  view on Meta::CPAN

{
#bug with long CDATA

# get an accented char in iso-8859-1
my $char_file=File::Spec->catfile('t', "latin1_accented_char.iso-8859-1");
open( CHARFH, "<$char_file") or die "cannot open $char_file: $!";
my $latin1_char=<CHARFH>;
chomp $latin1_char;
close CHARFH;

my %cdata=( "01- 1023 chars" => 'x' x 1022 . 'a',
            "02- 1024 chars" => 'x' x 1023 . 'a',
            "03- 1025 chars" => 'x' x 1024 . 'a',
            "04- 1026 chars" => 'x' x 1025 . 'a',
            "05- 2049 chars" => 'x' x 2048 . 'a',
            "06- 1023 chars spaces" => 'x' x 1020 . '  a',
            "07- 1024 chars spaces" => 'x' x 1021 . '  a',
            "08- 1025 chars spaces" => 'x' x 1022 . '  a',
            "09- 1026 chars spaces" => 'x' x 1023 . '  a',
            "10- 2049 chars spaces" => 'x' x 2048 . '  a',
            "11- 1023 accented chars" => $latin1_char x 1022 . 'a',

t/test_bugs_3_18.t  view on Meta::CPAN

            "19- 1026 accented chars spaces" => $latin1_char x 1023 . '  a',
            "20- 2049 accented chars spaces" => $latin1_char x 2048 . '  a', 
            "21- 511 accented chars" => $latin1_char x 511 . 'a',
            "22- 512 accented chars" => $latin1_char x 512 . 'a',
            "23- 513 accented chars" => $latin1_char x 513 . 'a',
            #"00- lotsa chars" => 'x' x 2000000 . 'a', # do not try this at home
                                                       # but if you do with a higher number, let me know!
            );

if( ($] == 5.008) || ($] < 5.006) || ($XML::Parser::VERSION <= 2.27) )
  { skip( scalar keys %cdata,   "KNOWN BUG in 5.8.0 and 5.005 or with XML::Parser 2.27 with keep_encoding and long (>1024 char) CDATA, "
                              . "see RT #14008 at http://rt.cpan.org/Ticket/Display.html?id=14008"
        );
  }
elsif( perl_io_layer_used())
  { skip( scalar keys %cdata, "cannot test parseurl when UTF8 perIO layer used "
                            . "(due to PERL_UNICODE being set or -C command line option being used)\n"
        );
  }
else
  {
    foreach my $test (sort keys %cdata)
      { my $cdata=$cdata{$test};
        my $doc= qq{<?xml version="1.0" encoding="iso-8859-1" ?><doc><![CDATA[$cdata]]></doc>};
        my $twig= XML::Twig->new( keep_encoding => 1)->parse($doc);
        my $res = $twig->root->first_child->cdata;
        is( $res, $cdata, "long CDATA with keep_encoding $test");
      }
  }
}


# subs_text on text with new lines
{ my $doc= "<doc> foo1 \n foo2 </doc>";
   my $t= XML::Twig->new->parse( $doc);
   (my $expected= $doc)=~ s{foo}{bar}g;
   $t->subs_text( qr{foo}, "bar");

t/test_bugs_3_18.t  view on Meta::CPAN


{ my $doc=q{<doc><?t1 d1?><elt/><?t2 d2?></doc>};
  my $res='';
  XML::Twig->new( pi => 'process', twig_handlers => { '?' => sub { $res.=$_->data } })->parse( $doc);
  is( $res => 'd1d2', '? (any pi) handler');
}

{ my $doc=q{<doc><elt>foo <!--commment--> bar</elt></doc>};
  my $t= XML::Twig->new->parse( $doc);
  is( $t->sprint, $doc, 'embedded comments, output asis');
  $t->root->first_child( 'elt')->first_child->set_pcdata( 'toto');
  is( $t->sprint, '<doc><elt>toto</elt></doc>', 'embedded comment removed');
}


{ my $doc=q{<?xml version="1.0" ?>
            <!DOCTYPE doc [ <!ELEMENT doc (#PCDATA)>
                            <!ENTITY  ent "foo">
                          ]
            >
            <doc> a &ent; is here</doc>

t/test_bugs_3_18.t  view on Meta::CPAN

                    '<doc><elt><ERS>foo<!-- edbet --></ERS><!-- edbet 2 --></elt></doc>',
                  )
    { my $t=XML::Twig->new->parse( $doc);
      $t->first_elt( 'ERS')->erase;
      (my $expected= $doc)=~ s{</?ERS/?>}{}g;
     is( $t->sprint, $expected, "erase in $doc");
    }
}
  
{ my $t=XML::Twig->new->parse( '<doc><p>toto</p></doc>');
  my $pcdata= $t->first_elt( '#PCDATA');
  $pcdata->split_at( 2);
  is( $t->sprint => '<doc><p>toto</p></doc>', 'split_at');
}

{ my $doc= q{<doc>tototata<e>tu</e></doc>};
  my $t= XML::Twig->new->parse( $doc);
  $t->subs_text( qr/(to)ta/, '&elt(p => $1)ti');
  is( $t->sprint,'<doc>to<p>to</p>tita<e>tu</e></doc>' , 'subs_text');
  $t->subs_text( qr/(to)ta/, '&elt(p => $1)ti');
  is( $t->sprint,'<doc>to<p>to</p>tita<e>tu</e></doc>' , 'subs_text (2cd try, same exp)');
  $t->subs_text( qr/(ta)/, '&elt(p1 => $1)ti');

t/test_bugs_3_19.t  view on Meta::CPAN

print "1..$TMAX\n";

{
#bug with long CDATA

# get an accented char in iso-8859-1
my $latin1_char= perl_io_layer_used() ? '' : slurp( File::Spec->catfile('t', "latin1_accented_char.iso-8859-1"));
chomp $latin1_char;

            
my %cdata=( "01- 1025 chars"                    => 'x' x 1025 . 'a',
            "02- short CDATA with nl"           =>  "first line\nsecond line",
            "03- short CDATA with ]"            =>  "first part]second part",
            "04- short CDATA with ] and spaces" =>  "first part ] second part",
            "05- 1024 chars with accent"        =>  $latin1_char x 1023 . 'a',
            "06- 1025 chars with accent"        =>  $latin1_char x 1024 . 'a',
            "07- 1023 chars, last a nl"         => 'x' x 1022 . "\n",
            "08- 1023 chars, last a ]"          => 'x' x 1022 . "]",
            "09- 1024 chars, last a nl"         => 'x' x 1023 . "\n",
            "10- 1024 chars, last a ]"          => 'x' x 1023 . "]",
            "11- 1025 chars, last a nl"         => 'x' x 1024 . "\n",

t/test_bugs_3_19.t  view on Meta::CPAN

            '17- 1060 chars, ] and \n'          => ('1' x 1024) . ('2' x 26) . "\n  \n ]\n]]  ",
            '18- 1060 chars, ] and \n'          => ('1' x 1024) . ('2' x 26) . "\n \n ]\n]]  a",
            '19- 1060 chars, ] and \n'          => '1' x 500 . "\n  \n  ]\n]] a" . '2' x 500 . "\n  \n  ]\n]] a", 
            "20- 800 chars with accent"         =>  $latin1_char x 800,
            "21- 800 chars with accent"         =>  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa$latin1_char" x 16,
            "22- 1600 chars with accent"        =>  "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa$latin1_char" x 32,
            '23- 1600 chars with accent and \n' =>  "aaaaaaaa]aaaaaaaaaaaaaaaaaaaaaaaaa\naaaaaaaaaaaaaaa$latin1_char" x 32,
          );

if( ($] == 5.008) || ($] < 5.006) )
  { skip( scalar keys %cdata,   "KNOWN BUG in 5.8.0 and 5.005 with keep_encoding and long (>1024 char) CDATA, "
                              . "see http://rt.cpan.org/Ticket/Display.html?id=14008"
        );
  }
elsif( perl_io_layer_used())
  { skip( scalar keys %cdata, "cannot test parseurl when UTF8 perIO layer used "
                            . "(due to PERL_UNICODE or -C option used)\n"
        );
  }
else
  {
    foreach my $test (sort keys %cdata)
      { my $cdata=$cdata{$test};
        my $doc= qq{<?xml version="1.0" encoding="iso-8859-1" ?><doc><![CDATA[$cdata]]></doc>};
        my $twig= XML::Twig->new( keep_encoding => 1)->parse($doc);
        my $res = $twig->root->first_child->cdata;
        is( $res, $cdata, "long CDATA with keep_encoding $test");
      }
  }
}

{ # testing _dump
  my $doc= q{<doc><!-- comment --><elt att="xyz">foo</elt><elt>bar<![CDATA[baz]]></elt><?t pi?><elt2>toto<b>tata</b>titi</elt2><elt3 /><elt>and now a long (more than 40 characters) text to see if it gets shortened by default (or not)</elt></doc>};
  my $t= XML::Twig->new->parse( $doc);
  my $dump= q{document
|-doc
| |-elt att="xyz"

t/test_bugs_3_22.t  view on Meta::CPAN

}

{ is(  XML::Twig->nparse( '<doc/>')->root->get_xpath( '.', 0)->gi, 'doc', 'get_xpath: .'); }

{ my $t= XML::Twig->nparse( '<doc><![CDATA[foo]]></doc>');
  $t->first_elt( '#CDATA')->set_text( 'bar');
  is( $t->sprint, '<doc><![CDATA[bar]]></doc>', " set_text on CDATA");
  $t->root->set_text( 'bar');
  is( $t->sprint, '<doc>bar</doc>', " set_text on elt containing CDATA");
  $t= XML::Twig->nparse( '<doc><![CDATA[foo]]></doc>');
  $t->first_elt( '#CDATA')->set_text( 'bar', force_pcdata => 1);
  is( $t->sprint, '<doc>bar</doc>', " set_text on CDATA with force_pcdata");}

  # print/flush entity
  # SAX export entity

{ my $enc= "a_non_existent_encoding_bwaaahhh";
  eval { XML::Twig->iconv_convert( $enc); };
  matches( $@, "^(Unsupported|Text::Iconv not available|Can't locate)", "unsupported encoding");
}

{ # test comments handlers

t/test_cdata.t  view on Meta::CPAN


my( $t, $result, $expected_result);

$t= XML::Twig->new( twig_handlers => { 'ehtml/#CDATA' => sub { $_->set_asis; } });
$t->parse( $xml);
$result= $t->sprint;
($expected_result=<DATA>)=~ s{\n*$}{}s; 
if( $result eq $expected_result) { print "ok 1\n"; }
else { print "not ok 1\n"; warn "expected: $expected_result\n result  : $result"; }

$t= XML::Twig->new( twig_handlers => { 'ehtml/#CDATA' => sub { $_->remove_cdata; } });
$t->parse( $xml);
$result= $t->sprint;
($expected_result=<DATA>)=~ s{\n*$}{}s; 
if( $result eq $expected_result) { print "ok 2\n"; }
else { print "not ok 2\n"; warn "expected: $expected_result\n result  : $result"; }

$t= XML::Twig->new( keep_encoding => 1, twig_handlers => { 'ehtml/#CDATA' => sub { $_->set_asis; } });
$t->parse( $xml);
$result= $t->sprint;
($expected_result=<DATA>)=~ s{\n*$}{}s; 
if( $result eq $expected_result) { print "ok 3\n"; }
else { print "not ok 3\n"; warn "test keep_encoding / asis\n  expected: $expected_result\n  result  : $result"; }

$t= XML::Twig->new( keep_encoding => 1, twig_handlers => { 'ehtml/#CDATA' => sub { $_->remove_cdata; } });
$t->parse( $xml);
$result= $t->sprint;
($expected_result=<DATA>)=~ s{\n*$}{}s; 
if( $result eq $expected_result) { print "ok 4\n"; }
else { print "not ok 4\n"; warn "test keep_encoding / remove_cdata\n  expected: $expected_result\n  result  : $result"; }

exit 0;

__DATA__
<doc>
  <elt>text</elt>
  <ehtml><![CDATA[hello<br>world & all]]></ehtml>
</doc>

<doc><elt>text</elt><ehtml>hello<br>world & all</ehtml></doc>

t/test_errors.t  view on Meta::CPAN

  my $p1= $t->root->first_child( 'p');
  my $p2= $t->root->first_child( 'p[2]');
  eval { $p1->merge_text( 'toto'); } ;
  matches( $@, "invalid merge: can only merge 2 elements", 'merge elt and string');
  eval { $p1->merge_text( $p2); } ;
  matches( $@, "invalid merge: can only merge 2 text elements", 'merge non text elts');
  $p1->first_child->merge_text( $p2->first_child);
  is( $t->sprint, '<doc><p>text1text2</p><p></p></doc>', 'merge_text');
  my $p3= XML::Twig::Elt->new( '#CDATA' => 'foo');
  eval { $p1->first_child->merge_text( $p3); };
  matches( $@, "invalid merge: can only merge 2 text elements", 'merge cdata and pcdata elts');
  
}

{ my $t= XML::Twig->new;
  $t->save_global_state;
  eval { $t->set_pretty_print( 'foo'); };
  matches( $@, "invalid pretty print style 'foo'", 'invalid pretty_print style');
  eval { $t->set_pretty_print( 987); };
  matches( $@, "invalid pretty print style 987", 'invalid pretty_print style');
  eval { $t->set_empty_tag_style( 'foo'); };

t/test_need_io_scalar.t  view on Meta::CPAN

  is( $out3, 'should be in $out3', 'restoring initial fh'); 

}


{ my $doc= '<doc><![CDATA[toto]]>tata<!-- comment -->t<?pi data?> more</doc>';
  my $out;
  my $fh= IO::String->new( \$out);
  my $t= XML::Twig->new( comments => 'process', pi => 'process')->parse( $doc);
  $t->flush( $fh);
  is( $out, $doc, 'flush with cdata');
}

{ my $out=''; 

  my $fh= IO::String->new( \$out);
  my $doc='<doc><elt>text</elt><elt1/><elt2/><elt3>text</elt3></doc>';
  my $t= XML::Twig->new( twig_roots=> { elt2 => 1 },
                          start_tag_handlers => { elt  => sub { print $fh '<e1/>'; } },  
                          end_tag_handlers   => { elt3 => sub { print $fh '<e2/>'; } },  
                          twig_print_outside_roots => $fh,

t/test_new_features_3_16.t  view on Meta::CPAN

  is( $t->index( 'target', -1)->text, 't2', 'index');
 
  my $index= $t->index( 'target');
  is( $index->[0]->text, 't1', 'index');
  is( $index->[ 1]->text, 't2', 'index');
  is_undef( $index->[ 2], 'index');
  is( $index->[-1]->text, 't2', 'index');
}


# test the remove_cdata option
{ my $doc        = q{<doc><![CDATA[<tag&>]]></doc>};
  my $escaped_doc= q{<doc>&lt;tag&amp;></doc>};
  my $t= XML::Twig->new( remove_cdata => 1)->parse( $doc);
  is( $t->sprint, $escaped_doc, 'remove_cdata on');
  $t= XML::Twig->new( remove_cdata => 0)->parse( $doc);
  is( $t->sprint, $doc, 'remove_cdata off');
}

# test the create_accessors method
if( $] < 5.006)
  { skip( 11 => "create_accessors not tested with perl < 5.006"); }
else
{ my $doc= '<doc att1="1" att3="foo"/>';
  my $t= XML::Twig->new->parse( $doc);
  $t->create_accessors( qw(att1 att2));
  my $root= $t->root;

t/test_new_features_3_16.t  view on Meta::CPAN

  is( $t->sprint, '<doc att1="4" att2="bar" att3="foo"/>', 'final output');
  eval { $t->create_accessors( 'tag'); };
  matches( $@, q{^attempt to redefine existing method tag using att_accessors }, 'duplicate accessor');
  $@='';
  eval { XML::Twig->create_accessors( 'att2'); };
  is( $@, '', 'redefining existing accessor');
}
  
{ # test embedded comments/pis
  foreach my $doc ( 
                    q{<doc>text <!--cdata coming--><![CDATA[here]]></doc>},
                    q{<doc>text<!--comment-->more</doc>},
                    q{<doc>text<!--comment-->more<!--comment2--></doc>},
                    q{<doc>text<!--comment-->more<!--comment2-->more2</doc>},
                    q{<doc><!--comment-->more<!--comment2-->more2</doc>},
                    q{<doc><!--comment--></doc>},
                    q{<doc>tata<!--comment & all-->toto</doc>},
                    q{<doc>tata &lt;<!--comment &amp; tu &lt; all-->toto &lt;</doc>},
                    q{<doc>text<!--comment-->more &amp; even more<!--comment2-->more2</doc>},
                    q{<doc>text <!--cdata coming--> <![CDATA[here]]></doc>},
                    q{<doc> <!--comment--> more <!--comment2--> more2 </doc>},
                    q{<doc><!--comment--> more <!--comment2--> more2</doc>},
                  )
    { my $t= XML::Twig->new->parse( $doc);
      is( $t->sprint, $doc, "comment within pcdata ($doc)");
      my $t2= XML::Twig->new( keep_encoding => 1)->parse( $doc);
      is( $t2->sprint, $doc, "comment within pcdata in keep encoding mode($doc)");
      my $doc_pi= $doc;
      $doc_pi=~ s{<!--}{<?pi}g; $doc_pi=~ s{-->}{?>}g;
      my $t3= XML::Twig->new->parse( $doc_pi);
      is( $t3->sprint, $doc_pi, "pi within pcdata ($doc_pi)");
      my $t4= XML::Twig->new( keep_encoding => 1)->parse( $doc_pi);
      is( $t4->sprint, $doc_pi, "pi within pcdata in keep encoding mode($doc_pi)");
    }
}

{ # test processing of embedded comments/pis 
  my $doc= q{<doc><elt>foo<!--comment-->bar</elt><elt>foobar</elt></doc>};
  my $t=  XML::Twig->new->parse( $doc);
  my @elt= $t->findnodes( '//elt[string()="foobar"]');
  is( scalar( @elt), 2, 'searching on text with embedded comments');
  foreach my $elt (@elt) { $elt->set_text( 'toto'); }
  is( $t->sprint, q{<doc><elt>toto</elt><elt>toto</elt></doc>}, "set_text");

t/xmlxpath_test1.t  view on Meta::CPAN

$t6->parse( $st6); 
$doc= $t6->root;
$doc->prefix( 'p1:');
sttest( $t6->root,'<doc>p1:<el1>text</el1><el2>more text</el2></doc>', 
        "prefix doc"); 
my $el1= $doc->first_child( 'el1');
$el1->prefix( 'p2:');
sttest( $t6->root,'<doc>p1:<el1>p2:text</el1><el2>more text</el2></doc>',
        "prefix el1"); 
my $el2= $doc->first_child( 'el2');
my $pcdata= $el2->first_child( PCDATA);
$pcdata->prefix( 'p3:');
sttest( $t6->root,'<doc>p1:<el1>p2:text</el1><el2>p3:more text</el2></doc>', 
        "prefix pcdata"); 

is( $t6->node_cmp( 1), -1, "compare twig with scalar");
ok( UNIVERSAL::isa( $t->root->getParentNode, 'XML::Twig::XPath'), 'getParentNode on the root');
ok( UNIVERSAL::isa( $t->root->first_child->getParentNode, 'XML::Twig::XPath::Elt'), 'getParentNode on an elt');
eval '$t6->root->node_cmp( []);';
matches( $@, "^unknown node type ", "compare elt with scalar");
my $elt= XML::Twig::XPath::Elt->new( elt => { att1 => 1, att2 => 2 }, "99");
my( $att1, $att2)= $elt->getAttributes;
is( $att1->node_cmp( $att2), -1, "attribute comparison");
is( $att2->node_cmp( $att1),  1, "attribute comparison (reverse order)");

tools/xml_spellcheck/xml_spellcheck  view on Meta::CPAN


   
    open( $tmp_fh, "<$tmp_file") or die "cannot open temp file $tmp_file: $!";
    while( <$tmp_fh>)
      { chomp;
        my( $id, $text)= split /:/, $_, 2;
        my $wrap= $id2elt->{$id};
        $text=~ s{<\\n>}{\n}g;
        my $text_elt= $wrap->first_child or die "internal error 100\n";
        if( $text_elt->gi eq '#PCDATA')
          { $text_elt->set_pcdata( $text); }
        elsif( $text_elt->gi eq '#CDATA')
          { $text_elt->set_cdata( $text); }
        else 
          { die "internal error 101\n"; }
        $wrap->erase;
      }
    close $tmp_fh;

    rename( $file, "$file$ext") or die "cannot save backup file $file$ext: $!";
    open( FILE, ">$file")       or die "cannot save spell checked file $file: $!";
    $t->print( \*FILE);
    close FILE;



( run in 0.425 second using v1.01-cache-2.11-cpan-ec4f86ec37b )