Bio-Phylo

 view release on metacpan or  search on metacpan

lib/Bio/Phylo/NeXML/Entities.pm  view on Meta::CPAN

        elsif ( $escape{$c} and $c eq '&' ) {
            my $maybe_entity = '';
            FIND_SEMI: for my $j ( $i .. $#string ) {
                $maybe_entity .= $string[$j];
                last FIND_SEMI if $string[$j] eq ';';
            }
            if ( not exists $entity2char{$maybe_entity} ) {
                $string[$i] = $char2entity{$c};
            }
        }
        elsif( $escape{$c} and $c eq ';' ) {
            my $maybe_entity = '';
            FIND_AMP: for ( my $j = $i; $j >= 0; $j-- ) {
                $maybe_entity = $string[$j] . $maybe_entity;
                last FIND_SEMI if $string[$j] eq '&';                
            }
            if ( not exists $entity2char{$maybe_entity} ) {
                $string[$i] = $char2entity{$c};
            }
        }
    }
    return join '', @string;
}

sub decode_entities {
    my @results;
    for my $string ( @_ ) {
        my @string = split //, $string;
        for my $i ( 0 .. $#string ) {
            my $c = $string[$i];
            if ( $c eq '&' ) {
                my $maybe_entity = '';
                my $length = 0;
                FIND_SEMI: for my $j ( $i .. $#string ) {
                    $maybe_entity .= $string[$j];
                    last FIND_SEMI if $string[$j] eq ';';
                    $length++;
                }
                if ( exists $entity2char{$maybe_entity} ) {
                    $string[$i] = $entity2char{$maybe_entity};
                    splice( @string, $i + 1, $length );
                }                
            }
        }
        push @results, join '', @string;
    }
    return wantarray ? @results : $results[0];
}

1;

__END__

=head1 NAME

Bio::Phylo::NeXML::Entities - Functions for dealing with XML entities

=head1 DESCRIPTION

This package provides subroutines for dealing with characters that need to be
encoded as XML entities, and decoded in other formats. For example: C<&> needs
to be encoded as C<&amp;> in XML. The subroutines have the same signatures and
the same names as those in the commonly-used module L<HTML::Entities>. They are
re-implemented here to avoid introducing dependencies.

=head1 SUBROUTINES

The following subroutines are utility functions that can be imported using:

 use Bio::Phylo::NeXML::Entities '/entities/';

=over

=item encode_entities

Encodes problematic characters as XML entities

 Type    : Utility function
 Title   : encode_entities
 Usage   : my $encoded = encode_entities('string with & or >','>&')
 Function: Encodes entities in first argument string
 Returns : Modified string
 Args    : Required, first argument: a string to encode
           Optional, second argument: a string that specifies
           which characters to encode

=item decode_entities

Decodes XML entities into the characters they code for

 Type    : Utility function
 Title   : decode_entities
 Usage   : my $decoded = decode_entities('string with &amp; or &gt;')
 Function: decodes encoded entities in argument string(s)
 Returns : Array of decoded strings
 Args    : One or more encoded strings

=back

=head1 SEE ALSO

There is a mailing list at L<https://groups.google.com/forum/#!forum/bio-phylo> 
for any user or developer questions and discussions.

=over

=item L<Bio::Phylo::Manual>

Also see the manual: L<Bio::Phylo::Manual> and L<http://rutgervos.blogspot.com>.

=back

=head1 CITATION

If you use Bio::Phylo in published research, please cite it:

B<Rutger A Vos>, B<Jason Caravas>, B<Klaas Hartmann>, B<Mark A Jensen>
and B<Chase Miller>, 2011. Bio::Phylo - phyloinformatic analysis using Perl.
I<BMC Bioinformatics> B<12>:63.
L<http://dx.doi.org/10.1186/1471-2105-12-63>



=cut



( run in 2.191 seconds using v1.01-cache-2.11-cpan-39bf76dae61 )