Algorithm-Diff

 view release on metacpan or  search on metacpan

lib/Algorithm/Diff.pm  view on Meta::CPAN


    @diffs     = diff( \@seq1, \@seq2 );
    $diffs_ref = diff( \@seq1, \@seq2 );

C<diff> computes the smallest set of additions and deletions necessary
to turn the first sequence into the second, and returns a description
of these changes.  The description is a list of I<hunks>; each hunk
represents a contiguous section of items which should be added,
deleted, or replaced.  (Hunks containing unchanged items are not
included.)

The return value of C<diff> is a list of hunks, or, in scalar context, a
reference to such a list.  If there are no differences, the list will be
empty.

Here is an example.  Calling C<diff> for the following two sequences:

    a b c e h j l m n p
    b c d e f j k l m r s t

would produce the following list:

    (
      [ [ '-', 0, 'a' ] ],

      [ [ '+', 2, 'd' ] ],

      [ [ '-', 4, 'h' ],
        [ '+', 4, 'f' ] ],

      [ [ '+', 6, 'k' ] ],

      [ [ '-',  8, 'n' ],
        [ '-',  9, 'p' ],
        [ '+',  9, 'r' ],
        [ '+', 10, 's' ],
        [ '+', 11, 't' ] ],
    )

There are five hunks here.  The first hunk says that the C<a> at
position 0 of the first sequence should be deleted (C<->).  The second
hunk says that the C<d> at position 2 of the second sequence should
be inserted (C<+>).  The third hunk says that the C<h> at position 4
of the first sequence should be removed and replaced with the C<f>
from position 4 of the second sequence.  And so on.

C<diff> may be passed an optional third parameter; this is a CODE
reference to a key generation function.  See L</KEY GENERATION
FUNCTIONS>.

Additional parameters, if any, will be passed to the key generation
routine.

=head2 C<sdiff>

    @sdiffs     = sdiff( \@seq1, \@seq2 );
    $sdiffs_ref = sdiff( \@seq1, \@seq2 );

C<sdiff> computes all necessary components to show two sequences
and their minimized differences side by side, just like the
Unix-utility I<sdiff> does:

    same             same
    before     |     after
    old        <     -
    -          >     new

It returns a list of array refs, each pointing to an array of
display instructions. In scalar context it returns a reference
to such a list. If there are no differences, the list will have one
entry per item, each indicating that the item was unchanged.

Display instructions consist of three elements: A modifier indicator
(C<+>: Element added, C<->: Element removed, C<u>: Element unmodified,
C<c>: Element changed) and the value of the old and new elements, to
be displayed side-by-side.

An C<sdiff> of the following two sequences:

    a b c e h j l m n p
    b c d e f j k l m r s t

results in

    ( [ '-', 'a', ''  ],
      [ 'u', 'b', 'b' ],
      [ 'u', 'c', 'c' ],
      [ '+', '',  'd' ],
      [ 'u', 'e', 'e' ],
      [ 'c', 'h', 'f' ],
      [ 'u', 'j', 'j' ],
      [ '+', '',  'k' ],
      [ 'u', 'l', 'l' ],
      [ 'u', 'm', 'm' ],
      [ 'c', 'n', 'r' ],
      [ 'c', 'p', 's' ],
      [ '+', '',  't' ],
    )

C<sdiff> may be passed an optional third parameter; this is a CODE
reference to a key generation function.  See L</KEY GENERATION
FUNCTIONS>.

Additional parameters, if any, will be passed to the key generation
routine.

=head2 C<compact_diff>

C<compact_diff> is much like C<sdiff> except it returns a much more
compact description consisting of just one flat list of indices.  An
example helps explain the format:

    my @a = qw( a b c   e  h j   l m n p      );
    my @b = qw(   b c d e f  j k l m    r s t );
    @cdiff = compact_diff( \@a, \@b );
    # Returns:
    #   @a      @b       @a       @b
    #  start   start   values   values
    (    0,      0,   #       =
         0,      0,   #    a  !
         1,      0,   #  b c  =  b c



( run in 0.508 second using v1.01-cache-2.11-cpan-df04353d9ac )