Algorithm-Diff
view release on metacpan or search on metacpan
lib/Algorithm/Diff.pm view on Meta::CPAN
@diffs = diff( \@seq1, \@seq2 );
$diffs_ref = diff( \@seq1, \@seq2 );
C<diff> computes the smallest set of additions and deletions necessary
to turn the first sequence into the second, and returns a description
of these changes. The description is a list of I<hunks>; each hunk
represents a contiguous section of items which should be added,
deleted, or replaced. (Hunks containing unchanged items are not
included.)
The return value of C<diff> is a list of hunks, or, in scalar context, a
reference to such a list. If there are no differences, the list will be
empty.
Here is an example. Calling C<diff> for the following two sequences:
a b c e h j l m n p
b c d e f j k l m r s t
would produce the following list:
(
[ [ '-', 0, 'a' ] ],
[ [ '+', 2, 'd' ] ],
[ [ '-', 4, 'h' ],
[ '+', 4, 'f' ] ],
[ [ '+', 6, 'k' ] ],
[ [ '-', 8, 'n' ],
[ '-', 9, 'p' ],
[ '+', 9, 'r' ],
[ '+', 10, 's' ],
[ '+', 11, 't' ] ],
)
There are five hunks here. The first hunk says that the C<a> at
position 0 of the first sequence should be deleted (C<->). The second
hunk says that the C<d> at position 2 of the second sequence should
be inserted (C<+>). The third hunk says that the C<h> at position 4
of the first sequence should be removed and replaced with the C<f>
from position 4 of the second sequence. And so on.
C<diff> may be passed an optional third parameter; this is a CODE
reference to a key generation function. See L</KEY GENERATION
FUNCTIONS>.
Additional parameters, if any, will be passed to the key generation
routine.
=head2 C<sdiff>
@sdiffs = sdiff( \@seq1, \@seq2 );
$sdiffs_ref = sdiff( \@seq1, \@seq2 );
C<sdiff> computes all necessary components to show two sequences
and their minimized differences side by side, just like the
Unix-utility I<sdiff> does:
same same
before | after
old < -
- > new
It returns a list of array refs, each pointing to an array of
display instructions. In scalar context it returns a reference
to such a list. If there are no differences, the list will have one
entry per item, each indicating that the item was unchanged.
Display instructions consist of three elements: A modifier indicator
(C<+>: Element added, C<->: Element removed, C<u>: Element unmodified,
C<c>: Element changed) and the value of the old and new elements, to
be displayed side-by-side.
An C<sdiff> of the following two sequences:
a b c e h j l m n p
b c d e f j k l m r s t
results in
( [ '-', 'a', '' ],
[ 'u', 'b', 'b' ],
[ 'u', 'c', 'c' ],
[ '+', '', 'd' ],
[ 'u', 'e', 'e' ],
[ 'c', 'h', 'f' ],
[ 'u', 'j', 'j' ],
[ '+', '', 'k' ],
[ 'u', 'l', 'l' ],
[ 'u', 'm', 'm' ],
[ 'c', 'n', 'r' ],
[ 'c', 'p', 's' ],
[ '+', '', 't' ],
)
C<sdiff> may be passed an optional third parameter; this is a CODE
reference to a key generation function. See L</KEY GENERATION
FUNCTIONS>.
Additional parameters, if any, will be passed to the key generation
routine.
=head2 C<compact_diff>
C<compact_diff> is much like C<sdiff> except it returns a much more
compact description consisting of just one flat list of indices. An
example helps explain the format:
my @a = qw( a b c e h j l m n p );
my @b = qw( b c d e f j k l m r s t );
@cdiff = compact_diff( \@a, \@b );
# Returns:
# @a @b @a @b
# start start values values
( 0, 0, # =
0, 0, # a !
1, 0, # b c = b c
( run in 0.508 second using v1.01-cache-2.11-cpan-df04353d9ac )