Algorithm-Diff-Any

 view release on metacpan or  search on metacpan

lib/Algorithm/Diff/Any.pm  view on Meta::CPAN


use strict;
use warnings;
use Carp ();

use Exporter 'import';
our @EXPORT_OK = qw(
  prepare
  LCS
  LCSidx
  LCS_length
  diff
  sdiff
  compact_diff
  traverse_sequences
  traverse_balanced
);

=head1 NAME

Algorithm::Diff::Any - Perl module to find differences between files

lib/Algorithm/Diff/Any.pm  view on Meta::CPAN

=head1 DESCRIPTION

This is a simple module to select the best available implementation of the
standard C<diff> algorithm, which works by effectively trying to solve the
Longest Common Subsequence (LCS) problem. This algorithm is described in:
I<A Fast Algorithm for Computing Longest Common Subsequences>, CACM, vol.20,
no.5, pp.350-353, May 1977.

However, it is algorithmically rather complicated to solve the LCS problem;
for arbitrary sequences, it is an NP-hard problem. Simply comparing two
strings together of lengths I<n> and I<m> is B<O(n x m)>. Consequently, this
means the algorithm necessarily has some tight loops, which, for a dynamic
language like Perl, can be slow.

In order to speed up processing, a fast (C/XS-based) implementation of the
algorithm's core loop was implemented. It can confer a noticable performance
advantage (benchmarks show a 54x speedup for the C<compact_diff> routine).

=head1 SYNOPSIS

  use Algorithm::Diff::Any;

lib/Algorithm/Diff/Any.pm  view on Meta::CPAN

The following functions are available for import into your namespace:

=over

=item * prepare

=item * LCS

=item * LCSidx

=item * LCS_length

=item * diff

=item * sdiff

=item * compact_diff

=item * traverse_sequences

=item * traverse_balanced

lib/Algorithm/Diff/Any.pm  view on Meta::CPAN


=item *

Neither the Pure Perl nor C/XS-based implementations of this module would
have been possible without the work of James W. Hunt (Stanford University)
and Thomas G. Szymanski (Princeton University), authors of the often-cited
paper for computing longest common subsequences.

In their abstract, they claim that a running time of B<O(n log n)> can be
expected, with a worst-case time of B<O(n^2 log n)> for two subsequences of
length I<n>.

=back

=head1 SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc Algorithm::Diff::Any

You can also look for information at:

t/10base.t  view on Meta::CPAN

		[ '-', 9,  'p' ],
		[ '+', 10, 's' ],
		[ '+', 11, 't' ],
	]
];

# Result of LCS must be as long as @a
my @result = Algorithm::Diff::_longestCommonSubsequence( \@a, \@b );
ok( scalar(grep { defined } @result),
	scalar(@correctResult),
	"length of _longestCommonSubsequence" );

# result has b[] line#s keyed by a[] line#
# print "result =", join(" ", map { defined($_) ? $_ : 'undef' } @result), "\n";

my @aresult = map { defined( $result[$_] ) ? $a[$_] : () } 0 .. $#result;
my @bresult =
  map { defined( $result[$_] ) ? $b[ $result[$_] ] : () } 0 .. $#result;

ok( "@aresult", $correctResult, "A results" );
ok( "@bresult", $correctResult, "B results" );

t/99portability.t  view on Meta::CPAN

  if ($ENV{RELEASE_TESTING}) {
    die 'Could not load release-testing module ' . $module;
  }
  else {
    plan skip_all => $module . ' not available for testing';
  }
}

options(
  # For descriptions of what these do, consult Test::Portability::Files
  test_amiga_length   => 1,
  test_ansi_chars     => 1,
  test_case           => 1,
  test_dir_noext      => 1,
  test_dos_length     => 0,
  test_mac_length     => 1,
  test_one_dot        => 0,
  test_space          => 1,
  test_special_chars  => 1,
  test_symlink        => 1,
  test_vms_length     => 1,
  use_file_find       => 0,
);

run_tests();



( run in 0.628 second using v1.01-cache-2.11-cpan-65fba6d93b7 )