view release on metacpan or search on metacpan
Revision history for Perl extension Algorithm-CurveFit.
1.06 Thu Jan 9 17:44 2025
- Fix formula_sub args when calculating derivatives. The parameters
were passed into formula_sub incorrectly in one location in the code,
which only appeared as a bug after the change from Debian jessie to
Debian stretch. See the bug report RT#121352 for many more details.
1.05 Sat Jun 5 10:14 2010
- Compile functions to Perl subs when possible for
performance.
- Fall back to numeric derivative using five-point stencil
if the symbolic derivative has a pole (or otherwise breaks
down).
1.02 Thu Oct 06 21:14 2005
- Distribution upgrade
- Now uses Module::Build.
- POD coverage tests.
- Better META.yml
- etc.
1.01 Tue Sep 13 21:58 2005
- Added additional parameter checks and more helpful error messages.
- Now also accepts Math::Symbolic trees as formulas.
1.00 Mon Apr 25 16:25 2005
- original version as uploaded to CPAN
NAME
Algorithm::CurveFit - Nonlinear Least Squares Fitting
SYNOPSIS
use Algorithm::CurveFit;
# Known form of the formula
my $formula = 'c + a * x^2';
my $variable = 'x';
my @xdata = read_file('xdata'); # The data corresponsing to $variable
my @ydata = read_file('ydata'); # The data on the other axis
my @parameters = (
# Name Guess Accuracy
['a', 0.9, 0.00001], # If an iteration introduces smaller
['c', 20, 0.00005], # changes that the accuracy, end.
);
my $max_iter = 100; # maximum iterations
my $square_residual = Algorithm::CurveFit->curve_fit(
formula => $formula, # may be a Math::Symbolic tree instead
params => \@parameters,
variable => $variable,
xdata => \@xdata,
ydata => \@ydata,
maximum_iterations => $max_iter,
);
use Data::Dumper;
print Dumper \@parameters;
# Prints
* Increasing the number of free parameters decreases the quality and
convergence speed.
* Make sure that there are no correlated parameters such as in 'a + b *
e^(c+x)'. (The example can be rewritten as 'a + b * e^c * e^x' in
which 'c' and 'b' are basically equivalent parameters.
The curve fitting algorithm is accessed via the 'curve_fit' subroutine.
It requires the following parameters as 'key => value' pairs:
formula
The formula should be a string that can be parsed by Math::Symbolic.
Alternatively, it can be an existing Math::Symbolic tree. Please refer
to the documentation of that module for the syntax.
Evaluation of the formula for a specific value of the variable
(X-Data) and the parameters (see below) should yield the associated
Y-Data value in case of perfect fit.
variable
The 'variable' is the variable in the formula that will be replaced
with the X-Data points for evaluation. If omitted in the call to
"curve_fit", the name 'x' is default. (Hence 'xdata'.)
params
The parameters are the symbols in the formula whose value is varied by
the algorithm to find the best fit of the curve to the data. There may
be one or more parameters, but please keep in mind that the number of
parameters not only increases processing time, but also decreases the
quality of the fit.
The value of this options should be an anonymous array. This array
should hold one anonymous array for each parameter. That array should
hold (in order) a parameter name, an initial guess, and optionally an
accuracy measure.
SUBROUTINES
This is a list of public subroutines
curve_fit
This subroutine implements the curve fitting as explained in
DESCRIPTION above.
NOTES AND CAVEATS
* When computing the derivative symbolically using "Math::Symbolic", the
formula simplification algorithm can sometimes fail to find the
equivalent of "(x-x_0)/(x-x_0)". Typically, these would be hidden in a
more complex product. The effect is that for "x -> x_0", the
evaluation of the derivative becomes undefined.
Since version 1.05, we fall back to numeric differentiation using
five-point stencil in such cases. This should help with one of the
primary complaints about the reliability of the module.
* This module is NOT fast. For slightly better performance, the formulas
are compiled to Perl code if possible.
SEE ALSO
The algorithm implemented in this module was taken from:
Eric W. Weisstein. "Nonlinear Least Squares Fitting." From MathWorld--A
Wolfram Web Resource.
http://mathworld.wolfram.com/NonlinearLeastSquaresFitting.html
New versions of this module can be found on http://steffen-mueller.net
examples/examplefit.pl view on Meta::CPAN
#!/usr/bin/perl
use strict;
use warnings;
use lib '../lib';
use Algorithm::CurveFit;
my $formula = 'b*cos(x/10)+c*sin(x/10)';
my $variable = 'x';
my @xdata;
my @ydata;
unless (@ARGV) {
die <<HERE;
Usage: $0 DATAFILE
Fits b*cos(x/10)+c*sin(x/10) to the data in DATAFILE.
HERE
}
examples/examplefit.pl view on Meta::CPAN
push @ydata, $ary[1];
}
my @parameters = (
# Name Guess Accuracy
['b', 10, 0.0001],
['c', 2, 0.0005],
);
my $max_iter = 100; # maximum iterations
my $square_residual = Algorithm::CurveFit->curve_fit(
formula => $formula, # may be a Math::Symbolic tree instead
params => \@parameters,
variable => $variable,
xdata => \@xdata,
ydata => \@ydata,
maximum_iterations => $max_iter,
);
use Data::Dumper;
print Dumper \@parameters;
print Dumper $square_residual;
lib/Algorithm/CurveFit.pm view on Meta::CPAN
sub curve_fit {
shift @_ if not ref $_[0] and defined $_[0] and $_[0] eq 'Algorithm::CurveFit';
confess('Uneven number of arguments to Algorithm::CurveFit::curve_fit.')
if @_ % 2;
my %args = @_;
# Formula
confess("Missing 'formula' parameter.") if not defined $args{formula};
my $formula;
if (ref($args{formula}) =~ /^Math::Symbolic/) {
$formula = $args{formula};
}
else {
eval { $formula = parse_from_string( $args{formula} ); };
confess( "Cannot parse formula '" . $args{formula} . "'. ($@)" )
if not defined $formula or $@;
}
# Variable (optional)
my $variable = $args{variable};
$variable = 'x' if not defined $variable;
confess("Formula '"
. $args{formula}
. "' not explicitly dependent on "
. "variable '$variable'." )
if not grep { $_ eq $variable } $formula->explicit_signature();
# Parameters
my $params = $args{params};
confess("Parameter 'params' has to be an array reference.")
if not defined $params
or not ref($params) eq 'ARRAY';
my @parameters = @$params;
confess('No parameters specified.') if not @parameters;
confess('Individual parameters need to be array references.')
lib/Algorithm/CurveFit.pm view on Meta::CPAN
confess("Weird parameter\n'"
. Dumper($p)
. "' Should have the format\n"
. "[ NAME_STRING, GUESSED_VALUE, ACCURACY ]\n"
. "With the accuracy being optional. See docs." )
if @$p > 3
or @$p < 2
or grep { not defined $_ } @$p;
confess("Formula '"
. $args{formula}
. "' not explicitly dependent on "
. "parameter '"
. $p->[0]
. "'." )
if not grep { $_ eq $p->[0] } $formula->explicit_signature();
}
# XData
my $xdata = $args{xdata};
confess('X-Data missing.')
if not defined $xdata
or not ref($xdata) eq 'ARRAY'
or not @$xdata;
my @xdata = @$xdata;
lib/Algorithm/CurveFit.pm view on Meta::CPAN
foreach my $param (@parameters) {
push @$param, 0 if @$param < 3;
}
# Array holding all first order partial derivatives of the function in respect
# to the parameters in order.
my @derivatives;
my @param_names = ($variable, map {$_->[0]} @parameters);
foreach my $param (@parameters) {
my $deriv =
Math::Symbolic::Operator->new( 'partial_derivative', $formula,
$param->[0] );
$deriv = $deriv->simplify()->apply_derivatives()->simplify();
my ($sub, $trees) = Math::Symbolic::Compiler->compile_to_sub($deriv, \@param_names);
if ($trees) {
push @derivatives, $deriv; # residual trees, need to evaluate
} else {
push @derivatives, $sub;
}
}
# if not compilable, close over a ->value call for convenience later on
my $formula_sub = do {
my ($sub, $trees) = Math::Symbolic::Compiler->compile_to_sub($formula, \@param_names);
$trees
? sub {
$formula->value(
map { ($param_names[$_] => $_[$_]) } 0..$#param_names
)
}
: $sub
};
my $dbeta;
# Iterative approximation of the parameters
my $iteration = 0;
lib/Algorithm/CurveFit.pm view on Meta::CPAN
foreach my $param (@parameters) {
my $deriv = $derivatives[ $pno++ ];
my @ary;
if (ref $deriv eq 'CODE') {
foreach my $x ( 0 .. $#xdata ) {
my $xv = $xdata[$x];
my $value = $deriv->($xv, @par_values);
if (not defined $value) { # fall back to numeric five-point stencil
my $h = SQRT_EPS*$xv; my $t = $xv + $h; $h = $t-$xv; # numerics. Cf. NR
$value = $formula_sub->($xv, map { $_->[1] } @parameters)
}
push @ary, $value;
}
}
else {
$deriv = $deriv->new; # better safe than sorry
foreach my $x ( 0 .. $#xdata ) {
my $xv = $xdata[$x];
my $value = $deriv->value(
$variable => $xv,
map { ( @{$_}[ 0, 1 ] ) } @parameters # a, guess
);
if (not defined $value) { # fall back to numeric five-point stencil
my $h = SQRT_EPS*$xv; my $t = $xv + $h; $h = $t-$xv; # numerics. Cf. NR
$value = $formula_sub->($xv, map { $_->[1] } @parameters)
}
push @ary, $value;
}
}
push @cols, \@ary;
}
# Prepare matrix of datapoints X parameters
my $A = Math::MatrixReal->new_from_cols( \@cols );
# transpose
my $AT = ~$A;
my $M = $AT * $A;
# residuals
my @beta =
map {
$ydata[$_] - $formula_sub->(
$xdata[$_],
map { $_->[1] } @parameters
)
} 0 .. $#xdata;
$dbeta = Math::MatrixReal->new_from_cols( [ \@beta ] );
my $N = $AT * $dbeta;
# Normalize before solving => better accuracy.
my ( $matrix, $vector ) = $M->normalize($N);
lib/Algorithm/CurveFit.pm view on Meta::CPAN
my $dlambda = $x->element( $pno, 1 );
$last = 0 if abs($dlambda) > $parameters[ $pno - 1 ][2];
$parameters[ $pno - 1 ][1] += $dlambda;
}
last if $last;
}
# Recalculate dbeta for the squared residuals.
my @beta =
map {
$ydata[$_] - $formula_sub->(
$xdata[$_],
map { $_->[1] } @parameters
)
} 0 .. $#xdata;
$dbeta = Math::MatrixReal->new_from_cols( [ \@beta ] );
my $square_residual = $dbeta->scalar_product($dbeta);
return $square_residual;
}
lib/Algorithm/CurveFit.pm view on Meta::CPAN
__END__
=head1 NAME
Algorithm::CurveFit - Nonlinear Least Squares Fitting
=head1 SYNOPSIS
use Algorithm::CurveFit;
# Known form of the formula
my $formula = 'c + a * x^2';
my $variable = 'x';
my @xdata = read_file('xdata'); # The data corresponsing to $variable
my @ydata = read_file('ydata'); # The data on the other axis
my @parameters = (
# Name Guess Accuracy
['a', 0.9, 0.00001], # If an iteration introduces smaller
['c', 20, 0.00005], # changes that the accuracy, end.
);
my $max_iter = 100; # maximum iterations
my $square_residual = Algorithm::CurveFit->curve_fit(
formula => $formula, # may be a Math::Symbolic tree instead
params => \@parameters,
variable => $variable,
xdata => \@xdata,
ydata => \@ydata,
maximum_iterations => $max_iter,
);
use Data::Dumper;
print Dumper \@parameters;
# Prints
lib/Algorithm/CurveFit.pm view on Meta::CPAN
(The example can be rewritten as 'a + b * e^c * e^x' in which 'c' and 'b' are
basically equivalent parameters.
=back
The curve fitting algorithm is accessed via the 'curve_fit' subroutine.
It requires the following parameters as 'key => value' pairs:
=over 2
=item formula
The formula should be a string that can be parsed by Math::Symbolic.
Alternatively, it can be an existing Math::Symbolic tree.
Please refer to the documentation of that module for the syntax.
Evaluation of the formula for a specific value of the variable (X-Data)
and the parameters (see below) should yield the associated Y-Data value
in case of perfect fit.
=item variable
The 'variable' is the variable in the formula that will be replaced with the
X-Data points for evaluation. If omitted in the call to C<curve_fit>, the
name 'x' is default. (Hence 'xdata'.)
=item params
The parameters are the symbols in the formula whose value is varied by the
algorithm to find the best fit of the curve to the data. There may be
one or more parameters, but please keep in mind that the number of parameters
not only increases processing time, but also decreases the quality of the fit.
The value of this options should be an anonymous array. This array should
hold one anonymous array for each parameter. That array should hold (in order)
a parameter name, an initial guess, and optionally an accuracy measure.
Example:
lib/Algorithm/CurveFit.pm view on Meta::CPAN
=back
=head1 NOTES AND CAVEATS
=over 2
=item *
When computing the derivative symbolically using C<Math::Symbolic>, the
formula simplification algorithm can sometimes fail to find the equivalent
of C<(x-x_0)/(x-x_0)>. Typically, these would be hidden in a more complex
product. The effect is that for C<x -E<gt> x_0>, the evaluation of the
derivative becomes undefined.
Since version 1.05, we fall back to numeric differentiation
using five-point stencil in such cases. This should help with one of the
primary complaints about the reliability of the module.
=item *
This module is NOT fast.
For slightly better performance, the formulas are compiled to
Perl code if possible.
=back
=head1 SEE ALSO
The algorithm implemented in this module was taken from:
Eric W. Weisstein. "Nonlinear Least Squares Fitting." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/NonlinearLeastSquaresFitting.html
t/01basic.t view on Meta::CPAN
'2.02298710522281',
'2.06016785210408',
'2.08654762730258',
'2.24630881136672',
'2.24898854508052',
'2.10667677725939',
'2.27627593183404'
];
my $formula = 'c+a*foo^2';
my $max_iter = 100;
my $variable = 'foo';
my @parameters = (
['a', 2, 0.0000001],
['c', 20, 0.0000005],
);
my $sqr;
eval {
$sqr = Algorithm::CurveFit::curve_fit(
variable => $variable,
formula => $formula,
params => \@parameters,
maximum_iterations => $max_iter,
xdata => $xdata,
ydata => $ydata,
);
};
ok(!$@, "Call didn't die ($@).");
ok($parameters[0][1] > 0.2-0.1, "Param 1 not too small.");
ok($parameters[0][1] < 0.2+0.1, "Param 1 not too big.");
ok($parameters[1][1] > 2-0.5, "Param 2 not too small.");
t/02bad_deriv.t view on Meta::CPAN
my @parameters = (
[ 's', 4.0, 0.01 ],
[ 'c', 0.01, 0.001 ],
[ 'y0', 0.1, 0.001 ],
[ 'x0', 1.0535, 0.01 ],
);
use Algorithm::CurveFit;
Algorithm::CurveFit->curve_fit(
formula => 'sqrt( s*(x-x0)^2 + c) + y0',
variable => 'x',
params => \@parameters,
xdata => \@xdata,
ydata => \@ydata,
maximum_iterations => 20,
);
#use Data::Dumper;
#print Dumper \@parameters;