Algorithm-CurveFit

 view release on metacpan or  search on metacpan

Changes  view on Meta::CPAN

Revision history for Perl extension Algorithm-CurveFit.

1.06  Thu Jan  9 17:44 2025
  - Fix formula_sub args when calculating derivatives.  The parameters
    were passed into formula_sub incorrectly in one location in the code,
    which only appeared as a bug after the change from Debian jessie to
    Debian stretch.  See the bug report RT#121352 for many more details.

1.05  Sat Jun  5 10:14 2010
  - Compile functions to Perl subs when possible for
    performance.
  - Fall back to numeric derivative using five-point stencil
    if the symbolic derivative has a pole (or otherwise breaks
    down).

Changes  view on Meta::CPAN


1.02  Thu Oct 06 21:14 2005
  - Distribution upgrade
  - Now uses Module::Build.
  - POD coverage tests.
  - Better META.yml
  - etc.

1.01  Tue Sep 13 21:58 2005
  - Added additional parameter checks and more helpful error messages.
  - Now also accepts Math::Symbolic trees as formulas.

1.00  Mon Apr 25 16:25 2005
  - original version as uploaded to CPAN

README  view on Meta::CPAN

NAME
    Algorithm::CurveFit - Nonlinear Least Squares Fitting

SYNOPSIS
      use Algorithm::CurveFit;

      # Known form of the formula
      my $formula = 'c + a * x^2';
      my $variable = 'x';
      my @xdata = read_file('xdata'); # The data corresponsing to $variable
      my @ydata = read_file('ydata'); # The data on the other axis
      my @parameters = (
          # Name    Guess   Accuracy
          ['a',     0.9,    0.00001],  # If an iteration introduces smaller
          ['c',     20,     0.00005],  # changes that the accuracy, end.
      );
      my $max_iter = 100; # maximum iterations

      my $square_residual = Algorithm::CurveFit->curve_fit(
          formula            => $formula, # may be a Math::Symbolic tree instead
          params             => \@parameters,
          variable           => $variable,
          xdata              => \@xdata,
          ydata              => \@ydata,
          maximum_iterations => $max_iter,
      );

      use Data::Dumper;
      print Dumper \@parameters;
      # Prints

README  view on Meta::CPAN

    * Increasing the number of free parameters decreases the quality and
      convergence speed.

    * Make sure that there are no correlated parameters such as in 'a + b *
      e^(c+x)'. (The example can be rewritten as 'a + b * e^c * e^x' in
      which 'c' and 'b' are basically equivalent parameters.

    The curve fitting algorithm is accessed via the 'curve_fit' subroutine.
    It requires the following parameters as 'key => value' pairs:

    formula
      The formula should be a string that can be parsed by Math::Symbolic.
      Alternatively, it can be an existing Math::Symbolic tree. Please refer
      to the documentation of that module for the syntax.

      Evaluation of the formula for a specific value of the variable
      (X-Data) and the parameters (see below) should yield the associated
      Y-Data value in case of perfect fit.

    variable
      The 'variable' is the variable in the formula that will be replaced
      with the X-Data points for evaluation. If omitted in the call to
      "curve_fit", the name 'x' is default. (Hence 'xdata'.)

    params
      The parameters are the symbols in the formula whose value is varied by
      the algorithm to find the best fit of the curve to the data. There may
      be one or more parameters, but please keep in mind that the number of
      parameters not only increases processing time, but also decreases the
      quality of the fit.

      The value of this options should be an anonymous array. This array
      should hold one anonymous array for each parameter. That array should
      hold (in order) a parameter name, an initial guess, and optionally an
      accuracy measure.

README  view on Meta::CPAN


  SUBROUTINES
    This is a list of public subroutines

    curve_fit
      This subroutine implements the curve fitting as explained in
      DESCRIPTION above.

NOTES AND CAVEATS
    * When computing the derivative symbolically using "Math::Symbolic", the
      formula simplification algorithm can sometimes fail to find the
      equivalent of "(x-x_0)/(x-x_0)". Typically, these would be hidden in a
      more complex product. The effect is that for "x -> x_0", the
      evaluation of the derivative becomes undefined.

      Since version 1.05, we fall back to numeric differentiation using
      five-point stencil in such cases. This should help with one of the
      primary complaints about the reliability of the module.

    * This module is NOT fast. For slightly better performance, the formulas
      are compiled to Perl code if possible.

SEE ALSO
    The algorithm implemented in this module was taken from:

    Eric W. Weisstein. "Nonlinear Least Squares Fitting." From MathWorld--A
    Wolfram Web Resource.
    http://mathworld.wolfram.com/NonlinearLeastSquaresFitting.html

    New versions of this module can be found on http://steffen-mueller.net

examples/examplefit.pl  view on Meta::CPAN

#!/usr/bin/perl

use strict;
use warnings;

use lib '../lib';
use Algorithm::CurveFit;

my $formula = 'b*cos(x/10)+c*sin(x/10)';
my $variable = 'x';
my @xdata;
my @ydata;

unless (@ARGV) {
    die <<HERE;
Usage: $0 DATAFILE
Fits b*cos(x/10)+c*sin(x/10) to the data in DATAFILE.
HERE
}

examples/examplefit.pl  view on Meta::CPAN

	push @ydata, $ary[1];
}
my @parameters = (
    # Name    Guess   Accuracy
    ['b',     10,    0.0001],
    ['c',     2,     0.0005],
);
my $max_iter = 100; # maximum iterations
  
my $square_residual = Algorithm::CurveFit->curve_fit(
    formula            => $formula, # may be a Math::Symbolic tree instead
    params             => \@parameters,
    variable           => $variable,
    xdata              => \@xdata,
    ydata              => \@ydata,
    maximum_iterations => $max_iter,
);
  
use Data::Dumper;
print Dumper \@parameters;
print Dumper $square_residual;

lib/Algorithm/CurveFit.pm  view on Meta::CPAN


sub curve_fit {
    shift @_ if not ref $_[0] and defined $_[0] and $_[0] eq 'Algorithm::CurveFit';

    confess('Uneven number of arguments to Algorithm::CurveFit::curve_fit.')
      if @_ % 2;

    my %args = @_;

    # Formula
    confess("Missing 'formula' parameter.") if not defined $args{formula};
    my $formula;
    if (ref($args{formula}) =~ /^Math::Symbolic/) {
        $formula = $args{formula};
    }
    else {
        eval { $formula = parse_from_string( $args{formula} ); };
        confess( "Cannot parse formula '" . $args{formula} . "'. ($@)" )
          if not defined $formula or $@;
    }

    # Variable (optional)
    my $variable = $args{variable};
    $variable = 'x' if not defined $variable;
    confess("Formula '"
          . $args{formula}
          . "' not explicitly dependent on "
          . "variable '$variable'." )
      if not grep { $_ eq $variable } $formula->explicit_signature();

    # Parameters
    my $params = $args{params};

    confess("Parameter 'params' has to be an array reference.")
      if not defined $params
      or not ref($params) eq 'ARRAY';
    my @parameters = @$params;
    confess('No parameters specified.') if not @parameters;
    confess('Individual parameters need to be array references.')

lib/Algorithm/CurveFit.pm  view on Meta::CPAN

        confess("Weird parameter\n'"
              . Dumper($p)
              . "' Should have the format\n"
              . "[ NAME_STRING, GUESSED_VALUE, ACCURACY ]\n"
              . "With the accuracy being optional. See docs." )
          if @$p > 3
          or @$p < 2
          or grep { not defined $_ } @$p;

        confess("Formula '"
              . $args{formula}
              . "' not explicitly dependent on "
              . "parameter '"
              . $p->[0]
              . "'." )
          if not grep { $_ eq $p->[0] } $formula->explicit_signature();
    }

    # XData
    my $xdata = $args{xdata};
    confess('X-Data missing.')
      if not defined $xdata
      or not ref($xdata) eq 'ARRAY'
      or not @$xdata;
    my @xdata = @$xdata;

lib/Algorithm/CurveFit.pm  view on Meta::CPAN

    foreach my $param (@parameters) {
        push @$param, 0 if @$param < 3;
    }

    # Array holding all first order partial derivatives of the function in respect
    # to the parameters in order.
    my @derivatives;
    my @param_names = ($variable, map {$_->[0]} @parameters);
    foreach my $param (@parameters) {
        my $deriv =
          Math::Symbolic::Operator->new( 'partial_derivative', $formula,
            $param->[0] );
        $deriv = $deriv->simplify()->apply_derivatives()->simplify();
        my ($sub, $trees) = Math::Symbolic::Compiler->compile_to_sub($deriv, \@param_names);
        if ($trees) {
            push @derivatives, $deriv; # residual trees, need to evaluate
        } else {
            push @derivatives, $sub;
        }
    }

    # if not compilable, close over a ->value call for convenience later on
    my $formula_sub = do {
        my ($sub, $trees) = Math::Symbolic::Compiler->compile_to_sub($formula, \@param_names);
        $trees
        ? sub {
              $formula->value(
                map { ($param_names[$_] => $_[$_]) } 0..$#param_names
              )
          }
        : $sub
    };

    my $dbeta;

    # Iterative approximation of the parameters
    my $iteration = 0;

lib/Algorithm/CurveFit.pm  view on Meta::CPAN

        foreach my $param (@parameters) {
            my $deriv = $derivatives[ $pno++ ];

            my @ary;
            if (ref $deriv eq 'CODE') {
                foreach my $x ( 0 .. $#xdata ) {
                    my $xv = $xdata[$x];
                    my $value = $deriv->($xv, @par_values);
                    if (not defined $value) { # fall back to numeric five-point stencil
                        my $h = SQRT_EPS*$xv; my $t = $xv + $h; $h = $t-$xv; # numerics. Cf. NR
                        $value = $formula_sub->($xv, map { $_->[1] } @parameters)
                    }
                    push @ary, $value;
                }
            }
            else {
                $deriv = $deriv->new; # better safe than sorry
                foreach my $x ( 0 .. $#xdata ) {
                    my $xv = $xdata[$x];
                    my $value = $deriv->value(
                      $variable => $xv,
                      map { ( @{$_}[ 0, 1 ] ) } @parameters # a, guess
                    );
                    if (not defined $value) { # fall back to numeric five-point stencil
                        my $h = SQRT_EPS*$xv; my $t = $xv + $h; $h = $t-$xv; # numerics. Cf. NR
                        $value = $formula_sub->($xv, map { $_->[1] } @parameters)
                    }
                    push @ary, $value;
                }
            }
            push @cols, \@ary;
        }

        # Prepare matrix of datapoints X parameters
        my $A = Math::MatrixReal->new_from_cols( \@cols );

        # transpose
        my $AT = ~$A;
        my $M  = $AT * $A;

        # residuals
        my @beta =
          map {
            $ydata[$_] - $formula_sub->(
                $xdata[$_],
                map { $_->[1] } @parameters
              )
          } 0 .. $#xdata;
        $dbeta = Math::MatrixReal->new_from_cols( [ \@beta ] );

        my $N = $AT * $dbeta;

        # Normalize before solving => better accuracy.
        my ( $matrix, $vector ) = $M->normalize($N);

lib/Algorithm/CurveFit.pm  view on Meta::CPAN

            my $dlambda = $x->element( $pno, 1 );
            $last = 0 if abs($dlambda) > $parameters[ $pno - 1 ][2];
            $parameters[ $pno - 1 ][1] += $dlambda;
        }
        last if $last;
    }

    # Recalculate dbeta for the squared residuals.
    my @beta =
      map {
        $ydata[$_] - $formula_sub->(
            $xdata[$_],
            map { $_->[1] } @parameters
          )
      } 0 .. $#xdata;
    $dbeta = Math::MatrixReal->new_from_cols( [ \@beta ] );

    my $square_residual = $dbeta->scalar_product($dbeta);
    return $square_residual;
}

lib/Algorithm/CurveFit.pm  view on Meta::CPAN

__END__

=head1 NAME

Algorithm::CurveFit - Nonlinear Least Squares Fitting

=head1 SYNOPSIS

  use Algorithm::CurveFit;

  # Known form of the formula
  my $formula = 'c + a * x^2';
  my $variable = 'x';
  my @xdata = read_file('xdata'); # The data corresponsing to $variable
  my @ydata = read_file('ydata'); # The data on the other axis
  my @parameters = (
      # Name    Guess   Accuracy
      ['a',     0.9,    0.00001],  # If an iteration introduces smaller
      ['c',     20,     0.00005],  # changes that the accuracy, end.
  );
  my $max_iter = 100; # maximum iterations

  my $square_residual = Algorithm::CurveFit->curve_fit(
      formula            => $formula, # may be a Math::Symbolic tree instead
      params             => \@parameters,
      variable           => $variable,
      xdata              => \@xdata,
      ydata              => \@ydata,
      maximum_iterations => $max_iter,
  );

  use Data::Dumper;
  print Dumper \@parameters;
  # Prints

lib/Algorithm/CurveFit.pm  view on Meta::CPAN

(The example can be rewritten as 'a + b * e^c * e^x' in which 'c' and 'b' are
basically equivalent parameters.

=back

The curve fitting algorithm is accessed via the 'curve_fit' subroutine.
It requires the following parameters as 'key => value' pairs:

=over 2

=item formula

The formula should be a string that can be parsed by Math::Symbolic.
Alternatively, it can be an existing Math::Symbolic tree.
Please refer to the documentation of that module for the syntax.

Evaluation of the formula for a specific value of the variable (X-Data)
and the parameters (see below) should yield the associated Y-Data value
in case of perfect fit.

=item variable

The 'variable' is the variable in the formula that will be replaced with the
X-Data points for evaluation. If omitted in the call to C<curve_fit>, the
name 'x' is default. (Hence 'xdata'.)

=item params

The parameters are the symbols in the formula whose value is varied by the
algorithm to find the best fit of the curve to the data. There may be
one or more parameters, but please keep in mind that the number of parameters
not only increases processing time, but also decreases the quality of the fit.

The value of this options should be an anonymous array. This array should
hold one anonymous array for each parameter. That array should hold (in order)
a parameter name, an initial guess, and optionally an accuracy measure.

Example:

lib/Algorithm/CurveFit.pm  view on Meta::CPAN


=back

=head1 NOTES AND CAVEATS

=over 2

=item *

When computing the derivative symbolically using C<Math::Symbolic>, the
formula simplification algorithm can sometimes fail to find the equivalent
of C<(x-x_0)/(x-x_0)>. Typically, these would be hidden in a more complex
product. The effect is that for C<x -E<gt> x_0>, the evaluation of the
derivative becomes undefined.

Since version 1.05, we fall back to numeric differentiation
using five-point stencil in such cases. This should help with one of the
primary complaints about the reliability of the module.

=item *

This module is NOT fast.
For slightly better performance, the formulas are compiled to
Perl code if possible.

=back

=head1 SEE ALSO

The algorithm implemented in this module was taken from:

Eric W. Weisstein. "Nonlinear Least Squares Fitting." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/NonlinearLeastSquaresFitting.html

t/01basic.t  view on Meta::CPAN

          '2.02298710522281',
          '2.06016785210408',
          '2.08654762730258',
          '2.24630881136672',
          '2.24898854508052',
          '2.10667677725939',
          '2.27627593183404'
];


my $formula = 'c+a*foo^2';
my $max_iter = 100;
my $variable = 'foo';
my @parameters = (
	['a', 2, 0.0000001],
	['c', 20, 0.0000005],
);

my $sqr;
eval {
	$sqr = Algorithm::CurveFit::curve_fit(
		variable => $variable,
		formula => $formula,
		params => \@parameters,
		maximum_iterations => $max_iter,
		xdata => $xdata,
		ydata => $ydata,
	);
};
ok(!$@, "Call didn't die ($@).");
ok($parameters[0][1] > 0.2-0.1, "Param 1 not too small.");
ok($parameters[0][1] < 0.2+0.1, "Param 1 not too big.");
ok($parameters[1][1] > 2-0.5, "Param 2 not too small.");

t/02bad_deriv.t  view on Meta::CPAN

my @parameters = ( 
                [ 's', 4.0, 0.01 ],
                [ 'c', 0.01, 0.001 ],
                [ 'y0', 0.1, 0.001 ],
                [ 'x0', 1.0535, 0.01 ],
              );

use Algorithm::CurveFit;

Algorithm::CurveFit->curve_fit(
  formula            => 'sqrt( s*(x-x0)^2 + c) + y0',
  variable           => 'x',
  params             => \@parameters,
  xdata              => \@xdata,
  ydata              => \@ydata,
  maximum_iterations => 20,
);

#use Data::Dumper;
#print Dumper \@parameters;



( run in 0.701 second using v1.01-cache-2.11-cpan-3cd7ad12f66 )