App-Chart
view release on metacpan or search on metacpan
doc/chart.texi view on Meta::CPAN
The following are algorithms and calculations shared among various indicators
and averages.
@menu
* Linear Regression::
* Standard Deviation::
* True Range::
@end menu
@c ---------------------------------------------------------------------------
@node Linear Regression, Standard Deviation, Common Calculations, Common Calculations
@section Linear Regression
@cindex Linear regression
@cindex Least squares, line
The ``least squares'' or ``linear regression'' algorithm produces a best
fitting straight line through the middle of a set of N data points
@m{x_1@comma{}y_1@comma{} ...@comma{} x_N@comma{}y_N, x1@comma{}y1@comma{}
...@comma{} xN@comma{}yN}. In Chart this means a set of prices Y, and dates X
(with non-trading days collapsed out).
For a possible fitted line @m{L(X)=a+bX,L(X)= a + b*X}, the vertical distance
from the line to each point is squared, and a total deviation formed.
@tex
$$ SumSquares = (y_1 - L(x_1))^2 + \cdots + (y_N - L(x_N))^2 $$
@end tex
@ifnottex
@example
SumSquares = (y1 - L(x1))^2 + ... + (yN - L(xN))^2
@end example
@end ifnottex
The line parameters @math{a} and @math{b} are then chosen to make SumSquares
as small as possible (hence the name ``least squares''), and there's just one
line with that smallest SumSquares. The calculation is made easier if the X
coordinates are shifted so that @math{Mean(X)=0}. With that the formulas for
@math{a} and @math{b} are
@tex
$$ a = Mean \; Y = { y_1 + \cdots + y_N \over N } $$
$$ b = { x_1 y_1 + \cdots x_N y_N \over x_1^2 \cdots + x_N^2 } $$
@end tex
@ifnottex
@example
y1 + ... + yN
a = Mean Y = -------------
N
x1*y1 + ... + xN*yN
b = -------------------
x1^2 + ... xN^2
@end example
@end ifnottex
A least squares fit is ``best'' under certain mathematical assumptions:
basically that the data points were a straight line to which normally
distributed random amounts (positive or negative) have been added. Of course
an underlying straight line is unlikely in market price data, or in economics
generally, and in particular any cyclical component invalidates the
assumptions. Even so the algorithm is quite widely used because it offers an
objective basis for fitting a line.
@cindex Linear regression slope indicator
@cindex Regression coefficient
@cindex Coefficient, regression
@anchor{Linear Regression Slope}
@subsection Slope
The slope of the linear regression line, the @math{b} above, is sometimes
called the @dfn{regression coefficient}. This is available as an indicator
(Linear Regression Slope), to show how steep the fitted trend line is. The
units are price change per day, which is negative for a downward sloping line.
This may or may not be particularly useful so it's under ``Low Priority'' in
the indicator lists.
@cindex Standard error
@anchor{Linear Regression Standard Error}
@subsection Standard Error
Standard error (stderr) is a statistical measure of how much values differ
from an assumed underlying curve. It's calculated as the quadratic mean of
the vertical distances from each point to the curve.
Standard error from a linear regression line @math{y=a+bx} is
@tex
$$ Stderr = \sqrt { (y_1 - (a + bx_1))^2 + \cdots + (y_N - (a + bx_N))^2
\over N } $$
@end tex
@ifnottex
@example
/ (y1 - (a+b*x1))^2 + ... + (yN - (a+b*xN))^2 \
Stderr = sqrt | ------------------------------------------- |
\ N /
@end example
@end ifnottex
Notice the numerator is the same SumSquares which was minimized above.
Standard error is similar to standard deviation (@pxref{Standard Deviation});
but where stddev takes differences from a horizontal line (the @math{Y} mean),
stderr here goes from the sloping linear regression line.
For reference, there's no need to actually calculate the linear regression
@math{a} and @math{b}, the stderr can be formed directly as
@tex
$$ Stderr = \sqrt { Variance(Y) - { Covariance(X,Y)^2 \over Variance(X) }} $$
@end tex
@ifnottex
@example
/ Covariance(X,Y)^2 \
Stderr = sqrt | Variance(Y) - ----------------- |
\ Variance(X) /
@end example
@end ifnottex
@noindent
where variance and covariance are as follows (and notice they simplify if
@math{X} values are chosen to make @math{Mean(X)} zero),
( run in 1.356 second using v1.01-cache-2.11-cpan-39bf76dae61 )