The linear regression approximates a random number of supporting points (x,y) to a straight line with the formulation
with a as the offset where the line crosses to cero line of x and b the rise of the line. The placement of the line is optimized that the sum of the square of the deviations to the supporting points is smallest as possible.
Basically the linear regression is a special case of the method of the least squares (see Method of the least squares). If the method of the least squares is computed for a polynomial of the order = 2 with for instance 3 supporting points that is
This expressed in a matrix equation:
According to the method of the least squares this matrix equation is extended by AT
That means for the left side:
And that can be written as:
and it’s quite obvious that this is valid for n supporting points as well:
and for the right side:
both sides together:
To resolve this for b the first element of the left side must be eliminated to get rid of a. This is done by extending both equations to make the first elements equal and subtracting the second equation from the first:
That is
and this resolved for b
For a the first equation of the matrix can be used:
This resolved for a
To keep the sums as small as possible
In this form the linear regression formula is shown mostly in the literature
In a small C# function that’s:
private void CalcLinearReg(double[] xp, double[] yp)
{
double sumX = 0;
double sumY = 0;
double sumXX = 0;
double sumYY = 0;
double sumXY = 0;
double Sxx = 0;
double Sxy = 0;
for (int i = 0; i < xp.Length; i++)
{
Sxx = sumX * sumX / samples - sumXX;
Sxy = sumX * sumY / samples - sumXY;
rise = Sxy / Sxx;
offset = (sumY - rise * sumX) / samples;
}double sumY = 0;
double sumXX = 0;
double sumYY = 0;
double sumXY = 0;
double Sxx = 0;
double Sxy = 0;
for (int i = 0; i < xp.Length; i++)
{
sumX = sumX + xp[i];
sumY = sumY + yp[i];
sumXX = sumXX + xp[i] * xp[i];
sumYY = sumYY + yp[i] * yp[i];
sumXY = sumXY + xp[i] * yp[i];
}sumY = sumY + yp[i];
sumXX = sumXX + xp[i] * xp[i];
sumYY = sumYY + yp[i] * yp[i];
sumXY = sumXY + xp[i] * yp[i];
Sxx = sumX * sumX / samples - sumXX;
Sxy = sumX * sumY / samples - sumXY;
rise = Sxy / Sxx;
offset = (sumY - rise * sumX) / samples;