Alston Scott Householder was a real genius I think. The Householder transformation he invented is a master piece of mathematics
The Householder transformation is an orthogonal transformation to eliminate more than one element of a given column in a matrix. It builds a transformation matrix
With I as the unit matrix and ω as a direction vector whit the length of 1. It’s quite a cool approach to solve a matrix equation and a very smart one if it comes to the method of the least squares to approximate some data points.
Most of the descriptions say what the transformation matrix does is basically mirroring a vector by the hyperplane which is orthogonal to ω and running through the origin. In my opinion this can be misleading a bit. At least me it was misguiding in the beginning The vector is not mirrored on the surface of the hyperplane as a light beam would be mirrored on the mirror of a bathroom. The hyperplane is just a plane and it is actually the axis for the mirroring. That means the vector is flapped to the other side of this hyperplane. That’s an important fact in my opinion.
The direction of this plane is chosen that way, that all the elements of the resulting vector will be 0 except the first one.
Therefore for a first step we have a look at a vector s which is orthogonal to ω. That means it is lying flat on the hyperplane. This we multiply by the Householder matrix and get:
Remark that ωT * s is 0 because of the orthogonality of ω and s. So the transformation does not change anything on this vector.
For the second step we have a look at a vector z which is parallel to ω. It’s standing upright on the hyperplane. So it can be substituted by a constant multiplied by ω: z = cω
With I * ω = ω and ωTω = 1. The transformation switches the direction of this vector.
Now any vector x can be regarded as a sum of s and z: X = s + z and the transformation of this vector x is:
The s part of this vector, which is lying flat on the hyperplane, is not affected and the z part which is upright on the hyperplane switches the sign.
Graphically that looks like this:
Whit the blue rectangle marking a part of the hyperplane.
(I didn’t mark the coordinate origin. It’s where ω, x and x’ cross)
Now the interesting part is how to get to the parameters of ω. Therefore we start with the first row of the main matrix in the matrix equation
This row we regard as a vector m1. This vector shall be transformed into a vector γ * e1 in which only the first element is not 0. It consists of the unit vector e1 multiplied by a constant γ. So the equation for the first transformation matrix is
and as e1 has the length of 1, γ must be the length of m1:
But as a root always has 2 solutions, γ could be positive as well as negative. So it is taken as positive if m11 is positive and negative if m11 is negative. This should be to avoid extinction (o.k. )
That was the first step. For the next step it might be helpful to have a look onto the graphical situation in a 2 dimensional room:
To rotate the vector x to x’ we would basically need to find the location of the hyperplane (the blue line here) to mirror x. In the 2 dimensional room that would be the angle bisector between x and x’. In the 3 dimensional room it’s the same and in a more dimensional root too. Even if we can not imagine that anymore. But finally we are not interested in the hyperplane but in ω which occurs in the Householder matrix. The vector ω is orthogonal to the hyperplane and its direction is the same as x – x’ and as x’ should be γ * e1 and x is m1, the direction of ω will be m1 - γ * e1 and as γ can be negative as well, we can say m1 + γ * e1 for the direction as well.
Now there is some substitution: m1 + γ * e1 = h and we need to calculate the correction of the length for ω as m1 + γ * e1 or h has the same direction, but not the same length jet, we need to fix that. Now there is an interesting trick: The length of h is:
and this is the same as
the transposed matrix of h multiplied by h.
Now this written with m1 + γ * e1
and expanded
In the special case here we can say
and get
and
and so
and with this we get for ω
as vector and the elements
For k = 1 ... N
Now we can put this all into an algorithm to calculate the elements of U = I – 2 * ω * ωT For a 5 * 5 matrix that looks like:
This matrix multiplied by the matrix M eliminates all the elements in the first row except the element m11:
To eliminate the elements m32, m42 and m52 in the second row now, we have the do the same again using the matrix m22 to m55.
And so on
This I put into a C# function;
public void RunHouseholder(int col)
{
int i, j;
double lamda = 0;
double beta;
double[] w = new double[samples];
double[,] I = new double[samples, samples];
// calc parameters
for (i = col; i < samples; i++)
{
lamda = lamda + m[i, col] * m[i, col];
}
lamda = Math.Sqrt(lamda);
if (m[col, col] < 0)
lamda = -lamda;
beta = Math.Sqrt(2 * lamda * (lamda + m[col, col]));
w[col] = (lamda + m[col, col]) / beta;
for (i = col + 1; i < samples; i++)
w[i] = m[i, col] / beta;
//calc matrix parameters
for (i = 0; i < samples; i++) // unit matrix
{
for (j = 0; j < samples; j++)
{
I[i, j] = 0.0;
}
I[i, i] = 1.0;
}
for (i = 0; i < samples; i++)
{
if (i >= col)
{
for (j = 0; j < samples; j++)
{
if (j >= col)
r[i, j] = I[i, j] - 2 * w[i] * w[j];
else
r[i, j] = I[i, j];
}
}
else
{
for (j = 0; j < samples; j++)
r[i, j] = I[i, j];
}
}
MultMatrix(r);
}
With
public void MultMatrix(double[,] m1)
{
int i, j, k;
double[,] al = new double[size, size];
double[] yl = new double[size];
for (i = 0; i < size; i++)
{
yl[i] = 0.0;
for (j = 0; j < size; j++)
{
yl[i] = yl[i] + (y[j] * m1[i, j]);
al[i, j] = 0.0;
for (k = 0; k < size; k++)
{
al[i, j] = al[i, j] + (m[k, j] * m1[i, k]);
}
}
}
for (i = 0; i < size; i++)
{
y[i] = yl[i];
for (j = 0; j < size; j++)
{
m[i, j] = al[i, j];
}
}
}
for (i = 0; i < MaxOrder-1; i++)
RunHouseholder(i);
And the matrix equation C’’x = y can be solved be a backward substitution.
public double[] Solve()
{
int k, l;
double[] x = new double[maxOrder];
for (k = maxOrder - 1; k >= 0; k--)
{
for (l = maxOrder - 1; l > k; l--)
{
y[k] = y[k] - x[l] * m[k, l];
}
if (m[k, k] != 0)
x[k] = y[k] / m[k, k];
else
x[k] = 0;
}
return x;
}
A beautiful algorithm. Isn’t it
Some related articles: