Next: Linear Models in R Up: stat6341 Previous: Practical Guidelines for Monte

# Least Squares and the QR Decomposition

Suppose we would like to solve the linear system of equations,

where n>p. Since there are more equations than unknowns, an exact solution does not exist. In this case we look for an approximate solution by minimizing a norm of the difference,

When the 2-norm is used here, this is referred to as the least squares problem. This section considers the case in which A is full rank. Less than full rank problems are discussed in later sections.

An algebraic expression for the solution can be derived from the normal equations,

These equations can be solved using the Cholesky decomposition of :
1. Compute .
2. Obtain Cholesky decomposition of S, , where G is lower triangular.
3. Let , solve the lower triangular system via forward substitution.
4. Solve the upper triangular system, via backward substitution.
The biggest problem associated with this approach is the increased floating point error introduced in the first step.

A more numerically stable approach is provided by the QR decomposition. Let

where Q is an orthogonal matrix and R is an upper triangular matrix. If A is full rank, then all of the diagonal elements of R are non-zero. This decomposition always exists but it is not necessarily unique. Multiplication of the ith column of Q and the ith row of R by -1 does not change the product. Therefore, the QRD is unique if a requirement is added that diagonal elements of R must be positive.

Properties of the QRD. Let

1. The columns of form an orthonormal basis for and the columns of form an orthonormal basis for . More generally, let

denote submatrices of A,Q, respectively. Then

2. , where contains the first p rows of R. Note that the remaining rows of R are 0 and that is upper triangular. This is referred to as the reduced QRD.

3. Hence, is the Cholesky factor of the symmetric matrix .
4. is the projection matrix onto .
5. is the projection matrix onto .

To show how these properties generate solutions to least squares problems, let be the full QRD of A. Use the column partition of Q defined above to obtain

is an orthogonal matrix and so

Since d does not involve x, this is minimized by solving the square, upper triangular system of equations

Note that just the reduced QRD is needed to obtain the LS solution.

Fitted values are given by

so fitted values are the projection of b onto . Residuals are given by

which is the projection of b onto .

Next: Linear Models in R Up: stat6341 Previous: Practical Guidelines for Monte
ammann
2017-12-10