Multivariate Analyses
15 February 2002
Finding latent variables in a multivariate data set
We have been studying the relationship between the original data world and the principal component world. No information is lost between these two worlds. We can go back and forth between these two worlds by using the eigen values of the original data world.
Given the covariance matrix of the real world (ΣX) and its spectral decomposition: ,
the principal component world can be expressed as:
and back to the real world by
(U is the matrix of eigen vectors and Λ is the
diagonal matrix of eigen values.).
There are a couple of good properties in the principal component world compared with the real world:
1) The covariance matrix of Y is diagonal and it is never singular and
2) All principal components are orthogonal.
To predict a variable z, we can use either the real world
(X) or the principal component world (Y).
For example ,
in the real world, whereas,
in the principal component world, where w
and r are vectors of weights, or coefficients, and e is a vector
of random error. These prediction
formulas are identical to the multiple regression problems. The prediction in the PC world may be better
than using the original X matrix because the covariance matrix of X can be
singular, whereas the covariance matrix of Y is always non-singular.
When the covariance matrix of X is singular, some of eigen values of the covariance matrix is zero, in other words, some of x variables are linear combinations of others. Consequently, we don’t need them for the prediction.
If we let to be a vector of eigen values of the
covariance matrix of X, we may decide that only the first a few eigen values
are important, say more than 90% of the total variability is explained by the
first two principal components. Then we
may use only first two principal components for the prediction of z.
Let’s rewrite the spectral decomposition of the covariance matrix as the following:
where ui is the i-th eigen vector and λi is the corresponding i-th eigen value. The i-th covariance matrix Σi is called the theory matrix.
By using the first few, say two, eigen values and eigen vectors, we can write a reduced covariance matrix of X:
This covariance matrix is called the theory partition matrix.
Taking this concept further, we can conduct a principal component analysis by using this partitioned covariance matrix:
,
where U1,2 is the eigenvector matrix with the first two columns. Unfortunately, we cannot post-multiply this equation by U1,2T to obtain X, because U1,2 is not symmetrical. The post-multiplication results in a new X matrix:
.
This is an approximation of X.
The covariance matrix of should be identical to the theory partition
matrix
.
In summary:
Principal components extract signals from original variables and filter out noise specific to stations where data were collected. By creating principal components from the original data, we extract signals from our data. The first eigen vector grabs onto effects that work on several variables.