Lecture Notes  Feb 06

 

Always make plots of components

While spherical and horseshoe shapes may produce similar eigenvectors, their interpretation is very different.  Spherical shaped plots are commonly interpreted as being noise, with no linear pattern.  All further eigenvalues/eigenvectors with the spherical shape will produce approximately the same variance. 

 

Organizing Results:

            Each element of the eigenvectors can be thought of as weights.  It is helpful to initially show the results in a table, as shown below.

Order

I

II

III

IV

V

Variance

0.3

0.2

0.18

0.15

 

Cumulative Variance

0.3

0.5

0.68

0.83

 

Ranking by values in I

 

 

 

 

 

 

Ranking by values in II

 

 

 

 

 

 

Ranking by values in III

 

 

 

 

 

 

The order in the first row is done by sorting the eigenvalues in decreasing order.  The variance in the second row is the value of the eigenvalues.  The cumulative variance is calculated by summing the variance components across columns. 

            It is almost unnecessary to deal with more than the first three eigenvectors.  Only the results from the first three are shown in the table above.  The row titled “Ranking by values in I” represents a sorted vector(in decreasing order) of the weights/elements of values in eigenvector I.  The first row in this column will just be eigenvector I, sorted in decreasing order.  The second row will be the sorted values of eigenvector II, whose elements are arranged depending on the sorted values in I.  After the sorting is completed for each eigenvector across the chart, the ranking process is repeated, depending on the values in eigenvector II and eigenvector III.

            The sign of eigenvalues conveys no information.  The most contrast is usually found in the first column, and contrast usually decreases across columns.  It is possible to do a multiple regression on PCA scores(try using the Y matrix from the last lecture as the independent variable). 

 

***IMPORTANT NOTE: PCA operates on the covariance matrix, not the correlation matrix.  If variables are changed in scale, the results will be affected.  There are several ways around this: the first is to standardize each variable before doing the analysis, and the second is to group variables on the same scale into an initial PCA analysis, then group all standardized variables into a final PCA analysis.