Friday, March 8, 2013

Why Principle Component Analysis is always better than Factor Analysis?

Principle Component Analysis

If x\sim (\mu, \Sigma) where \Sigma = A\wedge A' and AA'=I, and let y=A'(x-\mu), then y_i = a'_i(x-\mu) where y_i i the ith principal component and Cov(y_i, y_j) = 0

Factor Analysis

if x\sim (\mu, \Sigma) where \Sigma = \wedge \wedge'+\Psi, then x = \wedge f + \eta and the common factors f are independent to \eta.
For the two methods, the proportion of variance of x_j explained is called the community of x_j. The summation of community is not equal to 1 for Factor Analysis, while it is always equal to 1 Principle Component Analysis. 

We can use PROC FACTOR and a simple SASHELP.CLASS dataset as example. The only difference between them is the priors option. In Factor Analysis, starting from one common factor, the commonality is greater than 1, which is difficult to explain.
******factor analysis***********************************;
proc factor data=sashelp.class priors=smc plots=scree ;
run;

******princinple component analysis*********************;
proc factor data=sashelp.class priors=one plots=scree;
run;
As the result, Principle Component Analysis is easier interpreted and running faster. That is why PCA is much more popular than FA in reality.