MACS 30405: Exploring Cultural Space
University of Chicago
Variable 1 | … | Variable \(p\) | |
---|---|---|---|
Observation 1 | … | … | … |
… | … | … | … |
Observation \(n\) | … | … | … |
Dimension: n \(\times\) p (usually n is far greater than p.)
Sex | Age | … | Attitude on Abortion | |
---|---|---|---|---|
Individual 1 | 0 | 25 | … | 3 |
… | … | … | … | … |
Individual \(n\) | 1 | 48 | … | 7 |
a | the | … | power | … | |
---|---|---|---|---|---|
Document 1 | 928 | 824 | … | 8 | … |
… | … | … | … | … | … |
Document \(n\) | 451 | 552 | … | 5 | … |
Then, we want to find a second linear combination vector (the second principal component) that does the same except for the fact that the newly formed variable \(a_{21}\vec{x_1} + a_{22}\vec{x_2} + ... + a_{2p}\vec{x_p}\) should be uncorrelated with the first variable.
Following the same logic, the kth prinipal component is the \(k\)th linear combination vector that maximizes the variance of the newly formed variable subject to the condition that \(a_{k1}\vec{x_1} + a_{k2}\vec{x_2} + ... + a_{kp}\vec{x_p}\) should be uncorrelated with all previously transformed variables.
The procedure can be iteratively performed until \(k\) reaches \(\text{min}(n,p)\).