Difference between revisions of "Principal Component Analysis (PCA)"

From
Jump to: navigation, search
m
m
Line 15: Line 15:
 
* [[Dimensional Reduction]]
 
* [[Dimensional Reduction]]
 
* [[Unsupervised]] Learning
 
* [[Unsupervised]] Learning
* [[T-Distributed Stochastic Neighbor Embedding (t-SNE)]]  ..non-linear
+
* [[Embedding]]
 +
** [[T-Distributed Stochastic Neighbor Embedding (t-SNE)]]  ..non-linear
 
* [http://machinelearningmastery.com/calculate-principal-component-analysis-scratch-python/ How to Calculate Principal Component Analysis (PCA) from Scratch in Python | Jason Brownlee - Machine Learning Mastery]  
 
* [http://machinelearningmastery.com/calculate-principal-component-analysis-scratch-python/ How to Calculate Principal Component Analysis (PCA) from Scratch in Python | Jason Brownlee - Machine Learning Mastery]  
 
* [http://towardsdatascience.com/data-science-concepts-explained-to-a-five-year-old-ad440c7b3cbd Data Science Concepts Explained to a Five-year-old | Megan Dibble - Toward Data Science]
 
* [http://towardsdatascience.com/data-science-concepts-explained-to-a-five-year-old-ad440c7b3cbd Data Science Concepts Explained to a Five-year-old | Megan Dibble - Toward Data Science]

Revision as of 19:58, 26 June 2023

YouTube search... ...Google search



a data reduction technique that allows to simplify multidimensional data sets to 2 or 3 dimensions for plotting purposes and visual variance analysis.



  1. Center (and standardize) data
  2. First principal component axis
    1. Across centroid of data cloud
    2. Distance of each point to that line is minimized, so that it crosses the maximum variation of the data cloud
  3. Second principal component axis
    1. Orthogonal to first principal component
    2. Along maximum variation in the data
  4. First PCA axis becomes x-axis and second PCA axis y-axis
  5. Continue process until the necessary number of principal components is obtained


principal-component-analysis-basics-scatter-plot-data-mining-1.png


NumXL