Difference between revisions of "Dimensional Reduction"

Revision as of 07:57, 21 August 2020

To identify the most important Features to address:

reduce the amount of computing resources required
2D & 3D intuition often fails in higher dimensions
distances tend to become relatively the 'same' as the number of dimensions increases

Algorithms:
- Principal Component Analysis (PCA)
- Independent Component Analysis (ICA)
- Canonical Correlation Analysis (CCA)
- Linear Discriminant Analysis (LDA)
- Multidimensional Scaling (MDS)
- Non-Negative Matrix Factorization (NMF)
- Partial Least Squares Regression (PLSR)
- Principal Component Regression (PCR)
- Projection Pursuit
- Sammon Mapping/Projection
- Local Linear Embedding (LLE)
- T-Distributed Stochastic Neighbor Embedding (t-SNE) ...similar objects are modeled by nearby points
- Uniform Manifold Approximation and Projection (UMAP) | L. McInnes, J. Healy, and J. Melville ... a dimension reduction technique that can be used for visualisation similarly to t-SNE, but also for general non-linear dimension reduction.
  - UMAP...Python version
  - UMAP-JS ...Javascript version

Some datasets may contain many variables that may cause very hard to handle. Especially nowadays data collecting in systems occur at very detailed level due to the existence of more than enough resources. In such cases, the data sets may contain thousands of variables and most of them can be unnecessary as well. In this case, it is almost impossible to identify the variables which have the most impact on our prediction. Dimensional Reduction Algorithms are used in this kind of situations. It utilizes other algorithms like Random Forest, Decision Tree to identify the most important variables. 10 Machine Learning Algorithms You need to Know | Sidath Asir @ Medium

@@ Line 11: / Line 11: @@
 * [[Kernel Trick]]
 * [[Isomap]]
-* [[Local Linear Embedding (LLE)]]
-* [[t-Distributed Stochastic Neighbor Embedding (t-SNE)]]
 * [[Softmax]]
 * [http://github.com/JonTupitza/Data-Science-Process/blob/master/06-Dimensionality-Reduction.ipynb Dimensionality Reduction Techniques Jupyter Notebook] | [http://github.com/jontupitza Jon Tupitza]
@@ Line 34: / Line 32: @@
 ** [http://en.wikipedia.org/wiki/Projection_pursuit Projection Pursuit]
 ** [http://en.wikipedia.org/wiki/Sammon_mapping Sammon Mapping/Projection]
+** [[Local Linear Embedding (LLE)]]
+** [[T-Distributed Stochastic Neighbor Embedding (t-SNE)]]  ...similar objects are modeled by nearby points
+** [http://arxiv.org/pdf/1802.03426.pdf Uniform Manifold Approximation and Projection (UMAP) | L. McInnes, J. Healy, and J. Melville] ... a dimension reduction technique that can be used for visualisation similarly to [[T-Distributed Stochastic Neighbor Embedding (t-SNE) | t-SNE]], but also for general non-linear dimension reduction.
+*** [http://github.com/lmcinnes/umap UMAP]...[[Python]] version
+*** [http://github.com/pair-code/umap-js UMAP-JS] ...[[Javascript]] version
 Related:
@@ Line 47: / Line 48: @@
 Some datasets may contain many variables that may cause very hard to handle. Especially nowadays data collecting in systems occur at very detailed level due to the existence of more than enough resources. In such cases, the data sets may contain thousands of variables and most of them can be unnecessary as well. In this case, it is almost impossible to identify the variables which have the most impact on our prediction. Dimensional Reduction Algorithms are used in this kind of situations. It utilizes other algorithms like Random Forest, Decision Tree to identify the most important variables. [http://towardsdatascience.com/10-machine-learning-algorithms-you-need-to-know-77fb0055fe0 10 Machine Learning Algorithms You need to Know | Sidath Asir @ Medium]
-* [[Principal Component Analysis (PCA)]]
-* [[T-Distributed Stochastic Neighbor Embedding (t-SNE)]]  ...similar objects are modeled by nearby points
-* [http://arxiv.org/pdf/1802.03426.pdf Uniform Manifold Approximation and Projection (UMAP) | L. McInnes, J. Healy, and J. Melville] ... a dimension reduction technique that can be used for visualisation similarly to [[T-Distributed Stochastic Neighbor Embedding (t-SNE) | t-SNE]], but also for general non-linear dimension reduction.
-** [http://github.com/lmcinnes/umap UMAP]...[[Python]] version
-** [http://github.com/pair-code/umap-js UMAP-JS] ...[[Javascript]] version
 <youtube>YPJQydzTLwQ</youtube>

Difference between revisions of "Dimensional Reduction"

Revision as of 07:57, 21 August 2020

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools