Revision as of 10:11, 28 March 2023

YouTube ... Quora ...Google search ...Google News ...Bing News

To identify the most important Features to address:

reduce the amount of computing resources required
2D & 3D intuition often fails in higher dimensions
distances tend to become relatively the 'same' as the number of dimensions increases

Algorithms:
- Principal Component Analysis (PCA) is an unsupervised linear transformation technique helps us identify patterns in data based of the correlation between the features. PCA aims to find the directions of the maximum variance in high dimensional data and project it onto a lower dimensional feature space.
- Independent Component Analysis (ICA)
- Canonical Correlation Analysis (CCA)
- Linear Discriminant Analysis (LDA) is a supervised linear transformation technique is to find the feature subspace that optimizes class separability.
- Multidimensional Scaling (MDS)
- Non-Negative Matrix Factorization (NMF)
- Partial Least Squares Regression (PLSR)
- Principal Component Regression (PCR)
- Projection Pursuit
- Sammon Mapping/Projection
- Local Linear Embedding (LLE) creates an embedding of the dataset and tries to preserve the relationships between neighborhoods in the dataset. LLE can be thought of as a series of local PCAs that are globally compared to find the best non-linear embedding.
- Isomap Embedding is a non-linear dimensionality reduction technique that creates an embedding of the dataset and tries to preserve the relationships in the dataset. Isomap looks for a lower-dimensional embedding which maintains distances between all points.
- T-Distributed Stochastic Neighbor Embedding (t-SNE) ...similar objects are modeled by nearby points
- Singular Value Decomposition (SVD) is a linear dimensionality reduction technique.

Dimensional Reduction techniques for reducing the number of input variables in training data - captures the “essence” of the data

Some datasets may contain many variables that may cause very hard to handle. Especially nowadays data collecting in systems occur at very detailed level due to the existence of more than enough resources. In such cases, the data sets may contain thousands of variables and most of them can be unnecessary as well. In this case, it is almost impossible to identify the variables which have the most impact on our prediction. Dimensional Reduction Algorithms are used in this kind of situations. It utilizes other algorithms like Random Forest, Decision Tree to identify the most important variables. 10 Machine Learning Algorithms You need to Know | Sidath Asir @ Medium

Projection

Youtube search... ...Google search

Autoencoder (AE) / Encoder-Decoder
Unsupervised
Privacy
Manifold Hypothesis
- Uniform Manifold Approximation and Projection (UMAP) | L. McInnes, J. Healy, and J. Melville ... a dimension reduction technique that can be used for visualisation similarly to t-SNE, but also for general non-linear dimension reduction.
  - UMAP...Python version
  - UMAP-JS ...Javascript version
Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods | Michael Thrun & Alfred Ultsch - ScienceDirect

@@ Line 17: / Line 17: @@
 * [[Kernel Trick]]
 * [[Softmax]]
-* [http://files.knime.com/sites/default/files/inline-images/knime_seventechniquesdatadimreduction.pdf Seven Techniques for Dimensionality Reduction | KNIME]
+* [https://files.knime.com/sites/default/files/inline-images/knime_seventechniquesdatadimreduction.pdf Seven Techniques for Dimensionality Reduction | KNIME]
-* [http://github.com/JonTupitza/Data-Science-Process/blob/master/06-Dimensionality-Reduction.ipynb Dimensionality Reduction Techniques Jupyter Notebook] | [http://github.com/jontupitza Jon Tupitza]
+* [https://github.com/JonTupitza/Data-Science-Process/blob/master/06-Dimensionality-Reduction.ipynb Dimensionality Reduction Techniques Jupyter Notebook] | [https://github.com/jontupitza Jon Tupitza]
 * [[(Deep) Convolutional Neural Network (DCNN/CNN)]]
-* [http://en.wikipedia.org/wiki/Factor_analysis Factor analysis]
+* [https://en.wikipedia.org/wiki/Factor_analysis Factor analysis]
-* [http://en.wikipedia.org/wiki/Feature_extraction Feature extraction]
+* [https://en.wikipedia.org/wiki/Feature_extraction Feature extraction]
-* [http://en.wikipedia.org/wiki/Feature_selection Feature selection]
+* [https://en.wikipedia.org/wiki/Feature_selection Feature selection]
-* [http://en.wikipedia.org/wiki/Nonlinear_dimensionality_reduction#Locally-linear_embedding Nonlinear dimensionality reduction | Wikipedia]
+* [https://en.wikipedia.org/wiki/Nonlinear_dimensionality_reduction#Locally-linear_embedding Nonlinear dimensionality reduction | Wikipedia]
 * [[Local Linear Embedding (LLE) | Embedding functions]]
@@ Line 34: / Line 34: @@
 * Algorithms:
 ** [[Principal Component Analysis (PCA)]] is an unsupervised linear transformation technique helps us identify patterns in data based of the correlation between the features. PCA aims to find the directions of the maximum variance in high dimensional data and project it onto a lower dimensional feature space.
-** [http://en.wikipedia.org/wiki/Independent_component_analysis Independent Component Analysis (ICA)]
+** [https://en.wikipedia.org/wiki/Independent_component_analysis Independent Component Analysis (ICA)]
-** [http://en.wikipedia.org/wiki/Canonical_correlation Canonical Correlation Analysis (CCA)]
+** [https://en.wikipedia.org/wiki/Canonical_correlation Canonical Correlation Analysis (CCA)]
-** [http://en.wikipedia.org/wiki/Linear_discriminant_analysis Linear Discriminant Analysis (LDA)] is a supervised linear transformation technique is to find the feature subspace that optimizes class separability.
+** [https://en.wikipedia.org/wiki/Linear_discriminant_analysis Linear Discriminant Analysis (LDA)] is a supervised linear transformation technique is to find the feature subspace that optimizes class separability.
-** [http://en.wikipedia.org/wiki/Multidimensional_scaling Multidimensional Scaling (MDS)]
+** [https://en.wikipedia.org/wiki/Multidimensional_scaling Multidimensional Scaling (MDS)]
-** [http://en.wikipedia.org/wiki/Non-negative_matrix_factorization Non-Negative Matrix Factorization (NMF)]
+** [https://en.wikipedia.org/wiki/Non-negative_matrix_factorization Non-Negative Matrix Factorization (NMF)]
-** [http://en.wikipedia.org/wiki/Partial_least_squares_regression Partial Least Squares Regression (PLSR)]
+** [https://en.wikipedia.org/wiki/Partial_least_squares_regression Partial Least Squares Regression (PLSR)]
-** [http://en.wikipedia.org/wiki/Principal_component_regression Principal Component Regression (PCR)]
+** [https://en.wikipedia.org/wiki/Principal_component_regression Principal Component Regression (PCR)]
-** [http://en.wikipedia.org/wiki/Projection_pursuit Projection Pursuit]
+** [https://en.wikipedia.org/wiki/Projection_pursuit Projection Pursuit]
-** [http://en.wikipedia.org/wiki/Sammon_mapping Sammon Mapping/Projection]
+** [https://en.wikipedia.org/wiki/Sammon_mapping Sammon Mapping/Projection]
 ** [[Local Linear Embedding (LLE)]] creates an embedding of the dataset and tries to preserve the relationships between neighborhoods in the dataset. LLE can be thought of as a series of local PCAs that are globally compared to find the best non-linear embedding.
 ** [[Isomap]] Embedding is a non-linear dimensionality reduction technique that creates an embedding of the dataset and tries to preserve the relationships in the dataset. Isomap looks for a lower-dimensional embedding which maintains distances between all points.
@@ Line 56: / Line 56: @@
-Some datasets may contain many variables that may cause very hard to handle. Especially nowadays data collecting in systems occur at very detailed level due to the existence of more than enough resources. In such cases, the data sets may contain thousands of variables and most of them can be unnecessary as well. In this case, it is almost impossible to identify the variables which have the most impact on our prediction. Dimensional Reduction Algorithms are used in this kind of situations. It utilizes other algorithms like Random Forest, Decision Tree to identify the most important variables. [http://towardsdatascience.com/10-machine-learning-algorithms-you-need-to-know-77fb0055fe0 10 Machine Learning Algorithms You need to Know | Sidath Asir @ Medium]
+Some datasets may contain many variables that may cause very hard to handle. Especially nowadays data collecting in systems occur at very detailed level due to the existence of more than enough resources. In such cases, the data sets may contain thousands of variables and most of them can be unnecessary as well. In this case, it is almost impossible to identify the variables which have the most impact on our prediction. Dimensional Reduction Algorithms are used in this kind of situations. It utilizes other algorithms like Random Forest, Decision Tree to identify the most important variables. [https://towardsdatascience.com/10-machine-learning-algorithms-you-need-to-know-77fb0055fe0 10 Machine Learning Algorithms You need to Know | Sidath Asir @ Medium]
@@ Line 70: / Line 70: @@
 = <span id="Projection"></span>Projection =
-[http://www.youtube.com/results?search_query=Dimensional+Reduction+Projection+Algorithm Youtube search...]
+[https://www.youtube.com/results?search_query=Dimensional+Reduction+Projection+Algorithm Youtube search...]
-[http://www.google.com/search?q=Dimensional+Reduction+Projection+Algorithm+Dimension+machine+learning+ML ...Google search]
+[https://www.google.com/search?q=Dimensional+Reduction+Projection+Algorithm+Dimension+machine+learning+ML ...Google search]
 * [[Autoencoder (AE) / Encoder-Decoder]]
@@ Line 77: / Line 77: @@
 * [[Privacy]]
 * [[Manifold Hypothesis]]
-** [http://arxiv.org/pdf/1802.03426.pdf Uniform Manifold Approximation and Projection (UMAP) | L. McInnes, J. Healy, and J. Melville] ... a dimension reduction technique that can be used for visualisation similarly to [[T-Distributed Stochastic Neighbor Embedding (t-SNE) | t-SNE]], but also for general non-linear dimension reduction.
+** [https://arxiv.org/pdf/1802.03426.pdf Uniform Manifold Approximation and Projection (UMAP) | L. McInnes, J. Healy, and J. Melville] ... a dimension reduction technique that can be used for visualisation similarly to [[T-Distributed Stochastic Neighbor Embedding (t-SNE) | t-SNE]], but also for general non-linear dimension reduction.
-*** [http://github.com/lmcinnes/umap UMAP]...[[Python]] version
+*** [https://github.com/lmcinnes/umap UMAP]...[[Python]] version
-*** [http://github.com/pair-code/umap-js UMAP-JS] ...[[Javascript]] version
+*** [https://github.com/pair-code/umap-js UMAP-JS] ...[[Javascript]] version
-* [http://www.sciencedirect.com/science/article/pii/S2215016120303137 Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods | Michael Thrun & Alfred Ultsch - ScienceDirect]
+* [https://www.sciencedirect.com/science/article/pii/S2215016120303137 Uncovering High-dimensional Structures of Projections from Dimensionality Reduction Methods | Michael Thrun & Alfred Ultsch - ScienceDirect]
 <youtube>6BPl81wGGP8</youtube>

Difference between revisions of "Dimensional Reduction"

Revision as of 10:11, 28 March 2023

Projection

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools