Difference between revisions of "Clustering"
| Line 21: | Line 21: | ||
***[[Hierarchical Clustering; Agglomerative (HAC) & Divisive (HDC)]] | ***[[Hierarchical Clustering; Agglomerative (HAC) & Divisive (HDC)]] | ||
***[[Hierarchical Temporal Memory (HTM)]] | ***[[Hierarchical Temporal Memory (HTM)]] | ||
| + | |||
| + | Similarity Measures for Clusters: | ||
| + | * Compare the numbers of identical and unique item pairs appearing in cluster sets | ||
| + | * Achieved by counting the number of item pairs found in both clustering sets (a) as well as the pairs appearing only in the first (b) or the second (c) set. | ||
| + | * With this a similarity coefficient, such as the Jaccard index, can be computed. The latter is defined as the size of the intersect divided by the size of the union of two sample sets: a/(a+b+c). | ||
| + | * In case of partitioning results, the Jaccard Index measures how frequently pairs of items are joined together in two clustering data sets and how often pairs are observed only in one set. | ||
| + | * Related coefficient are the Rand Index and the Adjusted Rand Index. These indices also consider the number of pairs (d) that are not joined together in any of the clusters in both sets | ||
| + | [http://girke.bioinformatics.ucr.edu/GEN242/mydoc_Rclustering_3.html#example-2 Clustering Algorithms | Data Analysis in Genome Biology] | ||
<youtube>CtKeHnfK5uA</youtube> | <youtube>CtKeHnfK5uA</youtube> | ||
| Line 28: | Line 36: | ||
<youtube>ZueoXMgCd1c</youtube> | <youtube>ZueoXMgCd1c</youtube> | ||
<youtube>nk9K2AiFmjE</youtube> | <youtube>nk9K2AiFmjE</youtube> | ||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
Revision as of 20:29, 22 April 2019
YouTube search... ...Google search
- Clustering - Continuous - Dimensional Reduction
- Restricted Boltzmann Machine (RBM)
- Variational Autoencoder (VAE)
- Singular Value Decomposition (SVD)
- Principal Component Analysis (PCA)
- K-Means
- Mean-Shift Clustering
- Density-Based Spatial Clustering of Applications with Noise (DBSCAN)
- Expectation–Maximization (EM) Clustering using Gaussian Mixture Models (GMM)
- Hierarchical; to include clustering
Similarity Measures for Clusters:
- Compare the numbers of identical and unique item pairs appearing in cluster sets
- Achieved by counting the number of item pairs found in both clustering sets (a) as well as the pairs appearing only in the first (b) or the second (c) set.
- With this a similarity coefficient, such as the Jaccard index, can be computed. The latter is defined as the size of the intersect divided by the size of the union of two sample sets: a/(a+b+c).
- In case of partitioning results, the Jaccard Index measures how frequently pairs of items are joined together in two clustering data sets and how often pairs are observed only in one set.
- Related coefficient are the Rand Index and the Adjusted Rand Index. These indices also consider the number of pairs (d) that are not joined together in any of the clusters in both sets
Clustering Algorithms | Data Analysis in Genome Biology