Density-Based Spatial Clustering of Applications with Noise (DBSCAN)

Jump to: navigation, search

Youtube search...

DBSCAN is a density based clustered algorithm similar to mean-shift, but with a couple of notable advantages. DBSCAN poses some great advantages over other clustering algorithms. Firstly, it does not require a pe-set number of clusters at all. It also identifies outliers as noises unlike mean-shift which simply throws them into a cluster even if the data point is very different. Additionally, it is able to find arbitrarily sized and arbitrarily shaped clusters quite well.

The main drawback of DBSCAN is that it doesn’t perform as well as others when the clusters are of varying density. This is because the setting of the distance threshold ε and minPoints for identifying the neighborhood points will vary from cluster to cluster when the density varies. This drawback also occurs with very high-dimensional data since again the distance threshold ε becomes challenging to estimate. The 5 Clustering Algorithms Data Scientists Need to Know | Towards Data Science