DBSCAN revisited, revisited: why and how you should (still) use DBSCAN

E Schubert, J Sander, M Ester, HP Kriegel… - ACM Transactions on …, 2017 - dl.acm.org
ACM Transactions on Database Systems (TODS), 2017dl.acm.org
At SIGMOD 2015, an article was presented with the title “DBSCAN Revisited: Mis-Claim, Un-
Fixability, and Approximation” that won the conference's best paper award. In this technical
correspondence, we want to point out some inaccuracies in the way DBSCAN was
represented, and why the criticism should have been directed at the assumption about the
performance of spatial index structures such as R-trees and not at an algorithm that can use
such indexes. We will also discuss the relationship of DBSCAN performance and the …
At SIGMOD 2015, an article was presented with the title “DBSCAN Revisited: Mis-Claim, Un-Fixability, and Approximation” that won the conference’s best paper award. In this technical correspondence, we want to point out some inaccuracies in the way DBSCAN was represented, and why the criticism should have been directed at the assumption about the performance of spatial index structures such as R-trees and not at an algorithm that can use such indexes. We will also discuss the relationship of DBSCAN performance and the indexability of the dataset, and discuss some heuristics for choosing appropriate DBSCAN parameters. Some indicators of bad parameters will be proposed to help guide future users of this algorithm in choosing parameters such as to obtain both meaningful results and good performance. In new experiments, we show that the new SIGMOD 2015 methods do not appear to offer practical benefits if the DBSCAN parameters are well chosen and thus they are primarily of theoretical interest. In conclusion, the original DBSCAN algorithm with effective indexes and reasonably chosen parameter values performs competitively compared to the method proposed by Gan and Tao.
ACM Digital Library