### Smoothed Analysis of the k-Means MethodSmoothed Analysis of the k-Means Method

Access Restriction
Subscribed

 Author Arthur, David ♦ Manthey, Bodo ♦ Rglin, Heiko Source ACM Digital Library Content type Text Publisher Association for Computing Machinery (ACM) File Format PDF Copyright Year ©2011 Language English
 Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science Subject Keyword Data clustering ♦ K-means method ♦ Smoothed analysis Abstract The $\textit{k}-means$ method is one of the most widely used clustering algorithms, drawing its popularity from its speed in practice. Recently, however, it was shown to have exponential worst-case running time. In order to close the gap between practical performance and theoretical analysis, the $\textit{k}-means$ method has been studied in the model of smoothed analysis. But even the smoothed analyses so far are unsatisfactory as the bounds are still super-polynomial in the number $\textit{n}$ of data points. In this article, we settle the smoothed running time of the $\textit{k}-means$ method. We show that the smoothed number of iterations is bounded by a polynomial in $\textit{n}$ and $1/\textit{σ},$ where $\textit{σ}$ is the standard deviation of the Gaussian perturbations. This means that if an arbitrary input data set is randomly perturbed, then the $\textit{k}-means$ method will run in expected polynomial time on that input set. ISSN 00045411 Age Range 18 to 22 years ♦ above 22 year Educational Use Research Education Level UG and PG Learning Resource Type Article Publisher Date 2011-10-01 Publisher Place New York e-ISSN 1557735X Journal Journal of the ACM (JACM) Volume Number 58 Issue Number 5 Page Count 31 Starting Page 1 Ending Page 31

#### Open content in new tab

Source: ACM Digital Library