k-means++.pdf


立即下载 ⑩Ⅵ嵗D夨憶
2024-04-19
ans k-me speed minimize squared technique clustering ing point exactl
237.5 KB

k-means++: The Advantages of Careful Seeding
David Arthur ∗ Sergei Vassilvitskii†
Abstract
The k-means method is a widely used clustering technique
that seeks to minimize the average squared distance between
points in the same cluster. Although it offers no accuracy
guarantees, its simplicity and speed are very appealing in
practice. By augmenting k-means with a very simple, ran-
domized seeding technique, we obtain an algorithm that is
Θ(log k)-competitive with the optimal clustering. Prelim-
inary experiments show that our augmentation improves
both the speed and the accuracy of k-means, often quite
dramatically.
1 Introduction
Clustering is one of the classic problems in machine
learning and computational geometry. In the popular
k-means formulation, one is given an integer k and a set
of n data points in Rd. The goal is to choose k centers
so as to minimize φ, the sum of the squared distances
between each point and its closest center.
Solving this problem exactl


ans/k-me/speed/minimize/squared/technique/clustering/ing/point/exactl/ ans/k-me/speed/minimize/squared/technique/clustering/ing/point/exactl/
-1 条回复
登录 后才能参与评论
-->