A type of clustering algorithms. They initialize each data point as individual clusters, iteratively merge the closest pair of clusters until only one cluster remains, and outputs the tree built where the leaves are the data points and the internal nodes correspond to the merges performed.
Single linkage
A hierarchical clustering algorithm, where the distance between two clusters is defined to be the shortest distance between any pair of points from the two clusters.
Complete linkage
A hierarchical clustering algorithm, where the distance between two clusters is defined to be the farthest distance between any pair of points from the two clusters.
Average linkage
A hierarchical clustering algorithm, where the distance between two clusters is defined to be the average distance between pairs of points from the two clusters.
K-means clustering
A clustering algorithm that takes as input a set of unlabeled data points from Euclidean space and the number of clusters k, iterates over assigning points to their closest centers and updating the centers to be the centroids of the clusters, and outputs the obtained centers and clusters when the centers don’t change.
Distortion of a data point
The squared Euclidean distance between the point and its center.
Distortion of the data set
The sum of the distortions of all the data points in the set.