{"id":769,"date":"2018-07-07T08:10:09","date_gmt":"2018-07-07T08:10:09","guid":{"rendered":"http:\/\/muthu.co\/?p=769"},"modified":"2021-05-24T03:35:55","modified_gmt":"2021-05-24T03:35:55","slug":"mathematics-behind-k-mean-clustering-algorithm","status":"publish","type":"post","link":"http:\/\/write.muthu.co\/mathematics-behind-k-mean-clustering-algorithm\/","title":{"rendered":"Mathematics behind K-Mean Clustering algorithm"},"content":{"rendered":"\n
K-Means is one of the simplest unsupervised clustering algorithm which is used to cluster our data into K number of clusters. The algorithm iteratively assigns the data points to one of the K clusters based on how near the point is to the cluster centroid. The result of K-Means algorithm is:<\/p>\n\n\n\n
K-Means can be used for any type of grouping where data has not been explicitly labeled. Some of the real world examples are given below:<\/p>\n\n\n\n
Assuming we have input data points If p<\/b> = (p<\/i>1<\/sub>, p<\/i>2<\/sub>) and q<\/strong> = (q<\/i>1<\/sub>, q<\/i>2<\/sub>) then the distance is given by<\/p>\n\n\n\n Implementation:<\/strong><\/p>\n\n\n\n If each cluster centroid is denoted by ci<\/sub>, <\/em>then each data point x is assigned to a cluster based on<\/p>\n\n\n\nEuclidean Distance between two points in space:<\/h6>\n\n\n\n
<\/a><\/figure><\/div>\n\n\n\n
def euclidean_distance(point1, point2):\n return math.sqrt((point1[0]-point2[0])**2 + (point1[1]-point2[1])**2)<\/code><\/pre>\n\n\n\n
Assigning each point to the nearest cluster:<\/h6>\n\n\n\n