# K-Means on Iris Dataset

Read my previous post to understand how K-Means algorithm works. In this post I will try to run the K-Means on Iris dataset to classify our 3 classes of flowers, Iris setosa, Iris versicolor, Iris virginica (our classess) using the flowers sepal-length, sepal-width, petal-length and petal-width (our features)

Getting data:

```import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
#the imported dataset does not have the required column names so lets add it
colnames = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Class']

describe the data:

`irisdata.head()`

Converting the class names into numerical categories for analysis.

```irisdata['Class'] = pd.Categorical(irisdata["Class"])
irisdata["Class"] = irisdata["Class"].cat.codes```

Preparing our dataset:

```X = irisdata.values[:, 0:4]
y = irisdata.values[:, 4]```

Running K-Means on it:

```from sklearn.cluster import KMeans

# Number of clusters
kmeans = KMeans(n_clusters=3)
# Fitting the input data
kmeans = kmeans.fit(X)
# Getting the cluster labels
labels = kmeans.predict(X)
# Centroid values
centroids = kmeans.cluster_centers_```

Checking the clustering accuracy of our program:

```from sklearn.metrics import classification_report

target_names = ['Iris-setosa', 'Iris-versicolor', 'Iris-virginica']

print(classification_report(irisdata['Class'],kmeans.labels_,target_names=target_names))```

The output of classification looks like this:

You can see in the classification report that, 91% of our data was predicted accurately. Thats pretty good for an unsupervised algorithm.

### Webmentions

• Understanding the Classification report in sklearn – Muthukrishnan

[…] of our predictions for classification algorithms. A sample report is shown below generated in our previous post where we ran K-Means on Iris […]