{"id":787,"date":"2018-07-07T09:26:54","date_gmt":"2018-07-07T09:26:54","guid":{"rendered":"http:\/\/muthu.co\/?p=787"},"modified":"2023-12-21T08:36:33","modified_gmt":"2023-12-21T08:36:33","slug":"understanding-the-classification-report-in-sklearn","status":"publish","type":"post","link":"http:\/\/write.muthu.co\/understanding-the-classification-report-in-sklearn\/","title":{"rendered":"Understanding the Classification report through sklearn"},"content":{"rendered":"\n<p>A classification report is used to measure the quality of predictions from a classification algorithm. It details how many predictions are true and how many are false. More specifically, true positives, false positives, true negatives, and false negatives are used to calculate the metrics of a classification report, as shown below.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><a href=\"https:\/\/muthu.co\/wp-content\/uploads\/2018\/07\/Snip20180707_108.png\"><img loading=\"lazy\" decoding=\"async\" width=\"495\" height=\"137\" src=\"https:\/\/muthu.co\/wp-content\/uploads\/2018\/07\/Snip20180707_108.png\" alt=\"Classification report of Iris dataset\" class=\"wp-image-785\" srcset=\"http:\/\/write.muthu.co\/wp-content\/uploads\/2018\/07\/Snip20180707_108.png 495w, http:\/\/write.muthu.co\/wp-content\/uploads\/2018\/07\/Snip20180707_108-300x83.png 300w\" sizes=\"(max-width: 495px) 100vw, 495px\" \/><\/a><\/figure><\/div>\n\n\n<p>The report is copied from\u00a0our <a href=\"http:\/\/muthu.co\/k-means-on-iris-dataset\/\">previous post<\/a> related to K-Means on Iris Dataset.<\/p>\n\n\n\n<p>The code to generate a report similar to the one above is:<\/p>\n\n\n\n<pre class=\"wp-block-code EnlighterJSRAW\"><code>from sklearn.metrics import classification_report\ntarget_names = &#091;'Iris-setosa', 'Iris-versicolor', 'Iris-virginica']\nprint(classification_report(irisdata&#091;'Class'],kmeans.labels_,target_names=target_names))<\/code><\/pre>\n\n\n\n<p>The report presents the main classification metrics\u2014precision, recall, and F1-score\u2014for each class. These metrics are calculated using true and false positives, along with true and false negatives. In this context, &#8220;positive&#8221; and &#8220;negative&#8221; are simply generic terms used for the predicted classes.<\/p>\n\n\n\n<p>To assess whether a prediction is correct or incorrect, we consider four possible outcomes:<\/p>\n\n\n\n<ol class=\"three_ol\">\n<li><b>TN \/ True Negative: <\/b>The model correctly predicts a negative class for a negative case.<\/li>\n\n\n\n<li><b>TP \/ True Positive: <\/b>The model correctly predicts a positive class for a positive case.<\/li>\n\n\n\n<li><b>FN \/ False Negative: <\/b>The model incorrectly predicts a negative class for a positive case.<\/li>\n\n\n\n<li><b>FP \/ False Positive: <\/b>The model incorrectly predicts a positive class for a negative case.<\/li>\n<\/ol>\n\n\n\n<p>For example, if you were explaining the classification metrics for spam filtering. True positives are correctly identified spam emails sent to the spam folder (TP), while true negatives are legitimate emails delivered to the inbox (TN). False positives are mistakenly flagged spam emails that actually belonged in the inbox (FP), and false negatives are missed spam emails that got through to the inbox (FN).<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Precision &#8211;&nbsp;What percent of your predictions were correct?<\/strong><\/h2>\n\n\n\n<p>Precision is the ability of a classifier not to label an instance positive that is actually negative. For each class it is defined as the ratio of true positives to the sum of true and false positives.<br><br><strong>Precision \u2013 Accuracy of positive predictions.<\/strong><br><br>\\(Precision = \\frac{\\text{True Positives (TP)}}{\\text{(True Positives (TP)+ False Positives (FP))}}  \\)<strong><br><\/strong><div class=\"cell border-box-sizing text_cell rendered\"><div class=\"inner_cell\"><div class=\"text_cell_render border-box-sizing rendered_html\"><\/div><\/div><\/div><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from sklearn.metrics import precision_score\n\nprint(\"Precision score: {}\".format(precision_score(y_true,y_pred)))<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Recall &#8211;&nbsp;What percent of the positive cases did you catch?&nbsp;<\/strong><\/h2>\n\n\n\n<p>Recall is the ability of a classifier to find all positive instances. For each class it is defined as the ratio of true positives to the sum of true positives and false negatives.<\/p>\n\n\n\n<p>Recall: Fraction of positives that were correctly identified.<br><br>\\(Recall = \\frac{\\text{True Positives (TP)}}{\\text{True Positives (TP) + False Negatives (FN)}} \\)<br><\/p>\n\n\n\n<pre class=\"wp-block-code EnlighterJSRAW\"><code>from sklearn.metrics import recall_score\nprint(\"Recall score: {}\".format(recall_score(y_true,y_pred)))\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>F1 score&nbsp;&#8211;&nbsp;What percent of positive predictions were correct?&nbsp;<\/strong><\/h2>\n\n\n\n<p>The F<sub>1<\/sub>&nbsp;score is a weighted harmonic mean of precision and recall such that the best score is 1.0 and the worst is 0.0. Generally speaking, F<sub>1<\/sub>&nbsp;scores are lower than accuracy measures as they embed precision and recall into their computation. As a rule of thumb, the weighted average of F<sub>1<\/sub>&nbsp;should be used to compare classifier models, not global accuracy.<\/p>\n\n\n\n<p>\\(F_1 = \\frac{2 \\times \\text{Precision} \\times \\text{Recall}}{(\\text{Precision} + \\text{Recall})} \\)<\/p>\n\n\n\n<p><\/p>\n\n\n\n<pre class=\"wp-block-code EnlighterJSRAW\"><code>from sklearn.metrics import f1_score\nprint(\"F1 Score: {}\".format(f1_score(y_true,y_pred)))<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>A classification report is used to measure the quality of predictions from a classification algorithm. It details how many predictions are true and how many are false. More specifically, true positives, false positives, true negatives, and false negatives are used to calculate the metrics of a classification report, as shown below. The report is copied [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[37,32],"tags":[49,58],"_links":{"self":[{"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/posts\/787"}],"collection":[{"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/comments?post=787"}],"version-history":[{"count":14,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/posts\/787\/revisions"}],"predecessor-version":[{"id":1992,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/posts\/787\/revisions\/1992"}],"wp:attachment":[{"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/media?parent=787"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/categories?post=787"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/tags?post=787"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}