{"id":727,"date":"2018-06-20T15:14:57","date_gmt":"2018-06-20T15:14:57","guid":{"rendered":"http:\/\/muthu.co\/?p=727"},"modified":"2021-05-24T03:36:50","modified_gmt":"2021-05-24T03:36:50","slug":"simple-example-of-polynomial-regression-using-python","status":"publish","type":"post","link":"http:\/\/write.muthu.co\/simple-example-of-polynomial-regression-using-python\/","title":{"rendered":"Simple example of Polynomial regression using Python"},"content":{"rendered":"\n

Previously<\/a> I wrote an article explaining the underlying maths behind polynomial regression. In this post I will use Python libraries to regress a simple dataset to see polynomial regression in action. If you want to fully understand the internals I recommend you read my previous post.<\/a><\/p>\n\n\n\n

Polynomial regression is a method of finding an nth<\/em> degree polynomial function which is the closest approximation of our data points. Simply put, If my simple line<\/a> doesn’t fit my data set, I will go on and try to find a quadratic, a cubic or a much higher degree function which might fit. How to find which degree to use is a decision which depends completely on the data at hand. A quick glance at a simple scatter plot can reveal a lot about the curvilinear relationship between the data points. Take a look at the below graphs of different degrees of polynomial, this is important because this is what we are trying to fit our data point into.<\/p>\n\n\n\n

\"\"<\/a><\/figure><\/div>\n\n\n\n

Lets try an example now. In any organization, as we go up the ladder our salary also grows with it, the catch is that the growth is not exactly linear. A CEOs salary is many times higher compared to an entry level engineer or even a mid level manager, so a simple linear regression won’t help us predict the salary of a CEO if we know the salaries of few people above us in the hierarchy. I found a simple Salary dataset as shown below for my analysis which I will run through different degrees of polynomial regression:<\/p>\n\n\n\n

\"\"<\/a><\/figure><\/div>\n\n\n\n

Python Code:<\/p>\n\n\n\n

import matplotlib.pyplot as plt\nimport numpy as np\nfrom sklearn.linear_model import Ridge\nfrom sklearn.preprocessing import PolynomialFeatures\nfrom sklearn.pipeline import make_pipeline\n\ny_train = [[45000],[50000],[60000],[80000],[110000],[150000],[200000],[300000],[500000],[1000000]]\nx_plot = np.linspace(0,len(y_train), len(y_train))#returns evenly spaced numbers array between start=0, stop=28, number needed=26\nplt.scatter(x_plot, y_train, color='navy', s=30, marker='o', label=\"training points\")\ncolors = ['teal', 'yellowgreen', 'gold', 'red','green','violet','grey']\nx_plot = x_plot.reshape(-1,1)\n\nfor count, degree in enumerate([1, 2, 3]):\n  model = make_pipeline(PolynomialFeatures(degree), Ridge())\n  model.fit(x_plot, y_train)\n  y_plot = model.predict(x_plot)\n  plt.plot(x_plot, y_plot, color=colors[count], linewidth=2,\n           label=\"degree %d\" % degree)\n\nplt.legend(loc='lower right')\nplt.show()<\/code><\/pre>\n\n\n\n

The above code gives a graph which shows our regression lines for each degrees compared with each other.<\/p>\n\n\n\n

\"\"<\/a><\/figure><\/div>\n\n\n\n

Though my 3rd-degree polynomial is a much better fit for my dataset, I know that a quadratic function (degree 2) is the most logical choice because at no point will the salary start coming down as we level up, which is a case with a polynomial of degree 3 or more.<\/p>\n","protected":false},"excerpt":{"rendered":"

Previously I wrote an article explaining the underlying maths behind polynomial regression. In this post I will use Python libraries to regress a simple dataset to see polynomial regression in action. If you want to fully understand the internals I recommend you read my previous post. Polynomial regression is a method of finding an nth […]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[37],"tags":[49],"_links":{"self":[{"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/posts\/727"}],"collection":[{"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/comments?post=727"}],"version-history":[{"count":2,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/posts\/727\/revisions"}],"predecessor-version":[{"id":1892,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/posts\/727\/revisions\/1892"}],"wp:attachment":[{"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/media?parent=727"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/categories?post=727"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/write.muthu.co\/wp-json\/wp\/v2\/tags?post=727"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}