{"id":630,"date":"2018-05-23T07:22:41","date_gmt":"2018-05-23T07:22:41","guid":{"rendered":"http:\/\/muthu.co\/?p=630"},"modified":"2021-05-24T03:48:20","modified_gmt":"2021-05-24T03:48:20","slug":"math-behind-linear-regression-and-python-code","status":"publish","type":"post","link":"http:\/\/write.muthu.co\/math-behind-linear-regression-and-python-code\/","title":{"rendered":"Math behind Linear Regression with Python code"},"content":{"rendered":"\n
Simple linear regression<\/strong> is a statistical method you can use to study relationships between two continuous (quantitative) variables:<\/p>\n\n\n\n The goal of any regression model is to predict the value of y (dependant variable) based on the value of x (independent variable). In case of linear regression we would be using past relationships between x and y (which we call our training data) to find a linear equation of type Y = B + Ax and then use this equation to make predictions.<\/p>\n\n\n\n Lets get started with a simple example and build our linear regression model with it. Suppose you visit a pizza restaurant whose menu looks somewhat like this:<\/p>\n\n\n\n You would like to know the approximate price of your pizza if it were 12 inches in diameter. Its not there in the menu but with a linear regression model you might be able to find it. The above table is called our training set, <\/strong>because this is what we would use to model the relationship between diameter and the price of pizza. The training set when plotted on a graph looks like this:<\/p>\n\n\n\n Note that the observed (x<\/em>, y<\/em>) data points fall somewhat in a line but not exactly straight. We will be using these data points to find an equation of a straight line which passes through all the observations which we call a linear relationship between diameter and price.<\/p>\n\n\n\n A linear regression equation takes the same form as the equation of a line and is often written in the following general form: y = A + Bx<\/strong><\/em><\/p>\n<\/div>\n<\/div>\n\n\n\n Where \u2018x\u2019 is the independent variable (the diameter) and \u2018y\u2019 is the dependent variable (the predicted price). The letters \u2018A\u2019 and \u2018B\u2019 represent constants that describe the y-axis intercept and the slope of the line. To find the equation of line, we would need to use the below formula to get A and B.<\/p>\n <\/p>\n Lets do the math in an excel sheet for now and find the values of A and B.<\/p>\n<\/div>\n<\/div>\n\n\n\n<\/a><\/figure><\/div>\n\n\n\n
<\/a><\/figure><\/div>\n\n\n\n
What is a Linear Regression Equation?<\/h4>\n\n\n\n
<\/a><\/figure>