Data science

K-fold cross validation

The greatest headache for any machine learning engineer is the problem of overfitting. The model we trained works perfectly on the training dataset but when applied to other new dataset it fails miserably. This is because of overfitting where our classifier learns the provided dataset accurately but fails when applied on new data. One good …

Understanding Support vector Machines using Python

Support Vector machines (SVM) can be used for both classification as well as regression tasks but they are mostly used in classification applications. Some of the real world applications include Face detection, Handwriting detection, Document categorisation, SPAM Filtering, image classification and protein remote homology detection. For many researchers, SVM is the first best choice for …

Maths behind Polynomial regression

Polynomial regression is a process of finding a polynomial function that takes the form f( x ) = c0 + c1 x + c2 x2 ⋯ cn xn where n is the degree of the polynomial and c is a set of coefficients. Through polynomial regression we try to find an nth degree polynomial function which is the closest approximation of our data points. Below is a sample random dataset which has been regressed …