As I said on my previous regression post, sometimes linear models don’t fit that well data. So, curves are a great alternative (potential or exponential functions). But there are also cases in which not even these fit well. This method is for those cases that behave in a polynomial way.
This method works similarly to the previous methods, except instead of obtaining a simple formula, we end up with a linear equation system. If we wish to obtain a polynomial of grade n, we end up with a n + 1 equation system. Take for example, a 2 grade polynomial. Our matrix is the following:
Solving this matrix returns us some values a_0 … a_n, which are ordered from less significant to more significant (a_0 + a_1 * x + a_2 * x² + … + a_n * x^n).
In order to measure how effective the regression was, we calculate three values:
- Standard error: relative difference of average errors (between using the median as constant function and using the regression).
- Correlation coefficient: how much each value evaluated in the function is related to the original value.
- Determination coefficient: how much the new function is better than the median as a constant function.
Both coefficients are calculated in the same way than linear regression. Standard error is calculated a little different:
Where n is the number of points and m is the grade of the polynomial.
The resulting function is:
The plot from both the values cloud and the function.
Quantificating the error we get the following values:
- Standard error: 1.11752
- Correlation coefficient: 0.999254
- Determination coefficient: 0.998509
Polynomial regression is very useful for polynomial behaving data. In general, the error diminishes as the grade increases. But also the complexity of the calculus. So, a good approach is to make a balance between these both. Also, when having a model that behaves linearly, potentially or exponentially, the other methods work better.