Introduction

As I said on my previous regression post, sometimes linear models don’t fit that well data. So, curves are a great alternative (potential or exponential functions). But there are also cases in which not even these fit well. This method is for those cases that behave in a polynomial way.

Method

This method works similarly to the previous methods, except instead of obtaining a simple formula, we end up with a linear equation system. If we wish to obtain a polynomial of grade n, we end up with a n + 1 equation system. Take for example, a 2 grade polynomial. Our matrix is the following:

matrix.gif

Solving this matrix returns us some values a_0 … a_n, which are ordered from less significant to more significant (a_0 + a_1 * x + a_2 * x² + … + a_n * x^n).

Error quantification

In order to measure how effective the regression was, we calculate three values:

  • Standard error: relative difference of average errors (between using the median as constant function and using the regression).
  • Correlation coefficient: how much each value evaluated in the function is related to the original value.
  • Determination coefficient: how much the new function is better than the median as a constant function.

Both coefficients are calculated in the same way than linear regression. Standard error is calculated a little different:

stderr_pol

Where n is the number of points and m is the grade of the polynomial.

Example

X Y
0 2.1
1 7.7
2 13.6
3 27.2
4 40.9
5 61.1

The resulting function is:

y

The plot from both the values cloud and the function.

test

Quantificating the error we get the following values:

  • Standard error: 1.11752
  • Correlation coefficient: 0.999254
  • Determination coefficient: 0.998509

Flowchart

Code

Conclusions

Polynomial regression is very useful for polynomial behaving data. In general, the error diminishes as the grade increases. But also the complexity of the calculus. So, a good approach is to make a balance between these both.  Also, when having a model that behaves linearly, potentially or exponentially, the other methods work better.

Advertisements