### Introduction

As I said on my previous regression post, sometimes linear models don’t fit that well data. So, curves are a great alternative (potential or exponential functions). But there are also cases in which not even these fit well. This method is for those cases that behave in a polynomial way.

### Method

This method works similarly to the previous methods, except instead of obtaining a simple formula, we end up with a linear equation system. If we wish to obtain a polynomial of grade n, we end up with a n + 1 equation system. Take for example, a 2 grade polynomial. Our matrix is the following:

Solving this matrix returns us some values a_0 … a_n, which are ordered from less significant to more significant (a_0 + a_1 * x + a_2 * x² + … + a_n * x^n).

### Error quantification

In order to measure how effective the regression was, we calculate three values:

• Standard error: relative difference of average errors (between using the median as constant function and using the regression).
• Correlation coefficient: how much each value evaluated in the function is related to the original value.
• Determination coefficient: how much the new function is better than the median as a constant function.

Both coefficients are calculated in the same way than linear regression. Standard error is calculated a little different:

Where n is the number of points and m is the grade of the polynomial.

### Example

 X Y 0 2.1 1 7.7 2 13.6 3 27.2 4 40.9 5 61.1

The resulting function is:

The plot from both the values cloud and the function.

Quantificating the error we get the following values:

• Standard error: 1.11752
• Correlation coefficient: 0.999254
• Determination coefficient: 0.998509

### Conclusions

Polynomial regression is very useful for polynomial behaving data. In general, the error diminishes as the grade increases. But also the complexity of the calculus. So, a good approach is to make a balance between these both.  Also, when having a model that behaves linearly, potentially or exponentially, the other methods work better.