The task here is to assess a regression model using with test data. Particularly for the use case when at the beginning of a month the data of the previous month are available.
The problem addressed here is to compute the model only with the model data. Further, the chart should clearly show which data belong to the model and which belong to the test.
The solution here is essentially an extension of the previous example on Time Series Update. The chart was re-used adding
For the regression, the library ‘optimize.curve.fit’ of Scipy was used. For details, please go to Scipy - Optimize Curve Fit.
Clearly, the library has to be imported. This is the only import needed here for the regression.
from scipy.optimize import curve_fit
The decisive call is to the function ‘curve_fit’
popt, pcov2 = curve_fit(LinearFunction, dfTL.X, dfTL.Y)
In this approach, you can freely define a function for the fit. For the linear fit, the function is quite simple.
def LinearFunction(x, a0, a1): y = a0 + a1 * x return y
The fitting parameters can be obtained from the exported parameter of the optimization, i.e. ‘popt’.
fittingPara = {} fittingPara["a0"] = round(popt[0], 3) fittingPara["a1"] = round(popt[1], 3)
You get to the forecast using these parameters.
fRL[column] = LinearFunction(dfRL.XF, *popt)
Some statistical parameters are returned, e.g. the co-variance matrix. Unfortunately, the value of ‘RSquared’ (R2) is not returned and you have to compute it manually.