Cross Validation Score with Statsmodels

Recently was working with a limited set of data. Using statsmodels , employed a regression model on the data.

To test the confidence in the model needed to do cross validation. The solution that immediately sprang to mind was the cross_val_score function from sci-kit learn library.

However, that function is not applicable on statsmodels object.

Solution: wrap sklearn base estimators on statsmodels objects and then use the model.

Here is the code for wrapping a sklearn baseestimators over statsmodels objects.

from sklearn.base import BaseEstimator, RegressorMixin
import statsmodels.formula.api as smf
import statsmodels.api as sm

class statsmodel(BaseEstimator, RegressorMixin):
    def __init__(self, sm_class, formula):
        self.sm_class = sm_class
        self.formula = formula
        self.model = None
        self.result = None

    def fit(self,data,dummy):
        self.model = self.sm_class(self.formula,data)
        self.result = self.model.fit()

    def predict(self,X):
        return self.result.predict(X)

Notice the dummy in the fit function, the regression api for sklearn needs X and y and for this implementation I was relying on the formula API of statsmodels, so had to add a dummy variable.

Here’s a quick example on how to use it.


from sklearn import linear_model
from sklearn.model_selection import cross_val_score

# Get data
ccard = sm.datasets.ccard.load_pandas()
print (ccard.data.head())

# create a model
clf = statsmodel(smf.ols, "AVGEXP ~ AGE + INCOME")

# Print cross val score on this model
print (cross_val_score(clf, ccard.data, ccard.data['AVGEXP']))

# Same thing on sklearn's linear regression model
lm = linear_model.LinearRegression()

print (cross_val_score(lm , ccard.data.iloc[:,1:3].values, ccard.data.iloc[:,0].values))

I love sklearn. Convenient, efficient and consistent API can make things so much easier.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s