Integrating models¶
skpro can be extended using custom models.
Developing custom models¶
skpro can be extended to implement own models. All probabilistic models have to implement the API of the abstract base class skpro.base.ProbabilisticEstimator
. The example below illustrates a possible implementation of a random guess model that predicts normal distributions with random mean and variance.
from random import randint
import numpy as np
from scipy.stats import norm
from sklearn.datasets.base import load_boston
from sklearn.model_selection import train_test_split
from skpro.base import ProbabilisticEstimator
from skpro.metrics import log_loss
class MyCustomModel(ProbabilisticEstimator):
""" Estimator class that represents the probabilistic model"""
class Distribution(ProbabilisticEstimator.Distribution):
""" Distribution class returned by MyCustomModel.predict(X)
self.estimator provides access to the parent
ProbabilisticEstimator object, e.g. MyCustomModel
self.X provides access to the test sample X
"""
def point(self):
""" Implements the point prediction """
return self.estimator.random_mean_prediction_
def std(self):
""" Implements the variance prediction """
return self.estimator.random_std_prediction_
def pdf(self, x):
""" Implements the pdf function """
return norm.pdf(x, loc=self.point()[self.index], scale=self.std()[self.index])
def __init__(self):
self.random_mean_prediction_ = None
self.random_std_prediction_ = None
def fit(self, X, y):
# Generate random parameter estimates
self.random_mean_prediction_ = randint(np.min(y), np.max(y))
self.random_std_prediction_ = 0.2 * self.random_mean_prediction_
return self
# Use custom model
model = MyCustomModel()
X, y = load_boston(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
y_pred = model.fit(X_train, y_train).predict(X_test)
print('Loss: %f+-%f' % log_loss(y_test, y_pred, return_std=True))
Integrating vendor models¶
To integrate existing models into the framework, the user can implement own subclasses of the probabilistic estimator. The API, however, also offers simplified model integration using the derived VendorEstimator
object that takes a VendorInterface
and a DensityAdapter
. The vendor interface must only define on_fit
and on_predict
events that are invoked automatically. The results of the fit-predict procedure are exposed as public variables of the interface. The adapter, on the other hand, then describes how distributional properties are generated from the interfaced vendor model, i.e. the VendorInterface’s public properties. Given a vendor interface and appropriate adapter, a vendor estimator can be used like any other probabilistic estimator of the framework.
Bayesian integration¶
A notable example of the model integration API is the Bayesian case. To integrate a Bayesian model one can implement the BayesianVendorInterface
and its samples
method that is ought to return a predictive posterior sample. Combined with a skpro.density.DensityAdapter
like the KernelDensityAdapter
that transforms the sample into estimated densities, the Bayesian model can then be used as a probabilistic estimator.