Integrating models

skpro can be extended using custom models.

Developing custom models

skpro can be extended to implement own models. All probabilistic models have to implement the API of the abstract base class skpro.base.ProbabilisticEstimator. The example below illustrates a possible implementation of a random guess model that predicts normal distributions with random mean and variance.

from random import randint
import numpy as np
from scipy.stats import norm
from sklearn.datasets.base import load_boston
from sklearn.model_selection import train_test_split

from skpro.base import ProbabilisticEstimator
from skpro.metrics import log_loss


class MyCustomModel(ProbabilisticEstimator):
    """ Estimator class that represents the probabilistic model"""

    class Distribution(ProbabilisticEstimator.Distribution):
        """ Distribution class returned by MyCustomModel.predict(X)

        self.estimator provides access to the parent
        ProbabilisticEstimator object, e.g. MyCustomModel
        self.X provides access to the test sample X
        """

        def point(self):
            """ Implements the point prediction """
            return self.estimator.random_mean_prediction_

        def std(self):
            """ Implements the variance prediction """
            return self.estimator.random_std_prediction_

        def pdf(self, x):
            """ Implements the pdf function """
            return norm.pdf(x, loc=self.point()[self.index], scale=self.std()[self.index])

    def __init__(self):
        self.random_mean_prediction_ = None
        self.random_std_prediction_ = None

    def fit(self, X, y):
        # Generate random parameter estimates
        self.random_mean_prediction_ = randint(np.min(y), np.max(y))
        self.random_std_prediction_ = 0.2 * self.random_mean_prediction_

        return self


# Use custom model
model = MyCustomModel()

X, y = load_boston(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
y_pred = model.fit(X_train, y_train).predict(X_test)
print('Loss: %f+-%f' % log_loss(y_test, y_pred, return_std=True))

Integrating vendor models

To integrate existing models into the framework, the user can implement own subclasses of the probabilistic estimator. The API, however, also offers simplified model integration using the derived VendorEstimator object that takes a VendorInterface and a DensityAdapter. The vendor interface must only define on_fit and on_predict events that are invoked automatically. The results of the fit-predict procedure are exposed as public variables of the interface. The adapter, on the other hand, then describes how distributional properties are generated from the interfaced vendor model, i.e. the VendorInterface’s public properties. Given a vendor interface and appropriate adapter, a vendor estimator can be used like any other probabilistic estimator of the framework.

Bayesian integration

A notable example of the model integration API is the Bayesian case. To integrate a Bayesian model one can implement the BayesianVendorInterface and its samples method that is ought to return a predictive posterior sample. Combined with a skpro.density.DensityAdapter like the KernelDensityAdapter that transforms the sample into estimated densities, the Bayesian model can then be used as a probabilistic estimator.