simpleml.models.classifiers.sklearn.multioutput

Wrapper module around sklearn.multioutput

Module Contents

Classes

SklearnClassifierChain

No different than base model. Here just to maintain the pattern

SklearnMultiOutputClassifier

No different than base model. Here just to maintain the pattern

WrappedSklearnClassifierChain

A multi-label model that arranges binary classifiers into a chain.

WrappedSklearnMultiOutputClassifier

Multi target classification.

Attributes

__author__

simpleml.models.classifiers.sklearn.multioutput.__author__ = Elisha Yadgaran[source]
class simpleml.models.classifiers.sklearn.multioutput.SklearnClassifierChain(has_external_files=True, external_model_kwargs=None, params=None, fitted=False, pipeline_id=None, **kwargs)[source]

Bases: simpleml.models.classifiers.sklearn.base_sklearn_classifier.SklearnClassifier

No different than base model. Here just to maintain the pattern Generic Base -> Library Base -> Domain Base -> Individual Models (ex: [Library]Model -> SklearnModel -> SklearnClassifier -> SklearnLogisticRegression)

Need to explicitly separate passthrough kwargs to external models since most do not support arbitrary **kwargs in the constructors

Two supported patterns - full initialization in constructor or stepwise configured before fit and save

Parameters
  • has_external_files (bool) –

  • external_model_kwargs (Optional[Dict[str, Any]]) –

  • params (Optional[Dict[str, Any]]) –

  • fitted (bool) –

  • pipeline_id (Optional[Union[str, uuid.uuid4]]) –

_create_external_model(self, **kwargs)[source]

Abstract method for each subclass to implement

should return the desired model object

class simpleml.models.classifiers.sklearn.multioutput.SklearnMultiOutputClassifier(has_external_files=True, external_model_kwargs=None, params=None, fitted=False, pipeline_id=None, **kwargs)[source]

Bases: simpleml.models.classifiers.sklearn.base_sklearn_classifier.SklearnClassifier

No different than base model. Here just to maintain the pattern Generic Base -> Library Base -> Domain Base -> Individual Models (ex: [Library]Model -> SklearnModel -> SklearnClassifier -> SklearnLogisticRegression)

Need to explicitly separate passthrough kwargs to external models since most do not support arbitrary **kwargs in the constructors

Two supported patterns - full initialization in constructor or stepwise configured before fit and save

Parameters
  • has_external_files (bool) –

  • external_model_kwargs (Optional[Dict[str, Any]]) –

  • params (Optional[Dict[str, Any]]) –

  • fitted (bool) –

  • pipeline_id (Optional[Union[str, uuid.uuid4]]) –

_create_external_model(self, **kwargs)[source]

Abstract method for each subclass to implement

should return the desired model object

class simpleml.models.classifiers.sklearn.multioutput.WrappedSklearnClassifierChain(base_estimator, *, order=None, cv=None, random_state=None)[source]

Bases: sklearn.multioutput.ClassifierChain, simpleml.models.classifiers.external_models.ClassificationExternalModelMixin

A multi-label model that arranges binary classifiers into a chain.

Each model makes a prediction in the order specified by the chain using all of the available features provided to the model plus the predictions of models that are earlier in the chain.

Read more in the User Guide.

New in version 0.19.

base_estimatorestimator

The base estimator from which the classifier chain is built.

orderarray-like of shape (n_outputs,) or ‘random’, default=None

If None, the order will be determined by the order of columns in the label matrix Y.:

order = [0, 1, 2, ..., Y.shape[1] - 1]

The order of the chain can be explicitly set by providing a list of integers. For example, for a chain of length 5.:

order = [1, 3, 2, 4, 0]

means that the first model in the chain will make predictions for column 1 in the Y matrix, the second model will make predictions for column 3, etc.

If order is random a random ordering will be used.

cvint, cross-validation generator or an iterable, default=None

Determines whether to use cross validated predictions or true labels for the results of previous estimators in the chain. Possible inputs for cv are:

  • None, to use true labels when fitting,

  • integer, to specify the number of folds in a (Stratified)KFold,

  • CV splitter,

  • An iterable yielding (train, test) splits as arrays of indices.

random_stateint, RandomState instance or None, optional (default=None)

If order='random', determines random number generation for the chain order. In addition, it controls the random seed given at each base_estimator at each chaining iteration. Thus, it is only used when base_estimator exposes a random_state. Pass an int for reproducible output across multiple function calls. See Glossary.

classeslist

A list of arrays of length len(estimators_) containing the class labels for each estimator in the chain.

estimators_list

A list of clones of base_estimator.

order_list

The order of labels in the classifier chain.

n_features_in_int

Number of features seen during fit. Only defined if the underlying base_estimator exposes such an attribute when fit.

New in version 0.24.

feature_names_in_ndarray of shape (n_features_in_,)

Names of features seen during fit. Defined only when X has feature names that are all strings.

New in version 1.0.

RegressorChain : Equivalent for regression. MultioutputClassifier : Classifies each output independently rather than

chaining.

Jesse Read, Bernhard Pfahringer, Geoff Holmes, Eibe Frank, “Classifier Chains for Multi-label Classification”, 2009.

>>> from sklearn.datasets import make_multilabel_classification
>>> from sklearn.linear_model import LogisticRegression
>>> from sklearn.model_selection import train_test_split
>>> from sklearn.multioutput import ClassifierChain
>>> X, Y = make_multilabel_classification(
...    n_samples=12, n_classes=3, random_state=0
... )
>>> X_train, X_test, Y_train, Y_test = train_test_split(
...    X, Y, random_state=0
... )
>>> base_lr = LogisticRegression(solver='lbfgs', random_state=0)
>>> chain = ClassifierChain(base_lr, order='random', random_state=0)
>>> chain.fit(X_train, Y_train).predict(X_test)
array([[1., 1., 0.],
       [1., 0., 0.],
       [0., 1., 0.]])
>>> chain.predict_proba(X_test)
array([[0.8387..., 0.9431..., 0.4576...],
       [0.8878..., 0.3684..., 0.2640...],
       [0.0321..., 0.9935..., 0.0625...]])
class simpleml.models.classifiers.sklearn.multioutput.WrappedSklearnMultiOutputClassifier(estimator, *, n_jobs=None)[source]

Bases: sklearn.multioutput.MultiOutputClassifier, simpleml.models.classifiers.external_models.ClassificationExternalModelMixin

Multi target classification.

This strategy consists of fitting one classifier per target. This is a simple strategy for extending classifiers that do not natively support multi-target classification.

estimatorestimator object

An estimator object implementing fit, score and predict_proba.

n_jobsint or None, optional (default=None)

The number of jobs to run in parallel. fit(), predict() and partial_fit() (if supported by the passed estimator) will be parallelized for each target.

When individual estimators are fast to train or predict, using n_jobs > 1 can result in slower performance due to the parallelism overhead.

None means 1 unless in a joblib.parallel_backend context. -1 means using all available processes / threads. See Glossary for more details.

Changed in version 0.20: n_jobs default changed from 1 to None.

classesndarray of shape (n_classes,)

Class labels.

estimators_list of n_output estimators

Estimators used for predictions.

n_features_in_int

Number of features seen during fit. Only defined if the underlying estimator exposes such an attribute when fit.

New in version 0.24.

feature_names_in_ndarray of shape (n_features_in_,)

Names of features seen during fit. Only defined if the underlying estimators expose such an attribute when fit.

New in version 1.0.

ClassifierChainA multi-label model that arranges binary classifiers

into a chain.

MultiOutputRegressor : Fits one regressor per target variable.

>>> import numpy as np
>>> from sklearn.datasets import make_multilabel_classification
>>> from sklearn.multioutput import MultiOutputClassifier
>>> from sklearn.neighbors import KNeighborsClassifier
>>> X, y = make_multilabel_classification(n_classes=3, random_state=0)
>>> clf = MultiOutputClassifier(KNeighborsClassifier()).fit(X, y)
>>> clf.predict(X[-2:])
array([[1, 1, 0], [1, 1, 1]])