simpleml.models.classifiers.sklearn.multioutput

Wrapper module around sklearn.multioutput

Module Contents

Classes

SklearnClassifierChain

No different than base model. Here just to maintain the pattern

SklearnMultiOutputClassifier

No different than base model. Here just to maintain the pattern

WrappedSklearnClassifierChain

A multi-label model that arranges binary classifiers into a chain.

WrappedSklearnMultiOutputClassifier

Multi target classification

simpleml.models.classifiers.sklearn.multioutput.__author__ = Elisha Yadgaran[source]
class simpleml.models.classifiers.sklearn.multioutput.SklearnClassifierChain(has_external_files=True, external_model_kwargs=None, params=None, **kwargs)[source]

Bases: simpleml.models.classifiers.sklearn.base_sklearn_classifier.SklearnClassifier

No different than base model. Here just to maintain the pattern Generic Base -> Library Base -> Domain Base -> Individual Models (ex: [Library]Model -> SklearnModel -> SklearnClassifier -> SklearnLogisticRegression)

Need to explicitly separate passthrough kwargs to external models since most do not support arbitrary **kwargs in the constructors

_create_external_model(self, **kwargs)[source]

Abstract method for each subclass to implement

should return the desired model object

class simpleml.models.classifiers.sklearn.multioutput.SklearnMultiOutputClassifier(has_external_files=True, external_model_kwargs=None, params=None, **kwargs)[source]

Bases: simpleml.models.classifiers.sklearn.base_sklearn_classifier.SklearnClassifier

No different than base model. Here just to maintain the pattern Generic Base -> Library Base -> Domain Base -> Individual Models (ex: [Library]Model -> SklearnModel -> SklearnClassifier -> SklearnLogisticRegression)

Need to explicitly separate passthrough kwargs to external models since most do not support arbitrary **kwargs in the constructors

_create_external_model(self, **kwargs)[source]

Abstract method for each subclass to implement

should return the desired model object

class simpleml.models.classifiers.sklearn.multioutput.WrappedSklearnClassifierChain(base_estimator, *, order=None, cv=None, random_state=None)[source]

Bases: sklearn.multioutput.ClassifierChain, simpleml.models.classifiers.external_models.ClassificationExternalModelMixin

A multi-label model that arranges binary classifiers into a chain.

Each model makes a prediction in the order specified by the chain using all of the available features provided to the model plus the predictions of models that are earlier in the chain.

Read more in the User Guide.

New in version 0.19.

base_estimatorestimator

The base estimator from which the classifier chain is built.

orderarray-like of shape (n_outputs,) or ‘random’, default=None

If None, the order will be determined by the order of columns in the label matrix Y.:

order = [0, 1, 2, ..., Y.shape[1] - 1]

The order of the chain can be explicitly set by providing a list of integers. For example, for a chain of length 5.:

order = [1, 3, 2, 4, 0]

means that the first model in the chain will make predictions for column 1 in the Y matrix, the second model will make predictions for column 3, etc.

If order is ‘random’ a random ordering will be used.

cvint, cross-validation generator or an iterable, default=None

Determines whether to use cross validated predictions or true labels for the results of previous estimators in the chain. Possible inputs for cv are:

  • None, to use true labels when fitting,

  • integer, to specify the number of folds in a (Stratified)KFold,

  • CV splitter,

  • An iterable yielding (train, test) splits as arrays of indices.

random_stateint, RandomState instance or None, optional (default=None)

If order='random', determines random number generation for the chain order. In addition, it controls the random seed given at each base_estimator at each chaining iteration. Thus, it is only used when base_estimator exposes a random_state. Pass an int for reproducible output across multiple function calls. See Glossary.

classeslist

A list of arrays of length len(estimators_) containing the class labels for each estimator in the chain.

estimators_list

A list of clones of base_estimator.

order_list

The order of labels in the classifier chain.

>>> from sklearn.datasets import make_multilabel_classification
>>> from sklearn.linear_model import LogisticRegression
>>> from sklearn.model_selection import train_test_split
>>> from sklearn.multioutput import ClassifierChain
>>> X, Y = make_multilabel_classification(
...    n_samples=12, n_classes=3, random_state=0
... )
>>> X_train, X_test, Y_train, Y_test = train_test_split(
...    X, Y, random_state=0
... )
>>> base_lr = LogisticRegression(solver='lbfgs', random_state=0)
>>> chain = ClassifierChain(base_lr, order='random', random_state=0)
>>> chain.fit(X_train, Y_train).predict(X_test)
array([[1., 1., 0.],
       [1., 0., 0.],
       [0., 1., 0.]])
>>> chain.predict_proba(X_test)
array([[0.8387..., 0.9431..., 0.4576...],
       [0.8878..., 0.3684..., 0.2640...],
       [0.0321..., 0.9935..., 0.0625...]])

RegressorChain : Equivalent for regression. MultioutputClassifier : Classifies each output independently rather than

chaining.

Jesse Read, Bernhard Pfahringer, Geoff Holmes, Eibe Frank, “Classifier Chains for Multi-label Classification”, 2009.

class simpleml.models.classifiers.sklearn.multioutput.WrappedSklearnMultiOutputClassifier(estimator, *, n_jobs=None)[source]

Bases: sklearn.multioutput.MultiOutputClassifier, simpleml.models.classifiers.external_models.ClassificationExternalModelMixin

Multi target classification

This strategy consists of fitting one classifier per target. This is a simple strategy for extending classifiers that do not natively support multi-target classification

estimatorestimator object

An estimator object implementing fit, score and predict_proba.

n_jobsint or None, optional (default=None)

The number of jobs to run in parallel. fit(), predict() and partial_fit() (if supported by the passed estimator) will be parallelized for each target.

When individual estimators are fast to train or predict, using n_jobs > 1 can result in slower performance due to the parallelism overhead.

None means 1 unless in a joblib.parallel_backend context. -1 means using all available processes / threads. See Glossary for more details.

Changed in version 0.20: n_jobs default changed from 1 to None

classesndarray of shape (n_classes,)

Class labels.

estimators_list of n_output estimators

Estimators used for predictions.

>>> import numpy as np
>>> from sklearn.datasets import make_multilabel_classification
>>> from sklearn.multioutput import MultiOutputClassifier
>>> from sklearn.neighbors import KNeighborsClassifier
>>> X, y = make_multilabel_classification(n_classes=3, random_state=0)
>>> clf = MultiOutputClassifier(KNeighborsClassifier()).fit(X, y)
>>> clf.predict(X[-2:])
array([[1, 1, 0], [1, 1, 1]])