`simpleml.models.classifiers.sklearn.multiclass`

Wrapper module around sklearn.multiclass

Module Contents

Classes

`SklearnOneVsOneClassifier`	No different than base model. Here just to maintain the pattern
`SklearnOneVsRestClassifier`	No different than base model. Here just to maintain the pattern
`SklearnOutputCodeClassifier`	No different than base model. Here just to maintain the pattern
`WrappedSklearnOneVsOneClassifier`	One-vs-one multiclass strategy.
`WrappedSklearnOneVsRestClassifier`	One-vs-the-rest (OvR) multiclass strategy.
`WrappedSklearnOutputCodeClassifier`	(Error-Correcting) Output-Code multiclass strategy.

Attributes

__author__

simpleml.models.classifiers.sklearn.multiclass.__author__ = Elisha Yadgaran[source]

class simpleml.models.classifiers.sklearn.multiclass.SklearnOneVsOneClassifier(has_external_files=True, external_model_kwargs=None, params=None, fitted=False, pipeline_id=None, **kwargs)[source]

Bases: simpleml.models.classifiers.sklearn.base_sklearn_classifier.SklearnClassifier

No different than base model. Here just to maintain the pattern Generic Base -> Library Base -> Domain Base -> Individual Models (ex: [Library]Model -> SklearnModel -> SklearnClassifier -> SklearnLogisticRegression)

Need to explicitly separate passthrough kwargs to external models since most do not support arbitrary **kwargs in the constructors

Two supported patterns - full initialization in constructor or stepwise configured before fit and save

Parameters

has_external_files (bool) –
external_model_kwargs (Optional[Dict[str, Any]]) –
params (Optional[Dict[str, Any]]) –
fitted (bool) –
pipeline_id (Optional[Union[str, uuid.uuid4]]) –

_create_external_model(self, **kwargs)[source]

Abstract method for each subclass to implement

should return the desired model object

class simpleml.models.classifiers.sklearn.multiclass.SklearnOneVsRestClassifier(has_external_files=True, external_model_kwargs=None, params=None, fitted=False, pipeline_id=None, **kwargs)[source]

Bases: simpleml.models.classifiers.sklearn.base_sklearn_classifier.SklearnClassifier

No different than base model. Here just to maintain the pattern Generic Base -> Library Base -> Domain Base -> Individual Models (ex: [Library]Model -> SklearnModel -> SklearnClassifier -> SklearnLogisticRegression)

Need to explicitly separate passthrough kwargs to external models since most do not support arbitrary **kwargs in the constructors

Two supported patterns - full initialization in constructor or stepwise configured before fit and save

Parameters

has_external_files (bool) –
external_model_kwargs (Optional[Dict[str, Any]]) –
params (Optional[Dict[str, Any]]) –
fitted (bool) –
pipeline_id (Optional[Union[str, uuid.uuid4]]) –

_create_external_model(self, **kwargs)[source]

Abstract method for each subclass to implement

should return the desired model object

class simpleml.models.classifiers.sklearn.multiclass.SklearnOutputCodeClassifier(has_external_files=True, external_model_kwargs=None, params=None, fitted=False, pipeline_id=None, **kwargs)[source]

Bases: simpleml.models.classifiers.sklearn.base_sklearn_classifier.SklearnClassifier

No different than base model. Here just to maintain the pattern Generic Base -> Library Base -> Domain Base -> Individual Models (ex: [Library]Model -> SklearnModel -> SklearnClassifier -> SklearnLogisticRegression)

Need to explicitly separate passthrough kwargs to external models since most do not support arbitrary **kwargs in the constructors

Two supported patterns - full initialization in constructor or stepwise configured before fit and save

Parameters

has_external_files (bool) –
external_model_kwargs (Optional[Dict[str, Any]]) –
params (Optional[Dict[str, Any]]) –
fitted (bool) –
pipeline_id (Optional[Union[str, uuid.uuid4]]) –

_create_external_model(self, **kwargs)[source]

Abstract method for each subclass to implement

should return the desired model object

class simpleml.models.classifiers.sklearn.multiclass.WrappedSklearnOneVsOneClassifier(estimator, *, n_jobs=None)[source]

Bases: sklearn.multiclass.OneVsOneClassifier, simpleml.models.classifiers.external_models.ClassificationExternalModelMixin

One-vs-one multiclass strategy.

This strategy consists in fitting one classifier per class pair. At prediction time, the class which received the most votes is selected. Since it requires to fit n_classes * (n_classes - 1) / 2 classifiers, this method is usually slower than one-vs-the-rest, due to its O(n_classes^2) complexity. However, this method may be advantageous for algorithms such as kernel algorithms which don’t scale well with n_samples. This is because each individual learning problem only involves a small subset of the data whereas, with one-vs-the-rest, the complete dataset is used n_classes times.

Read more in the User Guide.

estimatorestimator object

An estimator object implementing fit and one of decision_function or predict_proba.

n_jobsint, default=None

The number of jobs to use for the computation: the n_classes * ( n_classes - 1) / 2 OVO problems are computed in parallel.

None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. See Glossary for more details.

estimators_list of n_classes * (n_classes - 1) / 2 estimators: Estimators used for predictions.
classesnumpy array of shape [n_classes]: Array containing labels.
n_classes_int: Number of classes.
pairwise_indices_list, length = len(estimators_), or None: Indices of samples used when training the estimators. None when estimator’s pairwise tag is False.

Deprecated since version 0.24: The _pairwise attribute is deprecated in 0.24. From 1.1 (renaming of 0.25) and onward, pairwise_indices_ will use the pairwise estimator tag instead.
n_features_in_int: Number of features seen during fit.

New in version 0.24.
feature_names_in_ndarray of shape (n_features_in_,): Names of features seen during fit. Defined only when X has feature names that are all strings.

New in version 1.0.

OneVsRestClassifier : One-vs-all multiclass strategy.

>>> from sklearn.datasets import load_iris
>>> from sklearn.model_selection import train_test_split
>>> from sklearn.multiclass import OneVsOneClassifier
>>> from sklearn.svm import LinearSVC
>>> X, y = load_iris(return_X_y=True)
>>> X_train, X_test, y_train, y_test = train_test_split(
...     X, y, test_size=0.33, shuffle=True, random_state=0)
>>> clf = OneVsOneClassifier(
...     LinearSVC(random_state=0)).fit(X_train, y_train)
>>> clf.predict(X_test[:10])
array([2, 1, 0, 2, 0, 2, 0, 1, 1, 1])

class simpleml.models.classifiers.sklearn.multiclass.WrappedSklearnOneVsRestClassifier(estimator, *, n_jobs=None)[source]

Bases: sklearn.multiclass.OneVsRestClassifier, simpleml.models.classifiers.external_models.ClassificationExternalModelMixin

One-vs-the-rest (OvR) multiclass strategy.

Also known as one-vs-all, this strategy consists in fitting one classifier per class. For each classifier, the class is fitted against all the other classes. In addition to its computational efficiency (only n_classes classifiers are needed), one advantage of this approach is its interpretability. Since each class is represented by one and one classifier only, it is possible to gain knowledge about the class by inspecting its corresponding classifier. This is the most commonly used strategy for multiclass classification and is a fair default choice.

OneVsRestClassifier can also be used for multilabel classification. To use this feature, provide an indicator matrix for the target y when calling .fit. In other words, the target labels should be formatted as a 2D binary (0/1) matrix, where [i, j] == 1 indicates the presence of label j in sample i. This estimator uses the binary relevance method to perform multilabel classification, which involves training one binary classifier independently for each label.

Read more in the User Guide.

estimatorestimator object

An estimator object implementing fit and one of decision_function or predict_proba.

n_jobsint, default=None

The number of jobs to use for the computation: the n_classes one-vs-rest problems are computed in parallel.

None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. See Glossary for more details.

Changed in version v0.20: n_jobs default changed from 1 to None

estimators_list of n_classes estimators: Estimators used for predictions.
coef_ndarray of shape (1, n_features) or (n_classes, n_features): Coefficient of the features in the decision function. This attribute exists only if the estimators_ defines coef_.

Deprecated since version 0.24: This attribute is deprecated in 0.24 and will be removed in 1.1 (renaming of 0.26). If you use this attribute in RFE or SelectFromModel, you may pass a callable to the importance_getter parameter that extracts feature the importances from estimators_.
intercept_ndarray of shape (1, 1) or (n_classes, 1): If y is binary, the shape is (1, 1) else (n_classes, 1) This attribute exists only if the estimators_ defines intercept_.

Deprecated since version 0.24: This attribute is deprecated in 0.24 and will be removed in 1.1 (renaming of 0.26). If you use this attribute in RFE or SelectFromModel, you may pass a callable to the importance_getter parameter that extracts feature the importances from estimators_.
classesarray, shape = [n_classes]: Class labels.
n_classes_int: Number of classes.
label_binarizer_LabelBinarizer object: Object used to transform multiclass labels to binary labels and vice-versa.
multilabel_boolean: Whether a OneVsRestClassifier is a multilabel classifier.
n_features_in_int: Number of features seen during fit. Only defined if the underlying estimator exposes such an attribute when fit.

New in version 0.24.
feature_names_in_ndarray of shape (n_features_in_,): Names of features seen during fit. Only defined if the underlying estimator exposes such an attribute when fit.

New in version 1.0.

MultiOutputClassifierAlternate way of extending an estimator for: multilabel classification.
sklearn.preprocessing.MultiLabelBinarizerTransform iterable of iterables: to binary indicator matrix.

>>> import numpy as np
>>> from sklearn.multiclass import OneVsRestClassifier
>>> from sklearn.svm import SVC
>>> X = np.array([
...     [10, 10],
...     [8, 10],
...     [-5, 5.5],
...     [-5.4, 5.5],
...     [-20, -20],
...     [-15, -20]
... ])
>>> y = np.array([0, 0, 1, 1, 2, 2])
>>> clf = OneVsRestClassifier(SVC()).fit(X, y)
>>> clf.predict([[-19, -20], [9, 9], [-5, 5]])
array([2, 0, 1])

class simpleml.models.classifiers.sklearn.multiclass.WrappedSklearnOutputCodeClassifier(estimator, *, code_size=1.5, random_state=None, n_jobs=None)[source]

Bases: sklearn.multiclass.OutputCodeClassifier, simpleml.models.classifiers.external_models.ClassificationExternalModelMixin

(Error-Correcting) Output-Code multiclass strategy.

Output-code based strategies consist in representing each class with a binary code (an array of 0s and 1s). At fitting time, one binary classifier per bit in the code book is fitted. At prediction time, the classifiers are used to project new points in the class space and the class closest to the points is chosen. The main advantage of these strategies is that the number of classifiers used can be controlled by the user, either for compressing the model (0 < code_size < 1) or for making the model more robust to errors (code_size > 1). See the documentation for more details.

simpleml.models.classifiers.sklearn.multiclass

Module Contents

Classes

Attributes

`simpleml.models.classifiers.sklearn.multiclass`