simpleml.models.base_model

Base Model module

Module Contents

Classes

LibraryModel

Main model class needs to be initialize-able in order to play nice with

Model

Base class for all Model objects. Defines the required

Attributes

LOGGER

__author__

simpleml.models.base_model.LOGGER[source]
simpleml.models.base_model.__author__ = Elisha Yadgaran[source]
class simpleml.models.base_model.LibraryModel(has_external_files=True, external_model_kwargs=None, params=None, fitted=False, pipeline_id=None, **kwargs)[source]

Bases: Model

Main model class needs to be initialize-able in order to play nice with database persistence and loading. This class is the in between that defines the expected methods for each extended library.

Examples: Scikit-learn estimators –> SklearnModel(LibraryModel): … Keras estimators –> KerasModel(LibraryModel): … PyTorch … …

Need to explicitly separate passthrough kwargs to external models since most do not support arbitrary **kwargs in the constructors

Two supported patterns - full initialization in constructor or stepwise configured before fit and save

Parameters
  • has_external_files (bool) –

  • external_model_kwargs (Optional[Dict[str, Any]]) –

  • params (Optional[Dict[str, Any]]) –

  • fitted (bool) –

  • pipeline_id (Optional[Union[str, uuid.uuid4]]) –

abstract _fit(self)[source]

Abstract method to act as a placeholder. Inheriting classes MUST instantiate this method to manage the fit operation. Intentionally not abstracting function because each library internally configures a little differently

class simpleml.models.base_model.Model(has_external_files=True, external_model_kwargs=None, params=None, fitted=False, pipeline_id=None, **kwargs)[source]

Bases: simpleml.persistables.base_persistable.Persistable

Base class for all Model objects. Defines the required parameters for versioning and all other metadata can be stored in the arbitrary metadata field

Also outlines the expected subclass methods (with NotImplementedError). Design choice to not abstract unified API across all libraries since each has a different internal mechanism

Need to explicitly separate passthrough kwargs to external models since most do not support arbitrary **kwargs in the constructors

Two supported patterns - full initialization in constructor or stepwise configured before fit and save

Parameters
  • has_external_files (bool) –

  • external_model_kwargs (Optional[Dict[str, Any]]) –

  • params (Optional[Dict[str, Any]]) –

  • fitted (bool) –

  • pipeline_id (Optional[Union[str, uuid.uuid4]]) –

object_type = MODEL[source]
abstract _create_external_model(self, **kwargs)[source]

Abstract method for each subclass to implement

should return the desired model object

abstract _fit(self)[source]

Abstract method to act as a placeholder. Inheriting classes MUST instantiate this method to manage the fit operation. Intentionally not abstracting function because each library internally configures a little differently

_hash(self)[source]
Hash is the combination of the:
  1. Pipeline

  2. Model

  3. Params

  4. Config

May only include attributes that exist at instantiation. Any attribute that gets calculated later will result in a race condition that may return a different hash depending on when the function is called

_load_pipeline(self)[source]

Helper to fetch the pipeline

_predict(self, X, **kwargs)[source]

Separate out actual predict call for optional overwrite in subclasses

add_pipeline(self, pipeline)[source]

Setter method for dataset pipeline used

Parameters

pipeline (simpleml.pipelines.Pipeline) –

Return type

None

assert_fitted(self, msg='')[source]

Helper method to raise an error if model isn’t fit

assert_pipeline(self, msg='')[source]

Helper method to raise an error if pipeline isn’t present and configured

property external_model(self)[source]

All model objects are going to require some filebase persisted object

Wrapper around whatever underlying class is desired (eg sklearn or keras)

fit(self, **kwargs)[source]

Pass through method to external model after running through pipeline

fit_predict(self, **kwargs)[source]

Wrapper for fit and predict methods

property fitted(self)[source]
get_feature_metadata(self, **kwargs)[source]

Abstract method for each model to define

Should return a dict of feature information (importance, coefficients…)

get_labels(self, dataset_split=None)[source]

Wrapper method to return labels from dataset

get_params(self, **kwargs)[source]

Pass through method to external model

property pipeline(self)[source]

Use a weakref to bind linked pipeline so it doesnt bloat usage returns pipeline if still available or tries to fetch otherwise

predict(self, X, transform=True, **kwargs)[source]

Pass through method to external model after running through pipeline :param transform: bool, whether to transform input via pipeline

before predicting, default True

save(self, **kwargs)[source]

Extend parent function with a few additional save routines

  1. save params

  2. save feature metadata

score(self, X, y=None, **kwargs)[source]

Pass through method to external model

set_params(self, **params)[source]

Pass through method to external model

transform(self, *args, **kwargs)[source]

Run input through pipeline – only method that should reference the pipeline relationship directly (gates the connection point for easy extension in the future)