simpleml.pipelines.base_pipeline module¶
-
class
simpleml.pipelines.base_pipeline.
BasePipeline
(has_external_files=True, transformers=[], **kwargs)[source]¶ Bases:
simpleml.persistables.base_persistable.BasePersistable
,simpleml.persistables.saving.AllSaveMixin
Base class for all Pipelines objects.
params: pipeline parameter metadata for easy insight into hyperparameters across trainings
-
external_pipeline
¶ All pipeline objects are going to require some filebase persisted object
Wrapper around whatever underlying class is desired (eg sklearn or native)
-
fit_transform
(return_y=False, **kwargs)[source]¶ Wrapper for fit and transform methods ASSUMES only applies to train split
Parameters: return_y – whether to return y with output necessary for fitting a supervised model after
-
get_dataset_split
(split=None)[source]¶ Get specific dataset split By default no constraint imposed, but convention is that return should be a tuple of (X, y)
-
get_feature_names
()[source]¶ Pass through method to external pipeline Should return a list of the final features generated by this pipeline
-
params
= Column(None, JSONB(astext_type=Text()), table=None, default=ColumnDefault({}))¶
-
save
(**kwargs)[source]¶ Extend parent function with a few additional save routines
- save params
- save transformer metadata
- features
-
transform
(X, dataset_split=None, return_y=False, **kwargs)[source]¶ Pass through method to external pipeline
Parameters: - X – dataframe/matrix to transform, if None, use internal dataset
- return_y – whether to return y with output - only used if X is None necessary for fitting a supervised model after
-