simpleml.pipelines.base_pipeline module¶

class simpleml.pipelines.base_pipeline.BasePipeline(has_external_files=True, transformers=[], **kwargs)[source]¶

Base class for all Pipelines objects.

params: pipeline parameter metadata for easy insight into hyperparameters across trainings

add_transformer(name, transformer)[source]¶: Setter method for new transformer step

external_pipeline¶

All pipeline objects are going to require some filebase persisted object

Wrapper around whatever underlying class is desired (eg sklearn or native)

fit_transform(return_y=False, **kwargs)[source]¶

Wrapper for fit and transform methods ASSUMES only applies to train split

Parameters:	return_y – whether to return y with output necessary for fitting a supervised model after

get_dataset_split(split=None)[source]¶: Get specific dataset split By default no constraint imposed, but convention is that return should be a tuple of (X, y)

get_feature_names()[source]¶: Pass through method to external pipeline Should return a list of the final features generated by this pipeline

params = Column(None, JSONB(astext_type=Text()), table=None, default=ColumnDefault({}))¶

save(**kwargs)[source]¶

Extend parent function with a few additional save routines

transform(X, dataset_split=None, return_y=False, **kwargs)[source]¶

Pass through method to external pipeline

Parameters:	X – dataframe/matrix to transform, if None, use internal dataset return_y – whether to return y with output - only used if X is None necessary for fitting a supervised model after