simpleml.datasets.dask.pipeline

Pipeline derived datasets

Module Contents

Classes

DaskPipelineDataset

Dask dataset class that generates the dataframe as the output of the

Attributes

__author__

simpleml.datasets.dask.pipeline.__author__ = Elisha Yadgaran[source]
class simpleml.datasets.dask.pipeline.DaskPipelineDataset(squeeze_return=False, **kwargs)[source]

Bases: simpleml.datasets.dask.base.BaseDaskDataset

Dask dataset class that generates the dataframe as the output of the linked pipeline

Parameters

squeeze_return (bool) – boolean flag whether to run dataframe.squeeze() on return from self.get() calls. Particularly necessary to align input types with different libraries (e.g. sklearn y with single label)

build_dataframe(self)[source]

Transform raw dataset via dataset pipeline for production ready dataset

Return type

None