simpleml.datasets.numpy.pipeline
Pipeline derived datasets
Module Contents
Classes
Dataset class with a predefined build |
Attributes
- class simpleml.datasets.numpy.pipeline.NumpyPipelineDataset(*args, **kwargs)[source]
Bases:
simpleml.datasets.numpy.base.BaseNumpyDataset
Dataset class with a predefined build routine, assuming dataset pipeline existence.
WARNING: this class will fail if build_dataframe is not overwritten or a pipeline provided!
param label_columns: Optional list of column names to register as the “y” split section param other_named_split_sections: Optional map of section names to lists of column names for
other arbitrary split columns – must match expected consumer signatures (e.g. sample_weights) because passed through untouched downstream (eg sklearn.fit(**split))
All other columns in the dataframe will automatically be referenced as “X”