simpleml.datasets

Import modules to register class names in global registry

Subpackages

Submodules

Package Contents

Classes

Dataset

Base class for all Dataset objects.

Attributes

__author__

simpleml.datasets.__author__ = Elisha Yadgaran[source]
exception simpleml.datasets.DatasetError(*args, **kwargs)[source]

Bases: SimpleMLError

Common base class for all non-exit exceptions.

Initialize self. See help(type(self)) for accurate signature.

class simpleml.datasets.Dataset(has_external_files=True, label_columns=None, other_named_split_sections=None, **kwargs)[source]

Bases: AbstractDataset

Base class for all Dataset objects.

pipeline_id: foreign key relation to the dataset pipeline used as input

param label_columns: Optional list of column names to register as the “y” split section param other_named_split_sections: Optional map of section names to lists of column names for

other arbitrary split columns – must match expected consumer signatures (e.g. sample_weights) because passed through untouched downstream (eg sklearn.fit(**split))

All other columns in the dataframe will automatically be referenced as “X”

Parameters
  • has_external_files (bool) –

  • label_columns (Optional[List[str]]) –

  • other_named_split_sections (Optional[Dict[str, List[str]]]) –

__table_args__
__tablename__ = datasets
pipeline
pipeline_id