simpleml.pipelines.projected_splits

Module for dataset projection into pipelines. Defines transfer objects returned from pipelines

Module Contents

Classes

IdentityProjectedDatasetSplit

Straight passthrough variety of projection (ie projected split == dataset split)

IndexBasedProjectedDatasetSplit

Index based subset. Compatible with dataset splits that support indexing

ProjectedDatasetSplit

Transfer object to pass dataset splits through pipelines

Attributes

__author__

simpleml.pipelines.projected_splits.__author__ = Elisha Yadgaran[source]
class simpleml.pipelines.projected_splits.IdentityProjectedDatasetSplit(dataset, split)[source]

Bases: ProjectedDatasetSplit

Straight passthrough variety of projection (ie projected split == dataset split)

Parameters
apply_projection(self, dataset_split)[source]

Identity return

Parameters

dataset_split (simpleml.datasets.dataset_splits.Split) –

Return type

simpleml.datasets.dataset_splits.Split

class simpleml.pipelines.projected_splits.IndexBasedProjectedDatasetSplit(indices, **kwargs)[source]

Bases: ProjectedDatasetSplit

Index based subset. Compatible with dataset splits that support indexing

apply_projection(self, dataset_split)[source]

Index subset return

Parameters

dataset_split (simpleml.datasets.dataset_splits.Split) –

Return type

simpleml.datasets.dataset_splits.Split

static dask_indexing(df, indices)[source]
classmethod indexing_method(cls, df, *args, **kwargs)[source]

Infer indexing method to use based on type

static numpy_indexing(df, indices)[source]
static pandas_indexing(df, indices)[source]
class simpleml.pipelines.projected_splits.ProjectedDatasetSplit(dataset, split)[source]

Transfer object to pass dataset splits through pipelines

Contains a reference to the dataset and internal logic to project the split (references the dataset on each call to avoid mutability issues)

Wraps the normal Split object but delegates behavior so can be used interchangeably

Parameters
__getattr__(self, attr)[source]

Passthrough to treat a projected split like a normal split

__getitem__(self, item)[source]
abstract apply_projection(self, dataset_split)[source]

Main method to apply projection logic on the dataset split Returns a new Split with the data subset

Parameters

dataset_split (simpleml.datasets.dataset_splits.Split) –

Return type

simpleml.datasets.dataset_splits.Split

property dataset_split(self)[source]

Passthrough method to retrieve the raw split

Return type

simpleml.datasets.dataset_splits.Split

property projected_split(self)[source]

Wrapper property to retrieve the dataset split and manipulate into a projected split. Returns a split object already parsed

Return type

simpleml.datasets.dataset_splits.Split