simpleml.models.split_iterators
Helper classes to iterate splits
Module Contents
Classes
Sequence wrapper for internal datasets. Only used for raw data mapping so |
|
Wrapper utility to convert a pipeline transform operation into an iterator |
|
Nested sequence class to apply transforms on batches in real-time and forward |
|
Pure python iterator. Converts a split object into a generator with defined |
Functions
|
Helper to convert a split object into an ordered tuple of |
Attributes
- class simpleml.models.split_iterators.DatasetSequence(split, batch_size=32, shuffle=True, return_tuple=True, **kwargs)[source]
Bases:
simpleml.imports.Sequence
Sequence wrapper for internal datasets. Only used for raw data mapping so return type is internal Split object. Transformed sequences are used to conform with external input types (keras tuples)
- Parameters
split (simpleml.datasets.dataset_splits.Split) –
batch_size (int) –
shuffle (bool) –
return_tuple (bool) –
- __getitem__(self, index)[source]
Gets batch at position index. # Arguments
index: position of the batch in the Sequence.
- # Returns
A batch
- Return type
- class simpleml.models.split_iterators.PipelineTransformIterator(pipeline, data_iterator)[source]
Bases:
DataIterator
Wrapper utility to convert a pipeline transform operation into an iterator Transforms batch on iteration with provided pipeline
- Parameters
pipeline (simpleml.pipelines.Pipeline) –
data_iterator (DataIterator) –
- __next__(self)[source]
NOTE: Some downstream objects expect to consume a generator with a tuple of X, y, other… not a Split object, so an ordered tuple will be returned if the dataset iterator returns a tuple
- Return type
Union[simpleml.datasets.dataset_splits.Split, Tuple]
- class simpleml.models.split_iterators.PipelineTransformSequence(pipeline, dataset_sequence)[source]
Bases:
simpleml.imports.Sequence
Nested sequence class to apply transforms on batches in real-time and forward through as the next batch
- Parameters
pipeline (simpleml.pipelines.Pipeline) –
dataset_sequence (DatasetSequence) –
- __getitem__(self, *args, **kwargs)[source]
Pass-through to dataset sequence - applies transform on data and returns transformed batch
- Return type
Union[simpleml.datasets.dataset_splits.Split, Tuple]
- class simpleml.models.split_iterators.PythonIterator(split, infinite_loop=False, batch_size=32, shuffle=True, return_tuple=False, **kwargs)[source]
Bases:
DataIterator
Pure python iterator. Converts a split object into a generator with defined batch sizes
- Parameters
split (simpleml.datasets.dataset_splits.Split) –
infinite_loop (bool) –
batch_size (int) –
shuffle (bool) –
return_tuple (bool) –
- __next__(self)[source]
Turn a dataset split into a generator
- Return type
Union[simpleml.datasets.dataset_splits.Split, Tuple]
- simpleml.models.split_iterators.split_to_ordered_tuple(split)[source]
Helper to convert a split object into an ordered tuple of X, y, other
- Parameters
split (simpleml.datasets.dataset_splits.Split) –
- Return type
Tuple