simpleml.models.split_iterators
Helper classes to iterate splits
Module Contents
Classes
Sequence wrapper for internal datasets. Only used for raw data mapping so |
|
Wrapper utility to convert a pipeline transform operation into an iterator |
|
Nested sequence class to apply transforms on batches in real-time and forward |
|
Pure python iterator. Converts a split object into a generator with defined |
Functions
|
Helper to convert a split object into an ordered tuple of |
Attributes
- class simpleml.models.split_iterators.DatasetSequence(split, batch_size=32, shuffle=True, return_tuple=True, **kwargs)[source]
Bases:
simpleml.imports.SequenceSequence wrapper for internal datasets. Only used for raw data mapping so return type is internal Split object. Transformed sequences are used to conform with external input types (keras tuples)
- Parameters
split (simpleml.datasets.dataset_splits.Split) –
batch_size (int) –
shuffle (bool) –
return_tuple (bool) –
- __getitem__(self, index)[source]
Gets batch at position index. # Arguments
index: position of the batch in the Sequence.
- # Returns
A batch
- Return type
- class simpleml.models.split_iterators.PipelineTransformIterator(pipeline, data_iterator)[source]
Bases:
DataIteratorWrapper utility to convert a pipeline transform operation into an iterator Transforms batch on iteration with provided pipeline
- Parameters
pipeline (simpleml.pipelines.Pipeline) –
data_iterator (DataIterator) –
- __next__(self)[source]
NOTE: Some downstream objects expect to consume a generator with a tuple of X, y, other… not a Split object, so an ordered tuple will be returned if the dataset iterator returns a tuple
- Return type
Union[simpleml.datasets.dataset_splits.Split, Tuple]
- class simpleml.models.split_iterators.PipelineTransformSequence(pipeline, dataset_sequence)[source]
Bases:
simpleml.imports.SequenceNested sequence class to apply transforms on batches in real-time and forward through as the next batch
- Parameters
pipeline (simpleml.pipelines.Pipeline) –
dataset_sequence (DatasetSequence) –
- __getitem__(self, *args, **kwargs)[source]
Pass-through to dataset sequence - applies transform on data and returns transformed batch
- Return type
Union[simpleml.datasets.dataset_splits.Split, Tuple]
- class simpleml.models.split_iterators.PythonIterator(split, infinite_loop=False, batch_size=32, shuffle=True, return_tuple=False, **kwargs)[source]
Bases:
DataIteratorPure python iterator. Converts a split object into a generator with defined batch sizes
- Parameters
split (simpleml.datasets.dataset_splits.Split) –
infinite_loop (bool) –
batch_size (int) –
shuffle (bool) –
return_tuple (bool) –
- __next__(self)[source]
Turn a dataset split into a generator
- Return type
Union[simpleml.datasets.dataset_splits.Split, Tuple]
- simpleml.models.split_iterators.split_to_ordered_tuple(split)[source]
Helper to convert a split object into an ordered tuple of X, y, other
- Parameters
split (simpleml.datasets.dataset_splits.Split) –
- Return type
Tuple