simpleml.save_patterns.serializers.dask
Module for Dask save patterns
Module Contents
Classes
Base class for internal dask serialization/deserialization options |
Attributes
- class simpleml.save_patterns.serializers.dask.DaskCSVSerializer[source]
- class simpleml.save_patterns.serializers.dask.DaskHDFSerializer[source]
- class simpleml.save_patterns.serializers.dask.DaskJSONSerializer[source]
- class simpleml.save_patterns.serializers.dask.DaskORCSerializer[source]
- class simpleml.save_patterns.serializers.dask.DaskParquetSerializer[source]
- class simpleml.save_patterns.serializers.dask.DaskPersistenceMethods[source]
Bases:
object
Base class for internal dask serialization/deserialization options
Wraps dd.Dataframe methods with sensible defaults Uses dask bag alternatives for optimizations (notably for read parallelization and memory handling)
- static read_hdf(filepath, **kwargs)[source]
- Parameters
filepath (str) –
- Return type
simpleml.imports.ddDataFrame
- classmethod read_json(cls, filepaths, persist=False, **kwargs)[source]
Uses dask bag implementation to optimize read :param persist: bool, flag to return a processing future instead of lazy compute later
- Parameters
filepaths (List[str]) –
- Return type
simpleml.imports.ddDataFrame
- static read_orc(filepath, **kwargs)[source]
- Parameters
filepath (str) –
- Return type
simpleml.imports.ddDataFrame
- static read_parquet(filepath, **kwargs)[source]
- Parameters
filepath (str) –
- Return type
simpleml.imports.ddDataFrame
- static read_text(*args, **kwargs)[source]
Dask Bag wrapper to read text and return a bag
- Return type
simpleml.imports.dbBag