simpleml.datasets.dask.file_based

File based datasets

Module Contents

Classes

DaskFileBasedDataset

Dask dataset class that generates the dataframe by reading in a file

Attributes

DASK_READER_MAP

__author__

simpleml.datasets.dask.file_based.DASK_READER_MAP[source]
simpleml.datasets.dask.file_based.__author__ = Elisha Yadgaran[source]
class simpleml.datasets.dask.file_based.DaskFileBasedDataset(filepath, format, reader_params=None, **kwargs)[source]

Bases: simpleml.datasets.dask.base.BaseDaskDataset

Dask dataset class that generates the dataframe by reading in a file

Parameters
  • squeeze_return – boolean flag whether to run dataframe.squeeze() on return from self.get() calls. Particularly necessary to align input types with different libraries (e.g. sklearn y with single label)

  • filepath (str) –

  • format (str) –

  • reader_params (Optional[Dict]) –

build_dataframe(self)[source]

Must set self._external_file Cant set as abstractmethod because of database lookup dependency

Return type

None