simpleml.persistables.base_persistable module

class simpleml.persistables.base_persistable.Persistable(name=None, has_external_files=False, author=None, project=None, version_description=None, save_method='disk_pickled', **kwargs)[source]

Bases: simpleml.persistables.base_sqlalchemy.BaseSQLAlchemy, simpleml.persistables.saving.AllSaveMixin, simpleml.persistables.hashing.CustomHasherMixin

Base class for all SimpleML database objects. Defaults to PostgreSQL but can be swapped out for any supported SQLAlchemy backend.

Takes advantage of sqlalchemy-mixins to enable active record operations (TableModel.save(), create(), where(), destroy())

id: Random UUID(4). Used over auto incrementing id to minimize collision probability
with distributed trainings and authors (especially if using central server to combine results across different instantiations of SimpleML)

hash_id: Use hash of object to uniquely identify the contents at train time registered_name: class name of object defined when importing

Can be used for the drag and drop GUI - also for prescribing training config

author: creator project: Project objects are associated with. Useful if multiple persistables

relate to the same project and want to be grouped (but have different names) also good for implementing row based security across teams

name: friendly name - primary way of tracking evolution of “same” object over time version: autoincrementing id of “friendly name” version_description: description that explains what is new or different about this version

# Persistence of fitted states has_external_files = boolean field to signify presence of saved files not in (main) db filepaths = JSON object with external file details

Structure: {

“disk”: [
path to file, relative to base simpleml folder (default ~/.simpleml), …

], “database”: [

], “pickled”: [

guid, (for files in binary blobs) …

]

}

metadata: Generic JSON store for random attributes

author = Column(None, String(), table=None, nullable=False, default=ColumnDefault('default'))
config
filepaths = Column(None, JSONB(astext_type=Text()), table=None, default=ColumnDefault({}))
has_external_files = Column(None, Boolean(), table=None, default=ColumnDefault(False))
hash_ = Column('hash', String(), table=None, nullable=False)
id = Column(None, GUID(), table=None, primary_key=True, nullable=False, default=ColumnDefault(<function uuid4>))
load(load_externals=True)[source]

Counter operation for save Needs to load any file and db objects

Class definition is stored by registered_name param and Pickled objects are stored in external_filename param

Parameters:load_externals – Boolean flag whether to load the external files

useful for relationships that only need class definitions and not data

metadata = MetaData(bind=None)
metadata_ = Column('metadata', JSONB(astext_type=Text()), table=None, default=ColumnDefault({}))
name = Column(None, String(), table=None, nullable=False, default=ColumnDefault('default'))
project = Column(None, String(), table=None, nullable=False, default=ColumnDefault('default'))
registered_name = Column(None, String(), table=None, nullable=False)
save()[source]

Each subclass needs to instantiate a save routine to persist to the database and any other required filestore

sqlalchemy_mixins supports active record style TableModel.save() so can still call super(Persistable, self).save()

state
version = Column(None, Integer(), table=None, nullable=False)
version_description = Column(None, String(), table=None, default=ColumnDefault(''))