simpleml.metrics.base_metric

Module Contents

Classes

AbstractMetric

Abstract Base class for all Metric objects

Metric

Base class for all Metric objects

Attributes

LOGGER

__author__

simpleml.metrics.base_metric.LOGGER[source]
simpleml.metrics.base_metric.__author__ = Elisha Yadgaran[source]
class simpleml.metrics.base_metric.AbstractMetric(name=None, has_external_files=False, author=None, project=None, version_description=None, save_patterns=None, **kwargs)[source]

Bases: with_metaclass(MetricRegistry, Persistable)

Abstract Base class for all Metric objects

name: the metric name values: JSON object with key: value pairs for performance on test dataset

(ex: FPR: TPR to create ROC Curve) Singular value metrics take the form - {‘agg’: value}

Parameters
  • name (Optional[str]) –

  • has_external_files (bool) –

  • author (Optional[str]) –

  • project (Optional[str]) –

  • version_description (Optional[str]) –

  • save_patterns (Optional[Dict[str, List[str]]]) –

__abstract__ = True[source]
object_type :str = METRIC[source]
values[source]
_get_dataset_split(self, **kwargs)[source]

Default accessor for dataset data. REFERS TO RAW DATASETS not the pipelines superimposed. That means that datasets that do not define explicit splits will have no notion of downstream splits (e.g. RandomSplitPipeline)

Return type

Any

_get_latest_version(self)[source]

Versions should be autoincrementing for each object (constrained over friendly name and model). Executes a database lookup and increments..

Return type

int

_get_pipeline_split(self, column, split, **kwargs)[source]

For special case where dataset is the same as the model’s dataset, the dataset splits can refer to the pipeline imposed splits, not the inherent dataset’s splits. Use the pipeline split then ex: RandomSplitPipeline on NoSplitDataset evaluating “in_sample” performance

Parameters
  • column (str) –

  • split (str) –

Return type

Any

_hash(self)[source]
Hash is the combination of the:
  1. Model

  2. Dataset (optional)

  3. Metric

  4. Config

Return type

str

add_dataset(self, dataset)[source]

Setter method for dataset used

Parameters

dataset (simpleml.datasets.base_dataset.Dataset) –

Return type

None

add_model(self, model)[source]

Setter method for model used

Parameters

model (simpleml.models.base_model.Model) –

Return type

None

load(self, **kwargs)[source]

Extend main load routine to load relationship class

Return type

None

save(self, **kwargs)[source]

Extend parent function with a few additional save routines

Return type

None

abstract score(self, **kwargs)[source]

Abstract method for each metric to define

Should set self.values

class simpleml.metrics.base_metric.Metric(name=None, has_external_files=False, author=None, project=None, version_description=None, save_patterns=None, **kwargs)[source]

Bases: AbstractMetric

Base class for all Metric objects

model_id: foreign key to the model that was used to generate predictions

TODO: Should join criteria be composite of model and dataset for multiple

duplicate metric objects computed over different test datasets?

Parameters
  • name (Optional[str]) –

  • has_external_files (bool) –

  • author (Optional[str]) –

  • project (Optional[str]) –

  • version_description (Optional[str]) –

  • save_patterns (Optional[Dict[str, List[str]]]) –

__table_args__[source]
__tablename__ = metrics[source]
dataset[source]
dataset_id[source]
model[source]
model_id[source]