`simpleml.metrics`¶

Import modules to register class names in global registry

Expose classes in one import module

Submodules¶

Package Contents¶

Classes¶

`AccuracyMetric`	TODO: Figure out multiclass generalizations
`BinaryClassificationMetric`	TODO: Figure out multiclass generalizations
`ClassificationMetric`	TODO: Figure out multiclass generalizations
`F1ScoreMetric`	TODO: Figure out multiclass generalizations
`FprMetric`	TODO: Figure out multiclass generalizations
`FprTprMetric`	TODO: Figure out multiclass generalizations
`Metric`	Base class for all Metric objects
`RocAucMetric`	TODO: Figure out multiclass generalizations
`ThresholdAccuracyMetric`	TODO: Figure out multiclass generalizations
`ThresholdF1ScoreMetric`	TODO: Figure out multiclass generalizations
`ThresholdFdrMetric`	TODO: Figure out multiclass generalizations
`ThresholdFnrMetric`	TODO: Figure out multiclass generalizations
`ThresholdForMetric`	TODO: Figure out multiclass generalizations
`ThresholdFprMetric`	TODO: Figure out multiclass generalizations
`ThresholdInformednessMetric`	TODO: Figure out multiclass generalizations
`ThresholdMarkednessMetric`	TODO: Figure out multiclass generalizations
`ThresholdMccMetric`	TODO: Figure out multiclass generalizations
`ThresholdNpvMetric`	TODO: Figure out multiclass generalizations
`ThresholdPpvMetric`	TODO: Figure out multiclass generalizations
`ThresholdTnrMetric`	TODO: Figure out multiclass generalizations
`ThresholdTprMetric`	TODO: Figure out multiclass generalizations
`TprMetric`	TODO: Figure out multiclass generalizations

simpleml.metrics.__author__ = Elisha Yadgaran[source]¶

class simpleml.metrics.AccuracyMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.AggregateBinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

static _score(predictions, labels)¶

Each aggregate needs to define a separate private method to actually calculate the aggregate

Separated from the public score method to enable easier testing and extension (values can be passed from non internal properties)

class simpleml.metrics.BinaryClassificationMetric(dataset_split=None, **kwargs)[source]¶

Bases: simpleml.metrics.classification.ClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

static _create_confusion_matrix(thresholds, probabilities, labels)¶: Independent computation method (easier testing)

property accuracy(self)¶: Convenience property for the Accuracy Rate (TP+TN/TP+FP+TN+FN)

property confusion_matrix(self)¶: Property method to return (or generate) dataframe of confusion matrix at each threshold

create_confusion_matrix(self)¶: Iterate through each threshold and compute confusion matrix

static dedupe_curve(keys, values, maximize=True, round_places=3)¶

Method to deduplicate multiple values for the same key on a curve (ex multiple thresholds with the same fpr and different tpr for roc)

Parameters: maximize – Boolean, whether to choose the maximum value for each unique key or the minimum

property f1(self)¶: Convenience property for the F1 Score (2*TP/2*TP+FP+FN)

property false_discovery_rate(self)¶: Convenience property for the False Discovery Rate (FP/FP+TP)

property false_negative_rate(self)¶: Convenience property for the False Negative Rate (FN/TP+FN)

property false_omission_rate(self)¶: Convenience property for the False Omission Rate (FN/TN+FN)

property false_positive_rate(self)¶: Convenience property for the False Positive Rate (FP/FP+TN)

property informedness(self)¶: Convenience property for the Informedness (TPR+TNR-1)

property labels(self)¶

property markedness(self)¶: Convenience property for the Markedness (PPV+NPV-1)

property matthews_correlation_coefficient(self)¶: Convenience property for the Matthews Correlation Coefficient (TP*TN-FP*FN/((FP+TP)*(TP+FN)*(TN+FP)*(TN+FN))^0.5)

property negative_predictive_value(self)¶: Convenience property for the Negative Predictive Value (TN/TN+FN)

property positive_predictive_value(self)¶: Convenience property for the Positive Predictive Value (TP/FP+TP)

property predicted_negative_rate(self)¶: Convenience property for the Predicted Negative Rate (TN+FN/TP+FP+TN+FN)

property predicted_positive_rate(self)¶: Convenience property for the Predicted Positive Rate (TP+FP/TP+FP+TN+FN)

property predictions(self)¶

property probabilities(self)¶

property thresholds(self)¶: Convenience property for the probability thresholds

property true_negative_rate(self)¶: Convenience property for the True Negative Rate (TN/FP+TN)

property true_positive_rate(self)¶: Convenience property for the True Positive Rate (TP/TP+FN)

static validate_labels(labels)¶

class simpleml.metrics.ClassificationMetric(dataset_split=None, **kwargs)[source]¶

Bases: simpleml.metrics.base_metric.Metric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

_get_split(self, column)¶

property labels(self)¶

property predictions(self)¶

property probabilities(self)¶

static validate_predictions(predictions)¶

class simpleml.metrics.F1ScoreMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.AggregateBinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

static _score(predictions, labels)¶

Each aggregate needs to define a separate private method to actually calculate the aggregate

Separated from the public score method to enable easier testing and extension (values can be passed from non internal properties)

class simpleml.metrics.FprMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.AggregateBinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

static _score(predictions, labels)¶

Each aggregate needs to define a separate private method to actually calculate the aggregate

Separated from the public score method to enable easier testing and extension (values can be passed from non internal properties)

class simpleml.metrics.FprTprMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.BinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

score(self)¶

Abstract method for each metric to define

Should set self.values

class simpleml.metrics.Metric(name=None, has_external_files=False, author=None, project=None, version_description=None, save_patterns=None, **kwargs)[source]¶

Bases: simpleml.metrics.base_metric.AbstractMetric

Base class for all Metric objects

model_id: foreign key to the model that was used to generate predictions

TODO: Should join criteria be composite of model and dataset for multiple: duplicate metric objects computed over different test datasets?

__table_args__¶

__tablename__ = metrics¶

dataset¶

dataset_id¶

model¶

model_id¶

class simpleml.metrics.RocAucMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.BinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

static _score(probabilities, labels)¶

score(self)¶

Abstract method for each metric to define

Should set self.values

class simpleml.metrics.ThresholdAccuracyMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.BinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

score(self)¶

Abstract method for each metric to define

Should set self.values

class simpleml.metrics.ThresholdF1ScoreMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.BinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

score(self)¶

Abstract method for each metric to define

Should set self.values

class simpleml.metrics.ThresholdFdrMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.BinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

score(self)¶

Abstract method for each metric to define

Should set self.values

class simpleml.metrics.ThresholdFnrMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.BinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

score(self)¶

Abstract method for each metric to define

Should set self.values

class simpleml.metrics.ThresholdForMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.BinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

score(self)¶

Abstract method for each metric to define

Should set self.values

class simpleml.metrics.ThresholdFprMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.BinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

score(self)¶

Abstract method for each metric to define

Should set self.values

class simpleml.metrics.ThresholdInformednessMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.BinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

score(self)¶

Abstract method for each metric to define

Should set self.values

class simpleml.metrics.ThresholdMarkednessMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.BinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

score(self)¶

Abstract method for each metric to define

Should set self.values

class simpleml.metrics.ThresholdMccMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.BinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

score(self)¶

Abstract method for each metric to define

Should set self.values

class simpleml.metrics.ThresholdNpvMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.BinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

score(self)¶

Abstract method for each metric to define

Should set self.values

class simpleml.metrics.ThresholdPpvMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.BinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

score(self)¶

Abstract method for each metric to define

Should set self.values

class simpleml.metrics.ThresholdTnrMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.BinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

score(self)¶

Abstract method for each metric to define

Should set self.values

class simpleml.metrics.ThresholdTprMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.BinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

score(self)¶

Abstract method for each metric to define

Should set self.values

class simpleml.metrics.TprMetric(**kwargs)[source]¶

Bases: simpleml.metrics.classification.AggregateBinaryClassificationMetric

TODO: Figure out multiclass generalizations

Parameters: dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline

static _score(predictions, labels)¶

Each aggregate needs to define a separate private method to actually calculate the aggregate

Separated from the public score method to enable easier testing and extension (values can be passed from non internal properties)

simpleml.metrics¶

Submodules¶

Package Contents¶

Classes¶

`simpleml.metrics`¶