simpleml.metrics
Import modules to register class names in global registry
Expose classes in one import module
Submodules
Package Contents
Classes
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
Base class for all Metric objects |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
|
TODO: Figure out multiclass generalizations |
Attributes
- class simpleml.metrics.AccuracyMetric(**kwargs)[source]
Bases:
AggregateBinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- static _score(predictions, labels)
Each aggregate needs to define a separate private method to actually calculate the aggregate
Separated from the public score method to enable easier testing and extension (values can be passed from non internal properties)
- class simpleml.metrics.BinaryClassificationMetric(dataset_split=None, **kwargs)[source]
Bases:
ClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split (Optional[str]) – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- static _create_confusion_matrix(thresholds, probabilities, labels)
Independent computation method (easier testing)
- property accuracy(self)
Convenience property for the Accuracy Rate (TP+TN/TP+FP+TN+FN)
- property confusion_matrix(self)
Property method to return (or generate) dataframe of confusion matrix at each threshold
- create_confusion_matrix(self)
Iterate through each threshold and compute confusion matrix
- static dedupe_curve(keys, values, maximize=True, round_places=3)
Method to deduplicate multiple values for the same key on a curve (ex multiple thresholds with the same fpr and different tpr for roc)
- Parameters
maximize – Boolean, whether to choose the maximum value for each unique key or the minimum
- property f1(self)
Convenience property for the F1 Score (2*TP/2*TP+FP+FN)
- property false_discovery_rate(self)
Convenience property for the False Discovery Rate (FP/FP+TP)
- property false_negative_rate(self)
Convenience property for the False Negative Rate (FN/TP+FN)
- property false_omission_rate(self)
Convenience property for the False Omission Rate (FN/TN+FN)
- property false_positive_rate(self)
Convenience property for the False Positive Rate (FP/FP+TN)
- property informedness(self)
Convenience property for the Informedness (TPR+TNR-1)
- property labels(self)
- property markedness(self)
Convenience property for the Markedness (PPV+NPV-1)
- property matthews_correlation_coefficient(self)
Convenience property for the Matthews Correlation Coefficient (TP*TN-FP*FN/((FP+TP)*(TP+FN)*(TN+FP)*(TN+FN))^0.5)
- property negative_predictive_value(self)
Convenience property for the Negative Predictive Value (TN/TN+FN)
- property positive_predictive_value(self)
Convenience property for the Positive Predictive Value (TP/FP+TP)
- property predicted_negative_rate(self)
Convenience property for the Predicted Negative Rate (TN+FN/TP+FP+TN+FN)
- property predicted_positive_rate(self)
Convenience property for the Predicted Positive Rate (TP+FP/TP+FP+TN+FN)
- property predictions(self)
- property probabilities(self)
- property thresholds(self)
Convenience property for the probability thresholds
- property true_negative_rate(self)
Convenience property for the True Negative Rate (TN/FP+TN)
- property true_positive_rate(self)
Convenience property for the True Positive Rate (TP/TP+FN)
- static validate_labels(labels)
- class simpleml.metrics.ClassificationMetric(dataset_split=None, **kwargs)[source]
Bases:
simpleml.metrics.base_metric.Metric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split (Optional[str]) – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- property labels(self)
- Return type
Any
- property predictions(self)
- Return type
Any
- property probabilities(self)
- Return type
Any
- static validate_predictions(predictions)
- Parameters
predictions (Any) –
- Return type
None
- class simpleml.metrics.F1ScoreMetric(**kwargs)[source]
Bases:
AggregateBinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- static _score(predictions, labels)
Each aggregate needs to define a separate private method to actually calculate the aggregate
Separated from the public score method to enable easier testing and extension (values can be passed from non internal properties)
- class simpleml.metrics.FprMetric(**kwargs)[source]
Bases:
AggregateBinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- static _score(predictions, labels)
Each aggregate needs to define a separate private method to actually calculate the aggregate
Separated from the public score method to enable easier testing and extension (values can be passed from non internal properties)
- class simpleml.metrics.FprTprMetric(**kwargs)[source]
Bases:
BinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- score(self)
Abstract method for each metric to define
Should set self.values
- class simpleml.metrics.Metric(name=None, has_external_files=False, author=None, project=None, version_description=None, save_patterns=None, **kwargs)[source]
Bases:
AbstractMetric
Base class for all Metric objects
model_id: foreign key to the model that was used to generate predictions
- TODO: Should join criteria be composite of model and dataset for multiple
duplicate metric objects computed over different test datasets?
- Parameters
- __table_args__
- __tablename__ = metrics
- dataset
- dataset_id
- model
- model_id
- class simpleml.metrics.RocAucMetric(**kwargs)[source]
Bases:
BinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- static _score(probabilities, labels)
- score(self)
Abstract method for each metric to define
Should set self.values
- class simpleml.metrics.ThresholdAccuracyMetric(**kwargs)[source]
Bases:
BinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- score(self)
Abstract method for each metric to define
Should set self.values
- class simpleml.metrics.ThresholdF1ScoreMetric(**kwargs)[source]
Bases:
BinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- score(self)
Abstract method for each metric to define
Should set self.values
- class simpleml.metrics.ThresholdFdrMetric(**kwargs)[source]
Bases:
BinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- score(self)
Abstract method for each metric to define
Should set self.values
- class simpleml.metrics.ThresholdFnrMetric(**kwargs)[source]
Bases:
BinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- score(self)
Abstract method for each metric to define
Should set self.values
- class simpleml.metrics.ThresholdForMetric(**kwargs)[source]
Bases:
BinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- score(self)
Abstract method for each metric to define
Should set self.values
- class simpleml.metrics.ThresholdFprMetric(**kwargs)[source]
Bases:
BinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- score(self)
Abstract method for each metric to define
Should set self.values
- class simpleml.metrics.ThresholdInformednessMetric(**kwargs)[source]
Bases:
BinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- score(self)
Abstract method for each metric to define
Should set self.values
- class simpleml.metrics.ThresholdMarkednessMetric(**kwargs)[source]
Bases:
BinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- score(self)
Abstract method for each metric to define
Should set self.values
- class simpleml.metrics.ThresholdMccMetric(**kwargs)[source]
Bases:
BinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- score(self)
Abstract method for each metric to define
Should set self.values
- class simpleml.metrics.ThresholdNpvMetric(**kwargs)[source]
Bases:
BinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- score(self)
Abstract method for each metric to define
Should set self.values
- class simpleml.metrics.ThresholdPpvMetric(**kwargs)[source]
Bases:
BinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- score(self)
Abstract method for each metric to define
Should set self.values
- class simpleml.metrics.ThresholdTnrMetric(**kwargs)[source]
Bases:
BinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- score(self)
Abstract method for each metric to define
Should set self.values
- class simpleml.metrics.ThresholdTprMetric(**kwargs)[source]
Bases:
BinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- score(self)
Abstract method for each metric to define
Should set self.values
- class simpleml.metrics.TprMetric(**kwargs)[source]
Bases:
AggregateBinaryClassificationMetric
TODO: Figure out multiclass generalizations
- Parameters
dataset_split – string denoting which dataset split to use can be one of: TRAIN, VALIDATION, Other. Other gets no prefix Default is train split to stay consistent with no split mapping to Train in Pipeline
- static _score(predictions, labels)
Each aggregate needs to define a separate private method to actually calculate the aggregate
Separated from the public score method to enable easier testing and extension (values can be passed from non internal properties)