Distribution metrics

Statistical measures for any data entity throughout your ML process
(Inputs, metadata, outputs, and labels)

1800

Statistical property metrics measured on the entity level are divided according to data entity format type:

MetricCategoricalNumericBoolean
Distribution shift βœ”βœ” βœ”
Top frequent percents
Frequency of the mode value.
βœ”--
Unique values
The number of unique values.
βœ”--
Entropy
Calculation of the entropy on the distribution of categorical entities.
βœ”--
Min value
The lowest value.
-βœ”-
Max value
The highest value.
-βœ”-
Sum value
The sum of values.
-βœ”-
Mean value
The average value.
-βœ”-
Standard Deviation
The measure of the amount of variation in a numeric entity.
-βœ”-
Proportion
Percent of positive value.
--βœ”

Feature Importance

Superwise measures the effect a feature has on a model. Superwise calculates feature importance using SHAP values during version creation, using a reference dataset, or you can set this manually while configuring the schema.

Distribution shift & drift metrics:

  • Distribution shift
    How different the distribution in the selected data is from the reference dataset over time. The scale ranges from 0-100, where 0 indicates identical distribution, and 100 indicates orthogonal distribution.

  • Input drift
    The average distribution shifts across all features.