The Drift Analysis is a series of tests designed to check if your model's input data is drifting, that is whether the distribution of the in-production input data (or updated dataset) is changing compared to the input data your model was trained on (or training dataset). The Drift Analysis outputs a Drift Score based on the results of the tests. The performance scores for individual tests are also available in the Drift Report if you wish to dig into the finer details of the Analysis.
Snitch performs a series of tests that each gets a score between 0 (data drift) and 100 (no data drift) and aggregates them into three intermediate scores: the Formatting Consistency Score, the Data Sanity Score, and the Data Distribution Score. The Drift Score is a weighted average of these three intermediate scores.
Formatting Consistency Validation
The Drift Analysis produces the Formatting Consistency Score to check if your updated dataset has the same data formats as your training dataset. The Score is based on whether the updated dataset has the same number of columns, column names, and data types as your training dataset. The Formatting Consistency Score accounts for 15% of the Drift Score.
Data Sanity Validation
The Drift Analysis produces the Data Sanity Score to check if your training and updated datasets contain any obvious flaws that would either limit the quality of a machine learning model or suggest a programming mistake. The Score is based on whether the datasets contain string variables, null values, constant columns, and/or columns that share the same name. The Data Sanity Score accounts for 15% of the Drift Score.
Data Distribution Validation
The Drift Analysis produces the Data Distribution Score to check if your training and updated datasets have features that are drifting, that is whether the updated dataset distribution is different from the training dataset feature distribution. The Data Distribution Score accounts for 70% of the Drift Score.
The Data Distribution Validation performs a series of tests aiming at determining if the training and testing datasets are samples of the same distribution.
- A Student's t-test checks if there is a significant statistical difference between both means.
- A Kolmogorov-Smirnov tests for similarity in empirical distributions.
- A test checks whether positive or negative values are present in the updated, while absent from the training dataset.
- A test checks whether the max of the training data set is smaller than the min of the updated dataset and vice-versa.
- An outlier detection test checks whether the extreme percentiles of the updated dataset are outliers in the training dataset.
- The population stability index (PSI) is computed to assess whether both training and updated datasets can be considered as samples from the same distribution (see Lin 2017).
- Snitch trains a simple random forest classifier to determine if it is possible to distinguish the training dataset observations from the updated dataset observation.
Snitch computes a score ranging from 0 to 100 for each of these tests. The Data Distribution Score is a weighted average of the test scores.