A Simple Method for Classifier Accuracy Prediction Under Prior Probability Shift

AUTHORS: Lorenzo Volpi, Alejandro Moreo, Fabrizio Sebastiani

WORK PACKAGE: WP 8 – UbiQuity

URL: A Simple Method for Classifier Accuracy Prediction Under Prior Probability Shift

Keywords: Classifier accuracy prediction, Prior probability shift, Label shift, Quantification

Abstract
The standard technique for predicting the accuracy that a classifier will have on unseen data (classifier accuracy prediction – CAP) is cross-validation (CV). However, CV relies on the assumption that the training data and the test data are sampled from the same distribution, an assumption that is often violated in many real-world scenarios. When such violations occur (i.e., in the presence of dataset shift), the estimates returned by CV are unreliable. In this paper we propose a CAP method specifically designed to address prior probability shift (PPS), an instance of dataset shift in which the training and test distributions are characterized by different class priors. By solving a system of n2 independent linear equations, with n the number of classes, our method estimates the n2 entries of the contingency table of the test data, and thus allows estimating any specific evaluation measure. Since a key step in this method involves predicting the class priors of the test data, we further observe a connection between our method and the field of “learning to quantify”. Our experiments show that, when combined with state-of-the-art quantification techniques, under PPS our method tends to outperform existing CAP methods.

Leave a comment