By Michael Wick, Principal Member Technical Staff - Oracle Labs EAST
Increasingly --- from lending to hiring --- machine learning is involved in decision-making processes that affect peoples' lives. Machine learning is often desirable because algorithms such as classifiers can be trained to high accuracy from large quantities of labeled data.
Unfortunately, it is by now well-established that this data can be biased, and thus the very training algorithm that drives the classifier to high levels of accuracy also inculcates the classifier with unfair tendencies. The classifier might exhibit, for example, a proclivity for denying people certain credit limits based on their age race or gender.
Fortunately, it is possible to precisely quantify fairness, making it possible to modify the classifiers’ decisions to respect a given mathematical embodiment of a given definition of fairness. A key question is, what happens to the accuracy of the classifier when we make it more fair?
The prevailing wisdom is that a classifier's fairness and its accuracy are in tension with one another. Under this trade-off theory, forcing the classifier to satisfy some mathematical definition of fairness will invariably decrease the accuracy. Indeed, in Figure 1, the accuracy of the classifier drops (higher is more accurate) as fairness increases (left is more fair).
From a constrained optimization perspective, this is rather intuitive: Fairness provides constraints over the set of outcomes from which it immediately follows that accuracy can only be the same or worse.
Both theoretical and empirical studies confirm the trade-off theory. As the authors of one paper put it, "demanding fairness of models always comes at a cost of reduced accuracy."
But is this theory really true? As just recently remarked, the reason a classifier is unfair in the first place is the labels in a dataset can be biased. This means that the classifier trained on these labels will be biased. But crucially, it also means that the accuracy evaluated on these labels will be biased, and therefore unfaithful to the true accuracy, whatever it actually is. The trade-off theory about fairness and accuracy tacitly makes the assumption that the labels in the data are unbiased, and they often are not.
Our paper re-examines the trade-off between fairness and accuracy, but this time taking into account assumptions about the bias in the data during the evaluation of the classifier. When we take into account the bias in the evaluation data, we find that in many cases, fairness and accuracy are not in tension, but actually in agreement: increasing fairness also increases accuracy. We can see this relationship in Figure 2, which has the opposite trend as Figure 1.
This raises the question: Can we use fairness as a training constraint to improve the accuracy of the classifier? Since many definitions of fairness do not require labeled data, we can use fairness as a constraint on large quantities of unlabeled data, in semi-supervised manner, to improve the quality of a classifier -- increasing both its fairness and its accuracy.
You can read the full paper here.