The basic challenge regarding debiasing ML models is that in order to prevent models from generating bias on the basis of some sensitive characteristics it is necessary to have information about these characteristics. Usually this information is not available Fortunately, there is a new approach: Adversarial Reweighted Learning which debiases the models without having sensitive attribute information. However this approach redefines the fairness as Rawlsian max-min principle which is quite different from parity based fairness definitions that have been hitherto used. The goal of this project is to scrutinize the implications of using Rawlsian fairness principle in order debias the models by scrutinizing three things

1. Are there sensitive attributes for which Rawlsian fairness is unsuitable?

2. What would be the parity-based fairness scores when Rawlsian fairness definition is used for debiasing the models?

3. Is it possible to use Rawlsian fairness (and ARL) as post-processing method to existing models?

Output

Publication on identifying the effect of Rawlsian fairness on parity-based fairness definitions

Publication on how ARL can be used for models that are already being used as a post-processing tool

Project Partners:

  • ING Groep NV, Dilhan Thilakarathne

Primary Contact: Dilhan Thilakarathne, ING