Contact person: Dino Pedreschi (dino.pedreschi@unipi.it

Internal Partners:

  1. University of Pisa – Department of CS, Dino Pedreschi (dino.pedrschi@unipi.it)   

External Partners:

  1. University of Antwerp – Department of CS, Daphne Lenders (daphne.lenders@uantwerpen.be
  2. Scuola Normale Superiore, Roberto Pellungrini (roberto.pellungrini@sns.it)   

 

Our project revolves around the topic of fair Artificial Intelligence(AI), a field that explores how decision-making algorithms used in high-stake domains, such as hiring or loan allocations, can perpetuate discriminatory patterns in the data they are based on, unfairly affecting people of certain races, genders or other demographics. Early attempts to address bias in AI systems focused on automated solutions, attempting to eliminate discrimination by establishing mathematical definitions of “fairness” and optimizing algorithms accordingly. However, these approaches have faced justified criticism for disregarding the contextual nuances in which algorithms operate, and for neglecting the input of domain experts who understand and can tackle discriminatory patterns effectively. Consequently, policymakers have recognized the pitfalls of solely relying on these approaches and are now designing legal regulations, mandating that high-risk AI systems can only be deployed when they allow for oversight and intervention by human experts. With our project, we investigate how to effectively achieve this human control by exploring the intersection between fair and explainable AI (xAI), whereas the latter is concerned with explaining the decision processes of otherwise opaque black-box algorithms. We develop a tool that provides humans with explanations about an algorithmic decision-making system. Based on the explanations, users can give feedback about the system’s fairness and choose between different strategies to mitigate its discriminatory patterns. By immediately getting feedback about the effects of their chosen strategy, users can engage in an iterative process, further refining and improving the algorithm. Since little prior work has been done on Human-AI collaboration in the context of bias mitigation, we took on an exploratory approach to evaluate this system. We set up a think-aloud study where potential end-users can interact with the system and try out different mitigation strategies. We analysed their responses and thoughts, to identify the tool’s strengths and weaknesses as well as users’ mental model of the tool. Additionally, we compared the systems’ biases before and after human intervention, to see how biases were mitigated and how successful this mitigation was.

Results Summary

We developed an algorithm that can reject predictions both based on their uncertainty and their unfairness. By rejecting possibly unfair predictions, our method reduces error and positive decision rate differences across demographic groups of the non-rejected data. Since the unfairness-based rejections are based on an interpretable-by-design method, i.e., rule-based fairness checks and situation testing, we create a transparent process that can empower human decision-makers to review the unfair predictions and make more just decisions for them. This explainable aspect is especially important in light of recent AI regulations, mandating that any high-risk decision task should be overseen by human experts to reduce discrimination risks. This methodology allows us to essentially bridge the gap between classifiers with a reject option and interpretable by design methods, encouraging human intervention and comprehension. We produced a functioning software, which is available, and are working on a full publication with experiments on multiple datasets and multiple rejection strategies. A publication is planned out of the outcome.

Tangible Outcomes

  1. The full software: https://github.com/calathea21/IFAC