Presenting users with explanations about AI systems, to let them detect and mitigate discriminatory patterns

Our project revolves around the topic of fair Artificial Intelligence (AI), a field that explores how decision-making algorithms used in high-stake domains, such as hiring or loan allocations can perpetuate discriminatory patterns in the data they are based on, unfairly affecting people of certain races, genders or other demographics.
Early attempts to address bias in AI systems focused on automated solutions, attempting to eliminate discrimination by establishing mathematical definitions of "fairness" and optimizing algorithms accordingly. However, these approaches have faced justified criticism for disregarding the contextual nuances in which algorithms operate and for neglecting the input of domain experts who understand and can tackle discriminatory patterns effectively. Consequently, policymakers have recognized the pitfalls of solely relying on these approaches and are now designing legal regulations, mandating that high-risk AI systems can only be deployed when they allow for oversight and intervention by human experts.

With our project, we investigate how to effectively achieve this human control, by exploring the intersection between fair and explainable AI (xAI), whereas the latter is concerned with explaining the decision processes of otherwise opaque black-box algorithms. We will develop a tool, that provides humans with explanations about an algorithmic decision-making system. Based on the explanations users can give feedback about the system’s fairness and choose between different strategies to mitigate its discriminatory patterns. By immediately getting feedback about the effects of their chosen strategy, users can engage in an iterative process further refining and improving the algorithm. Since little prior work has been done on Human-AI collaboration in the context of bias mitigation, we will take on an exploratory approach to evaluate this system. We will set up a think-aloud study where potential end-users can interact with the system and try out different mitigation strategies. We will analyse their responses and thoughts, to identify the tool’s strengths and weaknesses as well as users’ mental model of the tool. Additionally, we will compare the systems’ biases before and after human intervention, to see how biases were mitigated and how successful this mitigation was.

Our work aligns with the goal of the topic “Establishing Common Ground for Collaboration with AI Systems“ (motivated by Workpackage 1 and 2). This topic is focused on developing AI systems that work in harmony with human users, empowering them to bring their expertise and domain knowledge to the table. In particular, our work recognizes humans’ ability to make ethical judgements and aims to leverage this capability to make fairer AI systems. By conducting a user study we align with the topics’ goal to make this human-AI collaboration desirable from the users’ site, ensuring that they understand the inner workings of the AI system and they have full control in adapting it.


– A tool that presents users with explanations about a decision-making system and that can interactively adjust its decision process, based on human feedback about the fairness of its explanations
– A user-centric evaluation of the tool, investigating whether users can effectively detect biases through the tool and how they use the different bias mitigation strategies offered by it
– We aim to present a demo of the tool at a workshop or a conference
– Additionally, we commit to publishing one paper, describing the tool itself and the results of the usability study

Project Partners

Primary Contact

Dino Pedreschi, University of Antwerp – Departement of CS