Formulating common grounds for studying explainability in the context of agent behaviour: a survey on the topic, analysis of evaluation metrics and the definition of an ontology

As Artificial Intelligence (AI) systems further integrate in our daily lives, there are growing discussions in both academic and policy settings regarding the need for explanations. If we cannot explain the algorithms, we cannot effectively predict their outcomes, dispute their decisions, verify them, improve them, or maximise any learning from them, negatively impacting trustworthiness and raising ethical concerns. These issues led to the emergence of the field of eXplainable Artificial Intelligence (XAI) and of multiple approaches and methodologies for producing explanations.

However, there are many elements to take into account in order to decide what explanations to produce and how to produce them:
* Who is the explainee and what is their perspective, i.e. what knowledge do they have from the system and what are the questions they want addressed?
* How should the reliability of an explanation be defined? How can we assess whether explanations produced are in line with agent behaviour or just plausible falsehoods? Should explanations refer to specific situations or just to general cases? What metrics can be defined and what is needed for reliable explanations to be feasible?
* What explanations are actually demanded by each use case? Not all aspects of the agent or its behaviour are equally necessary.
* What demands on explainability pertains to continually interactive agents in particular, over other types of systems?
* In case more than one agent is present in the environment, should the explanations be given in terms of a single agent or of the system as a whole? When can these perspectives be meaningfully separated, and when not?
* Can technical demands for XAI be translated into implementation agnostic architectural, structural or behavioural constraints?
* What information about the agent’s context/the socio-technical system is necessary in the explanation? How does privacy and other ethical values impact the demands for explainability?

We focus on agents due to the unique complexities involved in their emerging behaviour; particularly, in multi-agent systems.
From a bottom-up perspective, taking into account the potentially complex behaviour of an interactive agent, the problem of explainability becomes hard to manage and seemingly only solvable in an implementation-specific manner. Many such implementation-specific approaches exist, often building from particular agent architectures, e.g. BDI.

We propose to adopt a top-down perspective on the topic, by 1) undergoing a comprehensive analysis of the State-of-the-Art on explainability of agent behaviour, 2) elaborating an exhaustive definition of relevant terms and their interpretation, 3) studying relevant evaluation metrics and proposing new ones, and 4) producing a comprehensive taxonomy of behaviour explainability. In this context, we aim at integrating diverse perspectives regarding where and how artificial agents are used in socio-technical systems through real-world representative examples.

Top-down views on the topic of explainable AI are not widely represented in the literature so our proposal should entail a strong contribution to the state of the art. The outcome of this microproject should allow the definition of explainable AI systems under common grounds, cutting down on complexity and driving towards generalization while always taking into account the needs of the audience of the explanations.

This project is strongly supporting at least WP1-2, WP3 and WP5 and possibly assisting the goals of WP4.
The main focus however is a framework for understanding and evaluating agent Explainability and fits within WP5.



1. (At least) two publications targeting venues such as the IJCAI, AAMAS, ECAI, AAAI conferences, or topic-related workshops:
* The first publication will be a survey of literature in agent explainability
* The second publication will be a definition of a conceptual framework for explainability in agent behaviour, and grounding as an ontology or data model

2. A grounding of the conceptual framework in the form of an ontology and/or a data model for its use in socio-technical systems.

Activities that will be funded, include but are not limited to:
* Doing a comprehensive survey of the literature, classifying different types of explainability and approaches to it,
* Producing a taxonomy of terms related to the topic and provide a definition for them (this might be hard but necessary as sometimes words are used in conflicting ways), Give a definition of what this “explainability box” should account for, probably as an architecture (component view),
* Analysing possible metrics that can be used for evaluating explainable systems, proposing new ones if necessary.
* Mapping different already existing approaches for explainability to our proposed architecture, in order to validate its expressiveness,
* Defining a methodology for grounding the conceptual framework to particular scenarios e.g. Overcooked-AI, COVID Social Simulator, Privacy-enforcement in a home network (these are just examples, the methodology will be general).

Project Partners

  • Umeå University (UMU), Mattias Brännström
  • UiB, John Lindqvist
  • Universitat Politècnica de Catalunya (UPC), Sergio Alvarez-Napagao

Primary Contact

Mattias Brännström, Umeå University (UMU)