Contact person: Matthias Valentin Brannström (mattias.brannstrom@umu.se)
Internal Partners:
- Umea – UMU, Mattias Valentin Brännström, mattias.brannstrom@umu.se
- BSC, Victor Gimenez-Abalos, victor.gimenez@bsc.es
External Partners:
- UiB, John Lindqvist, john.lindqvist@uib.no
- UPC, Sergio Alvarez-Napagao, salvarez@cs.upc.ed
As Artificial Intelligence (AI) systems further integrate in our daily lives, there are growing discussions in both academic and policy settings regarding the need for explanations. If we cannot explain the algorithms, we cannot effectively predict their outcomes, dispute their decisions, verify them, improve them, or maximise any learning from them, negatively impacting trustworthiness and raising ethical concerns. These issues led to the emergence of the field of eXplainable Artificial Intelligence (XAI) and of multiple approaches and methodologies for producing explanations.
However, there are many elements to take into account in order to decide what explanations to produce and how to produce them:
* Who is the explainee and what is their perspective, i.e. what knowledge do they have from the system and what are the questions they want addressed?
* How should the reliability of an explanation be defined? How can we assess whether explanations produced are in line with agent behaviour or just plausible falsehoods? Should explanations refer to specific situations or just to general cases? What metrics can be defined and what is needed for reliable explanations to be feasible?
* What explanations are actually demanded by each use case? Not all aspects of the agent or its behaviour are equally necessary.
* What demands on explainability pertains to continually interactive agents in particular, over other types of systems?
* In case more than one agent is present in the environment, should the explanations be given in terms of a single agent or of the system as a whole? When can these perspectives be meaningfully separated, and when not?
* Can technical demands for XAI be translated into implementation agnostic architectural, structural or behavioural constraints?
* What information about the agent’s context/the socio-technical system is necessary in the explanation? How does privacy and other ethical values impact the demands for explainability?
We focus on agents due to the unique complexities involved in their emerging behaviour; particularly, in multi-agent systems.
From a bottom-up perspective, taking into account the potentially complex behaviour of an interactive agent, the problem of explainability becomes hard to manage and seemingly only solvable in an implementation-specific manner. Many such implementation-specific approaches exist, often building from particular agent architectures, e.g. BDI.
We propose to adopt a top-down perspective on the topic, by 1) undergoing a comprehensive analysis of the State-of-the-Art on explainability of agent behaviour, 2) elaborating an exhaustive definition of relevant terms and their interpretation, 3) studying relevant evaluation metrics and proposing new ones, and 4) producing a comprehensive taxonomy of behaviour explainability. In this context, we aim at integrating diverse perspectives regarding where and how artificial agents are used in socio-technical systems through real-world representative examples.
Top-down views on the topic of explainable AI are not widely represented in the literature so our proposal should entail a strong contribution to the state of the art. The outcome of this microproject should allow the definition of explainable AI systems under common grounds, cutting down on complexity and driving towards generalization while always taking into account the needs of the audience of the explanations.
Results Summary
The project uncovered that a layered causal structure can tie intention based explanation to artificial agents in an otherwise implementation agnostic way. By observing causal connections between sensory information, knowledge and selected actions agent behavior can be structured along the lines of folk-psychological concepts which can be directly translated to explanations. Since the elements of this structure; beliefs, desire and intentions are all defined only on their causal role, they may be implemented in any fashion.
This causal structure repeats in a hierarchical manner such that higher-level intentional behavior encapsulates and makes use of lower levels. This naturally coincides with nested descriptions of behavior where low level behavior are similarly encapsulated.
The framework we explored during this project highlights this link and structure. The structure can be employed to create artificial agents explainable-by-design, but also to assess and understand the limits of intentional behavior in existing agents.
The project has led to several directions of further research to fully realize the potential of the uncovered relationships.