WP5 AI Ethics and Responsible AI | Project Types

/ Projects

Contact person: Matthias Valentin Brannström (mattias.brannstrom@umu.se)

Internal Partners:

Umea – UMU, Mattias Valentin Brännström and Filip Edström, mattias.brannstrom@umu.se , filip.edstrom@umu.se
BSC, Victor Gimenez-Abalos, victor.gimenez@bsc.es

External Partners:

UiB, John Lindqvist, john.lindqvist@uib.no
UPC, Sergio Alvarez-Napagao, salvarez@cs.upc.ed

As Artificial Intelligence (AI) systems further integrate in our daily lives, there are growing discussions in both academic and policy settings regarding the need for explanations. If we cannot explain the algorithms, we cannot effectively predict their outcomes, dispute their decisions, verify them, improve them, or maximise any learning from them, negatively impacting trustworthiness and raising ethical concerns. These issues led to the emergence of the field of eXplainable Artificial Intelligence (XAI) and of multiple approaches and methodologies for producing explanations.

However, there are many elements to take into account in order to decide what explanations to produce and how to produce them:

* Who is the explainee and what is their perspective, i.e. what knowledge do they have from the system and what are the questions they want addressed?

* How should the reliability of an explanation be defined? How can we assess whether explanations produced are in line with agent behaviour or just plausible falsehoods? Should explanations refer to specific situations or just to general cases? What metrics can be defined and what is needed for reliable explanations to be feasible?

* What explanations are actually demanded by each use case? Not all aspects of the agent or its behaviour are equally necessary.

* What demands on explainability pertains to continually interactive agents in particular, over other types of systems?

* In case more than one agent is present in the environment, should the explanations be given in terms of a single agent or of the system as a whole? When can these perspectives be meaningfully separated, and when not?

* Can technical demands for XAI be translated into implementation agnostic architectural, structural or behavioural constraints?

* What information about the agent’s context/the socio-technical system is necessary in the explanation? How does privacy and other ethical values impact the demands for explainability?

We focus on agents due to the unique complexities involved in their emerging behaviour; particularly, in multi-agent systems.

From a bottom-up perspective, taking into account the potentially complex behaviour of an interactive agent, the problem of explainability becomes hard to manage and seemingly only solvable in an implementation-specific manner. Many such implementation-specific approaches exist, often building from particular agent architectures, e.g. BDI.

We propose to adopt a top-down perspective on the topic, by 1) undergoing a comprehensive analysis of the State-of-the-Art on explainability of agent behaviour, 2) elaborating an exhaustive definition of relevant terms and their interpretation, 3) studying relevant evaluation metrics and proposing new ones, and 4) producing a comprehensive taxonomy of behaviour explainability. In this context, we aim at integrating diverse perspectives regarding where and how artificial agents are used in socio-technical systems through real-world representative examples.

Top-down views on the topic of explainable AI are not widely represented in the literature so our proposal should entail a strong contribution to the state of the art. The outcome of this microproject should allow the definition of explainable AI systems under common grounds, cutting down on complexity and driving towards generalization while always taking into account the needs of the audience of the explanations.

Results Summary

The project uncovered that a layered causal structure can tie intention based explanation to artificial agents in an otherwise implementation agnostic way. By observing causal connections between sensory information, knowledge and selected actions agent behavior can be structured along the lines of folk-psychological concepts which can be directly translated to explanations. Since the elements of this structure; beliefs, desire and intentions are all defined only on their causal role, they may be implemented in any fashion.

This causal structure repeats in a hierarchical manner such that higher-level intentional behavior encapsulates and makes use of lower levels. This naturally coincides with nested descriptions of behavior where low level behavior are similarly encapsulated.

The framework we explored during this project highlights this link and structure. The structure can be employed to create artificial agents explainable-by-design, but also to assess and understand the limits of intentional behavior in existing agents.

The project has led to several directions of further research to fully realize the potential of the uncovered relationships.

Contact person: Eric Blaudez, (eric.blaudez@thalesgroup.com)

Internal Partners:

Thales, Eric Blaudez, eric.blaudez@thalesgroup.com
Unibo, Paolo Torrini, p.torroni@unibo.it
CNRS

External Partners:

LISN, Christophe Servan c.servan@qwant.com

The micro-project provides a demonstration of the hierarchical framework for collaboration described in the Humane-AI Net revised strategic work plan, by constructing a multimodal and multilingual conversational agents focused on search. The framework is based on hierarchical levels of abilities:

Reactive (sensori-motor) Interaction: Interaction is tightly-coupled perception-action where actions of one agent are immediately sensed and interpreted as actions of the other. Examples include greetings, polite conversation and emotional mirroring
Situated (Spatio-temporal) Interaction Interactions are mediated by a shared model of objects and relations (states) and shared models for roles and interaction protocols.

In this micro-project, we focused on the 2 first levels (Reactive and Situational) and designed the global framework architecture to show a Proof of Concept (PoC).

Results Summary

We show that the proposed approach provides high-quality semantic segmentation from the robot’s perspective, with accuracy comparable to the original one. In addition, we exploited the gained information and improved the recognition performance of the deep network for the lower viewpoints and showed that the small robot alone is capable of generating high-quality semantic maps for the human partner. The computations are close to real time, so the approach enables interactive applications.

Tangible Outcomes

T-KEIR: https://github.com/ThalesGroup/t-keir
erc-unibo-module: https://github.com/helemanc/erc-unibo-module

Contact person: Frank Dignum (dignum@cs.umu.se)

Internal Partners:

Umeå University (UMU), Frank Dignum
Consiglio Nazionale delle Ricerche (CNR), Eugenia Polizzi, Giulia Andrighetto and Mario Paolucci
Leiden university, Mark Dechesne

External Partners:

UU and National Police in Netherlands, Mijke van den Hurk

In this project we investigate whether normative behavior can be detected in facebook groups. In a first step we hypothesize about possible norms that could lead to a group becoming more extreme on social media, or whether groups that become more extreme will develop certain norms that distinguish them from other groups and that could be detected. An example of such a norm could be that a (self-proclaimed) leader of a group is massively supported by retweets, likes or affirmative messages, along with evidence of verbal sanctioning toward counter-normative replies. Simulations and analyses of historical facebook data (using manual detection in specific case studies and more broadly through NLP) will help reveal the existence of normative behavior and its potential change over time.

Results Summary

The project delivered detailed analyses of the tweets around the USA elections and subsequent riots. Where we thought we might discover some patterns in the tweets indicating more extreme behavior, it appears that extremist expressions are quickly banned from Twitter and find a home in more niche social platforms (in this case Parler). Thus the main conclusion of this project is that we need to find the connections between users in different social media platforms in order to track any extreme behavior.

In order to see how individuals might contribute to behavior that is not in the interest of society we cannot analyze one social media platform. Especially more extremist expressions quickly disappear from main stream social media to niche platforms that can quickly change over time. Thus the connection between individual and societal goals is difficult to observe by just analyzing data from a single social media platform. On the other hand it is very difficult to link users between platforms. Our core contribution could be summarized in 2 points:

Identification of radical behavior in Parler groups
Characterizing the language use of radicalized communities detected on Parler

Tangible Outcomes

Video presentation summarizing the project

Contact person: Elisabetta Biondi, ( elisabetta.biondi@iit.cnr.it )

Internal Partners:

Consiglio Nazionale delle Ricerche (CNR),Elisabetta Biondi, elisabetta.biondi@iit.cnr.it
Central European University (CEU), Janos Kertesz, kerteszj@ceu.edu, Gerardo Iniguez, IniguezG@ceu.edu

The Friedkin-Johnsen model is a very popular model in opinion dynamics, validated on real groups, and well-investigated from the opinion polarization standpoint. Previous research has focused almost exclusively on static networks, where links between nodes do not evolve over time. In this micro-project, we want to fill this gap by designing a variant of the Friedkin-Johnsen model that embeds the dynamicity of social networks. Furthermore, we designed a novel definition of global polarization that combines network features and opinion distribution, to capture the existence of clustered opinions. We have analyzed the polarization effect of the new dynamic model, and identified the impact of the network structure.

Results Summary

Human social networks are very complex systems and their structure has an essential impact on opinion dynamics. However, since my main goal is to study the impact of the opinion dynamics model per se, we decided to deal with two different social network typologies: a Erdős–Rényi (ER) and a stochastic block model (SBM).

— Design of the Friedkin-Johnsen (FJ) dynamic model. We have implemented a rewiring policy that has been extensively studied in discrete opinion diffusion models. This involves substituting edges that connect nodes with different opinions with other edges. We have adapted this scheme to work with the FJ model’s opinions, which are within the range of [-1,1], in both the asynchronous and synchronous versions. According to two parameters θ (the disagreement threshold) and p_rew (the rewiring probability): • With probability 1-p_rew the FJ is applied • With probability p_rew, if i and j disagree, i.e. |x_i-x_j |> θ, the edge (i,j) is replaced with an edge (i,k) where k agrees with i, i.e. |x_i-x_j |<= θ. The above algorithm was specifically designed and implemented for the ER graph. However, in the case of the SBM, I have limited the potential candidates for rewiring to nodes within a maximum of two hops distance. This decision was made to prevent the block structure from becoming entirely irrelevant. The rationale behind this choice is based 38 on the triadic closure mechanism, which suggests that individuals are more inclined to choose new acquaintances among the friends of their friends.

–Design of the polarization metric. The design of the polarization metric involved developing a definition for identifying highly polarized networks. We defined a highly polarized network as one in which there are two distinct opinions that are clustered into two tightly connected communities. To achieve this, we needed to consider both the network structure and the distribution of opinions. Therefore, we decided to use two different metrics to measure these aspects: bimodality for the opinion distribution and homogeneity for its correspondence with the network structure.

— Bimodality. The bimodality coefficient was used to measure the extent to which a distribution is bimodal. It is calculated using the skewness and kurtosis values and represents how similar the distribution is to one with two modes.

— Homogeneity To measure the homogeneity of the opinion distribution with the network structure, we examined the local distribution of nodes’ opinions. We looked at whether each node’s opinion was similar to those of its neighbors, which would suggest that it was in line with the overall opinion distribution over the network. The final homogeneity value was close to zero if the distribution of opinions was close to linear.

–Experimental evaluation. We have developed a Python simulator that can compute the dynamic FJ (rewiring included), and polarization metrics over time based on the given network and initial opinions. To test the model, we ran simulations on a small network comprising 20 nodes and compared the outcomes of the FJ with rewiring to those without rewiring. For the ER network, we used a vector of uniformly distributed opinions over [-1,1] as the initial opinions. However, for the SBM networks, we employed a different configuration, where the initial opinions were uniformly extracted over the intervals [-0.5,0-0.1] and [0.1,0.5], depending on whether the nodes belonged to one or the other block. In conclusion, this Micro-Project involves the design of a dynamic version of the FJ model for synchronous and asynchronous cases. Additionally, we have developed a new definition of polarization that considers both the distribution of opinions and the network topology. To assess the model’s effectiveness, we conducted simulations on two different network types: an ER network and an SBM network. Our findings indicate that the rewiring process has significant effects on polarization, but these effects are dependent on the initial network.

Tangible Outcomes

Github link of the code of the simulator for the new dynamic model: https://github.com/elisabettabiondi/FJ_rewiring_basic.git

Contact person: Carmela Comito, CNR (carmela.comito@icar.cnr.it)

Internal Partners:

Consiglio Nazionale delle Ricerche (CNR), Carmela Comito, carmela.comito@icar.cnr.it
Umeå University (UMU), Nina Khairova, nina.khairova@umu.se
Università di Bologna (UNIBO), Andrea Galassi, p.torroni@unibo.it
TILDE

In this project, we work with a Ukrainian academic refugee, to combine methods for semantic text similarity with expert human knowledge in a participatory way to develop a training corpus that includes news articles containing information on extremism and terrorism.

Results Summary

1) Collection and curation of two event-based datasets of news about Russian-Ukrainianwar.

The datasets support analysis of information alteration among news outlets (agency and media) with a particular focus on Russian, Ukrainian, Western (EU and USA), and international news sources, over the period from February to September 2022. We manually selected some critical events of the Russian-Ukrainian war. Then, for each event, we created a short list of language-specific keywords. The chosen languages for the keywords are Ukrainian, Russian, and English. Finally, besides the scraping operation over the selected sources, we also gather articles using an external news intelligence platform, named Event Registry which keeps track of world events and analyzes media in real-time. Using this platform we were able to collect more articles from a larger number of news outlets and expand the dataset with two distinct article sets. The final version of the RUWA Dataset is thus composed of two distinct partitions.

2) Development of an unsupervised methodology to establish whether news from the various parties are similar enough to say they reflect each other or, instead, they are completely divergent and therefore one is likely not trustworthy. We focused on textual and semantic similarity (sentence embeddings methods such as Sentence-BERT), comparing the news and assess if they have a similar meaning. Another contribution of the proposed methodology is a comparative analysis of the different media sources in terms of sentiments and emotions, extracting subjective points of view as they are reported in texts,

combining a variety of NLP-based AI techniques and sentence embeddings techniques. Finally, we applied NLP techniques to detect propaganda in news article, relying on self supervised NLP systems such as RoBERTa and existing adequate propaganda datasets.

3) Preliminary Qualitative results:

When the events concern civilians all sources are very dissimilar. But Ukraine and Western are more similar. When the event is military targets, Russian and Ukraine sources are very dissimilar from other sources, there is more propaganda in Ukraine and Russian Ones.

Tangible Outcomes

Github repository of datasets and software: https://github.com/fablos/ruwa-dataset

Contact person: Andrea Galassi (a.galassi@unibo.it)

Internal Partners:

University of Bologna, Andrea Galassi, a.galasi@unibo.it

External Partners:

Uppsala University, Ana Tanevska, ana.tanevska@it.uu.s

As AI-powered devices, software solutions, and other products become prevalent in everyday life, there is an urgent need to prevent the creation or perpetuation of stereotypes and biases around gender, age, race, as well as other social characteristics at risk of discrimination.

There are well-documented limitations in our practices for collecting, maintaining, and distributing the datasets used in current ML models. Moreover, these AI/ML systems, their underlying datasets, as well as the stakeholders involved in their creation, often do not reflect the diversity in human societies, thus further exacerbating structural and systemic biases. Thus, it is critical for the AI community to address this lack of diversity, acknowledge its impact on technology development, and seek solutions to ensure diversity and inclusion.

Audio is a natural way of communicating for humans and allows the expression of a wide range of information. Its analysis through AI applications can provide insights regarding the emotions and inner state of the speaker, information that cannot be captured by simply analyzing text. The analysis of the speech component is valuable in any AI application designed for tasks requiring an understanding of human users behind their explicit textual expressions, such as the research area of affective computing.

Affective computing refers to the study and development of systems and devices that can recognize, interpret, and simulate human emotions and related affective phenomena. Most of the currently available speech datasets face significant limitations, such as a lack of diversity in the speaker population, which can affect the accuracy and inclusivity of speech recognition systems for speakers with different accents, dialects, or speech patterns.

Other limitations include narrow context and small scale of recordings, data quality issues, limited representation, and limited availability of data. These issues must be carefully addressed when selecting and using speech datasets in an effective computing context, to ensure that speech recognition systems can effectively contribute to applications such as intelligent virtual assistants, mental health diagnosis, and emotion recognition in diverse populations.

In this MP, we aim to contribute towards the creation of future datasets and to facilitate a more aware use of existing ones. We propose to perform an extensive review of the literature on the topic, in particular existing speech datasets, with two main objectives.

First, we identify the key characteristics required in the creation of unbiased and inclusive speech datasets and how such characteristics should be pursued and validated.

Second, we perform a meta-analysis of the domain, focusing on the underlying limitations in the existing datasets. We provide a critical evaluation of the datasets themselves, but also of the scientific articles in which they were presented. Such a fine-grained analysis will allow us to elaborate on a more general and coarse-grained evaluation of the domain.

Results Summary

In this micro-project, we addressed the domain of speech datasets for mental health and neurological disorders. We created a set of 7 desiderata for building these datasets, distilled it into a checklist of 20 elements that can be used as a tool for analysis of existing works and as guidance for future works, and finally surveyed existing literature to analyze and discuss current practices. Our set of desiderata is the first to specifically address this domain and considers both aspects that are relevant in terms of ethics and societal impact, such as “Fairness, Bias, and Diversity”, but also aspects that are more technical and domain-specific, such as the details of the recording and the involvement of medical experts in the study.

In our survey of existing literature, we identified key areas for improvement in resource creation and use. For example, several of the examined papers do not report on informed consent and accountability. Our findings highlighted the importance of involving experts from several different disciplines (e.g., computer science, medicine, social science, and law) when conducting studies in such a critical domain. These results also confirm the importance of the dissemination of principles and best practices across different disciplines.

Tangible Outcomes

[arxiv] [under review] 1.A pre-print currently under review: Mancini, E., Tanevska, A., Galassi, A., Galatolo, A., Ruggeri, F., & Torroni, P. (2024). Promoting Fairness and Diversity in Speech Datasets for Mental Health and Neurological Disorders Research. arXiv preprint arXiv:2406.04116 (under review in JAIR (journal of artificial intelligence research))
https://arxiv.org/abs/2406.04116
A GitHub repository with detailed analysis of literature Detailed analysis of containing 36 existing datasets and papers according to our desiderata and checklist
https://github.com/nlp-unibo/ethical-survey-speech
Invited talk “Towards an Ethical and Human-centric Artificial Intelligence: two case studies on fairness in Dialogue Systems and Speech Datasets”, at “2nd Workshop on Inside the Ethics of AI Awareness”, November 11th 2024, Uppsala, organized as part of the Horizon Europe project SymAware

Contact person: Eugenia Polizzi

Internal Partners:

Consiglio Nazionale delle Ricerche (CNR), ISTC: Eugenia Polizzi)
Fondazione Bruno Kessler (FBK), Marco Pistore

The goal of the project is to investigate the role of social norms on misinformation in online communities. This knowledge can help identify new interventions in online communities that help prevent the spread of misinformation. To accomplish the task, the role of norms was explored by analyzing Twitter data gathered through the Covid19 Infodemics Observatory, an online platform developed to study the relationship between the evolution of the COVID-19 epidemic and the information dynamics on social media. This study can inform a further set of microprojects addressing norms in AI systems through theoretical modelling and social simulations.

Results Summary

In this MP, we diagnosed and visualized a map of existing social norms underlying fake news related to COVID19. Through the analysis of millions of geolocated tweets collected during the Covid-19 pandemic we were able to identify the existence of structural and functional network features supporting an “illusion of the majority” on Twitter. Our results suggest that the majority of fake (and other) contents related to the pandemic are produced by a minority of users and that there is a structural segmentation in a small “core” of very active users responsible for large amount of fake news and a larger “periphery” that mainly retweets the contents produced by the core. This discrepancy between the size and identity of users involved in the production and diffusion of fake news suggests that a distorted perception of what users believe is the majority opinion may pressure users (especially those in the periphery) to comply with the group norm and further contribute to the spread of misinformation in the network.

Tangible Outcomes

The voice of few, the opinions of many: evidence of social biases in Twitter COVID-19 fake news sharing – Piergiorgio Castioni, Giulia Andrighetto, Riccardo Gallotti, Eugenia Polizzi, Manlio De Domenico https://arxiv.org/abs/2112.01304
Video presentation summarizing the project

Contact person: Jonne Maas (j.j.c.maas@tudelft.nl)

Internal Partners:

TU Delft, Jonne Maas, Juan M. Duran, and Jeroen van den Hoven
Umeå University (UMU), Virginia Dignum, and Luis Gustavo Ludescher

The purpose of this micro-project is to critically reflect on the design of an AI system by investigating the role of the designer. Designers make choices during the design of the system. Analysing these choices and their effective consequences contributes to an overall understanding of the situated knowledge embedded in a system. The reflection is concerned with questions like what does it mean for the output of the system what the designer’s interpretations are? In what way do they then exercise power on this system? In particular, this micro-project will examine a concrete case. It will follow the design of an agent-based social simulation that aims at modelling how inequality affects democracy.

Tangible Outcomes

Video presentation summarizing the project

Contact person: Karen Joisten, Ettore Barbagallo (karen.joisten@rptu.de; ettore.barbagallo@rptu.de)

Internal Partners:

RPTU kaiserslautern

This micro-project started from the consideration that AI systems not only are improving in doing what they are expected to do but are also developing a characteristic that no other technological artifact displays, that is, the resemblance with biological systems. This resemblance explains the tendency present not only in the media and social media but also within academic milieus to use anthropomorphic language when talking about what AI systems do or are (“intelligence,” “agency,” “autonomy,” “life cycle,” “learning,” “knowing,” “discriminating,” etc.). Joisten and Barbagallo attempted to discern cases in which anthropomorphic language is inevitable and can scarcely be replaced by better linguistic alternatives, and cases in which philosophers, scientists, and engineers should work together to find more suitable language solutions. The goal of the micro-project was not to suggest that anthropomorphic language use is in all cases incorrect and should always be replaced by more correct language use. The project leader instead adopted an ethical perspective, arguing that the real risk of unconsciously using anthropomorphic language when speaking of AI systems is not the humanization of AI but the mechanization of human life. The expression “mechanization of human life” refers here to the possible shift in the way human beings intellectually comprehend and emotionally perceive their humanity. Mechanization of human life, therefore, takes place when AI becomes the model and framework of human self-comprehension. An important takeaway of the micro-project is that the issue of anthropomorphic AI language and mechanization of human life cannot be addressed and solved by a single isolated discipline but only in an interdisciplinary effort in which philosophers, ethicists, computer scientists, engineers, social scientists, linguists, and jurists share their expertise.

Results Summary

The main focus of the micro-project was on the ethical consequences of language use when discussing AI systems and human-AI interaction. The research project was conducted in three phases.

1) In the first phase (May to June 2023), the project examined various AI guidelines, such as the Ethics Guidelines for Trustworthy AI (AI HLEG 2019) and others, with the aim of analyzing the language used to describe AI’s functionality and the interrelation between humans and machines. The project’s philosophical emphasis on problems related to language use was based on the phenomenological observation that language shapes our cognitive and emotional relationships to ourselves, the world, and also to our technological artifacts, including AI.

2) In the second phase (July to August 2023), the study addressed more general issues that emerged from the analysis and comparison of the examined AI guidelines. Despite the evident efforts of the guidelines’ authors to use technical, scientific, neutral, and objective language, Joisten and Barbagallo identified several terms—such as “agency,” “learning,” “decision making,” and “autonomy”—that indicate a tendency toward anthropomorphic language. The questions posed by the project implementers were: when is it philosophically and ethically acceptable to employ humanizing language when speaking of AI’s functioning? And when is it more appropriate to replace the terms mentioned above with more suitable concepts?

3) The third phase of the project (October 2023 to February 2024) involved the Research Seminar “Ethics and AI,” which was directed at PhD students, postdoctoral scholars, and research fellows, and took place at the University of Kaiserslautern-Landau (Campus Kaiserslautern) in the winter term of 2023-2024. The seminar gave Joisten and Barbagallo the opportunity to present and discuss their findings with other researchers and colleagues. In the seminar, the project leaders aimed to show that the main ethical risk of using anthropomorphic language in relation to AI is not the humanization of AI, but rather the mechanization of human life.

Tangible Outcomes

Research seminar: Ethics and AI for PhD students, postdoctoral scholars, and research fellows in University of Kaiserslautern-Landau (Winter 2023-2024) https://www.kis.uni-kl.de/campus/all/event.asp?gguid=0x36775C8D0E68413D87103D67948EF327&tguid=0x3E97C1E01A714B9F9C0BEE5AB4FFE5FC

Contact person: Bettina Fazzinga, Andrea Galassi (bettina.fazzinga@unical.it ; a.galassi@unibo.it)

Internal Partners:

Consiglio Nazionale delle Ricerche (CNR), Bettina Fazzinga
Università di Bologna (UNIBO), Paolo Torroni

Building AI machines capable of making decisions compliant with ethical principles is a challenge that needs to be faced in the direction of improving reliability and fairness in AI.

This micro-project aims at combining argument mining and argumentation-based reasoning to ensure ethical behaviors in the context of chatbot systems. Argumentation is a powerful tool for modeling conversations and disputes. Argument mining is the automatic extraction of arguments from natural language inputs, which could be applied both in the analysis of user input and in the retrieval of suitable feedbacks to the user. We aim to augment classical argumentation frameworks with ethical and/or moral constraints and with natural language interaction capabilities, in order to guide the conversation between chatbots and humans in accordance with the ethical constraints.

Results Summary

We propose a general-purpose dialogue system architecture that leverages computational argumentation and state-of-the-art language technologies to implement ethics by design.

In particular, we propose a chatbot architecture that relies on transparent and verifiable methods and is conceived so as to respect relevant data protection regulations. Importantly, the chatbot is able to explain its outputs or recommendations in a manner adapted to the intended (human) user.

We evaluate our proposal against a covid-19 vaccine case study.

Tangible Outcomes

Fazzinga, Bettina, Andrea Galassi, and Paolo Torroni. “An argumentative dialogue system for COVID-19 vaccine information.” In International Conference on Logic and Argumentation, pp. 477-485. Cham: Springer International Publishing, 2021. https://link.springer.com/chapter/10.1007/978-3-030-89391-0_27 https://arxiv.org/abs/2107.12079
Video presentation summarizing the project

Contact person: Juan Carlos Nieves (jcnieves@cs.umu.se)

Internal Partners:

Umeå University – Computing Science Department, Juan Carlos Nieves, jcnieves@cs.umu.se

External Partners:

Umeå University – Police Education Unit, Jonas Hansson, jonas.hansson@umu.se
Comet Global Innovation-COMET, Rocio Salguero Martinez, r.salguero@comet.technology
Institut de Seguretat Pública de Catalunya -ISPC, Lola Vales Port, lvalles@gencat.ca

We will address the ethical and societal elements of the ELS theme of HumanE-AI-Net, focusing on how to construct Artificial Intelligent Systems (AIS) whose functions aim to support citizen security and safety units. Citizen Security and Safety Units are those closest to the citizens, having the largest number of officers. Their tasks include to help a disoriented old person, deal with traffic and face dangerous situations, gang fights or shootings. The units training is generalist, lacking training and supporting tools to deal with certain situations, in contrast to specialized units. Units need to train situational awareness (ability to maintain a constant, clear mental picture of relevant information and tactical situation in all types of situations). We aim to provide AI tools to facilitate their work and improve their own safety, efficiency, protecting citizens’ rights and enhancing their trust. Transparency and trustworthiness are the most limiting factors regarding the development of AI solutions for public safety and security forces. Any development should be useful to citizens and officers alike. We will carry out tests using the mixed reality paradigm, i.e. HoloLens, with police officers to collect data, making and assessment of the implementation of Trustworthy AI requirements in the following police intervention scenario:

Vehicle stop. Police officers usually patrol a city in police cars facing all types of situations, that in any moment could scale from going to low risk (traffic preventive tasks) to high risk (tracking a possible suspect of a crime that is travelling at high speed). This scenario offers a pretty common activity for police officers. The project’s interest will be to address the perception and impact of the use of technology that could support the security forces to face these daily tasks, making their work safer (i.e., using drones to track suspects in case they are armed). We select this scenario due to its multiple implications, to assess the relationship of public security officers with AI, and to address possible several societal and legal challenges: -Societal: Use of AI to detect potentially life-threatening situations or vehicles related to crimes. Ensure that fundamental rights like privacy and non-discrimination are preserved, while at the same time guarantee public safety. -Legal: Personal Data Protection issues, possible fundamental rights violations related to the use of cameras, legal barriers on Aerial Robot Regulation and the use of UAVS in public spaces.

Results Summary

The project was a multidisciplinary collaboration bridging police academies, authorities, and computing science, in Sweden and Catalonia (Spain). The main result of the project was a Trustworthy AI Model and its integration with mixed reality interfaces, our collaborative effort seeks to empower officers with enhanced situational awareness and decision-making support. By leveraging context information, user insights, and ethical considerations, we strive to ensure that AI-empowered police interventions are not only effective but also ethical, legal, and socially responsible.

The Trustworthy AI Model is a result of two user studies, one with the Mossos d’Esquadra, the police authority in Barcelona, and one the police education unit at Umeå Sweden. The study included the participation of 39 senior police officers, 20 from Barcelona and 19 from Umeå. The formal publication of the Trustworthy AI Model is in process.

Tangible Outcomes

David Martín-Moncunill, Eduardo García Laredo, Juan Carlos Nieves: POTDAI: “”A Tool to Evaluate the Perceived Operational Trust Degree in Artificial Intelligence Systems””. IEEE Access 12: 133097-133109 (2024) https://ieeexplore.ieee.org/document/10663721
[under review] Andreas Brännström, Eduardo Garcia Laredo, Bernat Vivolas Jorda, Lola Valles, Jonas Hansson, Emili Martinez Cañaveras, Anders Schogster, David Martin-Moncunill, Juan Carlos Nieves: “Trustworthy AI and Mixed Reality in Police Interventions: Challenges and Opportunities”.
Two trustworthy seminars were given one with:
1. the Mossos d’Esquadra, the police authority in Barcelona.
2. the police education unit at Umeå Sweden.
video demonstrating the project: https://www.svt.se/nyheter/lokalt/vasterbotten/ai-for-poliser-utvecklas-i-umea

Contact person: Elisabeth Stockinger (elisabeth.stockinger@gess.ethz.ch )

Internal Partners:

ETHZ, Elisabeth Stockinger, elisabeth.stockinger@gess.ethz.ch
UMU, Virginia Dignum, virginia@cs.umu.se
TU Delft, Jonne Maas, J.J.C.Maas@tudelft.nl

External Partners:

University of Amsterdam Christopher Talvitie, christalvitie@gmail.com

Voting Advice Applications (VAAs) are increasingly popular throughout Europe. While commonly portrayed as impartial tools to measure issue agreement, their developers must take several design decisions at each step of the design process. Such decisions may include the selection of issues to incorporate into a questionnaire, the placement of candidates or parties on a political spectrum, or the algorithm measuring the distance between user and candidate. These decisions have to be made with great care, as they can cause substantial differences in the resulting list of recommendations.

As there is no known ground truth by which to measure different VAA designs, it is imperative that their design follows guidelines and best practices of pro-ethical design. Similarly, as VAAs aim to directly inform voter decisions in a democratic election, users must be able to trust the fulfillment of these guidelines based on the information available to them.

Results Summary

Firstly, we conduct an ethics assessment of several VAAs used in European countries, representing different design strategies. This assessment focuses on trustworthiness in the eyes of the electorate, focusing on user-centric documentation. By using the Ethics Guidelines for Trustworthy AI, we refer to a framework that is acknowledged by the democratic institutions of the countries hosting the VAAs and the respective elections, contributing to the democratic validity of a normative analysis of tools embedded in electoral processes.

Secondly, we identify the abstract criteria that a trustworthy VAA must fulfil according to the EGTAI, and accordingly evaluate a representative set of VAAs within Europe (StemWijzer, Kieskompas What2Vote, Smartvote, Wahl-O-Mat, Aftonbladets valkompass, HS Vaalikone and SVT Nyheters valkompass). None of the VAAs under investigation scored highly on the adapted EGTAI assessment list. For several requirements, many sub-requirements are not fulfilled by any VAA in this study. In particular, scores on societal and environmental well-being (R6) or accountability (R7) are low without significant differences between VAAs.

Thirdly, we present a list of recommendations based on these issues to contribute to future VAA development efforts. Across VAAs, we identify the need for improvement in (i) transparency regarding the subjectivity of recommendations, (ii) diversity of stakeholder participation, (iii) user-centric documentation of algorithm, and (iv) disclosure of the underlying values and assumptions.

Tangible Outcomes

Stockinger, E., Maas, J., Talvitie, C. et al. Trustworthiness of voting advice applications in Europe. Ethics Inf Technol 26, 55 (2024). https://doi.org/10.1007/s10676-024-09790-6
Dataset showing the evaluated VAAs and the frameworks used to evaluate them https://static-content.springer.com/esm/art%3A10.1007%2Fs10676-024-09790-6/MediaObjects/10676_2024_9790_MOESM1_ESM.pdf
The code used for the analysis: https://github.com/ethz-coss/vaa-egtai-compliance
A video explaining Robust and Value-Based Political Guidance to the general public https://www.youtube.com/watch?v=5riTfDuRTlk&ab_channel=ComputationalSocialScienceETH
The project was presented at:

1. the Digital Democracy Workshop (2023, http://digdemlab.io/event/wk2023 )
2. the Workshop on Co-Creating the Future: Participatory Cities and Digital Governance (2023, http://www.participatorycities.net )

1. the 1st Twin Workshop on Ethics of Smart Cities and Smart Societies (2023, http://coss.ethz.ch/research/CoCi/TwinWorkshop )
2. the HumanE AI Conference (2022, http://www.humane-ai.eu/event/humane-ai-conference ).

Knowledge 4 All Foundation Ltd.
Betchworth House
57-65 Station Road
Redhill, Surrey, RH1 1DL

Humane AI on Social Media

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 952026.

[TMP-079] Metrics for Explainable Agent Behaviour

Results Summary

[TMP-083] Multilingual & Multimodal conversational agent

Results Summary

Tangible Outcomes

[TMP-093] Normative behavior and extremism in facebook groups

Results Summary

Tangible Outcomes

[TMP-097] Polarization with the Friedkin-Johnsen model over a dynamic social network

Results Summary

Tangible Outcomes

[TMP-062] Human-machine collaboration for content analysis in context of Ukranian war

Results Summary

Tangible Outcomes

[TMP-102] Promoting Fairness and Diversity in Speech Datasets for Affective Computing

Results Summary

Tangible Outcomes

[TMP-112] Social Norms to counteract misinformation in human-AI hybrid systems

Results Summary

Tangible Outcomes

[TMP-119] The role of designers regarding AI design: a case study

Tangible Outcomes

[TMP-042] Ethical implications of language use with special consideration of the Ethics Guidelines for Trustworthy AI

Results Summary

Tangible Outcomes

[TMP-041] Ethical chatbots

Results Summary

Tangible Outcomes

[TMP-121] To develop a trustworthy AI model for situation awareness by using mixed reality in Police interventions.

Results Summary

Tangible Outcomes

[TMP-124] Trustworthy Voting Advice Applications

Results Summary

Tangible Outcomes

Knowledge 4 All Foundation Ltd.

Humane AI on Social Media