Using Dynamic Epistemic Logic (DEL) so an AI system proactively can make announcements to avoid undesirable future states based on the human's false belief

Previously we have investigated how an AI system can be proactive, that is, acting anticipatory and on own initiative, by reasoning on current and future states, mental simulation of actions and their effects, and what is desirable. In this micro-project we want to extend our earlier work doing epistemic reasoning. That is, we want to do reasoning on knowledge and belief of the human and by that inform the AI system what kind of proactive announcement to make to the human. As in our previous work, we will consider which states are desirable and which are not, and we too will take into account how the state will evolve into the future, if the AI system does not act. Now we also want to consider the human's false beliefs. It is not necessary and, in fact, not desirable to make announcements to correct each and any false belief that the human may have. For example, if the human is watching the TV, she need not be informed that the salt is in the red container and the sugar is in the blue container, while the human's belief is that it is the other way around. On the other hand, when the human starts cooking and is about to use the content of the blue container believing it is salt, then it is a relevant announcement of the AI system to inform the human what is actually the case to avoid undesirable outcomes. The example shows, that we need to research on not only what to announce but also when to make the announcement.

The methods we will use in this micro-project are knowledge-based, to be precise, we will employ Dynamic Epistemic Logic (DEL). DEL is a modal logic. It is an extension of Epistemic Logic which allows to model change in knowledge and belief of an agent herself and of other agents.

Output

– Formal models for proactive announcement based on DEL
– Publication in an international peer-reviewed high-quality conference

Project Partners

  • Örebro University, ORU, Jasmin Grosinger
  • Denmark Technical Unisersity, Thomas Bolander

Primary Contact

Jasmin Grosinger, Örebro University, ORU

This project targets the design of grounded dialogue models from observation of human-to-human conversation, typically from a set of recordings. It will bring trustable conversation models as well as a tool for the analysis of dialogue behavior.

This microproject aims to design grounded dialogue models based on observation of human-to-human dialogue examples, i.e., distilling dialogue patterns automatically and aligning them to external knowledge bases. The current state-of-the-art conversation models based on finetuned large language models are not grounded and mimic their training data, or their grounding is external and needs to be hand-designed. Furthermore, most commercially deployed dialogue models are entirely handcrafted.
Our goal is to produce grounding for these models (semi-)automatically, using dialogue context embedded in vector spaces via large language models trained specifically on conversational data. If we represent dialogue states as vectors, the whole conversation can be seen as a trajectory in the vector space. By merging, pruning, and modeling the trajectories, we can get dialog skeleton models in the form of finite-state graphs or similar structures. These models could be used for data exploration and analysis, content visualization, topic detection, or clustering. This can bring faster and cheaper design of fully trustable conversation models. The approach will serve both to provide external model grounding and to analyze the progress in human-to-human dialogues, including negotiation around the participants’ common ground.
The microproject will investigate the optimal format of the dialogue context embeddings (such as temporal resolution) as well as the optimal ways of merging dialogue trajectories and distilling models. Here, Variational Recurrent Neural Networks with discrete embeddings (Shi et al., NAACL 2019) are a promising architecture, but alternatives will also be considered.
We plan to experiment with both textual and voice-based dialogues. We will use the MultiWOZ corpus (Budzianowski et al., EMNLP 2018) as well as the DIASER extension developed in a Humane AI microproject by CUNI+LIMSI (Hudecek et al., LREC 2022) for text-based experiments. For voice-based experiments, we will use MultiWOZ spoken data released for the DSTC11 Challenge and dialogue data currently developed in a Humane AI microproject by BUT+CUNI.
The work will be done as a part of the JSALT workshop hosted by the University of Le Mans, France, and co-organized by Johns Hopkins University (JHU) and the Brno University of Technology.
https://jsalt2023.univ-lemans.fr/en/index.html

The JSALT workshop topic leader is Petr Schwarz from BUT (MP partner). The topic passed a scientific review by about 40 researchers in Baltimore, USA, in December 2022 and was selected among four workshop topics.
https://jsalt2023.univ-lemans.fr/en/automatic-design-of-conversational-models-from-observation-of-human-to-human-conversation.html

Workshop Topic Proposal: https://docs.google.com/document/d/19PAOkquQY6wnPx_wUXIx2EaInYchoCRn/edit?usp=sharing&ouid=105764332572733066001&rtpof=true&sd=true
Workshop Topic Presentation: https://docs.google.com/presentation/d/1rt7OFvIu34c3OCXtAkoGjMhVXrcxCSXz/edit?usp=sharing&ouid=105764332572733066001&rtpof=true&sd=true

Workshop Team:
https://docs.google.com/spreadsheets/d/1EsHZ-_OREkvf8ODiN7759OYHqSb6MBAX/edit?usp=sharing&ouid=105764332572733066001&rtpof=true&sd=true

The workshop and attendants will be supported by several sources – JHU sponsors, European Esperanto project, private companies, and the HumanE AI project. Ondrej Dusek from CUNI is responsible for the HumanE AI participants (as MP PI). The aim is to cover mainly travel, accommodation, per diem to move participants to Le Mans, and some preparation.

A joint place for four workshop topics having teams with world-top researchers and initial summer school gives participants an excellent opportunity for networking and personal growth, with high visibility and high impact of the work results. We expect that this effort can start new long-term collaborations among the participating institutions.

Output

– Software – code for dialogue embeddings & trajectory merging
– Trained embedding models
– Paper describing the dialogue skeleton models

Project Partners

  • Brno U, Petr Schwarz
  • CUNI, Ondrej Dusek
  • Eötvös Loránd University (ELTE), Andras Lorincz
  • IDIAP, Petr Motlicek

Primary Contact

Ondrej Dusek, Brno U

The project aims to explore multi-modal interaction concepts for collaborative creation of 3D objects in virtual reality with generative AI assistance.

# Motivation
The use of generative AI in the creation of 3D objects has the potential to greatly reduce the time and effort required for designers and developers, resulting in a more efficient and effective creation of virtual 3D objects. Yet, research still lacks an understanding of suitable interaction modalities and common grounding in this field.

# Objective
The objective of this research project is to explore and compare interaction modalities that are suited to collaboratively create virtual 3D objects together with a generative AI. To this end, the project aims to investigate how different input modalities, such as voice, touch and gesture recognition, can be used to generate and alter a virtual 3D object and how we can create methods for establishing common ground between the AI and the users.

# Methodology
The project is split into two working packages. (1) We investigate and evaluate the use of multi-modal input modalities to alter the shape and appearance of 3D objects in virtual reality (VR). (2) Based on our insights on promising multi-modal interaction concepts, we then develop a prototypical multi-modal VR interface that allows users to collaborate on the creation of 3D objects with a generative AI. This might include, but is not limited to the AI assistant generating 3D models (e.g. using https://threedle.github.io/text2mesh or Shap-E) or providing suggestions based on the users' queries.
The project will use a combination of experimental and observational methods to evaluate the effectiveness and efficiency of the concepts. This will involve conducting controlled experiments to test the effects of different modalities and AI assistance on the collaborative creation process, as well as observing and analyzing the users’ behavior.

# Expected outcomes
The research project is expected to produce several outcomes, including a software package to prototype multi-modal VR interfaces that enables collaborative creation of 3D objects, insights into the effectiveness and efficiency of different modalities and AI assistance in enhancing the collaborative process, and guidelines for the design of multi-modal interfaces and AI assistance for collaborative creation of 3D objects. The project's outcomes may have potential applications in fields such as architecture, engineering, and entertainment.

# Relation to call
This research project is directly related to the call for proposals as it addresses the challenge of coordination and collaboration between AI and human partners in the context of creating 3D objects. The project involves the use of multi-modal interfaces and AI assistance to enhance the collaborative process, which aligns with the call's focus on speech-based and multimodal interaction with AI. Additionally, the project's investigation of co-adaptive processes in collaborative creation aligns with the call's focus on co-adaptive processes in grounding. The project's outcomes, such as the development of guidelines for the design of multi-modal interfaces and AI assistance for collaborative creation, may also contribute to the broader theme of interactive grounding. Finally, the project's potential applications in architecture, engineering, and entertainment also align with the call's focus on special application areas.

Output

1. VR co-creation software package: The project aims to develop a publicly-available open-source software package to quickly prototype Multi-Modal VR interfaces for co-creating virtual 3D objects. It enables practitioners and VR application developers to more easily create virtual 3D objects without requiring expert knowledge in computer-aided design.
2. Recorded dataset and derived guidelines for the design of multi-modal interfaces with AI assistance: The project aims to publish all recorded datasets and further provides a set of guidelines for the design of efficient and effective multi-modal interfaces for generating and altering 3D objects with an AI assistant
3. We aim for publishing the results of this research as a paper in a leading XR or HCI venue, such as CHI, UIST, or ISMAR.

Project Partners

  • Københavns Universitet (UCPH), Teresa Hirzle
  • Ludwig-Maximilians-Universität München (LMU), Florian Müller/Julian Rasch
  • Saarland University, Martin Schmitz

Primary Contact

Teresa Hirzle, Københavns Universitet (UCPH)

We are going to build and evaluate a novel AI aviation assistant for supporting (general) aviation pilots with key flight information that facilitate decision making, placing particular emphasis on their efficient and effective visualization in 3D space.

Pilots frequently need to react to unforeseen in-flight events. Taking adequate decisions in such situations requires to consider all available information and demands strong situational awareness. Modern on-board computers and technologies like GPS radically improved the pilots’ abilities to take appropriate actions and lowered their required workload in recent years. Yet, current technologies used in aviation cockpits generally still fail to adequately map and represent 3D airspace. In response, we aim to create an AI aviation assistant that considers all relevant aircraft operation data, focuses on providing tangible action recommendations, and on visualizing them for efficient and effective interpretation in 3D space. In particular, we note that extended reality (XR) applications provide an opportunity to augment pilots’ perception through live 3D visualizations of key flight information, including airspace structure, traffic information, airport highlighting, and traffic patterns. While XR applications have been tested in aviation in the past, applications are mostly limited to military aviation and latest commercial aircrafts. This ignores the majority of pilots in general aviation, in particular, where such support could drastically increase situational awareness and lower the workload of pilots. General aviation is characterized as the non-commerical branch of aviation, often relating to single-engine and single-pilot operations.
To develop applications usable across aviation domains, we plan to create a Unity project for XR glasses. Based on this, we plan to, in the first step, systematically and iteratively explore suitable AI-based support on pilot feedback in a virtual reality study in a flight simulator. Based on our findings, we refine the Unity application and investigate opportunites to conduct a real test flight with our external partner ENAC, the French National School of Civil Aviation, who own a plane. Such a test flight would most likely use latest Augmented Reality headsets like the HoloLense 2. Considering the immense safety requirements for such a real test flight, this part of the project is considered optional at this stage and depends on the findings from the previous virtual reality evaluation.
The system development will particularly focus on the use XR techniques to create more effective AI-supported traffic advisories and visualizations. With this, we want to advance the coordination and collaboration of AI with human partners, establishing a common ground as a basis for multimodal interaction with AI (WP3 motivated). Further, the MP relates closely to “Innovation projects (WP6&7 motivated)”, calling for solutions that address “real-world challenges and opportunities in various domains such as (…) transportation […]”.

Output

– Requirements and a prototype implementation for an AI-based assistant that provides recommendations and shows selected flight information based on pilot workload and current flight parameters
– A Unity project that implements an extended reality support tool for (general) aviation and that is used for evaluation in simulators (Virtual Reality) and possibly for a real test flight at ENAC (Augmented Reality)
– Findings from the simulator study and design recommandations
– (Optional) Impressions from a real test flight at ENAC
– A research paper detailing the system and the findings

Project Partners

  • Ludwig-Maximilians-Universität München (LMU), Florian Müller
  • Ecole Nationale de l'Aviation Civile (ENAC), Anke Brock

Primary Contact

Florian Müller, Ludwig-Maximilians-Universität München (LMU)

The project aims to develop a Framework for multimodal & multilingual conversational agents focus on. The framework is based on hierarchical levels of abilities:

– Reactive(sensori-motor) Interaction: Interaction is tightly-coupled perception-action where actions of one agent are immediately sensed and interpreted as actions of the other. Examples include greetings, polite conversation and emotional mirroring

– Situated (Spatio-temporal) Interaction Interactions are mediated by a shared model of objects and relations (states) and shared models for roles and interaction protocols.

– Operational Interaction Collective performance of tasks.

– Praxical Interaction Sharing of knowledge about entitles, relations, actions and tasks.

– Creative Interaction Collective construction of theories and models that predict and explain phenomena.

On this microproject we focus on the 2 first levels (Reactive & Situational) and design the global framework architecture. The work performed in this project will be demontrated in a PoC.

Output

OSS Framework (Level 1 and 2)

Project Partners:

  • Università di Bologna (UNIBO), Paolo Torroni

Primary Contact: Eric Blaudez, THALES

Research at the intersection of artificial intelligence (AI) and extended reality (XR) has amounted to substantial literature over the past 20 years. Applications cover a broad spectrum, for example, visualising neural networks in virtual reality or interacting with conversational agents. However, a systematic overview is currently missing.

This micro-project addresses this gap with a scoping review covering two main objectives: First, it aims to give an overview of the research conducted at the intersection of AI and XR. Secondly, we are particularly interested in revealing how XR can be used to improve interactive grounding in human-AI interaction. In summary, the review focuses on the following guiding questions: Which are the typical AI methods used in XR research? Which are the main use cases at the intersection of AI and XR? How can XR serve as a tool to enhance interactive grounding in human-AI interaction?

Output

Conference or journal paper co-authored by the proposers (possibly with et. partners)

Dataset of the papers including codes

Project Partners:

  • Københavns Universitet (UCPH), Teresa Hirzle
  • Københavns Universitet (UCPH), Kasper Hornbæk
  • Ludwig-Maximilians-Universität München (LMU), Florian Müller

Primary Contact: Teresa Hirzle, University of Copenhagen, Department of Computer Science

This MP studies the problem of how to alert a human user to a potentially dangerous situation, for example for handovers in automated vehicles. The goal is to develop a trustworthy alerting technique that has high accuracy and minimum false alerts. The challenge is to decide when to interrupt, because false positives and false negatives will lower trust. However, knowing when to interrupt is hard, because you must take into account both the driving situation and the driver's ability to react given the alert, moreover this inference must be done based on impoverished sensor data. The key idea of this MP is to model this as a partially observable stochastic game (POSG), which allows approximate solutions to a problem where we have two adaptive agents (human and AI). The main outcome will be an open library called COOPIHC for Python, which allows modeling different variants of this problem.

Output

COOPIHC library (Python)

Paper (e.g,. IUI’23 or CHI’23)

Project Partners:

  • Aalto University, Antti Oulasvirta
  • Centre national de la recherche scientifique (CNRS), Julien Gori

Primary Contact: Antti Oulasvirta, Aalto University

The broad availability of 3D-printing enables end-users to rapidly fabricate personalized objects. While the actual manufacturing process is largely automated, users still need knowledge of complex design applications to not only produce ready-designed objects, but also adapt them to their needs or even design new objects from scratch.

In this project, we explore an AI-powered system that assists users in creating 3D objects for digital fabrication. For this, we propose to use natural language processing (NLP) to enable users to describe objects using their natural language (e.g., "A green rectangular box."). In this micro project, we conduct a Wizard-of-Oz study to elicit the requirements for such a system. The task of the participants is to recreate a given object using a spoken description with iterative refinements. We expect that this work will support the goal to make personal digital fabrication accessible for everyone.

Output

Requirements for voice-based 3D design

dataset

Design specification for a NLP model to support voice-based 3D design

Project Partners:

  • Ludwig-Maximilians-Universität München (LMU), Florian Müller/Albrecht Schmidt

Primary Contact: Florian Müller, LMU Munich

The communication between patients and healthcare institutions is increasingly moving to digital applications. Whereas information about the patient’s wellbeing is typically collected by means of a questionnaire, this is a tedious task for many patients, especially when it has to be done periodically, and may result in incomplete or imprecise input. Much can be gained by making the process of filling in such questionnaires more interactive, by deploying a conversational agent that can not only ask the questions, but also ask follow-up questions and respond to clarification questions by the user. We propose to deploy and test such a system.

Our proposed research aligns well with the WP3 focus on human-AI communication, and will lead to re-usable conversation patterns for conducting questionnaires in healthcare. The work benefit from existing experience with patient-provider communication within Philips and build on the SUPPLE framework for dialog management and sequence expansion.

Output

A dataset on conversation(s) between a patient and a conversational AI

A dialog model derived from the dataset

Scientific publication

Project Partners:

  • Philips Electronics Nederland B.V., Aart van Halteren
  • Stichting VU, Koen Hindriks

Primary Contact: Aart van Halteren, Philips Research

When we go for a walk with friends, we can observe that our movements unconsciously synchronize. This is a crucial aspect of human relations that is known to build trust, liking, and the feeling of connectedness and rapport. In this project, we explore if and how this effect can enhance the relationship between humans and AI systems by increasing the sense of connectedness in the formation of techno-social teams working together on a task.

To evaluate the feasibility of this approach, we plan to build a physical object representing an AI system that can bend in two dimensions to synchronize with the movements of humans. Then, we plan to conduct an initial evaluation in which we will track the upper body motion of the participants and use this data to compute the prototype movement using different transfer functions (e.g., varying the delay and amplitude of the movement).

Output

Physical prototype that employs bending for synchronization

Study results on the feasibility of establishing trust with embodied AI systems through motion synchronization.

Publication of the results

 

Primary Contact: Florian Müller, LMU Munich

We study proactive communicative behavior, where robots provide information to humans which may help them to achieve desired outcomes, or to prevent possible undesired ones. Proactive behavior in an under-addressed area in AI and robotics, and proactive human-robot communication is even more so. We will combine the past expertise of Sorbonne Univ. (intention recognition) and Orebro Univ. (proactive behavior) to define proactive behavior based on the understanding of user’s intentions, and then extend it to consider communicative actions based on second-order perspective awareness.

We propose an architecture able to (1) estimate the human's intention of goal, (2) infer robot’s and human’s knowledge about foreseen possible upcoming outcomes of intended goal, (3) detect opportunities for desirability of intended goal to robot be proactive, (4) select action from the listed opportunities. The theoretical underpinning of this work will contribute to the study of theory of mind in HRI.

Output

Jupyter Notebook / Google Colab that presents the code of proposed architecture and is able to provide plug and play interaction.

a manuscript describing the proposed architecture and initial findings of the experiment

Presentations

Project Partners:

  • Sorbonne Université, Mohamed CHETOUANI
  • Örebro University (ORU), Alessandro Saffioti and Jasmin Grosinger

Primary Contact: Mohamed CHETOUANI, Sorbonne University

Main results of micro project:

The goal of this micro-project is to develop a cognitive architecture able to generate proactive communicative behaviors during human-robot interactions. The general idea is to provide information to humans which may help them to achieve desired outcomes, or to prevent possible undesired ones. Our work proposes a framework that generates and selects among opportunities for acting based on recognizing human intention, predicting environment changes, and reasoning about what is desirable in general. Our framework has two main modules to initiate proactive behavior; intention recognition and equilibrium maintenance.
The main achievements are:
– Integration of two systems: user intention recognition and equilibrium maintenance in a generic architecture
– Showing stability of the architecture to many users
– Reasoning mechanism and 2nd order perspective awareness
The next steps will aim to show knowledge repair, prevent outcomes of lack of knowledge and improve trustability, transparency and legibility (user study)

Contribution to the objectives of HumaneAI-net WPs

– Playground system that HumaneAI-net partners could define their interactive scenario to play with the robot’s proactivity.

-T3.3 -> Study about how to model human rationality to detect and use computationally defined human belief, goal and intention. Then, use that model to make robots proactive. Human in the loop system to support cooperative behavior of robots while sharing the environment by generating proactive communication.

-T3.1 -> Study relates robots that generate proactive communication, possible effects on human cognition and interaction strategies.

Tangible outputs

Many industrial NLP applications emphasise the processing and detection of nouns, especially proper nouns (Named Entity Recognition, NER). However, processing of verbs has been neglected in recent years, even though it is crucial for the development of full NLU systems, e.g., for the detection of intents in spoken language utterances or events in written language news articles. The META-O-NLU microproject focuses on proving the feasibility of a multilingual event-type ontology based on classes of synonymous verb senses, complemented with semantic roles and links to existing semantic lexicons. Such an ontology shall be usable for content- and knowledge-based annotation, which in turn shall allow for developing NLU parsers/analyzers. The concrete goal is to extend the existing Czech-English SynSemClass lexicon (which displays all the necessary features, but only for two languages) by German and Polish, as a first step to show it can be extended to other languages as well.

Output

Common paper co-authored by the proposers (possibly with et. partners)

Extended version of SynSemClass (entried in additional languages)

Presentations

Project Partners:

  • Charles University Prague, Jan Hajič
  • German Research Centre for Artificial Intelligence (DFKI), Georg Rehm

Primary Contact: Jan Hajič, Univerzita Karlova (Charles University, CUNI)

Main results of micro project:

The main results of the META-O-NLU microproject is the extension of the original SynSemClass dataset by German classes, or more precisely, the inclusion of German verbs and event descriptors to the existing classes in SynSemClass. Together with the individual verbs, existing German lexical resources have been linked to (GermaNet, E-VALBU and GUP). Adding a third language demonstrated that future extension to other languages is feasible, in terms of annotation rules, the dataset itself, and in creating a new web browser that can show all language entries alongside each other with all the external links. The data is freely available in the LINDAT/CLAIRAH-CZ repository (and soon also through the Euroepan Language Grid) and a web browser on the resources is now also available.

Contribution to the objectives of HumaneAI-net WPs

Task 3.6 focuses on both spoken and written language-based interactions (dialogues, chats), in particular questions of multilinguality that are essential to the European vision of human-centric AI. The results of this microproject contribute especially to the multlingual issue, and is directed to full NLU (Natural Language Understanding) by describing event types, for which no general ontology exists yet. The resulting resource will be used for both text and dialog annotation, to allow for evaluation and possibly also for training of NLU systems.

Tangible outputs