SciNoBo: An AI system collaborating with Journalists in Science Communication (resubmission)

Science communication conveys scientific findings and informs about research developments the general public, policymakers and other non-expert groups raising interest, trust in science and engagement on societal problems (e.g., United Nations Sustainable Development Goals). In this context, evidence-based science communication isolates topics of interest from the scientific literature, frames the relevant evidence and disseminates the relevant information to targeted non-scholarly audiences through a wide range of communication channels and strategies.

The proposed microproject (MP) focusses on science journalism and the public outreach on scientific topics in Health and Climate Change. The MP will bring together and enable interactions of science communicators (e.g., science journalists, policy analysts, science advisors for policymakers, other actors) with an AI system, capable of identifying statements about Health and Climate in mass media, grounding them on scientific evidence and simplifying the language of the scientific discourse by reducing the complexity of the text while keeping the meaning and the information the same.

Technologically, we plan to build on our previous MP work on neuro-symbolic Q&A (*) and further exploit and advance recent developments in instruction fine-tuning of large language models, retrieval augmentation and natural language understanding – specifically the NLP areas of argumentation mining, claim verification and text (ie, lexical and syntactic) simplification.

The proposed MP addresses the topic of “Collaborative AI” by developing an AI system equipped with innovative NLP tools that can collaborate with humans (ie, science communicators -SCs) communicating statements on Health & Climate Change topics, grounding them on scientific evidence (Interactive grounding) and providing explanations in simplified language, thus, facilitating SCs in science communication. The innovative AI solution will be tested on a real-world scenario in collaboration with OpenAIRE by employing OpenAIRE research graph (ORG) services in Open Science publications.

Workplan
The proposed work is divided into two phases running in parallel. The main focus in phase I is the construction of the data collections and the adaptations and improvements needed in PDF processing tools. Phase II deals with the development of the two subsystems: claim analysis and text simplification as well as their evaluation.

Phase I
Two collections with News and scientific publications will be compiled in the areas of Health and Climate. The News collection will be built based on an existing dataset with News stories and ARC automated classification system in the areas of interest. The second collection with publications will be provided by OpenAIRE ORG service and further processed, managed and properly indexed by ARC SciNoBo toolkit. A small-scale annotation is foreseen by DFKI in support of the simplification subsystem.

Phase II
In phase II, we will be developing/advancing, finetuning and evaluating the two subsystems. Concretely, the “claim analysis” subsystem encompasses (i) ARC previous work on “claim identification”, (ii) a retrieval engine fetching relevant scientific publications (based on our previous miniProject), and (iii) an evidence-synthesis module indicating whether the publications fetched and the scientists’ claims therein, support or refute the News claim under examination.
DFKI will be examining both lexical and syntax-based representations, exploring their contribution to text simplification and evaluating (neural) simplification models on the Eval dataset. Phase II work will be led by ARC in collaboration with DFKI and OpenAIRE.

Ethics: AI is used but without raising ethical concerns related to human rights and values.

(*): Combining symbolic and sub-symbolic approaches – Improving neural QA-Systems through Document Analysis for enhanced accuracy and efficiency in Human-AI interaction.

Output

Paper(s) in Conferences:
We plan to submit at least two papers about the “claim analysis” and the “text simplification” subsystems.

Practical demonstrations, tools:
A full-fledged demonstrator showing the functionality supported will be available (expected at the last month of the project).

Project Partners

  • ILSP/ATHENA RC, Haris Papageorgiou
  • German Research Centre for Artificial Intelligence (DFKI), Julián Moreno Schneider
  • OpenAIRE, Natalia Manola

Primary Contact

Haris Papageorgiou, ILSP/ATHENA RC

Enhancing Non-Anthropomorphic Robots: Exploring Social Cues for Seamless Human-Robot Interaction

Various types of robots are entering many contexts of life, such as homes, public spaces and factories. Social robots interact with people using conventions and interaction modalities that are prevalent in human-human interaction. While many social robots are anthropomorphic, non-anthropomorphic robots, such as vacuum cleaners, lawn mowers, and barista robots, are getting more common for everyday tasks. The purpose of this research is to explore how people’s interaction with non-anthropomorphic robots can benefit from human-like social cues.

The main research question of this micro-project is: What kind of social cues can be used for human-robot interaction (HRI) with non-anthropomorphic robots? Social interactions will be supported by robots’ gestures, sounds and visual cues. Different robotic parts such as arms or antennas can be designed to extend expressivity. The designs will be tested with a specified scenario and user group, such as factory workers or elderly people. The exact scenario will be defined at the start of the project. Two parallel studies will be conducted, one in LMU and one in ITS. In LMU, a prototype or mock-up will be designed and built. In ITS, one of their robots will be used to test the same scenario. The evaluations will be explorations of different social cues, and based primarily on qualitative user evaluation.

The proposed research aligns well with the WP3 focus on human-AI collaboration, more specifically by investigating human-centered techniques for common grounding between people and robots. The research will lead to novel understanding of how robots can give social cues to improve multimodal human-robot interaction in various usage contexts.

Output

– Prototype/mock-up of a socially supportive non-anthropomorphic robot
– User study report, with insights of the types of robotic parts and social cues that improve human-robot interaction in the selected scenarios
– Paper submitted to a conference or journal of HRI

Project Partners

  • Ludwig-Maximilians-Universität München (LMU), Albrecht Schmidt
  • Instituto Superior Técnico (IST), Ana Paiva
  • Tampere University, Finland, Kaisa Väänänen

Primary Contact

Albrecht Schmidt, Ludwig-Maximilians-Universität München (LMU)

This project aims to make modern cognitive user models and collaborative AI tools more applicable by developing generalizable amortization techniques for them.

In human-AI collaboration, one of the key difficulties is establishing a common ground for the interaction, especially in terms of goals and beliefs. In practice, the AI might not have access to this necessary information directly and must infer it during the interaction with the human. However, training a model to support this kind of inference would require massive collections of interaction data and is not feasible in most applications.
Modern cognitive models, on the other hand, can equip AI tools with the necessary prior knowledge to readily support inference, and hence, to quickly establish a common ground for collaboration with humans. However, utilizing these models in realistic applications is currently impractical due to their computational complexity and non-differentiable structure.
This micro-project contributes directly to the development of collaborative AI by making cognitive models practical and computationally feasible to use thus enabling efficient online grounding during interaction. The project approaches this problem by developing amortization techniques for modern cognitive models and for merging them in collaborative AI systems.

Output

A conference paper draft that introduces the problem, a method, and initial findings.

Project Partners

  • Delft University of Technology, Frans Oliehoek

Primary Contact

Samuel Kaski, Delft University of Technology

SciNoBo: An AI system collaborating with Journalists in Science Communication (resubmission)

Science communication conveys scientific findings and informs about research developments the general public, policymakers and other non-expert groups raising interest, trust in science and engagement on societal problems (e.g., United Nations Sustainable Development Goals). In this context, evidence-based science communication isolates topics of interest from the scientific literature, frames the relevant evidence and disseminates the relevant information to targeted non-scholarly audiences through a wide range of communication channels and strategies.

The proposed microproject (MP) focusses on science journalism and the public outreach on scientific topics in Health and Climate Change. The MP will bring together and enable interactions of science communicators (e.g., science journalists, policy analysts, science advisors for policymakers, other actors) with an AI system, capable of identifying statements about Health and Climate in mass media, grounding them on scientific evidence and simplifying the language of the scientific discourse by reducing the complexity of the text while keeping the meaning and the information the same.

Technologically, we plan to build on our previous MP work on neuro-symbolic Q&A (*) and further exploit and advance recent developments in instruction fine-tuning of large language models, retrieval augmentation and natural language understanding – specifically the NLP areas of argumentation mining, claim verification and text (ie, lexical and syntactic) simplification.

The proposed MP addresses the topic of “Collaborative AI” by developing an AI system equipped with innovative NLP tools that can collaborate with humans (ie, science communicators -SCs) communicating statements on Health & Climate Change topics, grounding them on scientific evidence (Interactive grounding) and providing explanations in simplified language, thus, facilitating SCs in science communication. The innovative AI solution will be tested on a real-world scenario in collaboration with OpenAIRE by employing OpenAIRE research graph (ORG) services in Open Science publications.

Workplan
The proposed work is divided into two phases running in parallel. The main focus in phase I is the construction of the data collections and the adaptations and improvements needed in PDF processing tools. Phase II deals with the development of the two subsystems: claim analysis and text simplification as well as their evaluation.

Phase I
Two collections with News and scientific publications will be compiled in the areas of Health and Climate. The News collection will be built based on an existing dataset with News stories and ARC automated classification system in the areas of interest. The second collection with publications will be provided by OpenAIRE ORG service and further processed, managed and properly indexed by ARC SciNoBo toolkit. A small-scale annotation is foreseen by DFKI in support of the simplification subsystem.

Phase II
In phase II, we will be developing/advancing, finetuning and evaluating the two subsystems. Concretely, the “claim analysis” subsystem encompasses (i) ARC previous work on “claim identification”, (ii) a retrieval engine fetching relevant scientific publications (based on our previous miniProject), and (iii) an evidence-synthesis module indicating whether the publications fetched and the scientists’ claims therein, support or refute the News claim under examination.
DFKI will be examining both lexical and syntax-based representations, exploring their contribution to text simplification and evaluating (neural) simplification models on the Eval dataset. Phase II work will be led by ARC in collaboration with DFKI and OpenAIRE.

Ethics: AI is used but without raising ethical concerns related to human rights and values.

(*): Combining symbolic and sub-symbolic approaches – Improving neural QA-Systems through Document Analysis for enhanced accuracy and efficiency in Human-AI interaction.

Output

Paper(s) in Conferences:
We plan to submit at least two papers about the “claim analysis” and the “text simplification” subsystems.

Practical demonstrations, tools:
A full-fledged demonstrator showing the functionality supported will be available (expected at the last month of the project).

Project Partners

  • ILSP/ATHENA RC, Haris Papageorgiou
  • German Research Centre for Artificial Intelligence (DFKI), Julián Moreno Schneider
  • OpenAIRE, Natalia Manola

Primary Contact

Haris Papageorgiou, ILSP/ATHENA RC

Develop AI interactive grounding capabilities in collaborative tasks using a game-based mixed reality scenario that require physical actions.

The project addresses research on interactive grounding. It consists of the development of an Augmented Reality (AR) game, using HoloLens, that supports the interaction of a human player with an AI character in a mixed reality setting using gestures as the main communicative act. The game will integrate technology to perceive human gestures and poses. The game will bring about collaborative tasks that need coordination at the level of mutual understanding of the several elements of the required task. Players (human and AI) will have different information about the tasks to advance in the game and need to communicate that information to their partners through gestures. The main grounding challenge will be based on learning the mapping between gestures to the meaning of actions to perform in the game. There will be two levels of gestures to ground, some are task-independent while others are task-dependent. In other words, besides the gestures that communicate explicit information about the game task, the players need to agree on the gestures used to coordinate the communication itself, for example, to signal agreement or doubt, to ask for more information, or close the communication. These latter gesture types can be transferred from task to task within the game, and probably to other contexts as well.
It will be possible to play the game with two humans and study their gesture communication in order to gather the gestures that emerge: a human-inspired gesture set will be collected and serve the creation of a gesture dictionary in the AI repertoire.
The game will provide different tasks of increasing difficulty. The first ones will ask the players to perform gestures or poses as mechanisms to open a door to progress to the next level. But later, in a more advanced version of the game, specific and constrained body poses, interaction with objects, and the need to communicate more abstract concepts (e.g., next to, under, to the right, the biggest one, …) will be introduced.
The game will be built as a platform to perform studies. It will support studying diverse questions about the interactive grounding of gestures. For example, we can study the way people adapt to and ascribe meaning to the gestures performed by the AI agent, we can study how different gesture profiles influence the people’s interpretation, facilitate grounding, and have an impact on the performance of the tasks, or we can study different mechanisms on the AI to learn its gesture repertoire from humans (e.g., by imitation grounded on the context).
We see this project as a relevant contribution to the upcoming Macro Project on Interactive Grounding, and we would like the opportunity to join the MP later. Our focus is on the grounding based on gestures being critical in certain scenarios. The setting can include language if vocalization is allowed and can be heard. Our game scenarios are simple and abstract and can be the basis for realistic ones.

Output

A game that serves as a platform for studying grounding in the context of collaborative tasks using gestures.
A repertoire of gestures to be used in the communication between humans and AI in a collaborative task that relies on the execution of physical actions. We will emphasize the gestures that can be task-independent.
The basis for an AI algorithm to ground gestures to meaning adapted to a particular user.
One or two papers, describing the platform and a study with people.

Project Partners

  • Instituto Superior Técnico (IST), Rui Prada
  • Eötvös Loránd University, András Lőrincz
  • DFKI Lower Saxony, Daniel Sonntag
  • CMU, László Jeni

Primary Contact

Rui Prada, Instituto Superior Técnico (IST)

Exploring the balance between ownership and AI assistance in creative collaboration through an interactive exhibition.

Novel AI systems enable individuals to maximize their creative potential by rapidly prototyping ideas based on initial sketches or idea descriptions. A generative AI system is the bridge between an individual’s thought and its physical manifestation. Traditional approaches, on the other hand, require a greater investment of effort, involvement, and time, which was historically associated with a sense of ownership over the creation and agency (or control) over the creation process. As the paradigm shift caused by AI significantly reduces the amount of work required to achieve a desired result, individuals consistently report low agency and ownership over their creations, and such boundaries are unclear even in the legal sphere. Therefore, it is essential to understand how these variables can be balanced to foster a strong sense of ownership while allowing users to fully exploit the potential of AI systems.

In this project, we seek to achieve this understanding by creating an interactive exhibition where visitors to a science museum will interact with a generative AI system to create illustrations for a children’s book based on rough sketches and prompts. The participants will be instructed to collaborate with an image-generating AI system to illustrate a children’s storybook with a simple plot. Participants will start with their own sketch or by selecting one from a set. When an illustration is finished, participants will be asked if they want to sign the illustration with their name, the name of the AI model, or both. Participants will have the option to display their illustrations on the exhibition’s billboard. We will conclude by asking them three brief questions about self-efficacy.

The interaction will be logged to record the degree of intervention (iterating over the illustrations, using a starting sketch instead of drawing their own sketch, signature ownership). We plan to carry out a limited quantitative study with observations of visitors’ behavior paired with interviews. With the collected data, we will be able to analyze the correlations between time, effort, ownership, and self-efficacy in the AI-assisted creative process, and ultimately gain insights into how to design such systems to promote a sense of ownership in the user.

This project falls under WP3. It examines the Pragmatic aspects of communication and collaboration between humans and AI by exploring how participants collaborate with an AI system to translate their initial sketches or prompts into meaningful illustrations for visual narratives for Storytelling. Everything from the lens of influence of the participants’ sense of ownership and agency and how it impacts the outcome and design process.

Output

A manuscript reporting the intervention and the results of the field-study.

A video explanation of the intervention and the insights gained from it.

An open-source repository of the materials used in the intervention.

Project Partners

  • Ludwig-Maximilians-Universität München (LMU), Sebastian Feger
  • Sheffield Hallam University Enterprises Limited, Daniela Petrelli

Primary Contact

Steeven Villa, Ludwig-Maximilians-Universität München (LMU)

Ethical implications of language use with special consideration of the Ethics Guidelines for Trustworthy AI

Due to the ongoing advancements of AI technologies, we will have to face a totally new ethical problem that never occurred with other technologies before, that is the problem of the increasing resemblance between AI systems and biological systems, especially human beings and animals. This resemblance will gradually make it more obvious for us to attribute human or animal qualities to AI systems, even if we know that they are not self-conscious or alive. We are not able to predict the consequences on the social, psychological, educational, political, and economical level of the spread of such AI systems. In our meta-project, we want to address this problem from an ethical point of view.

In the first two months, we will base our analysis on the Ethics Guidelines for Trustworthy AI (2019) written by the High-Level Expert Group on AI (AI HLEG) set up by the European Commission. We will focus in particular on the language used by the AI HLEG for describing AI systems’ activity and the human-machine interaction. The focus on language is philosophically motivated by the close correlation existing between language, habits (see Aristotle), and practical as well as emotional relationship with the world.

Over the following two months we will try to generalize the results of our analysis. We will propose some examples of how an adequate linguistic practice can help us to make sharp terminological and conceptual distinctions and so describe and understand the human-AI interaction correctly. The outcome of our work will be a research seminar in which we will present and discuss the results of our research.

Connection of Results to Work Package Objectives:

  • WP5 is concerned with AI ethics and responsible AI. Our project wants to address the responsibility of our linguistic practices with regard to AI. The way in which we speak about AI and the human-AI interaction creates habits, shapes our practical and emotional relationship with the machines and therefore has ethical consequences.
  • WP3 deals with the human-AI collaboration and interaction. Our project will address the language we use to talk about AI and to describe the interaction between us and AI systems.

Output

  • Research seminar: Ethics and AI
  • Seminar for PhD students, postdoctoral scholars, and research fellows
  • University of Kaiserslautern-Landau
  • Winter term 2023-2024

Project Partners

  • RPTU-Kaiserslautern
  • Primary Contact
  • Karen Joisten, RPTU-Kaiserslautern

Project Leads

  • Prof. Dr. Karen Joisten
  • Dr. Ettore Barbagallo

A graduate level educational module (12 lectures + 5 assignments) covering basic principles and techniques of Human-Interactive Robot Learning.

Human-Interactive Robot Learning (HIRL) is an area of robotics that focuses on developing robots that can learn from and interact with humans. This educational module aims to cover the basic principles and techniques of Human-Interactive Robot Learning. This interdisciplinary module will encourage graduate students (Master/PhD level) to connect different bodies of knowledge within the broad field of Artificial Intelligence, with insights from Robotics, Machine Learning, Human Modelling, and Design and Ethics. The module is meant for Master’s and PhD students in STEM, such as Computer Science, Artificial Intelligence, and Cognitive Science.
This work will extend the tutorial presented in the context of the International Conference on Algorithms, Computing, and Artificial Intelligence (ACAI 2021) and will be shared with the Artificial Intelligence Doctoral Academy (AIDA). Moreover, the proposed lectures and assignments will be used as teaching material at Sorbonne University, and Vrije Universiteit Amsterdam.
We plan to design a collection of approximately 12 1.5-hour lectures, 5 assignments, and a list of recommended readings, organized along relevant topics surrounding HIRL. Each lecture will include an algorithmic part and a practical example of how to integrate such an algorithm into an interactive system.
The assignments will encompass the replication of existing algorithms with the possibility for the student to develop their own alternative solutions.

Proposed module contents (each lecture approx. 1.5 hour):
(1) Interactive Machine Learning vs Machine Learning – 1 lecture
(2) Interactive Machine Learning vs Interactive Robot Learning (Embodied vs non-embodied agents) – 1 lecture
(3) Fundamentals of Reinforcement Learning – 2 lectures
(4) Learning strategies: observation, demonstration, instruction, or feedback
– Imitation Learning, Learning from Demonstration – 2 lectures
– Learning from Human Feedback: evaluative, descriptive, imperative, contrastive examples – 3 lectures
(5) Evaluation metrics and benchmarks – 1 lecture
(6) Application scenarios: hands-on session – 1 lecture
(7) Design and ethical considerations – 1 lecture

Output

Learning objectives along Dublin descriptors:
(1) Knowledge and understanding;
– Be aware of the human interventions in standard machine learning and interactive machine learning.
– Understand human teaching strategies
– Gain knowledge about learning from feedback, demonstrations, and instructions.
– Explore ongoing works on how human teaching biases could be modeled.
– Discover applications of interactive robot learning.
(2) Applying knowledge and understanding;
– Implement HIRL techniques that integrate different types of human input
(3) Making judgments;
– Make informed design choices when building HIRL systems
(4) Communication skills;
– Effectively communicate about own work both verbally and in a written manner
(5) Learning skills;
– Integrate insights from theoretical material presented in the lecture and research papers showcasing state-of-the-art HIRL techniques.

Project Partners

  • ISIR, Sorbonne University, Mohamed Chetouani
  • ISIR, Sorbonne University, Silvia Tulli
  • Vrije Universiteit Amsterdam, Kim Baraka

Primary Contact

Mohamed Chetouani, ISIR, Sorbonne University

Using Dynamic Epistemic Logic (DEL) so an AI system proactively can make announcements to avoid undesirable future states based on the human's false belief

Previously we have investigated how an AI system can be proactive, that is, acting anticipatory and on own initiative, by reasoning on current and future states, mental simulation of actions and their effects, and what is desirable. In this micro-project we want to extend our earlier work doing epistemic reasoning. That is, we want to do reasoning on knowledge and belief of the human and by that inform the AI system what kind of proactive announcement to make to the human. As in our previous work, we will consider which states are desirable and which are not, and we too will take into account how the state will evolve into the future, if the AI system does not act. Now we also want to consider the human's false beliefs. It is not necessary and, in fact, not desirable to make announcements to correct each and any false belief that the human may have. For example, if the human is watching the TV, she need not be informed that the salt is in the red container and the sugar is in the blue container, while the human's belief is that it is the other way around. On the other hand, when the human starts cooking and is about to use the content of the blue container believing it is salt, then it is a relevant announcement of the AI system to inform the human what is actually the case to avoid undesirable outcomes. The example shows, that we need to research on not only what to announce but also when to make the announcement.

The methods we will use in this micro-project are knowledge-based, to be precise, we will employ Dynamic Epistemic Logic (DEL). DEL is a modal logic. It is an extension of Epistemic Logic which allows to model change in knowledge and belief of an agent herself and of other agents.

1 week of visit is planned. In total, 7,5 PMs are planned to work on the MP, that is, 1 week we work physically in the same place, the rest of the PMs we work together online.

Output

– Formal model
We expect to develop a formal model based on DEL and based on the
findings of J.Grosinger's previous work on proactivity. The model
enables an artificial agent to make announcements to the human to
correct the human's false belief and false belief about desirability
of future states in a proactive way. Being formal we can make general
definitions and propositions in the model and provide proofs about its
properties, for example, about which proactive announcements are
relevant and/or well-timed.

– Conference
We aim for a publication of our work at an international peer-reviewed
high-quality conference. Candidate conferences are AAMAS
(International Conference on Autonomous Agents and Multiagent
Systems), or if this is temporally infeasible, then IJCAI
(International Joint Conferences on Artificial Intelligence).

– Further collaboration
The MP can lead to further fruitful collaborations between the
applicants (and possibly, some of their colleagues additionally) as
the MP's topic is new and under-explored and all cannot be investigated
within one MP.

Project Partners

  • Örebro University, ORU, Jasmin Grosinger
  • Denmark Technical Unisersity, Thomas Bolander

Primary Contact

Jasmin Grosinger, Örebro University, ORU

This project targets the design of grounded dialogue models from observation of human-to-human conversation, typically from a set of recordings. It will bring trustable conversation models as well as a tool for the analysis of dialogue behavior.

This microproject aims to design grounded dialogue models based on observation of human-to-human dialogue examples, i.e., distilling dialogue patterns automatically and aligning them to external knowledge bases. The current state-of-the-art conversation models based on finetuned large language models are not grounded and mimic their training data, or their grounding is external and needs to be hand-designed. Furthermore, most commercially deployed dialogue models are entirely handcrafted.
Our goal is to produce grounding for these models (semi-)automatically, using dialogue context embedded in vector spaces via large language models trained specifically on conversational data. If we represent dialogue states as vectors, the whole conversation can be seen as a trajectory in the vector space. By merging, pruning, and modeling the trajectories, we can get dialog skeleton models in the form of finite-state graphs or similar structures. These models could be used for data exploration and analysis, content visualization, topic detection, or clustering. This can bring faster and cheaper design of fully trustable conversation models. The approach will serve both to provide external model grounding and to analyze the progress in human-to-human dialogues, including negotiation around the participants’ common ground.
The microproject will investigate the optimal format of the dialogue context embeddings (such as temporal resolution) as well as the optimal ways of merging dialogue trajectories and distilling models. Here, Variational Recurrent Neural Networks with discrete embeddings (Shi et al., NAACL 2019) are a promising architecture, but alternatives will also be considered.
We plan to experiment with both textual and voice-based dialogues. We will use the MultiWOZ corpus (Budzianowski et al., EMNLP 2018) as well as the DIASER extension developed in a Humane AI microproject by CUNI+LIMSI (Hudecek et al., LREC 2022) for text-based experiments. For voice-based experiments, we will use MultiWOZ spoken data released for the DSTC11 Challenge and dialogue data currently developed in a Humane AI microproject by BUT+CUNI.
The work will be done as a part of the JSALT workshop hosted by the University of Le Mans, France, and co-organized by Johns Hopkins University (JHU) and the Brno University of Technology.
https://jsalt2023.univ-lemans.fr/en/index.html

The JSALT workshop topic leader is Petr Schwarz from BUT (MP partner). The topic passed a scientific review by about 40 researchers in Baltimore, USA, in December 2022 and was selected among four workshop topics.
https://jsalt2023.univ-lemans.fr/en/automatic-design-of-conversational-models-from-observation-of-human-to-human-conversation.html

Workshop Topic Proposal: https://docs.google.com/document/d/19PAOkquQY6wnPx_wUXIx2EaInYchoCRn/edit?usp=sharing&ouid=105764332572733066001&rtpof=true&sd=true
Workshop Topic Presentation: https://docs.google.com/presentation/d/1rt7OFvIu34c3OCXtAkoGjMhVXrcxCSXz/edit?usp=sharing&ouid=105764332572733066001&rtpof=true&sd=true

Workshop Team:
https://docs.google.com/spreadsheets/d/1EsHZ-_OREkvf8ODiN7759OYHqSb6MBAX/edit?usp=sharing&ouid=105764332572733066001&rtpof=true&sd=true

The workshop and attendants will be supported by several sources – JHU sponsors, European Esperanto project, private companies, and the HumanE AI project. Ondrej Dusek from CUNI is responsible for the HumanE AI participants (as MP PI). The aim is to cover mainly travel, accommodation, per diem to move participants to Le Mans, and some preparation.

A joint place for four workshop topics having teams with world-top researchers and initial summer school gives participants an excellent opportunity for networking and personal growth, with high visibility and high impact of the work results. We expect that this effort can start new long-term collaborations among the participating institutions.

Output

– Software – code for dialogue embeddings & trajectory merging
– Trained embedding models
– Paper describing the dialogue skeleton models

Project Partners

  • Brno U, Petr Schwarz
  • CUNI, Ondrej Dusek
  • Eötvös Loránd University (ELTE), Andras Lorincz
  • IDIAP, Petr Motlicek

Primary Contact

Ondrej Dusek, Brno U

The project aims to explore multi-modal interaction concepts for collaborative creation of 3D objects in virtual reality with generative AI assistance.

# Motivation
The use of generative AI in the creation of 3D objects has the potential to greatly reduce the time and effort required for designers and developers, resulting in a more efficient and effective creation of virtual 3D objects. Yet, research still lacks an understanding of suitable interaction modalities and common grounding in this field.

# Objective
The objective of this research project is to explore and compare interaction modalities that are suited to collaboratively create virtual 3D objects together with a generative AI. To this end, the project aims to investigate how different input modalities, such as voice, touch and gesture recognition, can be used to generate and alter a virtual 3D object and how we can create methods for establishing common ground between the AI and the users.

# Methodology
The project is split into two working packages. (1) We investigate and evaluate the use of multi-modal input modalities to alter the shape and appearance of 3D objects in virtual reality (VR). (2) Based on our insights on promising multi-modal interaction concepts, we then develop a prototypical multi-modal VR interface that allows users to collaborate on the creation of 3D objects with a generative AI. This might include, but is not limited to the AI assistant generating 3D models (e.g. using https://threedle.github.io/text2mesh or Shap-E) or providing suggestions based on the users' queries.
The project will use a combination of experimental and observational methods to evaluate the effectiveness and efficiency of the concepts. This will involve conducting controlled experiments to test the effects of different modalities and AI assistance on the collaborative creation process, as well as observing and analyzing the users’ behavior.

# Expected outcomes
The research project is expected to produce several outcomes, including a software package to prototype multi-modal VR interfaces that enables collaborative creation of 3D objects, insights into the effectiveness and efficiency of different modalities and AI assistance in enhancing the collaborative process, and guidelines for the design of multi-modal interfaces and AI assistance for collaborative creation of 3D objects. The project's outcomes may have potential applications in fields such as architecture, engineering, and entertainment.

# Relation to call
This research project is directly related to the call for proposals as it addresses the challenge of coordination and collaboration between AI and human partners in the context of creating 3D objects. The project involves the use of multi-modal interfaces and AI assistance to enhance the collaborative process, which aligns with the call's focus on speech-based and multimodal interaction with AI. Additionally, the project's investigation of co-adaptive processes in collaborative creation aligns with the call's focus on co-adaptive processes in grounding. The project's outcomes, such as the development of guidelines for the design of multi-modal interfaces and AI assistance for collaborative creation, may also contribute to the broader theme of interactive grounding. Finally, the project's potential applications in architecture, engineering, and entertainment also align with the call's focus on special application areas.

Output

1. VR co-creation software package: The project aims to develop a publicly-available open-source software package to quickly prototype Multi-Modal VR interfaces for co-creating virtual 3D objects. It enables practitioners and VR application developers to more easily create virtual 3D objects without requiring expert knowledge in computer-aided design.
2. Recorded dataset and derived guidelines for the design of multi-modal interfaces with AI assistance: The project aims to publish all recorded datasets and further provides a set of guidelines for the design of efficient and effective multi-modal interfaces for generating and altering 3D objects with an AI assistant
3. We aim for publishing the results of this research as a paper in a leading XR or HCI venue, such as CHI, UIST, or ISMAR.

Project Partners

  • Københavns Universitet (UCPH), Teresa Hirzle
  • Ludwig-Maximilians-Universität München (LMU), Florian Müller/Julian Rasch
  • Saarland University, Martin Schmitz

Primary Contact

Teresa Hirzle, Københavns Universitet (UCPH)

We are going to build and evaluate a novel AI aviation assistant for supporting (general) aviation pilots with key flight information that facilitate decision making, placing particular emphasis on their efficient and effective visualization in 3D space.

Pilots frequently need to react to unforeseen in-flight events. Taking adequate decisions in such situations requires to consider all available information and demands strong situational awareness. Modern on-board computers and technologies like GPS radically improved the pilots’ abilities to take appropriate actions and lowered their required workload in recent years. Yet, current technologies used in aviation cockpits generally still fail to adequately map and represent 3D airspace. In response, we aim to create an AI aviation assistant that considers all relevant aircraft operation data, focuses on providing tangible action recommendations, and on visualizing them for efficient and effective interpretation in 3D space. In particular, we note that extended reality (XR) applications provide an opportunity to augment pilots’ perception through live 3D visualizations of key flight information, including airspace structure, traffic information, airport highlighting, and traffic patterns. While XR applications have been tested in aviation in the past, applications are mostly limited to military aviation and latest commercial aircrafts. This ignores the majority of pilots in general aviation, in particular, where such support could drastically increase situational awareness and lower the workload of pilots. General aviation is characterized as the non-commerical branch of aviation, often relating to single-engine and single-pilot operations.
To develop applications usable across aviation domains, we plan to create a Unity project for XR glasses. Based on this, we plan to, in the first step, systematically and iteratively explore suitable AI-based support on pilot feedback in a virtual reality study in a flight simulator. Based on our findings, we refine the Unity application and investigate opportunites to conduct a real test flight with our external partner ENAC, the French National School of Civil Aviation, who own a plane. Such a test flight would most likely use latest Augmented Reality headsets like the HoloLense 2. Considering the immense safety requirements for such a real test flight, this part of the project is considered optional at this stage and depends on the findings from the previous virtual reality evaluation.
The system development will particularly focus on the use XR techniques to create more effective AI-supported traffic advisories and visualizations. With this, we want to advance the coordination and collaboration of AI with human partners, establishing a common ground as a basis for multimodal interaction with AI (WP3 motivated). Further, the MP relates closely to “Innovation projects (WP6&7 motivated)”, calling for solutions that address “real-world challenges and opportunities in various domains such as (…) transportation […]”.

Output

– Requirements and a prototype implementation for an AI-based assistant that provides recommendations and shows selected flight information based on pilot workload and current flight parameters
– A Unity project that implements an extended reality support tool for (general) aviation and that is used for evaluation in simulators (Virtual Reality) and possibly for a real test flight at ENAC (Augmented Reality)
– Findings from the simulator study and design recommandations
– (Optional) Impressions from a real test flight at ENAC
– A research paper detailing the system and the findings

Project Partners

  • Ludwig-Maximilians-Universität München (LMU), Florian Müller
  • Ecole Nationale de l'Aviation Civile (ENAC), Anke Brock

Primary Contact

Florian Müller, Ludwig-Maximilians-Universität München (LMU)