The project aims to develop a Framework for multimodal & multilingual conversational agents focus on. The framework is based on hierarchical levels of abilities:

– Reactive(sensori-motor) Interaction: Interaction is tightly-coupled perception-action where actions of one agent are immediately sensed and interpreted as actions of the other. Examples include greetings, polite conversation and emotional mirroring

– Situated (Spatio-temporal) Interaction Interactions are mediated by a shared model of objects and relations (states) and shared models for roles and interaction protocols.

– Operational Interaction Collective performance of tasks.

– Praxical Interaction Sharing of knowledge about entitles, relations, actions and tasks.

– Creative Interaction Collective construction of theories and models that predict and explain phenomena.

On this microproject we focus on the 2 first levels (Reactive & Situational) and design the global framework architecture. The work performed in this project will be demontrated in a PoC.

Output

OSS Framework (Level 1 and 2)

Project Partners:

  • Università di Bologna (UNIBO), Paolo Torroni

 

Primary Contact: Eric Blaudez, THALES

Results Description

Our work considers the applicability of the MR-CKR framework to the task of generating challenging inputs for a machine learning model.

Here, MR-CKR is a symbolic reasoning framework for Multi-Relational Contextual Knowledge Repositories that we previously developed. Contextual means that we can (defeasibly) derive different conclusions in different contexts given the same data. This means that conclusions in one context can be invalidated in a more specific context. Multi-Relational means that a context can be "more specific" with respect to different independent aspects, such as regionality or time.

The general idea of generating challenging inputs for a machine learning model is the following: We have limited data on which we can train our model, thus, it is like that the model does not cover all eventualities or does not have enough data in specific contexts to lead to the correct result. Obtaining more data is often very difficult or even infeasible however.

We introduce a new approach to solving this problem. Namely, given a set of diagnoses describing contexts in which the model performs poorly we generate new inputs that are (i) in the described contexts and (ii) as similar as possible to a given starting input. (i) allows us to train the network in a targeted manner by feeding it exactly those cases that it struggles with. (ii) ensures that the new input only differs from the old one in those aspects that make the new input problematic for the model. Thus, allowing us to teach the model to recognize aspects relevant for the answer.

This fits very well with the capabilities of MR-CKR: on the one hand, we have different contexts in which the inputs need to be modified to suit a different diagnosis of failure of the model. On the other hand, we can exploit the different relations by having one relation that specifies that inputs are more modifiable in one context than another and another relation that describes whether one diagnosis is a special case of another. Additionally, it allows us to incorporate global knowledge such that we can only modify inputs in such a manner that the result is still "realistic", i.e., satisfies the axioms in the global knowledge.

In this work, we provide a prototype specialized to generating similar and problematic scenes in the domain of Autonomous Driving.

This work fits well into Task 1.1 of WP1: "Linking symbolic and subsymbolic learning", since we use a symbolic approach to enable the use of domain knowledge in order to advance the performance of a subsymbolic model.

Furthermore, it also loosely fits into Task 1.5 of WP1: "Quantifying model uncertainty", since we can quantify how similar the generated new inputs are to the original ones.

Publications

ArXiv Technical Report:
Loris Bozzato, Thomas Eiter, Rafael Kiesel and Daria Stepanova (2023).
Contextual Reasoning for Scene Generation (Technical Report).
https://arxiv.org/abs/2305.02255

Links to Tangible results

Technical Report on formalization: https://arxiv.org/abs/2305.02255
Prototype implementation: https://github.com/raki123/MR-CKR
(The prototype has been also submitted as an AI Asset to the AI4Europe)

Research at the intersection of artificial intelligence (AI) and extended reality (XR) has amounted to substantial literature over the past 20 years. Applications cover a broad spectrum, for example, visualising neural networks in virtual reality or interacting with conversational agents. However, a systematic overview is currently missing.

This micro-project addresses this gap with a scoping review covering two main objectives: First, it aims to give an overview of the research conducted at the intersection of AI and XR. Secondly, we are particularly interested in revealing how XR can be used to improve interactive grounding in human-AI interaction. In summary, the review focuses on the following guiding questions: Which are the typical AI methods used in XR research? Which are the main use cases at the intersection of AI and XR? How can XR serve as a tool to enhance interactive grounding in human-AI interaction?

Output

Conference or journal paper co-authored by the proposers (possibly with et. partners)

Dataset of the papers including codes

Project Partners:

  • Københavns Universitet (UCPH), Teresa Hirzle
  • Københavns Universitet (UCPH), Kasper Hornbæk
  • Ludwig-Maximilians-Universität München (LMU), Florian Müller

 

Primary Contact: Teresa Hirzle, University of Copenhagen, Department of Computer Science

Results Description

We conducted a scoping review covering 311 papers published between 2017 and 2021. First, we screened 2619 publications from 203 venues to cover the broad spectrum of XR and AI research. For the search, we I inductively built a set of XR and AI terms. The venues include research from XR, AI, Human-Computer Interaction, Computer Graphics, Computer Vision, and others. After a two-phase screening process, we reviewed and extracted data from 311 full papers based on a code book with 26 codes about the research direction, contribution, and topics of the papers, as well as the algorithms, tools, datasets, models, and data types the researchers used to address research questions on XR and AI. The extracted data for these codes form the basis for our predominantly narrative synthesis. As a result, we found fve main topics at the intersection of XR and AI: (1) Using AI to create XR worlds (28.6%), (2) Using AI to understand users (19.3%), (3) Using AI to support interaction (15.4%), (4) Investigating interaction with intelligent virtual agents (IVAs) (8.0%), and (5) Using XR to Support AI Research (2.3%). The remaining 23.8% of the papers apply XR and AI to an external problem, such as for medical training applications (3.5%) or for simulation purposes (3.0%). Finally, we summarise our findings in 13 research opportunities and present ideas and recommendations for how to address them in future work. Some of the most pressing issues are a lack of generative use of AI to create worlds, understand users, and enhance interaction, a lack of generalisability and robustness, and a lack of discussion about ethical and societal implications.
In terms of the call topics, we analysed whether XR can serve as a tool to establish and enhance interactive grounding in human-AI interaction. Here, we found that there is a lack of understanding user experience during human-AI interaction using XR technology. Typically, AI is used for content creation and to enhance interaction techniques. We did, however, not find a lot of papers that use XR to support human-AI interaction. There are some works that look into artificial agents and how an interaction with them can be realised through XR. However, most of these works do not yet work in real=time and are mostly based on mock-up scenes.

Publications

Teresa Hirzle, Florian Müller, Fiona Draxler, Martin Schmitz, Pascal Knierim, and Kasper Hornbæk. 2023. When XR and AI Meet – A Scoping Review on Extended Reality and Artifcial Intelligence. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI ’23), April 23–28, 2023, Hamburg, Germany. ACM, New York, NY, USA, 45 pages. https://doi.org/10.1145/3544548.3581072

Links to Tangible results

Paper: https://thirzle.com/pdf/chi23_xrai_scoping_review_hirzle.pdf
Reviewed Papers and Coding Spreadsheet: https://thirzle.com/supplement/chi23_xrai_scoping_review_hirzle.zip
Videos: There will be talk videos at a later stage.

This MP studies the problem of how to alert a human user to a potentially dangerous situation, for example for handovers in automated vehicles. The goal is to develop a trustworthy alerting technique that has high accuracy and minimum false alerts. The challenge is to decide when to interrupt, because false positives and false negatives will lower trust. However, knowing when to interrupt is hard, because you must take into account both the driving situation and the driver's ability to react given the alert, moreover this inference must be done based on impoverished sensor data. The key idea of this MP is to model this as a partially observable stochastic game (POSG), which allows approximate solutions to a problem where we have two adaptive agents (human and AI). The main outcome will be an open library called COOPIHC for Python, which allows modeling different variants of this problem.

Output

COOPIHC library (Python)

Paper (e.g,. IUI’23 or CHI’23)

Project Partners:

  • Aalto University, Antti Oulasvirta
  • Centre national de la recherche scientifique (CNRS), Julien Gori

Primary Contact: Antti Oulasvirta, Aalto University

Results Description

When is an opportune moment to alert a human partner? This question is hard, because the beliefs and cognitive state of the human should be taken into account when choosing if/when to alert. Every alert is interruptive and bears a cost to the human. However, especially in safety-critical domains, the consequences of not alerting can be infinitely negative. In this work, we formulate the optimal alerting problem based on the theory of partially observable stochastic games. The problem of the assistant and the problem of the user are formulated and solved as a single problem in POSG. We presented first results using a gridworld environment, comparing different types of alerting agents and a roadmap for future work using realistic driver simulators. These models can inform handover/takeover decisions in semi-automated vehicles. The results were integrated into COOPIHC, a multiagent solver for interactive AI.

Publications

Hossein, Firooz (2022). AI-Assisted for Modeling Multitasking Driver. Thesis submitted for examination for the degree of Master of Science in Technology at Aalto University.

Links to Tangible results

The MP contributed to a computational theory called POSG, a multi-agent framework for human-AI interaction developed between CNRS and Aalto University:
https://jgori-ouistiti.github.io/CoopIHC/

The broad availability of 3D-printing enables end-users to rapidly fabricate personalized objects. While the actual manufacturing process is largely automated, users still need knowledge of complex design applications to not only produce ready-designed objects, but also adapt them to their needs or even design new objects from scratch.

In this project, we explore an AI-powered system that assists users in creating 3D objects for digital fabrication. For this, we propose to use natural language processing (NLP) to enable users to describe objects using their natural language (e.g., "A green rectangular box."). In this micro project, we conduct a Wizard-of-Oz study to elicit the requirements for such a system. The task of the participants is to recreate a given object using a spoken description with iterative refinements. We expect that this work will support the goal to make personal digital fabrication accessible for everyone.

Output

Requirements for voice-based 3D design

dataset

Design specification for a NLP model to support voice-based 3D design

Project Partners:

  • Ludwig-Maximilians-Universität München (LMU), Florian Müller/Albrecht Schmidt

Primary Contact: Florian Müller, LMU Munich

Results Description

Manufacturing tools like 3D printers have become accessible to the wider society, making the promise of digital fabrication for everyone seemingly reachable. While the actual manufacturing process is largely automated today, users still require knowledge of complex design applications to not only produce ready-designed objects, but also adapt them to their needs or design new objects from scratch. To lower the barrier for the design and customization of personalized 3D models, we imagine an AI-powered system that assists users in creating 3D objects for digital fabrication. Reaching this vision requires a common understanding – a common ground – between the users and the AI system.

As a first step, in this micro project, we explored novices’ mental models in voice-based 3D design by conducting a high-fidelity Wizard of Oz study with 22 participants without skills in 3D design. We asked the participants to perform 14 tasks revolving around some basic concepts of 3D design for digital modeling, like the creation of objects, the manipulation of objects (e.g., scaling, rotating, and/or moving objects), and the creation of composite objects. We performed a thematic analysis of the collected data assessing how the mental model of novices translates into voice-based 3D design.

We found that future AI assistants to support novice users in voice-based digital modeling must: manage the correction the users do during and after the commands to fix certain errors; deal with vague and incomplete commands by automatically completing the commands with sensible defaults or by asking the users for clarification; consider the prior novices knowledge, for example, about the use of undo e redo functions; provide only a simplified set of operations for creating simple and composite 3D objects; design a workflow similar to what novices would do if they were building real objects, for example, providing wizard procedures that guide novices in designing composite 3D models starting from the bottom; provide different commands to select 3D objects; understand and execute chained commands; understand commands that are relative to the users’ point of view; grant multiple ways to refer to the axes, for example, by using their names, colors and user direction; favor explicit trigger words to avoid unintentional activation of the voice assistant; embrace diversity in naming approaches since novices often use other words to refer to 3D objects.

Publications

Paper was rejected at CHI 23. Currently under submission to INTERACT 23.

Links to Tangible results

The transcribed and coded data we collected in our study, together with the codebook. We plan to make this data available to the community through a publication after the paper is published. For now, please keep them confidential.

https://syncandshare.lrz.de/getlink/fiEFHiEQVQYtHDj5ZSWBdp/

The communication between patients and healthcare institutions is increasingly moving to digital applications. Whereas information about the patient’s wellbeing is typically collected by means of a questionnaire, this is a tedious task for many patients, especially when it has to be done periodically, and may result in incomplete or imprecise input. Much can be gained by making the process of filling in such questionnaires more interactive, by deploying a conversational agent that can not only ask the questions, but also ask follow-up questions and respond to clarification questions by the user. We propose to deploy and test such a system.

Our proposed research aligns well with the WP3 focus on human-AI communication, and will lead to re-usable conversation patterns for conducting questionnaires in healthcare. The work benefit from existing experience with patient-provider communication within Philips and build on the SUPPLE framework for dialog management and sequence expansion.

Output

A dataset on conversation(s) between a patient and a conversational AI

A dialog model derived from the dataset

Scientific publication

Project Partners:

  • Philips Electronics Nederland B.V., Aart van Halteren
  • Stichting VU, Koen Hindriks

 

Primary Contact: Aart van Halteren, Philips Research

When we go for a walk with friends, we can observe that our movements unconsciously synchronize. This is a crucial aspect of human relations that is known to build trust, liking, and the feeling of connectedness and rapport. In this project, we explore if and how this effect can enhance the relationship between humans and AI systems by increasing the sense of connectedness in the formation of techno-social teams working together on a task.

To evaluate the feasibility of this approach, we plan to build a physical object representing an AI system that can bend in two dimensions to synchronize with the movements of humans. Then, we plan to conduct an initial evaluation in which we will track the upper body motion of the participants and use this data to compute the prototype movement using different transfer functions (e.g., varying the delay and amplitude of the movement).

Output

Physical prototype that employs bending for synchronization

Study results on the feasibility of establishing trust with embodied AI systems through motion synchronization.

Publication of the results

 

Primary Contact: Florian Müller, LMU Munich

Results Description

When we go for a walk with friends, we can observe an interesting effect: From step lengths to arm movements – our movements unconsciously align; they synchronize. Prior research in social psychology found that this synchronization is a crucial aspect of human relations that strengthens social cohesion and trust. In this micro project, we explored if and how this effect generalizes beyond human-human relationships. We hypothesized that synchronization can enhance the relationship between humans and AI systems by increasing the sense of connectedness in the formation of techno-social teams working together on a task. To evaluate the feasibility of this approach, we built a prototype of a simple non-humanoid robot as an embodied representation of an AI system. The robot tracks the upper body movements of people in its vicinity and can bend to follow human movements and vary the movement synchronization patterns. Using this prototype, we conducted a controlled experiment with 51 participants exploring our concept in a between-subjects design. We found significantly higher ratings on trust between people and automation in an established questionnaire for synchronized movements. However, we could not find an influence on the willingness to spend money in a trust game inspired by behavioral economics. Taken together, our results strongly suggest a positive effect of synchronized movement on the participants’ feeling of trust toward embodied AI representations.

Publications

To appear May 2023:

Wieslaw Bartkowski, Andrzej Nowak, Filip Ignacy Czajkowski, Albrecht Schmidt, and Florian Müller. 2023. In Sync: Exploring Synchronization to Increase Trust Between Humans and Non-humanoid Robots. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI ’23), April 23–28, 2023, Hamburg, Germany. ACM, New York, NY, USA, 14 pages. https://doi.org/10.1145/3544548.3581193

Links to Tangible results

Paper: https://syncandshare.lrz.de/getlink/fiAUnSaqgQJyEdcw5XJqpN/in_sync_final.pdf
Video 30 sec: https://syncandshare.lrz.de/getlink/fiRjwbk1AoYxKujaEaZ5ax/in_sync_video_short.mp4
Video Full: https://syncandshare.lrz.de/getlink/fiGEX3bGahhbzrChUiXqvL/in_sync_video_full.mp4
Repository ( to be filled by the time of publication of the paper): https://github.com/wbartkowski/In-Sync-Robot-Prototype

We study proactive communicative behavior, where robots provide information to humans which may help them to achieve desired outcomes, or to prevent possible undesired ones. Proactive behavior in an under-addressed area in AI and robotics, and proactive human-robot communication is even more so. We will combine the past expertise of Sorbonne Univ. (intention recognition) and Orebro Univ. (proactive behavior) to define proactive behavior based on the understanding of user’s intentions, and then extend it to consider communicative actions based on second-order perspective awareness.

We propose an architecture able to (1) estimate the human's intention of goal, (2) infer robot’s and human’s knowledge about foreseen possible upcoming outcomes of intended goal, (3) detect opportunities for desirability of intended goal to robot be proactive, (4) select action from the listed opportunities. The theoretical underpinning of this work will contribute to the study of theory of mind in HRI.

Output

Jupyter Notebook / Google Colab that presents the code of proposed architecture and is able to provide plug and play interaction.

a manuscript describing the proposed architecture and initial findings of the experiment

Presentations

Project Partners:

  • Sorbonne Université, Mohamed CHETOUANI
  • Örebro University (ORU), Alessandro Saffioti and Jasmin Grosinger

Primary Contact: Mohamed CHETOUANI, Sorbonne University

Main results of micro project:

The goal of this micro-project is to develop a cognitive architecture able to generate proactive communicative behaviors during human-robot interactions. The general idea is to provide information to humans which may help them to achieve desired outcomes, or to prevent possible undesired ones. Our work proposes a framework that generates and selects among opportunities for acting based on recognizing human intention, predicting environment changes, and reasoning about what is desirable in general. Our framework has two main modules to initiate proactive behavior; intention recognition and equilibrium maintenance.
The main achievements are:
– Integration of two systems: user intention recognition and equilibrium maintenance in a generic architecture
– Showing stability of the architecture to many users
– Reasoning mechanism and 2nd order perspective awareness
The next steps will aim to show knowledge repair, prevent outcomes of lack of knowledge and improve trustability, transparency and legibility (user study)

Contribution to the objectives of HumaneAI-net WPs

– Playground system that HumaneAI-net partners could define their interactive scenario to play with the robot’s proactivity.

-T3.3 -> Study about how to model human rationality to detect and use computationally defined human belief, goal and intention. Then, use that model to make robots proactive. Human in the loop system to support cooperative behavior of robots while sharing the environment by generating proactive communication.

-T3.1 -> Study relates robots that generate proactive communication, possible effects on human cognition and interaction strategies.

Tangible outputs

Many industrial NLP applications emphasise the processing and detection of nouns, especially proper nouns (Named Entity Recognition, NER). However, processing of verbs has been neglected in recent years, even though it is crucial for the development of full NLU systems, e.g., for the detection of intents in spoken language utterances or events in written language news articles. The META-O-NLU microproject focuses on proving the feasibility of a multilingual event-type ontology based on classes of synonymous verb senses, complemented with semantic roles and links to existing semantic lexicons. Such an ontology shall be usable for content- and knowledge-based annotation, which in turn shall allow for developing NLU parsers/analyzers. The concrete goal is to extend the existing Czech-English SynSemClass lexicon (which displays all the necessary features, but only for two languages) by German and Polish, as a first step to show it can be extended to other languages as well.

Output

Common paper co-authored by the proposers (possibly with et. partners)

Extended version of SynSemClass (entried in additional languages)

Presentations

Project Partners:

  • Charles University Prague, Jan Hajič
  • German Research Centre for Artificial Intelligence (DFKI), Georg Rehm

Primary Contact: Jan Hajič, Univerzita Karlova (Charles University, CUNI)

Main results of micro project:

The main results of the META-O-NLU microproject is the extension of the original SynSemClass dataset by German classes, or more precisely, the inclusion of German verbs and event descriptors to the existing classes in SynSemClass. Together with the individual verbs, existing German lexical resources have been linked to (GermaNet, E-VALBU and GUP). Adding a third language demonstrated that future extension to other languages is feasible, in terms of annotation rules, the dataset itself, and in creating a new web browser that can show all language entries alongside each other with all the external links. The data is freely available in the LINDAT/CLAIRAH-CZ repository (and soon also through the Euroepan Language Grid) and a web browser on the resources is now also available.

Contribution to the objectives of HumaneAI-net WPs

Task 3.6 focuses on both spoken and written language-based interactions (dialogues, chats), in particular questions of multilinguality that are essential to the European vision of human-centric AI. The results of this microproject contribute especially to the multlingual issue, and is directed to full NLU (Natural Language Understanding) by describing event types, for which no general ontology exists yet. The resulting resource will be used for both text and dialog annotation, to allow for evaluation and possibly also for training of NLU systems.

Tangible outputs

The aim of the project is to investigate both the theoretical and empirical roles of agency in successful human-computer partnerships. For human-centred AI research, the understanding of agency is a key factor in achieving effective collaboration. Although recent advances in AI have enabled systems to successfully contribute to human-computer interaction, we are interested in extending this such that the interaction acts more like a ‘partnership’. This requires building systems with collaborative agency that users can manipulate in the process. Research questions include: 1) identifying which parameters are relevant to the description of the system agency, 2) what impact these parameters have on the perceived agency and 3) how to modify them in order to achieve different roles of systems in a process.

Output

Theoretical: Literature review on agency / research paper / define parameters

Empirical: Demo (paper, video, interactive)

Project Partners:

  • Institut national de recherche en sciences et technologies du numérique (INRIA), Janin Koch
  • Ludwig-Maximilians-Universität München (LMU), Albrecht Schmidt
  • Københavns Universitet (UCPH), Kasper Hornbaek
  • Stichting VU, Koen Hindriks
  • Umeå University (UMU), Helena Lindgren

Primary Contact: Janin Koch, Inria

Attachments

Agency_MicroProject_Koch_Mackay_March17.mov

Exloring the Impact of Agency INRIA J Koch Agency_MP3_Berlin.mov

Results Description

The project's objective was to examine the theoretical and practical contributions of agency to successful human-computer partnership. Understanding agency is critical for establishing effective collaboration in human-centered AI research.
The goal of this project was to 1) produce an overview of existing HCI and AI literature on agency, 2) hold a workshop to brainstorm and categorize agency components, define metrics and experimental protocols, and 3) create interactive demos demonstrating various forms of human and system agency.

We conducted individual and collaborative brainstorming sessions with all participants to create an initial overview of current literature in order to establish a common starting point (1). We talked about potential overlaps in our work and how such perspectives influence our current work.
We will hold a workshop at CHI'23 on 'Integrating AI in Human-Human Collaborative Ideation' to examine the role AI can play in such interactive environments in order to identify distinct dimensions  and measures of agency within human-ai interaction (2) [Shin et al. , 2023].
Umeå has also investigated how conversations between a human and a socially intelligent robot in a home environment can influence perceptions of agency [Tewari and Lindgren, 2022] and the importance of goal setting in such a scenario [Lindgren and Weck, 2022; Kilic et al., 2023] (3).

While there is currently project underway, we will report our findings in the second half of this year.

Publications

Tewari M and Lindgren H (2022), Expecting, understanding, relating, and interacting – older, middle-aged and younger adults’ perspectives on breakdown situations in human–robot dialogues. Front. Robot. AI 9:956709. doi: 10.3389/frobt.2022.956709.

Kilic K, Weck S, Kampik T, Lindgren H. Argument-Based Human-AI Collaboration for Supporting Behavior Change to Improve Health. to appear in Front. AI, 2023.

Lindgren H and Weck S. 2022. Contextualising Goal Setting for Behaviour Change – from Baby Steps to Value Directions. In 33rd European Conference on Cognitive Ergonomics (ECCE2022), October 4–7, 2022, Kaiserslautern,Germany. ACM, New York, NY, USA. https: //doi.org/10.1145/3552327.3552342

Joongi Shin, Janin Koch, Andrés Lucero, Peter Dalsgaard, Wendy E. Mackay. Integrating AI in Human-Human Collaborative Ideation. CHI 2023 – SIGCHI conference on Human Factors in computing systems, Apr 2023, Hamburg, Germany. pp.1-5. ⟨hal-04023507⟩

Links to Tangible results

https://www.frontiersin.org/articles/10.3389/frobt.2022.956709/full

https://www.frontiersin.org/articles/10.3389/frai.2023.1069455/full

https://dl.acm.org/doi/abs/10.1145/3552327.3552342

Tb released in the proceedings of CHI

We propose to research how autobiographical recall can be detected in virtual reality (VR). In particular, we experimentally investigate what physiological parameters accompany interaction with autobiographical memories in VR. We consider VR as one important representation of Human-AI collaboration.

For this, we plan to (1) record an EEG data set of people’s reaction and responses when recalling an autobiographical memory, (2) label the data set, and (3) do an initial analysis of the dataset to inform the design of autobiographical VR experiences. We would try to automate data collection as much as possible to make it easy to add more data over time.

This will contribute to a longer-term effort in model and theory formation. The main Contribution is to WP3. This is set in Task 3.2: Human-AI Interaction/collaboration paradigms and aims at better understanding user emotion in VR to model self-relevance in AI collaboration Task 3.4.

Output

dataset on autobiographical recall in VR

a manuscript describing the data set and initial insights into autobiographical recall in VR

Presentations

Project Partners:

  • Ludwig-Maximilians-Universität München (LMU), Albrecht Schmidt
  • German Research Centre for Artificial Intelligence (DFKI), Paul Lukowicz and Patrick Gebhard

Primary Contact: Albrecht Schmidt, Ludwig-Maximilians-Universität München

Main results of micro project:

We have developed VR experiences for research on autobiographical recall in virtual reality (VR). This allows us to experimentally investigate what physiological parameters accompany self-relevant memories elicited by digital content. We have piloted the experiment and are currently recording more data on the recall of autobiographical memories. After data collection is complete, we will label the data set, and do an initial analysis of the dataset to inform the design of autobiographical VR experiences. We have also co-hosted a Workshop on AI and human memory.

Contribution to the objectives of HumaneAI-net WPs

The main Contribution is to WP3. This is set in Task 3.2: Human-AI Interaction/collaboration paradigms and aims at better understanding user emotion in VR to model self-relevance in AI collaboration Task 3.4. The VR experience is implemented in Unity and we are happy to share this in the context of a joint project.

Tangible outputs

In this micro-project, we propose investigating human recollection of team meetings and how conversational AI could use this information to create better team cohesion in virtual settings.

Specifically, we would like to investigate how a person's emotion, personality, relationship to fellow teammates, goal and position in the meeting influences how they remember the meeting. We want to use this information to create memory aware conversational AI that could leverage such data to increase team cohesion in future meetings.

To achieve this goal, we plan first to record a multi-modal data-set of team meetings in a virtual-setting. Second, administrate questionnaires to participants in different time intervals succeeding a session. Third, annotate the corpus. Fourth, carry out an initial corpus analysis to inform the design of memory-aware conversational AI.

This micro-project will contribute to a longer-term effort in building a computational memory model for human-agent interaction.

Output

A corpus of repeated virtual team meetings (6 sessions spaced, 1 week each)

manual annotations (people’s recollection of the team meeting etc.)

automatic annotations (e.g. eye-gaze, affect, body posture etc.)

A paper describing the corpus and insights gained on the design of memory-aware agents from initial analysis

Project Partners:

  • TU Delft, Catholijn Jonker
  • Eötvös Loránd University (ELTE), Andras Lorincz

Primary Contact: Catharine Oertel, TU Delft

Main results of micro project:

1) A corpus of repeated virtual team meetings (4 sessions spaced, 4 days apart each).
2) Manual annotations (people's recollection of the team meeting etc.)
3) Automatic annotations (e.g. eye-gaze, affect, body posture etc.)
4)A preliminary paper describing the corpus and insights gained on the design of memory-aware agents from initial analysis

Contribution to the objectives of HumaneAI-net WPs

In this micro-project, we propose investigating human recollection of team meetings and how conversational AI could use this information to create better team cohesion in virtual settings.
Specifically, we would like to investigate how a person's emotion, personality, relationship to fellow teammates, goal and position in the meeting influences how they remember the meeting. We want to use this information to create memory aware conversational AI that could leverage such data to increase team cohesion in future meetings.
To achieve this goal, we plan first to record a multi-modal data-set of team meetings in a virtual-setting. Second, administrate questionnaires to participants in different time intervals succeeding a session. Third, annotate the corpus. Fourth, carry out an initial corpus analysis to inform the design of memory-aware conversational AI.
This micro-project will contribute to a longer-term effort in building a computational memory model for human-agent interaction.

Tangible outputs

  • Dataset: MEMO – Catharine Oertel
  • Publication: MEMO dataset paper – Catharine Oertel
  • Program/code: Memo feature extraction code – Andras Lorincx

We propose to research a scalable human-machine collaboration system with the common goal of executing high quality actions (e.g., in rehabilitation exercise). We combine video and speech for video-grounded goal-oriented dialogue. We build on our video and text database. The database has exercises for rehabilitation following knee injuries. We evaluate high performance body pose estimation tools and compare it to a real-time body pose estimation tool to be developed for smartphones via ‘knowledge distillation’ methods.

The complementing part of the project deals with the texts that we have collected for these exercises and estimates the amount of texts needed for dialogues that can lead and correct the quality of exercises. Potential topics/intents include pose relative to camera, proper light conditions, audio-visual information about pain, notes about execution errors, errors discovered by the computer evaluations, requests about additional information from the patient, and reactions to other, unrelated queries.

Output

Dataset of the dialogues

Publication on the constraints and potentials of existing state-of-the-art methods

Performance evaluation methods and usability studies

Presentations

Project Partners:

  • Eötvös Loránd University (ELTE), András Lőrincz
  • Charles University Prague, Ondřej Dušek

Primary Contact: András Lőrincz, Eotvos Lorand University

Main results of micro project:

Machine assisted physical rehabilitation is of special interest since (a) it is a relatively narrow field, but (b) observation and interactions are multimodal and include natural language processing, video processing, speech recognition and generation and (c) it is a critical medical application. We considered rehabilitation after total knee replacement as a prototype scenario. We used 2D and RGBD cameras and dialogue systems at three levels, such as
(i) video-based feedback aiming both documentation and helping performance improvements,
(ii) additional rule-based dialogue with specific error detection, and
(iii) extensions with a data-driven dialogue system based on the DialoGPT language model.
We argue that time is ripe to revitalize existing practices using recent advances of machine learning.

Contribution to the objectives of HumaneAI-net WPs

Video-based dialogue systems meet the goals of the Foundations of Human-AI interactions, whereas the rehabilitation scenario is a prototype for goal-oriented collaboration. The microproject targeted specific topics, including
(i) body motion and pain both
— in terms a language and potential dialogues and
— in more than 400 video samples that included 50 exercises and about 7 errors on the average to be detected alone or in combinations for each motion types
and
(ii) dialogues
— from experts and
— crowdsourcing based dialogue enhancements

Tangible outputs

  • Publication: DeepRehab: Real Time Pose Estimation on the Edge for Knee Injury Rehabilitation – Bruno Carlos Dos Santos Melício, Gábor Baranyi, Zsófia Gaal, Sohil Zidan,and Andras Lőrincz
    https://e-nns.org/icann2021/
  • Publication: Multimodal technologies for machine-assisted physical rehabilitation – Ondrej Dusek, András Simonyi, Dániel Sindely, Levente Juhász, Gábor Baranyi, Tomas Nekvinda, Márton Véges, Kinga Faragó, András Lőrincz
    submitted