Transformers and self-attention (Vaswani et al., 2017), have become the dominant approach for natural language processing (NLP) with systems such as BERT (Devlin et al., 2019) and GPT-3 (Brown et al., 2020) rapidly displacing more established RNN and CNN structures with an architecture composed of stacked encoder-decoder modules using self-attention.

This micro-project will provide tools and data sets for experiments and a first initial demonstration of the potential of transformers for multimodal perception and multimodal interactions. We will define research challenges, benchmark data sets and performance metrics for multimodal perception and interaction tasks such as (1) audio-visual narration of scenes, cooking actions and activities, (2) audio-video recordings of lectures and TV programs (3) audio-visual deictic (pointing) gestures, and (4) perception and evocation of engagement, attention, and emotion.

(full description and bibliography covers 200 words – available on request).

Output

Benchmark data and performance targets for a phased set of research challenges of increasing difficulty.

Tools for experiments to explore use of embeddings, encoder-decoders, self-attention architectures and related problems associated with applying transformers to different modalities.

Concept demonstrations for simple examples of multimodal perception.

Presentations

Project Partners:

  • Institut national de recherche en sciences et technologies du numérique (INRIA), James Crowley
  • Eötvös Loránd University (ELTE), Andras Lorincz
  • Université Grenoble Alpes (UGA), Fabien Ringeval
  • Centre national de la recherche scientifique (CNRS), François Yvon
  • Institut “Jožef Stefan” (JSI), Marko Grobelnik

Primary Contact: James Crowley, INRIA

Main results of micro project:

This micro-project will survey tools and data sets for experiments for demonstrating the potential use of transformers for multimodal perception and multimodal interactions. We will define research challenges and performance metrics for multimodal perception and interaction tasks such as audio-visual narration of scenes, cooking actions and activities, audio-visual deictic (pointing) gestures, and perception and evocation of engagement, attention, and emotion. We will provide tutorials on the use of transformers for multimodal perception and interaction.

Contribution to the objectives of HumaneAI-net WPs

This microproject will aid and encourage the use of a transformers and self attention for multimodal modal interaction by Humane AI Net researchers, by identifying relevant tools and benchmark data sets, by providing tutorials and training materials for education, and by identifying research challenges for multimodal perception and interaction with Transformers.

Tangible outputs

  • Dataset: A survey of tools and data-sets for a multimodal perception with transformers – James Crowley
  • Other: A tutorial on the use of transformers for multimodal perception. – Francois Yvon
  • Other: Research challenges for the use of transformers for multimodal perception and interaction. – James Crowley

Robots are already in wide use in industrial settings where the interactions with people are well structured and stable. Interactions with robots in home settings are notoriously more difficult. The context of interactions changes over time, depending on the people present, the time of day, the event going on, etc. In order to cope with all these factors creating uncertainty and ambiguity people use practices, norms, conventions, etc. to normalize and package certain interactions into standard types of actions performed in order by the parties involved. E.g. getting coffee.

Within this project we will explore how the idea of social practices to regulate interactions and create expectations in the parties involved can be used to guide robots in their interactions with people. We will explore a simple scenario with a Pepper robot to explore all practical obstacles when using these concepts in robotics.

Output

Three MSc thesis reports

demo software

documented example on the AI4EU platform

Presentations

Project Partners:

  • Umeå University (UMU), Frank Dignum
  • Örebro University (ORU), Alessandro Safiotti

Primary Contact: Frank Dignum, Umeå University

Main results of micro project:

There is a first prototype of the use of social practices in the interaction between a robot and humans. It is shown that following a social practice can help planning for the interaction. It can also be used to support recovery from deviations of the expected interaction by the human. There is a first representation of social practices in a data structure that is usable by the robot planner. A first version of a planner using the social practice information and an execution process that both executes the plan and monitors the progress of the interaction and adapts or re-plans the robots actions when necessary.

Contribution to the objectives of HumaneAI-net WPs

The project shows how social practices can be used to guide human-robot interactions. This provides a social context that can be helpful to adapt the actions of the robot to both the situation and the user. The project was a very first attempt to create a practical implementation and thus can only be seen as a basis on which further work can be done to really take advantage of all aspects of social practices.

Tangible outputs

  • Other: Ai Planning with Social Practices for the Pepper robot. – Frank Dignum

Interaction between chatbots and humans is often based on frequently occurring interaction patterns, e.g. question – answer. Those patterns usually describe a very brief phase in the interaction. In this micro project, we want to investigate whether we can design a chatbot for behavior change by including higher level patterns, which are adapted from the taxonomy of behavior change techniques (BCT’s) [1]. These patterns should describe the components of the interaction during a longer period of time. In addition, we will investigate how to design a user interface in such a way that it sustains the interest of the users.

We will focus on reducing sedentary behavior [2], and especially sitting behavior, which can have negative health consequences. The interaction patterns and user interface will be implemented in a prototype. An user study will be done to evaluation the different components on effectiveness and engagement.

Output

Scientific paper

Prototype for consortium

Presentations

Project Partners:

  • Stichting VU, Koen Hindriks, Michel Klein
  • University College London í(UCL), Yvonne Rogers

Primary Contact: Michel Klein, Vrije Universiteit Amsterdam

Main results of micro project:

Interaction between chatbots and humans is often based on frequently occurring interaction patterns, e.g. question – answer. Those patterns usually describe a very brief phase in the interaction. In this micro project, we want to investigate whether we can design a chatbot for behavior change by including higher level patterns, which are adapted from the taxonomy of behavior change techniques (BCT’s). These patterns should describe the components of the interaction during a longer period of time. In addition, we will investigate how to design a user interface in such a way that it sustains the interest of the users.

A prototype has been developed is currently being evaluated.

Contribution to the objectives of HumaneAI-net WPs

We investigate how AI systems can collaborate with humans, specifically focussing on changing a specific behavior. We increase our understanding of how specific interaction forms between an AI system and a human are effective in achieving behavior change. We also investigate to what extent knowledge about health behavior can contribute to desiging realistic and effective communication.

Tangible outputs

  • Publication: Bachelor thesis – Michel Klein

Attachments

Klein M_March17.mkv

This project aims at investigating the construction of humor models to enrich conversational agents through the help of interactive reinforcement learning approaches.

Our methodology consists in deploying an online platform where passersby can play a game of matching sentences with humorous comebacks against an agent.

The data collected from these interactions will help to gradually build the humor models of the agent following state of the art Interactive Reinforcement Learning techniques.

We plan to work on this project for 4 months, resulting in an implementation of the platform, a first model for humor-enabled conversational agent and a publication of the obtained results and evaluations.

Output

Online game for collecting humorous interaction data

Humor models for conversational agents

Paper in International Conference of Journal related to AI and AI in Games

Project Partners:

  • Centre national de la recherche scientifique (CNRS), Brian Ravenet
  • Instituto Superior Técnico (IST), Rui Prada

Primary Contact: Brian Ravenet, LISN-CNRS (ex LIMSI-CNRS)

Main results of micro project:

The main result of this project will be the creation of an intelligent agent capable of playing a game – Cards Against Humanity- that involves matching sentences with humorous comebacks. The game requires that players be able to combine black and white cards to form the funniest joke possible. Therefore, the developing AI agent must be able to make funny jokes. Ultimately, this opens perspectives for the development of humor models in conversational AIs, a key social competence in our daily human interactions.

Contribution to the objectives of HumaneAI-net WPs

The micro-project produced for HumaneAI-net a dataset of annotated associations between black and white cards following the game design of Cards Against Humanity. By doing so, the micro-project led to the creation of a unique dataset of humorous associations between concepts, annotated in terms of different humor styles by the participants of the experiment. The preliminary analysis on how the dataset can be leveraged to build different humor models for conversational agents is particularly relevant for the tasks T3.3 and 3.4 of WP3. Additionally, the micro project aims at exploring how to refine the humor models through an interactive learning approach, particularly relevant for the task T1.3 of WP1.

Tangible outputs

  • Dataset: Dataset – 1712 jokes, rated on a scale of 1 to 9 in terms of joke level, originality, positivity, entertainment, whether it makes sense and whether it is family-friendly
    – Rui Prada
  • Program/code: Online Game – A game of matching sentences with humorous comebacks against an agent (similar to the game Cards Against Humanity)
    – Ines Batina

Results Description

The main result of this project are the creation of an intelligent agent capable of playing a game – Cards Against Humanity- that involves matching sentences with humorous comebacks.
In order to achieve this, a dataset of 1712 jokes, rated on a scale of 1 to 9 in terms of joke level, originality, positivity, entertainment, whether it makes sense and whether it is family-friendly, were collected and an online game was developed to serve as the foundation of the reinforcement mechanism.

Publications

No publications

Links to Tangible results

Not applicable

Owing to the progress of underlying NLP technologies (speech to text, text normalization and compression, machine translation) automatic captioning technologies (ATC) both intra- and inter-lingual, are rapidly improving. ACTs are useful for many contents and contexts: from talks and lectures to news, fictions and other entertaining content.

While historical systems are based on complex NLP pipelines, recent proposals are based on integrated (end-to-end) systems, which questions standard evaluation schemes, where each module can be assessed independently from the others.

We focus on evaluating the quality of the output segmentation, where decisions regarding the length, disposition and display duration of the caption need to be taken, all having a direct impact on the acceptability and readabilitye. We will notably study ways to perform reference-free evaluations of automatic caption segmentation. We will also try to correlate these « technology-oriented » metrics with user-oriented evaluations in typical use cases: post-editing and direct broadcasting.

Output

Survey of existing segmentation metrics

Design of a contrastive evaluation set

Comparison of metrics on multiple languages / tasks

Project Partners:

  • Centre national de la recherche scientifique (CNRS), Francois Yvon
  • Fondazione Bruno Kessler (FBK), Marco Turchi

Primary Contact: François Yvon, CNRS

We aim evaluate the usefulness of current dialogue dataset annotation and propose annotation unification and automatized enhancements for better user modeling by training on larger amounts of data. Current datasets’ annotationis often only focused on annotation geared toward the dialog system learning how to answer, while the user representation should be explicit,consistent and as complete as possible for more complex user representation (e.g. cognitively). Theproject will start from existing annotated dialog corpora and produce extended versions, with improved annotation consistency and extra user representation annotations produced automatically from existing corpora like bAbI++ and MultiWOZ and others. We will explore unifying annotations from multiple datasets and evaluate the enhanced annotation using our own end-to-end dialogue models based on memorynetworks . Connection with T3.7 and T3.4 is straighforward since the task-oriented dialogue systems are the very definition of conversational, collaborative AI. T3.6 will be addressed through round-trip translation for data augmentation.

Output

Extented and unified versions of publicly available dialog corpora with explicit user modelingannotations (bAbI++, MultiWOZ etc.)

a report and papers describing a unified user modeling annotation scheme with respect toexisting dialog annotation datasets and the results of some baseline experiments using theannotated data produced by the project.

Presentations

Project Partners:

  • Centre national de la recherche scientifique (CNRS), P. Paroubek
  • Charles University Prague, O. Dušek

Primary Contact: Patrick Paroubek, LIMSI-CNRS

Main results of micro project:

A corpus of 37,173 annotated dialogues with unified and enhanced annotations built from existing open dialogue resources.

Code & trained models (GPT-2, MarCo) for dialogue response generation on the above corpus.

One paper accepted at the TALN2021 conference: Léon-Paul Schaub, Vojtech Hudecek, Daniel Stancl, Ondrej Dusek, Patrick Paroubek, "Defining And Detecting Inconsistent System Behavior in Task-oriented Dialogues",
https://hal.archives-ouvertes.fr/TALN-RECITAL2021/hal-03265892

One paper to be submitted to the "Dialogue and Discourse" journal.

Ongoing collaboration between LISN (Paris-Saclay University) and Fac. of Mathematics and Physics (Charles University, Pragues).

Contribution to the objectives of HumaneAI-net WPs

By providing an open annotated dialogue resource with unified and enhanced annotations, DIASER offers to the community linguistic material usable both for machine learning experiments and for testing dialog model properties in relation with dialog history management, dialog consistency checking and user modeling aspects.

The result of DIASER is related to issues pertaining to the following tasks: mainly T3.6 Language Based and Multilingual Interaction
with potential links to T3.7 Conversational, Collaborative AI,
T3.4 User Models and Interaction History, T3.2 Human AI Interaction / Collaboration Paradigms, T3.3 Reflexivity and Adaptation in Human AI collaborations.

Tangible outputs

Adapting user interfaces (UIs) requires taking into account both positive and negative effects that changes may have on the user. A carelessly picked adaptation may impose high costs — for example, due to surprise or relearning effort. It is essential to consider differences between users as the effect of an adaptation depends on the user's strategies, e.g. how each user searches for information in a UI. This microproject extends an earlier collaboration between partners on model-based reinforcement learning for adaptive UIs by developing methods to account for individual differences. Here, we first develop computational models to explain and predict users' visual search and pointing strategies when searching within a UI. We apply this model to infer user strategies based on interaction history, and adapt UIs accordingly. The outcomes of this project will be (1) a publication at the ACM CHI conference and (2) integration in our platform for adaptive UIs.

Output

Model of visual search and pointing in menus. The code will be available on GitHub

The integration of the model in our platform for adaptive UI. The code will be available on GitHub

A demo of the system will be available online

A publication at the conference ACM CHI

Presentations

Project Partners:

  • Sorbonne Université, Gilles Bailly
  • Aalto University, Kashyap Todi

Primary Contact: Gilles Bailly, Sorbonne Université, CNRS, ISIR

Main results of micro project:

This micro-project reinforces the collaborations between Sorbonne Université, Aalto University and University of Luxembourg with weekly meetings. It aims at elaborating computational models of visual search in adaptive User Interfaces. We defined different visual search strategies in adaptive menus as well as promising interactive mechanisms to revisit how to to design menus. The Elaboration of the model is in progress.

Contribution to the objectives of HumaneAI-net WPs

The micro-project aims at empowering humans with advanced user interfaces with the capacity to adapt to the user goals. More precisely, it aims at designing User Interfaces increasing usability and allowing humans and machines to better collaborate. We plan to integrate our model in our platform for adaptive UIs and make a demo of the system available online.

Tangible outputs

This project entails robot online behavioral adaptation during interactive learning with humans. Specifically, the robot shall adapt to each human subject’s specific way of giving feedback during the interaction. Feedback here includes reward, instruction and demonstration, and can be regrouped under the term “teaching signals”. For example, some human subjects prefer a proactive robot while others prefer the robot to wait for their instructions; some only tell the robot when it performs a wrong action, while others reward correct actions, etc. The main outcome will be a new ensemble method of human-robot interaction which can learn models of various human feedback strategies and use them for online tuning of reinforcement learning so that the robot can quickly learn an appropriate behavioral policy. We will first derive an optimal solution to the problem and then compare the empirical performance of ensemble methods to this optimum through a set of numerical simulations.

Output

Paper in IEEE RO-MAN or ACM/IEEE HRI or ACM CHI

yes

Presentations

Project Partners:

  • Sorbonne Université, Mohamed Chetouani
  • ATHINA, Petros Maragos

Primary Contact: Mehdi Khamassi, Sorbonne University

Main results of micro project:

We designed a new ensemble learning algorithm, combining model-based and model-free reinforcement learning, for on-the-fly robot adaptation during human-robot interaction. The algorithm includes a mechanism for the robot to autonomously detect changes in a human's reward function from its observed behavior, and a reset of the ensemble learning accordingly. We simulated a series of human-robot interaction scenarios to test the robustness of the algorithm. In scenario 1, the human rewards the robot with various feedback profile: stochastic reward; non-monotonic reward; or punishing for error without rewarding correct responses. In scenario 2, the humans teaches the robot through demonstrations, again with different degrees of stochasticity and levels of expertise from the human. In scenario 3, we simulated a human-robot cooperation task for putting a set of cubes in the right box. The task includes abrupt changes in the target box. Results show the generality of the algorithm.

Contribution to the objectives of HumaneAI-net WPs

Humans and robots are doomed to cooperate more and more within the society. This micro-project addresses a major AI challenge to enable robots to adapt on-the-fly to different situations and to different more-or-less naive human users. The solution consists in designing a robot learning algorithm which generalizes to a variety of simple human-robot interaction scenarios. Following the HumanE AI vision, interactive learning puts the human in the loop, prompting human-aware robot behavioral adaptation.
The micro-project directly contributes to one of the objectives of WP1 (T1.3) to enable 'continuous incremental learning in joint human/AI systems' by ‘exploiting rich human feedback’. It also directly contributes to one of the objectives of WP3 (T3.3) to enable reflexivity and adaptation in Human AI collaboration.

Tangible outputs

  • Publication: Journal paper in preparation – Rémi Dromnelle, Erwan Renaudo, Benoît Girard, Petros Maragos, Mohamed Chetouani, Raja Chatila, Mehdi Khamassi
    in preparation
  • Program/code: Open source code to be uploaded on github – Rémi Dromnelle
    in preparation
  • Publication: Preprint to be made open on HAL – Rémi Dromnelle, Erwan Renaudo, Benoît Girard, Petros Maragos, Mohamed Chetouani, Raja Chatila, Mehdi Khamassi
    in preparation

Results Description

This 6-month project entails robot online behavioral adaptation during interactive learning with humans. Specifically, the robot shall adapt to each human subject’s specific way of giving feedback during the interaction. Feedback here includes reward, instruction and demonstration, and can be regrouped under the term “teaching signals”. For example, some human subjects prefer a proactive robot while others prefer the robot to wait for their instructions; some only tell the robot when it performs a wrong action, while others reward correct actions, etc. The main expected outcome was a new ensemble method of human-robot interaction which can learn models of various human feedback strategies and use them for online tuning of reinforcement learning so that the robot can quickly learn an appropriate behavioral policy.

We designed a new ensemble learning algorithm, combining model-based and model-free reinforcement learning, for on-the-fly robot adaptation during human-robot interaction. The algorithm includes a mechanism for the robot to autonomously detect changes in a human's reward function from its observed behavior, and a reset of the ensemble learning accordingly. We simulated a series of human-robot interaction scenarios to test the robustness of the algorithm. In scenario 1, the human rewards the robot with various feedback profile: stochastic reward; non-monotonic reward; or punishing for error without rewarding correct responses. In scenario 2, the humans teaches the robot through demonstrations, again with different degrees of stochasticity and levels of expertise from the human. In scenario 3, we simulated a human-robot cooperation task for putting a set of cubes in the right box. The task includes abrupt changes in the target box. Results show the generality of the algorithm.

Humans and robots are doomed to cooperate more and more within the society. This micro-project addresses a major AI challenge to enable robots to adapt on-the-fly to different situations and to different more-or-less naive human users. The solution consists in designing a robot learning algorithm which generalizes to a variety of simple human-robot interaction scenarios. Following the HumanE AI vision, interactive learning puts the human in the loop, prompting human-aware robot behavioral adaptation.
The micro-project directly contributes to one of the objectives of WP1 (T1.3) to enable "continuous incremental learning in joint human/AI systems" by ‘exploiting rich human feedback’. It also directly contributes to one of the objectives of WP3 (T3.3) to enable reflexivity and adaptation in Human AI collaboration.

Publications

Rémi Dromnelle, Erwan Renaudo, Benoît Girard, Petros Maragos, Mohamed Chetouani, Raja Chatila, Mehdi Khamassi (2022). Reducing computational cost during robot navigation and human-robot interaction with a human-inspired reinforcement learning architecture. International Journal of Social Robotics, doi: 10.1007/s12369-022-00942-6, https://link.springer.com/article/10.1007/s12369-022-00942-6.

Links to Tangible results

Open source code: https://github.com/DromnHell/meta-control-decision-making-agent

Publication: Preprint made open on HAL – https://hal.sorbonne-universite.fr/hal-03829879

Many citizen science projects have a crowdsourcing component where several different citizen scientists are requested to fulfill a micro task (such as tagging an image as either relevant or irrelevant for the evaluation of damage in a natural disaster, or identifying a specimen into its taxonomy). How do we create a consensus between the different opinions/votes? Currently, most of the time simple majority voting is used. We argue that alternative voting schemas (taking into account the errors performed by each annotator) could severely reduce the number of citizen scientists required. This is a clear example of continuous human-in-the-loop machine learning with the machine creating a model of the humans that it has to interact with.

We propose to study consensus building under two different hypotheses: truthful annotators (as a model for most voluntary citizen science projects) and self-interested annotators (as a model for paid crowdsourcing projects).

Output

Software and documentation for the two new consensus models into the crowdnalysis framework.

New consensus models case study in a citizen science project.

Algorithm for numerical simulations useful to evaluate the efficacy of the consensus models considered in crowdnalysis.

Report of the results of simulations, with suggestions to improve the consensus models.

Presentations

Project Partners:

  • Consejo Superior de Investigaciones Científicas (CSIC), Jesus Cerquides
  • Consiglio Nazionale delle Ricerche (CNR), Daniele Vilone

Primary Contact: Jesus Cerquides, IIIA-CSIC

Main results of micro project:

We have contributed to the implementation of several different probabilistic consensus models in the Crowdnalysis library which has been resealed as a Python package.
We have proposed a generic mathematical framework for the definition of probabilistic consensus algorithms, and for performing prospective analysis. This has been published in a journal paper.
We have used the library and the mathematical framework for the analysis of images from the Albanian earthquake scenario.
We exploited Monte Carlo simulations to understand which can be the best way to assess group decisions in evaluating the correct level of damage in natural catastrophes. The results collected so far, which will be published by this year, suggest that Majority rule is the best option as long as all the agents are competent enough to address the task. Otherwise, when the number of unqualified agents is no longer negligible, smarter procedures must be found out.

Contribution to the objectives of HumaneAI-net WPs

Crowdsourcing can be applied to quickly obtain accurate information in different domains, including disaster management scenarios. This requires the computation of the consensus among the different annotators. Probabilistic graphical models can be used to build interpretable consensus models. These models can answer questions such as “Who is the more competent annotator for this task?”, or “How many annotators do I need for this task?” which provide a clear example of machine learning with human-in-the-loop, and fully related to T1.3 Continuous & incremental learning in joint human/AI systems in WP2.
The evolutionary results will help to understand the best way to proceed in the research, suggesting new theoretical and experimental studies to address the topic. Therefore they make it possible to evaluate the interplay between human acting and AI learning in crowdsourcing tasks, connected with T3.2 Human-AI Interaction/collaboration paradigms

Tangible outputs