Contact person: Rui Prada (rui.prada@tecnico.ulisboa.pt

Internal Partners:

  1. Instituto Superior Técnico, Department of Computer Science,
  2. Eötvös Loránd University, Department of Artificial Intelligence

External Partners:

  1. DFKI Lower Saxony, Interactive Machine Learning Lab
  2. Carnegie Mellon University, Robotics Institute  

 

The project addresses research on interactive grounding. It consists of the development of an Augmented Reality (AR) game, using HoloLens, that supports the interaction of a human player with an AI character in a mixed reality setting using gestures as the main communicative act. The game will integrate technology to perceive human gestures and poses. The game will bring about collaborative tasks that need coordination at the level of mutual understanding of the several elements of the required task. Players (human and AI) will have different information about the tasks to advance in the game and need to communicate that information to their partners through gestures. The main grounding challenge will be based on learning the mapping between gestures to the meaning of actions to perform in the game. There will be two levels of gestures toground, some are task-independent while others are task-dependent. In other words, besides the gestures that communicate explicit information about the game task, the players need to agree on the gestures used to coordinate the communication itself; for example, to signal agreement or doubt, to ask for more information, or close the communication. These latter gesture types can be transferred from task to task within the game, and probably to other contexts as well. It will be possible to play the game with two humans and study their gesture communication in order to gather the gestures that emerge: a human-inspired gesture set will be collected and serve the creation of a gesture dictionary in the AI repertoire. The game will provide different tasks of increasing difficulty. The first ones will ask the players to perform gestures or poses as mechanisms to open a door to progress to the next level. But later, in a more advanced version of the game, specific and constrained body poses, interaction with objects, and the need to communicate more abstract concepts (e.g., next to, under, to the right, the biggest one, …) will be introduced. The game will be built as a platform to perform studies. It will support studying diverse questions about the interactive grounding of gestures. For example, we can study the way people adapt to and ascribe meaning to the gestures performed by the AI agent, we can study how different gesture profiles influence the people’s interpretation, facilitate grounding, and have an impact on the performance of the tasks, or we can study different mechanisms on the AI to learn its gesture repertoire from humans (e.g., by imitation grounded on the context).

Results Summary

An AR game, where players face a sequence of codebreaking challenges that require them to press some buttons in a specific sequence, however, only one of the partners has access to the buttons while the other has access to the solution code. The core gameplay is centred on the communication between the two partners (AI and virtual agent), which must be performed only by using gestures. In addition, to the development of the AR game, we developed some sample AI agents that are able to play with a human player. A version using an LLM was also developed to provide some reasoning for gesture recognition and performance by the AI virtual agent.

Players face a sequence of codebreaking challenges that require them to press some buttons in a specific sequence, however, only one of the partners has access to the buttons while the other has access to the solution code. Furthermore, only gesture communication if possible. Therefore, the core gameplay is centred on the communication between the two partners (AI and virtual agent). Gestures supported in the game are split into two distinct subtypes:

  1. Taskwork gestures: Used for conveying information about the game’s tasks and environment (e.g., an object’s colour).
  2. Teamwork gestures: Used for giving feedback regarding communication (e.g., affirming that a gesture was understood).

The gameplay loop implies shared performance coordination and communication.

In the current version, the virtual agent is able to play reactively in response to the player’s gestures based on a gesture knowledge base that assigns meaning and action to each gesture. A version using an LLM was also developed to provide some reasoning for gesture recognition and performance by the AI virtual agent.

Tangible Outcomes

  1. The base game – https://github.com/badomate/EscapeHololens 
  2. The extended game – https://github.com/badomate/EscapeMain 
  3. A presentation summarizing the project: https://www.youtube.com/watch?v=WmuWaNdIpcQ
  4. A short demo for the system https://youtu.be/j_bAw8e0lNU?si=STi6sbLzbpknckGG

Contact person: Victor Chitolina Schetinger, (victor.schetinger@tuwien.ac.at)

Internal Partners:

  1. Charles University, Rudolf Rosa, rosa@ufal.mff.cuni.cz

External Partners:

  1. Edirlei Lima, edirlei.lima@universidadeeuropeia.pt
  2. IADE – Faculdade de Design, Tecnologia e Comunicação da Universidade Europeia

 

Results Summary

We were able to generate storyboards through the use of text and image generative models combined. Due to the fast development of these fields in the past months, however, the quality of the results is not up to the state of the art.

Tangible Outcomes

  1. prototype: https://ufallab.ms.mff.cuni.cz/cgi-bin/rosa/theaitre-demo/demo_images.py 

Contact person: Steeven Villa (steeven.villa@um.ifi.lmu.de

Internal Partners:

  1. Ludwig-Maximilians-Universität München (LMU), Steeven Ville and Sebastian Feger  

External Partners:

  1. Sheffield Hallam University Enterprises Limited, Daniela Petrelli  

 

Novel AI systems enable individuals to maximize their creative potential by rapidly prototyping ideas based on initial sketches or idea descriptions. A generative AI system is the bridge between an individual’s thought and its physical manifestation. Traditional approaches, on the other hand, require a greater investment of effort, involvement, and time, which was historically associated with a sense of ownership over the creation and agency (or control) over the creation process. As the paradigm shift caused by AI significantly reduces the amount of work required to achieve a desired result, individuals consistently report low agency and ownership over their creations, and such boundaries are unclear even in the legal sphere. Therefore, it is essential to understand how these variables can be balanced to foster a strong sense of ownership while allowing users to fully exploit the potential of AI systems. In this project, we seeked to achieve this understanding by creating an interactive exhibition where visitors to a science museum will interact with a generative AI system to create illustrations for a children’s book based on rough sketches and prompts. The participants will be instructed to collaborate with an image-generating AI system to illustrate a children’s storybook with a simple plot. Participants will start with their own sketch or by selecting one from a set. When an illustration is finished, participants will be asked if they want to sign the illustration with their name, the name of the AI model, or both. Participants will have the option to display their illustrations on the exhibition’s billboard. We will conclude by asking them three brief questions about self-efficacy. The interaction will be logged to record the degree of intervention (iterating over the illustrations, using a starting sketch instead of drawing their own sketch, signature ownership). We plan to carry out a limited quantitative study with observations of visitors’ behavior paired with interviews. With the collected data, we will be able to analyze the correlations between time, effort, ownership, and self-efficacy in the AI-assisted creative process, and ultimately gain insights into how to design such systems to promote a sense of ownership in the user. This project falls under WP3. It examines the Pragmatic aspects of communication and collaboration between humans and AI by exploring how participants collaborate with an AI system to translate their initial sketches or prompts into meaningful illustrations for visual narratives for Storytelling. Everything from the lens of influence of the participants’ sense of ownership and agency and how it impacts the outcome and design process.

Results Summary

This project led to the development of three unique museum exhibitions featured at the Deutsches Museum, Bonn Museum, and Alte Pinakothek. At the core of each exhibition is a generative AI system that enables visitors to co-create children’s book stories, giving them the freedom to shape the narrative using drawing inputs. This AI system is specifically tailored for a museum installation environment, ensuring it remains offline and highly reliable.

A notable feature of the Alte Pinakothek installation (developed in collaboration with Sheffield University) is the physical prototype’s symmetrical design, which embodies the collaborative dynamic between human creativity and AI assistance. The visitor is seated on one side, interacting directly with the system. Opposite them, a glowing cube represents the AI presence, actively responding to the user’s inputs. This cube subtly illuminates and adapts based on the AI’s activity, providing visual cues that guide and inspire users through the story-making process.

Additionally, two master’s theses were developed within this project’s framework, and initial data from over 300 user interactions at the Deutsches Museum has already been gathered. These interactions will form the basis for future research publications, offering valuable insights into the role of AI in creative storytelling and user engagement within museum settings. There is a planned paper submission of the project.

Tangible Outcomes

  1. Backend: https://github.com/dorj222/storybookcreator 
  2. Video presentation summarizing the project

Contact person: Janin Koch (Janin.Koch@inria.fr

Internal Partners:

  1. Ludwig-Maximilians-Universität München, Albrecht Schmidt
  2. Københavns Universitet, Kasper Hornbaek
  3. Stichting VU, Koen Hindriks
  4. Umeå University, Helena Lindgren  

 

The aim of the project is to investigate both the theoretical and empirical roles of agency in successful human-computer partnerships. For human-centred AI research, the understanding of agency is a key factor in achieving effective collaboration. Although recent advances in AI have enabled systems to successfully contribute to human-computer interaction, we are interested in extending this such that the interaction acts more like a ‘partnership’. This requires building systems with collaborative agency that users can manipulate in the process. Research questions include: 1) identifying which parameters are relevant to the description of the system agency, 2) what impact these parameters have on the perceived agency and 3) how to modify them in order to achieve different roles of systems in a process.

Results Summary

We conducted individual and collaborative brainstorming sessions with all participants to create an initial overview of current literature in order to establish a common starting point (1). We talked about potential overlaps in our work and how such perspectives influence our current work. We will hold a workshop at CHI’23 on ‘Integrating AI in Human-Human Collaborative Ideation’ to examine the role AI can play in such interactive environments in order to identify distinct dimensions  and measures of agency within human-ai interaction (2) [Shin et al. , 2023]. Umeå has also investigated how conversations between a human and a socially intelligent robot in a home environment can influence perceptions of agency [Tewari and Lindgren, 2022] and the importance of goal setting in such a scenario [Lindgren and Weck, 2022; Kilic et al., 2023] (3). 

Tangible Outcomes

  1. Tewari M and Lindgren H (2022), Expecting, understanding, relating, and interacting – older, middle-aged and younger adults’ perspectives on breakdown situations in human–robot dialogues. Front. Robot. AI 9:956709. doi: 10.3389/frobt.2022.956709.  https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2023.1069455/full
  2. Kilic K, Weck S, Kampik T, Lindgren H. Argument-Based Human-AI Collaboration for Supporting Behavior Change to Improve Health. to appear in Front. AI, 2023.  https://www.frontiersin.org/journals/robotics-and-ai/articles/10.3389/frobt.2022.956709/full
  3. Lindgren H and Weck S. 2022. Contextualising Goal Setting for Behaviour Change – from Baby Steps to Value Directions. In 33rd European Conference on Cognitive Ergonomics (ECCE2022), October 4–7, 2022, Kaiserslautern,Germany. ACM, New York, NY, USA. https://doi.org/10.1145/3552327.3552342 https://dl.acm.org/doi/abs/10.1145/3552327.3552342
  4. Joongi Shin, Janin Koch, Andrés Lucero, Peter Dalsgaard, Wendy E. Mackay. Integrating AI in Human-Human Collaborative Ideation. CHI 2023 – SIGCHI conference on Human Factors in computing systems, Apr 2023, Hamburg, Germany. pp.1-5. ⟨hal04023507⟩  https://dl.acm.org/doi/10.1145/3544549.3573802 

Contact person: Teresa Hirzle (tehi@di.ku.dk

Internal Partners:

  1. Københavns Universitet (UCPH), Teresa Hirzle
  2. Ludwig-Maximilians-Universität München (LMU), Florian Müller/Julian Rasch  

External Partners:

  1. Saarland University, Martin Schmitz  

 

The use of generative AI in the creation of 3D objects has the potential to greatly reduce the time and effort required for designers and developers, resulting in a more efficient and effective creation of virtual 3D objects. Yet, research still lacks an understanding of suitable interaction modalities and common grounding in this field.

Objective:

The objective of this research project is to explore and compare interaction modalities that are suited to collaboratively create virtual 3D objects together with a generative AI. To this end, the project aims to investigate how different input modalities, such as voice, touch and gesture recognition, can be used to generate and alter a virtual 3D object and how we can create methods for establishing common ground between theAI and the users.

Methodology:

The project is split into two working packages. (1) We investigate and evaluate the use of multi-modal input modalities to alter the shape and appearance of 3D objects in virtual reality (VR). (2) Based on our insights on promising multi-modal interaction concepts, we then develop a prototypical multi-modal VR interface that allows users to collaborate on the creation of 3D objects with a generative AI. This might include, but is not limited to, the AI assistant generating 3D models (e.g. using https://threedle.github.io/text2mesh or Shap-E ) or providing suggestions based on the users’ queries. The project will use a combination of experimental and observational methods to evaluate the effectiveness and efficiency of the concepts. This will involve conducting controlled experiments to test the effects of different modalities and AI assistance on the collaborative creation process, as well as observing and analyzing the users’ behavior.

Results Summary

The project was conducted and finished and a paper submission in a top-tier conference is currently under review. Thus, the authors requested the obscuring of results for now due to the strict anonymization policies of the venue. An abstract version of the results is that the project contributed insights into the effectiveness and efficiency of different modalities and AI assistance in enhancing the collaborative process, and guidelines for the design of multi-modal interfaces and AI assistance for collaborative creation of 3D objects.

Contact person: Mauro Dragoni (dragoni@fbk.eu)

Internal Partners:

  1. Fondazione Bruno Kessler (FBK), Mauro Dragoni
  2. Centre national de la recherche scientifique (CNRS), Jean-Claude Martin
  3. Umeå University (UMU), Helena Lindgren

 

We develop a conceptual model of key components relating to supporting healthy behavior change. The model provides a top-level representation of the clinical (from the psychological perspective) enablers and barriers that can be exploited for developing fine- grained models supporting the realization of behavior change paths within and across specific domains. The resulting ontology will form the basis for generating user models (Theory of Mind), developing reasoning and decision-making strategies for managing conflicting values and motives, which can be used in collaborative and persuasive dialogues with the user. Such knowledge is also fundamental for embedding empathic behavior as well as non-verbal behaviors which can be embodied by a virtual character in the role of a coach. Learning methods can be applied to explore trajectories of behavior change. The produced ontology will represent a valuable resource for the healthcare domain thanks to the knowledge included into the provided resource.

Results Summary

The resulting ontology will form the basis for generating user models (Theory of Mind), developing reasoning and decision-making strategies for managing conflicting values and motives, which can be used in collaborative and persuasive dialogues with the user. Such knowledge is also fundamental for embedding empathic behavior as well as corresponding non-verbal behaviors which can be displayed by a virtual character embodying the role of a coach or the dispatching of motivational human-computer interactions over different devices (e.g. mobile phone and smartwatch). Concerning the last point, the work done in this microproject worked as a trigger for the creation of a new knowledge graph of functional status.

Tangible Outcomes

  1. An Ontology-Based Coaching Solution for Increasing Self-Awareness of Own Functional Status – Mauro Dragoni in AI4Function 2021:2nd workshop on Artificial Intelligence for Function, Disability, and Health, 2021  http://ceur-ws.org/Vol-2926/paper1.pdf 
  2. Modeling a Functional Status Knowledge Graph For Personal Health. in HC@AIxIA 2022 workshop on Artificial INtelligence for healthcare, 2022   https://ceur-ws.org/Vol-3307/paper4.pdf 
  3. Integrating Functional Status Information into Knowledge Graphs to Support Self-Health Management. at the Data Intelligence Journal (MIT Press), 2023 https://direct.mit.edu/dint/article/5/3/636/114950/Integrating-Functional-Status-Information-into   
  4.  extensions of the HeLiS ontology – Mauro Dragoni https://horus-ai.fbk.eu/helis/ 
  5. Video presentation summarizing the project

Contact person: Michel Klein (michel.klein@vu.nl

Internal Partners:

  1. Stichting VU ( Vrije Universiteit Amsterdam), Koen Hindriks, Michel Klein
  2. University College London í(UCL), Yvonne Rogers

 

Interaction between chatbots and humans is often based on frequently occurring interaction patterns, e.g., question – answer. Those patterns usually describe a very brief phase in the interaction. In this micro project, we want to investigate whether we can design a chatbot for behavior change by including higher level patterns, which are adapted from the taxonomy of behavior change techniques (BCT’s). These patterns should describe the components of the interaction during a longer period of time. In addition, we investigate how to design a user interface in such a way that it sustains the interest of the users. We focus on reducing sedentary behavior, and especially sitting behavior, which can have negative health consequences. The interaction patterns and user interface will be implemented in a prototype. A user study evaluates the different components on effectiveness and engagement.

Results Summary

We show that the proposed approach provides high-quality semantic segmentation from the robot’s perspective, with accuracy comparable to the original one. In addition, we exploited the gained information and improved the recognition performance of the deep network for the lower viewpoints and showed that the small robot alone is capable of generating high-quality semantic maps for the human partner. The computations are close to real time, so the approach enables interactive applications.

Tangible Outcomes

  1. Video presentation summarizing the project

Contact person: François Yvon (yvon@isir.upmc.fr

Internal Partners:

  1. Centre national de la recherche scientifique (CNRS), 2. Francois Yvon
  2. Fondazione Bruno Kessler (FBK), Marco Turchi  

 

Owing to the progress of underlying NLP technologies (speech to text, text normalization and compression, machine translation) automatic captioning technologies (ATC) both intra- and inter-lingual, are rapidly improving. ACTs are useful for many contents and contexts: from talks and lectures to news, fictions and other entertaining content.While historical systems are based on complex NLP pipelines, recent proposals are based on integrated (end-to-end) systems, which questions standard evaluation schemes, where each module can be assessed independently from the others. We focus on evaluating the quality of the output segmentation, where decisions regarding the length, disposition and display duration of the caption need to be taken, all having a direct impact on the acceptability and readability. We will notably study ways to perform reference-free evaluations of automatic caption segmentation. We will also try to correlate these « technology-oriented » metrics with user-oriented evaluations in typical use cases: post-editing and direct broadcasting.

Results Summary

In this MP, we did three main tasks: 1) surveyed existing segmentation metrics, 2) designed a contrastive evaluation set, and 3) implemented and compared the metrics on multiple languages / tasks. We created the EvalSubtitle tool for the community to use our results. This is a tool for reference-based evaluation of subtitle segmentation. The repository contains the Subtitle Segmentation Score (Sigma), specifically tailored for evaluating segmentation from system outputs where the text is not identical to a reference (imperfect texts). EvalSub also contains a collection of standard segmentation metrics (F1, WindowDiff etc.) as well as subtitling evaluation metrics: BLEU on segmented (BLEU_br) and non-segmented text (BLEU_nb), and TER_br. We disseminated and documented our results through a publication.

Tangible Outcomes

  1. Karakanta, Alina, François Buet, Mauro Cettolo, and François Yvon. “Evaluating subtitle segmentation for end-to-end generation systems.” arXiv preprint arXiv:2205.09360 (2022). https://aclanthology.org/2022.lrec-1.328.pdf
  2. EvalSubtitle: tool for reference-based evaluation of subtitle segmentation https://github.com/fyvo/EvalSubtitle

Contact person: Karen Joisten, Ettore Barbagallo (karen.joisten@rptu.de; ettore.barbagallo@rptu.de

Internal Partners:

  1. RPTU kaiserslautern 

 

This micro-project started from the consideration that AI systems not only are improving in doing what they are expected to do but are also developing a characteristic that no other technological artifact displays, that is, the resemblance with biological systems. This resemblance explains the tendency present not only in the media and social media but also within academic milieus to use anthropomorphic language when talking about what AI systems do or are (“intelligence,” “agency,” “autonomy,” “life cycle,” “learning,” “knowing,” “discriminating,” etc.). Joisten and Barbagallo attempted to discern cases in which anthropomorphic language is inevitable and can scarcely be replaced by better linguistic alternatives, and cases in which philosophers, scientists, and engineers should work together to find more suitable language solutions. The goal of the micro-project was not to suggest that anthropomorphic language use is in all cases incorrect and should always be replaced by more correct language use. The project leader instead adopted an ethical perspective, arguing that the real risk of unconsciously using anthropomorphic language when speaking of AI systems is not the humanization of AI but the mechanization of human life. The expression “mechanization of human life” refers here to the possible shift in the way human beings intellectually comprehend and emotionally perceive their humanity. Mechanization of human life, therefore, takes place when AI becomes the model and framework of human self-comprehension. An important takeaway of the micro-project is that the issue of anthropomorphic AI language and mechanization of human life cannot be addressed and solved by a single isolated discipline but only in an interdisciplinary effort in which philosophers, ethicists, computer scientists, engineers, social scientists, linguists, and jurists share their expertise.

Results Summary

The main focus of the micro-project was on the ethical consequences of language use when discussing AI systems and human-AI interaction. The research project was conducted in three phases.

1) In the first phase (May to June 2023), the project examined various AI guidelines, such as the Ethics Guidelines for Trustworthy AI (AI HLEG 2019) and others, with the aim of analyzing the language used to describe AI’s functionality and the interrelation between humans and machines. The project’s philosophical emphasis on problems related to language use was based on the phenomenological observation that language shapes our cognitive and emotional relationships to ourselves, the world, and also to our technological artifacts, including AI.

2) In the second phase (July to August 2023), the study addressed more general issues that emerged from the analysis and comparison of the examined AI guidelines. Despite the evident efforts of the guidelines’ authors to use technical, scientific, neutral, and objective language, Joisten and Barbagallo identified several terms—such as “agency,” “learning,” “decision making,” and “autonomy”—that indicate a tendency toward anthropomorphic language. The questions posed by the project implementers were: when is it philosophically and ethically acceptable to employ humanizing language when speaking of AI’s functioning? And when is it more appropriate to replace the terms mentioned above with more suitable concepts?

3) The third phase of the project (October 2023 to February 2024) involved the Research Seminar “Ethics and AI,” which was directed at PhD students, postdoctoral scholars, and research fellows, and took place at the University of Kaiserslautern-Landau (Campus Kaiserslautern) in the winter term of 2023-2024. The seminar gave Joisten and Barbagallo the opportunity to present and discuss their findings with other researchers and colleagues. In the seminar, the project leaders aimed to show that the main ethical risk of using anthropomorphic language in relation to AI is not the humanization of AI, but rather the mechanization of human life.

Tangible Outcomes

  1. Research seminar: Ethics and AI for PhD students, postdoctoral scholars, and research fellows in University of Kaiserslautern-Landau (Winter 2023-2024) https://www.kis.uni-kl.de/campus/all/event.asp?gguid=0x36775C8D0E68413D87103D67948EF327&tguid=0x3E97C1E01A714B9F9C0BEE5AB4FFE5FC

Contact person: John Shawe-Taylor, UCL (j.shawe-taylor@ucl.ac.uk

Internal Partners:

  1. University College London í(UCL), John Shawe-Taylor
  2. Institut “Jožef Stefan” (JSI), John Shawe-Taylor
  3. INESC TEC, Alipio Jorge  

 

Through this work, we explore novel and advanced learner representation models aimed at exploiting learning trajectories to build a transparent, personalised and efficient automatic learning tutor through resource recommendations. We elaborate on the different types of publicly available data sources that can be used to build an accurate trajectory graph of how knowledge should be taught to learners to fulfil their learning goals effectively. Our aim is to capture and utilise the inferred learner state and the understanding the model has about sensible learning trajectories to generate personalised narratives that will allow the system to rationalise the educational recommendations provided to individual learners. Since an educational path consists heavily of building/following a narrative, a properly constructed narrative structure and representation is paramount to the problem of building successful and transparent educational recommenders.

Results Summary

Adding humanly-intuitive model assumptions to the TrueLearn Bayesian learner model such as 1) interest, 2) knowledge of learner 3) semantic relatedness between content topics has been achieved successfully leading to improved predictive performance. A dataset of personalised learning pathways of over 20000 learners has been composed. Analysis on Optimal Transport for generating interpretable narratives using Earth Mover’s Distance (EMD) of Wikipedia concepts also showed promise in scenarios where there is a limited number of topic annotations per document. A novel method for cross-lingual information retrieval using EMD has been invented pursuing this idea. Incorporating semantic networks (WordNet, WikiData) in building higher-level reasoning for recommendation also shows promise albeit with limited results at this point. Successful expansion of WordNet network using WikiData network is achieved. The resultant semantic network indicates that the quality of reasoning over Wiki Annotated video lectures can be improved in this way.

Tangible Outcomes

  1. X5Learn: A Personalised Learning Companion at the Intersection of AI and HCI – Maria Perez-Ortiz. https://dl.acm.org/doi/10.1145/3397482.3450721  
  2. “Why is a document relevant? Understanding the relevance scores in cross-lingual document retrieval.” Novak, Erik, Luka Bizjak, Dunja Mladenić, and Marko Grobelnik. Knowledge-Based Systems 244 (2022): 108545. https://dl.acm.org/doi/10.1016/j.knosys.2022.108545
  3. Towards an Integrative Educational Recommender for Lifelong Learners (Student Abstract). Sahan Bulathwela,  María Pérez-Ortiz, Emine Yilmaz, and John Shawe-Taylor. (2020, April). In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 10, pp. 13759-13760). https://ojs.aaai.org/index.php/AAAI/article/view/7151/7005
  4. “TrueLearn: A Python Library for Personalised Informational Recommendations with (Implicit) Feedback.” Yuxiang Qiu,  Karim Djemili, Denis Elezi, Aaneel Shalman, María Pérez-Ortiz, and Sahan Bulathwela.  arXiv preprint arXiv:2309.11527 (2023). Published through ORSUM workshop, RecSys’23 https://arxiv.org/pdf/2309.11527
  5. “Peek: A large dataset of learner engagement with educational videos.” Bulathwela, Sahan, Maria Perez-Ortiz, Erik Novak, Emine Yilmaz, and John Shawe-Taylor.  arXiv preprint arXiv:2109.03154 (2021). Submitted to ORSUM workshop, RecSys’21 https://arxiv.org/abs/2109.03154
  6. Dataset: PEEK Dataset – Sahan Bulathwela https://github.com/sahanbull/PEEK-Dataset
  7. Program/code: TrueLearn Model – Sahan Bulathwela https://github.com/sahanbull/TrueLearn
  8. Program/code: Semantic Networks for Narratives – Daniel Loureiro https://github.com/danlou/mp_narrative

Contact person: Mohamed Chetouani (mohamed.chetouani@sorbonne-universite.fr

Internal Partners:

  1. ISIR, Sorbonne University, Mohamed Chetouani, Silvia Tulli
  2. Vrije Universiteit Amsterdam, Kim Baraka  

 

Human-Interactive Robot Learning (HIRL) is an area of robotics that focuses on developing robots that can learn from and interact with humans. This educational module aims to cover the basic principles and techniques of Human-Interactive Robot Learning. This interdisciplinary module will encourage graduate students (Master/PhD level) to connect different bodies of knowledge within the broad field of Artificial Intelligence, with insights from Robotics, Machine Learning, Human Modelling, and Design and Ethics. The module is meant for Master’s and PhD students in STEM, such as Computer Science, Artificial Intelligence, and Cognitive Science. This work will extend the tutorial presented in the context of the International Conference on Algorithms, Computing, and Artificial Intelligence (ACAI 2021) and will be shared with the Artificial Intelligence Doctoral Academy (AIDA). Moreover, the proposed lectures and assignments will be used as teaching material at Sorbonne University, and Vrije Universiteit Amsterdam. We plan to design a collection of approximately 12 1.5-hour lectures, 5 assignments, and a list of recommended readings, organized along relevant topics surrounding HIRL. Each lecture will include an algorithmic part and a practical example of how to integrate such an algorithm into an interactive system. The assignments will encompass the replication of existing algorithms with the possibility for the student to develop their own alternative solutions. Proposed module contents (each lecture approx. 1.5 hour): (1) Interactive Machine Learning vs Machine Learning – 1 lecture, (2) Interactive Machine Learning vs Interactive Robot Learning (Embodied vs non-embodied agents) – 1 lecture, (3) Fundamentals of Reinforcement Learning – 2 lectures, (4) Learning strategies: observation, demonstration, instruction, or feedback- Imitation Learning, Learning from Demonstration – 2 lectures- Learning from Human Feedback: evaluative, descriptive, imperative, contrastive examples – 3 lectures, (5) Evaluation metrics and benchmarks – 1 lecture, (6) Application scenarios: hands-on session – 1 lecture, and (7) Design and ethical considerations – 1 lecture.

Contact person: Patrick Paroubek, LIMSI-CNRS (pap@limsi.fr

Internal Partners:

  1. Centre national de la recherche scientifique (CNRS), Patrick Paroubek  

External Partners:

  1. Charles University Prague, O. Dušek  

 

We aim to evaluate the usefulness of current dialogue dataset annotation and propose annotation unification and automatized enhancements for better user modeling by training on larger amounts of data. Current datasets’ annotation is often only focused on annotation geared toward the dialog system learning how to answer, while the user representation should be explicit, consistent and as complete as possible for more complex user representation (e.g. cognitively). The project will start from existing annotated dialog corpora and produce extended versions, with improved annotation consistency and extra user representation annotations produced automatically from existing corpora like bAbI++ and MultiWOZ and others. We will explore unifying annotations from multiple datasets and evaluate the enhanced annotation using our own end-to-end dialogue models based on memory networks.

Results Summary

  1. A corpus of 37,173 annotated dialogues with unified and enhanced annotations was built from existing open dialogue resources.
  2. Code and trained models (GPT-2, MarCo) for dialogue response generation on the above corpus were generated.
  3. Ongoing collaboration between LISN (Paris-Saclay University) and Fac. of Mathematics and Physics (Charles University, Pragues).

Tangible Outcomes

  1. Schaub, Léon-Paul, Vojtech Hudecek, Daniel Stancl, Ondrej Dusek, and Patrick Paroubek. “Defining and detecting inconsistent system behavior in task-oriented dialogues.” In Traitement Automatique des Langues Naturelles, pp. 142-152. ATALA, 2021. https://hal.science/TALN-RECITAL2021/hal-03265892 https://aclanthology.org/2021.jeptalnrecital-taln.13/ 
  2. Vojtěch Hudeček, Léon-Paul Schaub, Daniel Stancl, Patrick Paroubek, and Ondřej Dušek. 2022. DIASER: A Unifying View On Task-oriented Dialogue Annotation. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 1286–1296, Marseille, France. European Language Resources Association. https://aclanthology.org/2022.lrec-1.137/
  3. Dataset: DIASER corpus – Ondrej Dusek: A corpus of 37,173 annotated dialogues with unified and enhanced annotations built from existing open dialogue resources. https://gitlab.com/ufal/dsg/diaser
  4. Video presentation summarizing the project