Contact person:  Mehdi Khamassi (mehdi.khamassi@sorbonne-universite.fr

Internal Partners:

  1. Sorbonne University, Mehdi Khamassi, mehdi.khamassi@sorbonne-universite.fr
  2. ATHINA Research Center, Petros Maragos, maragos@cs.ntua.gr

 

This project entails robot online behavioral adaptation during interactive learning with humans. Specifically, the robot shall adapt to each human subject’s specific way of giving feedback during the interaction. Feedback here includes reward, instruction and demonstration, and can be regrouped under the term “teaching signals”. For example, some human subjects prefer a proactive robot while others prefer the robot to wait for their instructions; some only tell the robot when it performs a wrong action, while others reward correct actions, etc. The main outcome is a new ensemble method of human-robot interaction which can learn models of various human feedback strategies and use them for online tuning of reinforcement learning so that the robot can quickly learn an appropriate behavioral policy. We first derive an optimal solution to the problem and then compare the empirical performance of ensemble methods to this optimum through a set of numerical simulations.

Results Summary

We designed a new ensemble learning algorithm, combining model-based and model-free reinforcement learning, for on-the-fly robot adaptation during human-robot interaction. The algorithm includes a mechanism for the robot to autonomously detect changes in a human’s reward function from its observed behavior, and a reset of the ensemble learning accordingly. We simulated a series of human-robot interaction scenarios to test the robustness of the algorithm. In scenario 1, the human rewards the robot with various feedback profiles: stochastic reward; non-monotonic reward; or punishing for error without rewarding correct responses. In scenario 2, the humans teach the robot through demonstrations, again with different degrees of stochasticity and levels of expertise from the human. In scenario 3, we simulated a human-robot cooperation task for putting a set of cubes in the right box. The task includes abrupt changes in the target box. Results show the generality of the algorithm. Humans and robots are doomed to cooperate more and more within the society. This micro-project addresses a major AI challenge to enable robots to adapt on-the-fly to different situations and to different more-or-less naive human users. The solution consists of designing a robot learning algorithm which generalizes to a variety of simple human robot interaction scenarios. Following the HumanE AI vision, interactive learning puts the human in the loop, prompting human-aware robot behavioral adaptation.

Tangible Outcomes

  1.  Rémi Dromnelle, Erwan Renaudo, Benoît Girard, Petros Maragos, Mohamed Chetouani, Raja Chatila, Mehdi Khamassi (2022). Reducing computational cost during robot navigation and human-robot interaction with a human-inspired reinforcement learning architecture. International Journal of Social Robotics, doi: 10.1007/s12369-022-00942-6 Preprint made open on HAL – https://hal.sorbonne-universite.fr/hal03829879
  2. Open source code: https://github.com/DromnHell/meta-control-decision-making-agent 

Contact person: Aart van Halteren (a.t.van.halteren@vu.nl

Internal Partners:

  1. Stichting VU, Koen Hindriks
  2. Philips Electronics Nederland B.V., Aart van Halteren  

 

The communication between patients and healthcare institutions is increasingly moving to digital applications. Whereas information about the patient’s wellbeing is typically collected by means of a questionnaire, this is a tedious task for many patients, especially when it has to be done periodically, and may result in incomplete or imprecise input. Much can be gained by making the process of filling in such questionnaires more interactive by deploying a conversational agent that can not only ask the questions, but also ask follow-up questions and respond to clarification questions by the user. We propose to deploy and test such a system. The work benefits from existing experience with patient-provider communication within Philips and builds on the SUPPLE framework for dialog management and sequence expansion.

Contact person: Richard Niestroj, VW Data:Lab Munich, Yuanting Liu (liu@fortiss.org; yuanting.liu@fortiss.org

Internal Partners:

  1. Volkswagen AG, Richard Niestroj
  2. Consiglio Nazionale delle Ricerche (CNR), Mirco Nanni
  3. fortiss GmbH, Yuanting Liu  

 

The goal is to build a simulation environment to test connected car data based applications. AI based car data applications save people‘s time by guiding drivers and vehicles intelligently. This leads to a reduction of the environmental footprint of the transportation sector by reducing local and global emissions. The development and usage of a simulation environment enables data privacy compliancy for the development of AI based applications.

Tangible Outcomes

  1. Video presentation summarizing the project

Contact person: Haris Papageorgiou (Athena RC) (haris@athenarc.gr

Internal Partners:

  1. ATHENA RC,ILSP , Haris Papageorgiou, haris@athenarc.gr
  2. DFKI, Georg Rehm, georg.rehm@dfki.de

 

Knowledge discovery offers numerous challenges and opportunities. In the last decade, a significant number of applications have emerged relying on evidence from the scientific literature. ΑΙ methods offer innovative ways of applying knowledge discovery methods in the scientific literature facilitating automated reasoning, discovery and decision making on data. This micro-project focuses on the task of question answering (QA) for the biomedical domain. Our starting point is a neural QA engine developed by ILSP addressing experts’ natural language questions by jointly applying document retrieval and snippet extraction on a large collection of PUBMED articles, thus, facilitating medical experts in their work. DFKI will augment this system with a knowledge graph integrating the output of document analysis and segmentation modules. The knowledge graph will be incorporated in the QA system and used for exact answers and more efficient Human-AI interactions. We primarily focus upon scientific articles on Covid-19 and SARS-CoV-2.

Tangible Outcomes

  1. Video presentation summarizing the project

Contact person: Albrecht Schmidt, Robin Welsch (albrecht.schmidt@ifi.lmu.de; robin.welsch@aalto.fi

Internal Partners:

  1. Ludwig-Maximilians-Universität München (LMU), Albrecht Schmidt
  2. German Research Centre for Artificial Intelligence (DFKI), Paul Lukowicz and Patrick Gebhard  

 

We proposed to research how autobiographical recall can be detected in virtual reality (VR). In particular, we experimentally investigate what physiological parameters accompany interaction with autobiographical memories in VR. We consider VR as one important representation of Human-AI collaboration. For this, we plan to: (1) record an EEG data set of people’s reaction and responses when recalling an autobiographical memory, (2) label the data set, and (3) do an initial analysis of the dataset to inform the design of autobiographical VR experiences. We would try to automate data collection as much as possible to make it easy to add more data over time. This contributes to a longer-term effort in model and theory formation.

Tangible Outcomes

  1.  Dataset: Pilot dataset – Kunal Gupta & Mark Billinghurst https://github.com/kgupta2789/AMinVR
  2. Dataset: eye tracking data during encoding phase https://github.com/kgupta2789/AMinVR/tree/main/data/pupil
  3. Workshop on Human Memory and AI – Albrecht Schmidt, Antti Oulasvirta, Robin Welsch & Kashyap Todi https://www.humane-ai.eu/event/ai-and-human-memory/
  4. Video presentation explaining the project and the VR experience – Robin Welsch, Kunal Gupta & Albrecht Schmidt https://www.youtube.com/watch?v=mGb7Oi5CHNc&ab_channel=KunalGupta
  5. Experimental video for the encoding phase showing the CR experience https://www.youtube.com/watch?v=mGb7Oi5CHNc

Contact person: Hamraz javaheri (hamraz.javaheri@dfki.de

Internal Partners:

  1. DFKI  

External Partners:

  1. Hospital Saarbrücken “Der Winterberg”  

 

In this project, we successfully implemented and clinically evaluated an AR assistance system for pancreatic surgery, enhancing surgical navigation and achieving more precise perioperative outcomes. However, the system’s reliance on preoperative data posed challenges, particularly due to anatomical deformations occurring in the later stages of surgery. In future research, we aim to address this by integrating real-time data sources to further improve the system’s accuracy and adaptability during surgery.

Results Summary

Throughout our project, we developed and clinically evaluated ARAS, an augmented reality (AR) assistance system designed for pancreatic surgery. The system was clinically evaluated by field surgeons during pancreatic tumor resections involving 20 patients. In a matched-pair analysis with 60 patients who underwent surgery without ARAS, the ARAS group demonstrated a significantly shorter operation time compared to the control group. Although not statistically significant, the ARAS group also exhibited clinically noticeable lower rates of excessive intraoperative bleeding and reduced need for intraoperative red blood cell (RBC) transfusions. Furthermore, ARAS enabled more precise tumor resections with tumor-free margins, and patients in this group had better postoperative outcomes, including significantly shorter hospital stays. In this project, we published 2 journal papers (1 is accepted and will be published soon), 1 conference paper, 1 demo paper (Best Demo Paper Award), and 2 more conference papers are currently under submission. The success of our project got also several international and local news and media attention including Deutsche Welle news channel (Example links provided)

Tangible Outcomes

  1. Beyond the visible: preliminary evaluation of the first wearable augmented reality assistance system for pancreatic surgery, Journal of International Journal of Computer Assisted Radiology and Surgery (https://doi.org/10.1007/s11548-024-03131-0 )
  2. Enhancing Perioperative Outcomes of Pancreatic Surgery with Wearable Augmented Reality Assistance System: A Matched-Pair Analysis, Journal of Annals of Surgery Open ( https://doi.org/10.1097/AS9.0000000000000516)
  3. Design and Clinical Evaluation of ARAS: An Augmented Reality Assistance System for Pancreatic Surgery (IEEE ISMAR 2024 (https://www.researchgate.net/publication/385116946_Design_and_Clinical_Evaluation_of_ARAS_An_Augmented_Reality_Assistance_System_for_Open_Pancreatic_Surgery_Omid_Ghamarnejad
  4. ARAS: LLM-Supported Augmented Reality Assistance System for Pancreatic Surgery, ISWC/UbiComp 2024 (https://doi.org/10.1145/3675094.3677543
  5. Media coverage for the project:
    1. https://www.dw.com/en/artificial-intelligence-saving-lives-in-the-operating-room/video-68125878
    2. https://www.dw.com/de/k%C3%BCnstliche-intelligenz-im-op-saal-rettet-leben/video-68125903
    3. https://www.saarbruecker-zeitung.de/app/consent/?ref=https%3A%2F%2Fwww.saarbruecker-zeitung.de%2Fsaarland%2Fsaarbruecken%2Fsaarbruecken%2Fsaarbruecken-winterberg-klinik-international-im-tv-zu-sehen_aid-106311259
    4. https://www.saarbruecker-zeitung.de/app/consent/?ref=https%3A%2F%2Fwww.saarbruecker-zeitung.de%2Fsaarland%2Fsaarbruecken-mittels-ki-erfolgreiche-operation-an-82-jaehriger-v29_aid-104053203
    5. https://m.focus.de/gesundheit/gesundleben/da-gibt-es-keinen-raum-fuer-fehler-kuenstliche-intelligenz-im-op-saal-rettet-leben_id_259629806.html 

Contact person: Petr Schwarz,  Brno University of Technology, (schwarzp@fit.vutbr.cz)

Internal Partners:

  1. Brno University of Technology, Petr Schwarz, schwarzp@fit.vutbr.cz
  2. Charles University, Ondrej Dusek, odusek@ufal.mff.cuni.cz

 

This project brings us data, tools, and baselines that enable us to study and improve context exchange among component and dialog sides (AI agent and human) in voice dialog systems. A better context exchange allows us to build more accurate automatic speech transcription, better dialog flow modeling, more fluent speech synthesis, and more powerful AI agents. The context exchange can be seen as an interactive grounding in two senses – among dialog sides (for example, technologies like example automatic speech transcription rarely use the other dialog side information to adapt itself) and among dialog system components (the speech synthesis rarely uses dialog context to produce more fluent or expressive speech). The individual project outputs are summarized below.

Results Summary

1) Audio data collection software based on the Twilio platform and WebRTC desktop/mobile device clients. The purpose is to collect audio data of communication between agents (company, service provider, for example, travel info provider) and users. This software enables us to collect very realistic voice dialogs that have high-quality audio (>= 16kHz sampling frequency) on the agent side and low telephone-quality audio on the user side. The code is available here: https://github.com/oplatek/speechwoz

2) We have established a relationship with Paweł Budzianowski (Poly.AI) and Izhak Shafran (Google). Paweł created the MultiWoz database – an excellent dialog corpus (https://arxiv.org/abs/1810.00278) that we use for the text-based experiment. We decided to collect our audio data similarly. Izhak organized DSTC11 Speech Aware Dialog System Technology Challenge (https://arxiv.org/abs/2212.08704) and created artificial audio data for MultiWOZ through speech synthesis, reading, and paraphrasing. Both provided us with the necessary advice for our data collection.

3) Speech dialog data – the data collection platform preparation and data collection are very time-consuming. The data collection is in progress and will be released before June 26th, 2023.

4) Initial experiments with context exchange between dialog sides (user and agent) were performed. These experiments show a nice improvement in the component of automatic speech recognition side. The results will be re-run with the collected data and published when the collection is finished.

5) Initial experiments with training instance weighting for response generation – which brings context to dialog system response generation, were performed. Experiments were based on the AuGPT system, previously developed at CUNI. The code is available here: https://github.com/knalin55/augpt. Instance weighting increases the re-use of context, compared to normal training, and can go even beyond natural occurrences in data. Simple weighting (threshold) seems better than designing a complex instance weight (in terms of automated metrics, limited manual evaluation is not conclusive). Cross entropy loss works better than unlikelihood loss, where dialogue success may be reduced.

6) We organized a workshop in JSALT research summer workshop about “Automatic design of conversational models from observation of human-to-human conversation”(https://jsalt2023.univ-lemans.fr/en/automatic-design-of-conversational-models-fromobservation-of-human-to-human-conversation.html).  https://www.clsp.jhu.edu/2023-jelinek-summer-workshop, https://jsalt2023.univ-lemans.fr/en/index.html. This is a prestigious workshop organized by John Hopkins University every year. This year it is supported and co-organized by the University of Le Mans. Our topic  passed a scientific review by more than 40 world-class researchers in AI in Baltimore, USA, in December 2022, and was selected for this workshop out of 15 proposals together with three others. The workshop topic builds on the outcome of this Micro-Project and will reuse the collected data.

Tangible Outcomes

  1. Nalin Kumar and Ondrej Dusek. 2024. LEEETs-Dial: Linguistic Entrainment in End-to-End Task-oriented Dialogue systems. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 727–735, Mexico City, Mexico. Association for Computational Linguistics https://aclanthology.org/2024.findings-naacl.46/ 
  2. Code for audio data collection: https://github.com/oplatek/speechwoz 
  3. Code for end-to-end response generation: https://github.com/knalin55/augpt 
  4. Report for end-to-end response generation: https://docs.google.com/document/d/1iQB1YWr3wMO8aEB08BUYBqiLh0KreYjyO4EHnb395Bo/edit 
  5. “Automatic design of conversational models from observation of human-to-human conversation” workshop in the prestigious JSALT research summer workshops program https://jsalt2023.univ-lemans.fr/en/automatic-design-of-conversational-models-from-observation-of-human-to-human-conversation.html
  6. workshop proposal: https://docs.google.com/document/d/19PAOkquQY6wnPx_wUXIx2EaInYchoCRn/edit 
  7. presentations from the prestigious JSALT research summer workshop: https://youtu.be/QS5zXkpXV3Q 

Contact person: Florian Müller , LMU ( florian.mueller@tu-darmstadt.de

Internal Partners:

  1. Ludwig-Maximilians-Universität München (LMU), Florian Müller  

External Partners:

  1. Ecole Nationale de l’Aviation Civile (ENAC), Anke Brock  

 

Pilots frequently need to react to unforeseen in-flight events. Taking adequate decisions in such situations requires considering all available information and demands strong situational awareness. Modern on-board computers and technologies like GPS radically improved the pilots’ abilities to take appropriate actions and lowered their required workload in recent years. Yet, current technologies used in aviation cockpits generally still fail to adequately map and represent 3D airspace. In response, we aim to create an AI aviation assistant that considers all relevant aircraft operation data, focuses on providing tangible action recommendations, and on visualizing them for efficient and effective interpretation in 3D space. In particular, we note that extended reality (XR) applications provide an opportunity to augment pilots’ perception through live 3D visualizations of key flight information, including airspace structure, traffic information, airport highlighting, and traffic patterns. While XR applications have been tested in aviation in the past, applications are mostly limited to military aviation and latest commercial aircrafts. This ignores the majority of pilots in general aviation, in particular, where such support could drastically increase situational awareness and lower the workload of pilots. General aviation is characterized as the non-commercial branch of aviation, often relating to single-engine and single-pilot operations. To develop applications usable across aviation domains, we planned to create a Unity project for XR glasses. Based on this, we planned to, in the first step, systematically and iteratively explore suitable AI-based support on pilot feedback in a virtual reality study in a flight simulator. Based on our findings, we refine the Unity application and investigate opportunities to conduct a real test flight with our external partner ENAC, the French National School of Civil Aviation, who own a plane. Such a test flight would most likely use latest Augmented Reality headsets like the HoloLense 2. Considering the immense safety requirements for such a real test flight, this part of the project is considered optional at this stage and depends on the findings from the previous virtual reality evaluation. The system development particularly focuses on the use of XR techniques to create more effective AI-supported traffic advisories and visualizations. With this, we want to advance the coordination and collaboration of AI with human partners, establishing a common ground as a basis for multimodal interaction with AI (WP3 motivated). Further, the MP relates closely to “Innovation projects (WP6and7 motivated)”, calling for solutions that address “real-world challenges and opportunities in various domains such as (…)transportation […]”.

Results Summary

We explored AI and Mixed Reality for pilot support. One of the results includes an early mixed reality prototype for a popular consumer-grade flight simulator that allows to intuitively perceive actual 3D information that current 2D tools cannot present satisfactorily. Based on this mockup, we conducted a very early exploration into AI support strategies that would allow, for example, to convert air traffic control instructions to flight path renderings.

Contact person: Mireia Diez Sanchez (mireia@fit.vutbr.cz

Internal Partners:

  1. BUT, Brno University of Technology, Mireia Diez Sanchez, mireia@fit.vutbr.cz; cernocky@fit.vutbr.cz
  2. TUB, TECHNISCHE UNIVERSITÄT BERLIN, Tim Polzehl, tim.polzehl@dfki.deklaus.r.mueller@googlemail.com

 

In this microproject, we pursued enabling access to AI technology to those who might have special needs when interacting with “AI: Automatic Speech Recognition made accessible for people with dysarthria”. Dysarthria is a motor speech disorder resulting from neurological injury and is characterized by poor articulation of phonemes. Within Automatic speech recognition (ASR), dysarthric speech recognition is a tedious task due to the lack of supervised data and diversity.

The project studied the adaptation of automatic speech recognition (ASR) systems for impaired speech. Specifically, the micro-project focused on improving ASR systems for speech from subjects with dysarthria and/or stuttering speech impairment types of various degrees. The work was developed using German “Lautarchive” data comprising only 130 hours of untranscribed doctor-patient German speech conversations and using English TORGO dataset, applying human-in-the-loop methods. We spot individual errors and regions of low certainty in ASR in order to apply human originated improvement and clarification in AI decision processes.

Results Summary

Particularly, in this work, we have studied the performance of different ASR systems on dysarthric speech: LF-MMI, Transformer and wav2vec2. The analysis revealed the superiority of the wav2vec2 models on the task. We investigated the importance of speaker dependent auxiliary features such as fMLLR and xvectors for adapting wav2vec2 models for improving dysarthric speech recognition. We showed that in contrast to hybrid systems, wav2vec2 did not improve by adapting model parameters based on each speaker.

We proposed a wav2vec2 adapter module that inherits speaker features as auxiliary information to perform effective speaker normalization during finetuning. We showed that, using the adapter module, fMLLR and xvectors are complementary to each other, and proved the effectiveness of the approach outperforming existing SoTA on UASpeech dysartric speech ASR.

In our cross-lingual experiments, we also showed that combining English and German data for training, can further improve performance of our systems, proving useful in scenarios where little training examples exist for a particular language.

 

Tangible Outcomes

  1. M. K. Baskar, T. Herzig, D. Nguyen, M. Diez, T. Polzehl, L. Burget, J. Černocký, “Speaker adaptation for Wav2vec2 based dysarthric ASR”. Proc. Interspeech 2022, 3403-3407, doi: 10.21437/Interspeech.2022-10896 https://www.isca-speech.org/archive/pdfs/interspeech_2022/baskar22b_interspeech.pdf
  2. Open source tool for training ASR models for dysarthic speech: The repository contains: A baseline recipe to train a TDNN-CNN hybrid model based ASR system, this recipe is prepared to be trained on the TORGO dataset. And an end-to-end model using ESPnet framework prepared to be trained on UASpeech dataset. https://github.com/creatorscan/Dysarthric-ASR

Contact person: Jesus Cerquides (j.cerquides@csic.es)

Internal Partners:

  1. CSIC Consejo Superior de Investigaciones Científicas, Jesus Cerquides
  2. CNR Consiglio Nazionale delle Ricerche, Daniele Vilone   

 

Many citizen science projects have a crowdsourcing component where several different citizen scientists are requested to fulfill a micro task (such as tagging an image as either relevant or irrelevant for the evaluation of damage in a natural disaster, or identifying a specimen into its taxonomy). How do we create a consensus between the different opinions/votes? Currently, most of the time, simple majority voting is used. We argue that alternative voting schemes (taking into account the errors performed by each annotator) could severely reduce the number of citizen scientists required. This is a clear example of continuous human-in-the-loop machine learning with the machine creating a model of the humans that it has to interact with. We propose to study consensus building under two different hypotheses: truthful annotators (as a model for most voluntary citizen science projects) and self-interested annotators (as a model for paid crowdsourcing projects).

Results Summary

We have contributed to the implementation of several different probabilistic consensus models in the Crowdnalysis library which has been resealed as a Python package.

We have proposed a generic mathematical framework for the definition of probabilistic consensus algorithms, and for performing prospective analysis. This has been published in a journal paper.

We have used the library and the mathematical framework for the analysis of images from the Albanian earthquake scenario.

We exploited Monte Carlo simulations to understand which can be the best way to assess group decisions in evaluating the correct level of damage in natural catastrophes. The results suggest that Majority rule is the best option as long as all the agents are competent enough to address the task. Otherwise, when the number of unqualified agents is no longer negligible, smarter procedures must be found out.

Tangible Outcomes

  1. Program/code: Crowdnalysis Python package – Jesus Cerquides (cerquide@iiia.csic.es) https://pypi.org/project/crowdnalysis/ 

Contact person: Teresa Hirzle (tehi@di.ku.dk)

Internal Partners:

  1. UCPH, Teresa Hirzle, kash@di.ku.dk
  2. LMU, Florian Müller, albrecht.schmidt@ifi.lmu.de

External Partners:

  1. Saarland University, Martin Schmitz
  2. Universität Innsbruck, Pascal Knierim  

 

Research on Extended Reality (XR) and Artificial Intelligence (AI) is booming, which has led to an emerging body of literature in their intersection. However, the main topics in this intersection are unclear, as are the benefits of combining XR and AI. This paper presents a scoping review that highlights how XR is applied in AI research and vice versa. We screened 2619 publications from 203 international venues published between 2017 and 2021, followed by an in-depth review of 311 papers. Based on our review, we identify five main topics at the intersection of XR and AI, showing how research at the intersection can benefit each other. Furthermore, we present a list of commonly used datasets, software, libraries, and models to help researchers interested in this intersection. Finally, we present 13 research opportunities and recommendations for future work in XR and AI research.

Results Summary

We conducted a scoping review covering 311 papers published between 2017 and 2021.

First, we screened 2619 publications from 203 venues to cover the broad spectrum of XR and AI research. For the search, we inductively built a set of XR and AI terms. The venues include research from XR, AI, Human-Computer Interaction, Computer Graphics, Computer Vision, and others. After a two-phase screening process, we reviewed and extracted data from 311 full papers based on a code book with 26 codes about the research direction, contribution, and topics of the papers, as well as the algorithms, tools, datasets, models, and data types the researchers used to address research questions on XR and AI. The extracted data for these codes form the basis for our predominantly narrative synthesis. As a result, we found five main topics at the intersection of XR and AI: (1) Using AI to create XR worlds (28.6%), (2) Using AI to understand users (19.3%), (3) Using AI to support interaction (15.4%), (4) Investigating interaction with intelligent virtual agents (IVAs) (8.0%), and (5) Using XR to Support AI Research (2.3%).

The remaining 23.8% of the papers apply XR and AI to an external problem, such as for medical training applications (3.5%), or for simulation purposes (3.0%). Finally, we summarise our findings in 13 research opportunities and present ideas and recommendations for how to address them in future work. Some of the most pressing issues are a lack of generative use of AI to create worlds, understand users, and enhance interaction, a lack of generalisability and robustness, and a lack of discussion about ethical and societal implications.

In terms of the call topics, we analysed whether XR can serve as a tool to establish and enhance interactive grounding in human-AI interaction. Here, we found that there is a lack of understanding user experience during human-AI interaction using XR technology. Typically, AI is used for content creation and to enhance interaction techniques. We did, however, not find a lot of papers that use XR to support human-AI interaction. There are some works that look into artificial agents and how an interaction with them can be realised through XR. However, most of these works do not yet work in real-time and are mostly based on mock-up scenes.

Tangible Outcomes

  1. Teresa Hirzle, Florian Müller, Fiona Draxler, Martin Schmitz, Pascal Knierim, and Kasper Hornbæk. 2023. When XR and AI Meet – A Scoping Review on Extended Reality and Artifcial Intelligence. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI ’23), April 23–28, 2023, Hamburg, Germany. ACM, New York, NY, USA, 45 pages. https://doi.org/10.1145/3544548.3581072 https://thirzle.com/pdf/chi23_xrai_scoping_review_hirzle.pdf 
  2. Reviewed Papers and Coding Spreadsheet: https://thirzle.com/supplement/chi23_xrai_scoping_review_hirzle.zip 
  3. CHI’23 conference presentation: https://youtu.be/VDg-2Pz9lj8?feature=shared 

Contact person: John Shawe-Taylor (j.shawe-taylor@ucl.ac.uk)

Internal Partners:

  1. Knowledge 4 All Foundation, Davor Orlic, davor.orlic@gmail.com
  2. University College London, John Shawe-Taylor, j.shawe-taylor@ucl.ac.uk
  3. Institut Jožef Stefan, Davor Orlic and Marko Grobelnik, davor.orlic@gmail.com

 

K4A proposed a microproject to extend its existing prototype of the online learning platform X5LEARN (https://x5learn.org/) developed by K4A and UCL and JSI and its new IRCAI center under the auspices of UNESCO. It is a standalone, learner-facing web application designed to give access through an innovative interface to a portfolio of openly licensed educational resources (OER) in video and textual format. Designed for lifelong learners looking for specific content wanting to expand on their knowledge, our aim is to extend it to AI-related topics. The updated application will be released via IRCAI, a newly designated AI center and integrated with AI4EU with heavy HumaneAI branding. The main reason to push the product with IRCAI is that UNESCO is positioning itself as the main UN agency to promote humanist Artificial Intelligence, a major international policy on the Ethics of AI, and champion OER, which is in line with HumaneAI.

Results Summary

Under this microproject, a series of extensions to the X5Learn platform was added. A new user friendly user interface was developed and deployed. X5Learn, being an intelligent learning platform, a series of human-centric AI technologies that enable educational recommendation, intelligent previewing of information and scalable question generation that can help different stakeholders such as teachers and learners were developed backed by scientific research. The results have been published in peer reviewed conferences such as AAAI, AIED and CHIIR and also published in the Journal of Sustainability. The new earning platform is now available to the public including a python library that implements the recommendation algorithms developed.

Tangible Outcomes

  1. Maria Pérez Ortiz, Sahan Bulathwela, Claire Dormann, Meghana Verma, Stefan Kreitmayer, Richard Noss, John Shawe-Taylor, Yvonne Rogers, and Emine Yilmaz. 2022. Watch Less and Uncover More: Could Navigation Tools Help Users Search and Explore Videos? In Proceedings of the 2022 Conference on Human Information Interaction and Retrieval (CHIIR ’22). Association for Computing Machinery, New York, NY, USA, 90–101. https://doi.org/10.1145/3498366.3505814 
  2. Maria Perez-Ortiz, Claire Dormann, Yvonne Rogers, Sahan Bulathwela, Stefan Kreitmayer, Emine Yilmaz, Richard Noss, and John Shawe-Taylor. 2021. X5Learn: A Personalised Learning Companion at the Intersection of AI and HCI. In 26th International Conference on Intelligent User Interfaces – Companion (IUI ’21 Companion). Association for Computing Machinery, New York, NY, USA, 70–74. https://doi.org/10.1145/3397482.3450721 
  3. Sahan Bulathwela, María Pérez-Ortiz, Emine Yilmaz, and John Shawe-Taylor. 2022. Power to the Learner: Towards Human-Intuitive and Integrative Recommendations with Open Educational Resources. Sustainability 14, 18: 11682. https://doi.org/10.3390/su141811682 
  4. [arxiv] Bulathwela, S., Pérez-Ortiz, M., Holloway, C., & Shawe-Taylor, J. (2021). Could ai democratise education? socio-technical imaginaries of an edtech revolution. arXiv preprint arXiv:2112.02034.https://arxiv.org/abs/2112.02034 
  5.  X5Learn Platform: https://x5learn.org/ 
  6.  TrueLearn Codebase: https://github.com/sahanbull/TrueLearn 
  7.  TrueLearn Python library: https://truelearn.readthedocs.io 
  8.  X5Learn Demo Video: https://youtu.be/aXGL05kbzyg 
  9.  Longer lecture about the topic: https://youtu.be/E11YUWad7Lw 
  10.  Workshop presentation (AAAI’21): https://www.youtube.com/watch?v=gYtmL2XdxHg 
  11.  Workshop Presentation (AAAI’21): https://youtu.be/4v-fizLvHwA