Contact person: Hamraz javaheri (hamraz.javaheri@dfki.de

Internal Partners:

  1. DFKI  

External Partners:

  1. Hospital Saarbrücken “Der Winterberg”  

 

In this project, we successfully implemented and clinically evaluated an AR assistance system for pancreatic surgery, enhancing surgical navigation and achieving more precise perioperative outcomes. However, the system’s reliance on preoperative data posed challenges, particularly due to anatomical deformations occurring in the later stages of surgery. In future research, we aim to address this by integrating real-time data sources to further improve the system’s accuracy and adaptability during surgery.

Results Summary

Throughout our project, we developed and clinically evaluated ARAS, an augmented reality (AR) assistance system designed for pancreatic surgery. The system was clinically evaluated by field surgeons during pancreatic tumor resections involving 20 patients. In a matched-pair analysis with 60 patients who underwent surgery without ARAS, the ARAS group demonstrated a significantly shorter operation time compared to the control group. Although not statistically significant, the ARAS group also exhibited clinically noticeable lower rates of excessive intraoperative bleeding and reduced need for intraoperative red blood cell (RBC) transfusions. Furthermore, ARAS enabled more precise tumor resections with tumor-free margins, and patients in this group had better postoperative outcomes, including significantly shorter hospital stays. In this project, we published 2 journal papers (1 is accepted and will be published soon), 1 conference paper, 1 demo paper (Best Demo Paper Award), and 2 more conference papers are currently under submission. The success of our project got also several international and local news and media attention including Deutsche Welle news channel (Example links provided)

Tangible Outcomes

  1. Beyond the visible: preliminary evaluation of the first wearable augmented reality assistance system for pancreatic surgery, Journal of International Journal of Computer Assisted Radiology and Surgery (https://doi.org/10.1007/s11548-024-03131-0 )
  2. Enhancing Perioperative Outcomes of Pancreatic Surgery with Wearable Augmented Reality Assistance System: A Matched-Pair Analysis, Journal of Annals of Surgery Open ( https://doi.org/10.1097/AS9.0000000000000516)
  3. Design and Clinical Evaluation of ARAS: An Augmented Reality Assistance System for Pancreatic Surgery (IEEE ISMAR 2024 (https://www.researchgate.net/publication/385116946_Design_and_Clinical_Evaluation_of_ARAS_An_Augmented_Reality_Assistance_System_for_Open_Pancreatic_Surgery_Omid_Ghamarnejad
  4. ARAS: LLM-Supported Augmented Reality Assistance System for Pancreatic Surgery, ISWC/UbiComp 2024 (https://doi.org/10.1145/3675094.3677543
  5. Media coverage for the project:
    1. https://www.dw.com/en/artificial-intelligence-saving-lives-in-the-operating-room/video-68125878
    2. https://www.dw.com/de/k%C3%BCnstliche-intelligenz-im-op-saal-rettet-leben/video-68125903
    3. https://www.saarbruecker-zeitung.de/app/consent/?ref=https%3A%2F%2Fwww.saarbruecker-zeitung.de%2Fsaarland%2Fsaarbruecken%2Fsaarbruecken%2Fsaarbruecken-winterberg-klinik-international-im-tv-zu-sehen_aid-106311259
    4. https://www.saarbruecker-zeitung.de/app/consent/?ref=https%3A%2F%2Fwww.saarbruecker-zeitung.de%2Fsaarland%2Fsaarbruecken-mittels-ki-erfolgreiche-operation-an-82-jaehriger-v29_aid-104053203
    5. https://m.focus.de/gesundheit/gesundleben/da-gibt-es-keinen-raum-fuer-fehler-kuenstliche-intelligenz-im-op-saal-rettet-leben_id_259629806.html 

Contact person: Dilhan Thilakarathne (dilhan.thilakarathne@ing.com)

Internal Partners:

  1. ING Groep NV, Dilhan Thilakarathne
  2. Umeå University (UMU), Andrea Aler Tubella  

 

After choosing a formal definition of fairness (we limit ourselves with definitions based on group fairness through equal resources or equal opportunities), one can attain fairness on the basis of this definition in two ways: directly incorporating the chosen definition into the algorithm through in-processing (as another constraint besides the usual error minimization; or using adversarial learning etc.) or introducing an additional layer to the pipeline through post-processing (considering the model as a black-box and focusing on its inputs and predictions to alter the decision boundary approximating the ideal fair outcomes, e.g. using a Glass-Box methodology).

We aim to compare both approaches, providing guidance on how best to incorporate fairness definitions into the design pipeline, focusing on the following research questions: Is there any qualitative difference between fairness acquired through in-processing and fairness attained by post-processing? What are the advantages of each method (e.g. performance, amenability to different fairness definitions)?

Results Summary

The work focuses on the choice between in-processing and post-processing showing that it is not value-free, as it has serious implications in terms of who will be affected by a fairness intervention. The work suggests how the translation of technical engineering questions into ethical decisions can concretely contribute to the design of fair models and the societal discussions around it.

The results of the experimental study provide evidences that are robust w.r.t. different implementations and discuss it for the case of a credit risk application. The results demonstrate how the translation of technical engineering questions into ethical decisions can concretely contribute to the design of fair models. At the same time, assessing the impacts of the resulting classification can have implications for the specific context of the original problem. We summarize our results in a paper addressing the difference between in-vs-post processing methods on ML models focusing on fairness vs performance trade-offs.

Tangible Outcomes

  1.  Ethical implications of fairness interventions: what might be hidden behind engineering choices?– Andrea Aler Tubella, Flavia Barsotti, Ruya Gokhan Kocer, Julian Alfredo Mendez
    https://doi.org/10.1007/s10676-022-09636-z
  2. Video presentation summarizing the project

 

Contact person: Florian Müller , LMU ( florian.mueller@tu-darmstadt.de

Internal Partners:

  1. Ludwig-Maximilians-Universität München (LMU), Florian Müller  

External Partners:

  1. Ecole Nationale de l’Aviation Civile (ENAC), Anke Brock  

 

Pilots frequently need to react to unforeseen in-flight events. Taking adequate decisions in such situations requires considering all available information and demands strong situational awareness. Modern on-board computers and technologies like GPS radically improved the pilots’ abilities to take appropriate actions and lowered their required workload in recent years. Yet, current technologies used in aviation cockpits generally still fail to adequately map and represent 3D airspace. In response, we aim to create an AI aviation assistant that considers all relevant aircraft operation data, focuses on providing tangible action recommendations, and on visualizing them for efficient and effective interpretation in 3D space. In particular, we note that extended reality (XR) applications provide an opportunity to augment pilots’ perception through live 3D visualizations of key flight information, including airspace structure, traffic information, airport highlighting, and traffic patterns. While XR applications have been tested in aviation in the past, applications are mostly limited to military aviation and latest commercial aircrafts. This ignores the majority of pilots in general aviation, in particular, where such support could drastically increase situational awareness and lower the workload of pilots. General aviation is characterized as the non-commercial branch of aviation, often relating to single-engine and single-pilot operations. To develop applications usable across aviation domains, we planned to create a Unity project for XR glasses. Based on this, we planned to, in the first step, systematically and iteratively explore suitable AI-based support on pilot feedback in a virtual reality study in a flight simulator. Based on our findings, we refine the Unity application and investigate opportunities to conduct a real test flight with our external partner ENAC, the French National School of Civil Aviation, who own a plane. Such a test flight would most likely use latest Augmented Reality headsets like the HoloLense 2. Considering the immense safety requirements for such a real test flight, this part of the project is considered optional at this stage and depends on the findings from the previous virtual reality evaluation. The system development particularly focuses on the use of XR techniques to create more effective AI-supported traffic advisories and visualizations. With this, we want to advance the coordination and collaboration of AI with human partners, establishing a common ground as a basis for multimodal interaction with AI (WP3 motivated). Further, the MP relates closely to “Innovation projects (WP6and7 motivated)”, calling for solutions that address “real-world challenges and opportunities in various domains such as (…)transportation […]”.

Results Summary

We explored AI and Mixed Reality for pilot support. One of the results includes an early mixed reality prototype for a popular consumer-grade flight simulator that allows to intuitively perceive actual 3D information that current 2D tools cannot present satisfactorily. Based on this mockup, we conducted a very early exploration into AI support strategies that would allow, for example, to convert air traffic control instructions to flight path renderings.

Contact person: Mireia Diez Sanchez (mireia@fit.vutbr.cz

Internal Partners:

  1. BUT, Brno University of Technology, Mireia Diez Sanchez, mireia@fit.vutbr.cz; cernocky@fit.vutbr.cz
  2. TUB, TECHNISCHE UNIVERSITÄT BERLIN, Tim Polzehl, tim.polzehl@dfki.deklaus.r.mueller@googlemail.com

 

In this microproject, we pursued enabling access to AI technology to those who might have special needs when interacting with “AI: Automatic Speech Recognition made accessible for people with dysarthria”. Dysarthria is a motor speech disorder resulting from neurological injury and is characterized by poor articulation of phonemes. Within Automatic speech recognition (ASR), dysarthric speech recognition is a tedious task due to the lack of supervised data and diversity.

The project studied the adaptation of automatic speech recognition (ASR) systems for impaired speech. Specifically, the micro-project focused on improving ASR systems for speech from subjects with dysarthria and/or stuttering speech impairment types of various degrees. The work was developed using German “Lautarchive” data comprising only 130 hours of untranscribed doctor-patient German speech conversations and using English TORGO dataset, applying human-in-the-loop methods. We spot individual errors and regions of low certainty in ASR in order to apply human originated improvement and clarification in AI decision processes.

Results Summary

Particularly, in this work, we have studied the performance of different ASR systems on dysarthric speech: LF-MMI, Transformer and wav2vec2. The analysis revealed the superiority of the wav2vec2 models on the task. We investigated the importance of speaker dependent auxiliary features such as fMLLR and xvectors for adapting wav2vec2 models for improving dysarthric speech recognition. We showed that in contrast to hybrid systems, wav2vec2 did not improve by adapting model parameters based on each speaker.

We proposed a wav2vec2 adapter module that inherits speaker features as auxiliary information to perform effective speaker normalization during finetuning. We showed that, using the adapter module, fMLLR and xvectors are complementary to each other, and proved the effectiveness of the approach outperforming existing SoTA on UASpeech dysartric speech ASR.

In our cross-lingual experiments, we also showed that combining English and German data for training, can further improve performance of our systems, proving useful in scenarios where little training examples exist for a particular language.

 

Tangible Outcomes

  1. M. K. Baskar, T. Herzig, D. Nguyen, M. Diez, T. Polzehl, L. Burget, J. Černocký, “Speaker adaptation for Wav2vec2 based dysarthric ASR”. Proc. Interspeech 2022, 3403-3407, doi: 10.21437/Interspeech.2022-10896 https://www.isca-speech.org/archive/pdfs/interspeech_2022/baskar22b_interspeech.pdf
  2. Open source tool for training ASR models for dysarthic speech: The repository contains: A baseline recipe to train a TDNN-CNN hybrid model based ASR system, this recipe is prepared to be trained on the TORGO dataset. And an end-to-end model using ESPnet framework prepared to be trained on UASpeech dataset. https://github.com/creatorscan/Dysarthric-ASR

Contact person: John Shawe-Taylor (j.shawe-taylor@ucl.ac.uk)

Internal Partners:

  1. Knowledge 4 All Foundation, Davor Orlic, davor.orlic@gmail.com
  2. University College London, John Shawe-Taylor, j.shawe-taylor@ucl.ac.uk
  3. Institut Jožef Stefan, Davor Orlic and Marko Grobelnik, davor.orlic@gmail.com

 

K4A proposed a microproject to extend its existing prototype of the online learning platform X5LEARN (https://x5learn.org/) developed by K4A and UCL and JSI and its new IRCAI center under the auspices of UNESCO. It is a standalone, learner-facing web application designed to give access through an innovative interface to a portfolio of openly licensed educational resources (OER) in video and textual format. Designed for lifelong learners looking for specific content wanting to expand on their knowledge, our aim is to extend it to AI-related topics. The updated application will be released via IRCAI, a newly designated AI center and integrated with AI4EU with heavy HumaneAI branding. The main reason to push the product with IRCAI is that UNESCO is positioning itself as the main UN agency to promote humanist Artificial Intelligence, a major international policy on the Ethics of AI, and champion OER, which is in line with HumaneAI.

Results Summary

Under this microproject, a series of extensions to the X5Learn platform was added. A new user friendly user interface was developed and deployed. X5Learn, being an intelligent learning platform, a series of human-centric AI technologies that enable educational recommendation, intelligent previewing of information and scalable question generation that can help different stakeholders such as teachers and learners were developed backed by scientific research. The results have been published in peer reviewed conferences such as AAAI, AIED and CHIIR and also published in the Journal of Sustainability. The new earning platform is now available to the public including a python library that implements the recommendation algorithms developed.

Tangible Outcomes

  1. Maria Pérez Ortiz, Sahan Bulathwela, Claire Dormann, Meghana Verma, Stefan Kreitmayer, Richard Noss, John Shawe-Taylor, Yvonne Rogers, and Emine Yilmaz. 2022. Watch Less and Uncover More: Could Navigation Tools Help Users Search and Explore Videos? In Proceedings of the 2022 Conference on Human Information Interaction and Retrieval (CHIIR ’22). Association for Computing Machinery, New York, NY, USA, 90–101. https://doi.org/10.1145/3498366.3505814 
  2. Maria Perez-Ortiz, Claire Dormann, Yvonne Rogers, Sahan Bulathwela, Stefan Kreitmayer, Emine Yilmaz, Richard Noss, and John Shawe-Taylor. 2021. X5Learn: A Personalised Learning Companion at the Intersection of AI and HCI. In 26th International Conference on Intelligent User Interfaces – Companion (IUI ’21 Companion). Association for Computing Machinery, New York, NY, USA, 70–74. https://doi.org/10.1145/3397482.3450721 
  3. Sahan Bulathwela, María Pérez-Ortiz, Emine Yilmaz, and John Shawe-Taylor. 2022. Power to the Learner: Towards Human-Intuitive and Integrative Recommendations with Open Educational Resources. Sustainability 14, 18: 11682. https://doi.org/10.3390/su141811682 
  4. [arxiv] Bulathwela, S., Pérez-Ortiz, M., Holloway, C., & Shawe-Taylor, J. (2021). Could ai democratise education? socio-technical imaginaries of an edtech revolution. arXiv preprint arXiv:2112.02034.https://arxiv.org/abs/2112.02034 
  5.  X5Learn Platform: https://x5learn.org/ 
  6.  TrueLearn Codebase: https://github.com/sahanbull/TrueLearn 
  7.  TrueLearn Python library: https://truelearn.readthedocs.io 
  8.  X5Learn Demo Video: https://youtu.be/aXGL05kbzyg 
  9.  Longer lecture about the topic: https://youtu.be/E11YUWad7Lw 
  10.  Workshop presentation (AAAI’21): https://www.youtube.com/watch?v=gYtmL2XdxHg 
  11.  Workshop Presentation (AAAI’21): https://youtu.be/4v-fizLvHwA 

Contact person: Francesco Spinnato Riccardo Guidotti (francesco.spinnato@sns.it)

Internal Partners:

  1. Generali Italia
  2. CNR Pisa
  3. Università di Pisa

 

For insurance business a connected car is a vehicle where an embedded telematics device streams acceleration data, GPS position and other physical parameters of the moving car. This live streaming is used for automatic real time detection of car crashes. The project is focused on the development of an XAI layer which translates the logical outcome of an underneath LSTM used for crash detection into a human readable format.

Results Summary

  • Industrial outcome: the LSTM automatic labeling of a signal event from a car telematics box as a ‘crash’ triggers an emergency live call from a contact center to the driver’s phone for health assessment and further help. If the driver is not responding or is out of reach, more actions could follow (e.g. call to emergency service). In order to improve the efficiency of this emergency procedure, is vital for the contact center operator to reduce the number of false positive events (e.g. being able to read the outcome of the box and discriminate a false positive event)
  • Societal outcome: an improved efficiency in connected car crash detection (reduction of false positives) can reduce the number of car crashes with fatal or severe injury outcome and also improve road safety.

Tangible Outcomes

  1.  Explaining Crash Predictions on Multivariate Time Series Data  The Lecture Notes in Computer Science book series (LNAI,volume 13601) https://link.springer.com/chapter/10.1007/978-3-031-18840-4_39