Making sense of data is a main challenge in creating human understandable descriptions of complex situations. When data refer to process executions, techniques exist that discover explicit descriptions in terms of formal models. Many research works envisage the discovery task as a one-class supervised learning job. Work on deviance mining highlighted nonetheless the need to characterise behaviours that exhibit certain characteristics and forbid others (e.g., the slower, less frequent), leading to the quest for a binary supervised learning task.

In this microproject we focus on the discovery of declarative process models, expressed through Linear Time Temporal Logic, as a binary supervised learning task, where the input log reports both positive and negative behaviours. We therefore investigate how valuable information can be extracted and formalised into a “optimal” model, according to user-preferences (e.g., model generality or simplicity). By iteratively including further examples, the user can also refine the discovered models

Output

Paper to be submitted to relevant journal

Machine learning tool

Artificial data set

Presentations

Project Partners:

  • Fondazione Bruno Kessler (FBK), Chiara Ghidini
  • Università di Bologna (UNIBO), Federico Chesani

 

Primary Contact: Chiara Ghidini, FBK

Main results of micro project:

The microproject has produced so far two main results:
– A two-step approach for the discovery of temporal-logic patterns as a binary supervised learning problem, that is starting from a set of “positive traces” (execution traces whose behaviour we want to observe in the discovered patterns), and a set of “negative” traces (execution traces whose behaviour we do not want to observe in the discovered patterns). In detail, in the first step, sets of patterns (possible models) that accept all positive traces and discard as much as possible of the negative ones are discovered. In the second step, the model(s) optimizing one criterion, as for instance the generality or the simplicity, are selected among the possible discovered models.
– Two synthetic labelled (“positive” and “negative”) event log datasets used for the synthetic evaluation of the proposed approach.

Contribution to the objectives of HumaneAI-net WPs

The results of the microproject mainly contribute to WP1 (Human-in-the-Loop Machine Learning, Reasoning and Planning). Indeed, on the one hand, the micro-project aims at leveraging machine learning techniques (sub-symbolic learning) to provide LTL patterns (symbolic representation) of a set of “positive” traces, while excluding the “negative” ones (T1.1). On the other hand, the micro-project is a first step towards including the human in the loop of the discovery of LTL patterns representing all and only the cases the human wants to represent (T1.3). The user could indeed iteratively refine the discovered patterns so as to be sure to include all the cases she is interested to include, while excluding all those cases that she wants to exclude.

Tangible outputs

In this micro-project, we propose investigating human recollection of team meetings and how conversational AI could use this information to create better team cohesion in virtual settings.

Specifically, we would like to investigate how a person’s emotion, personality, relationship to fellow teammates, goal and position in the meeting influences how they remember the meeting. We want to use this information to create memory aware conversational AI that could leverage such data to increase team cohesion in future meetings.

To achieve this goal, we plan first to record a multi-modal data-set of team meetings in a virtual-setting. Second, administrate questionnaires to participants in different time intervals succeeding a session. Third, annotate the corpus. Fourth, carry out an initial corpus analysis to inform the design of memory-aware conversational AI.

This micro-project will contribute to a longer-term effort in building a computational memory model for human-agent interaction.

Output

A corpus of repeated virtual team meetings (6 sessions spaced, 1 week each)

manual annotations (people’s recollection of the team meeting etc.)

automatic annotations (e.g. eye-gaze, affect, body posture etc.)

A paper describing the corpus and insights gained on the design of memory-aware agents from initial analysis

Project Partners:

  • TU Delft, Catholijn Jonker
  • Eötvös Loránd University (ELTE), Andras Lorincz

 

Primary Contact: Catharine Oertel, TU Delft

Main results of micro project:

1) A corpus of repeated virtual team meetings (4 sessions spaced, 4 days apart each).
2) Manual annotations (people's recollection of the team meeting etc.)
3) Automatic annotations (e.g. eye-gaze, affect, body posture etc.)
4)A preliminary paper describing the corpus and insights gained on the design of memory-aware agents from initial analysis

Contribution to the objectives of HumaneAI-net WPs

In this micro-project, we propose investigating human recollection of team meetings and how conversational AI could use this information to create better team cohesion in virtual settings.
Specifically, we would like to investigate how a person's emotion, personality, relationship to fellow teammates, goal and position in the meeting influences how they remember the meeting. We want to use this information to create memory aware conversational AI that could leverage such data to increase team cohesion in future meetings.
To achieve this goal, we plan first to record a multi-modal data-set of team meetings in a virtual-setting. Second, administrate questionnaires to participants in different time intervals succeeding a session. Third, annotate the corpus. Fourth, carry out an initial corpus analysis to inform the design of memory-aware conversational AI.
This micro-project will contribute to a longer-term effort in building a computational memory model for human-agent interaction.

Tangible outputs

  • Dataset: MEMO – Catharine Oertel
  • Publication: MEMO dataset paper – Catharine Oertel
  • Program/code: Memo feature extraction code – Andras Lorincx

Transformers and self-attention (Vaswani et al., 2017), have become the dominant approach for natural language processing (NLP) with systems such as BERT (Devlin et al., 2019) and GPT-3 (Brown et al., 2020) rapidly displacing more established RNN and CNN structures with an architecture composed of stacked encoder-decoder modules using self-attention.

This micro-project will assess tools and data sets for experiments and a first initial demonstration of the potential of transformers for multimodal perception and multimodal interactions. We explore research challenges, benchmark data sets and performance metrics for multimodal perception and modeling tasks such as (1) audio-visual narration of scenes, actions and activities, (2) audio-video recordings of lectures and TV programs (3) perception and evocation of engagement, attention, and emotion.

(full description and bibliography exceeds 200 words – available on request).

Presentations

Project Partners:

  • Institut national de recherche en sciences et technologies du numérique (INRIA), James Crowley
  • Eötvös Loránd University (ELTE), Andras Lorincz
  • Université Grenoble Alpes (UGA), Fabien Ringeval
  • Centre national de la recherche scientifique (CNRS), François Yvon
  • Institut “Jožef Stefan” (JSI), Marko Grobelnik

Primary Contact: James Crowley, INRIA

Main results of micro project:

This micro-project has explored the potential of transformers for multimodal perception and interaction to support Humane AI, providing
1) A tutorial on the use of transformers for multimodal interaction, and
2) A report on available tools for experiments.
3) A survey of data sets and research challenges for experiments.
The result has opened a new approach to building practical tools for interaction and collaboration between people and intelligent systems.

Contribution to the objectives of HumaneAI-net WPs

This microproject has promoted the use of a transformers and self attention for multimodal modal interaction by Humane AI Net researchers, by identifying relevant tools and benchmark data sets, by providing tutorials and training materials for education, and by identifying research challenges for multimodal perception and interaction with Transformers.

Tangible outputs

The goal is to devise a data generation methodology that, given a data sample, can approximate the stochastic process that generated it. The methodology can be useful in many contexts where we need to share data while preserving user privacy.

There are known literature for data generation based on Bayesian neural networks/hidden Markov models that are restricted to static and propositional data. We focus on time-evolving data and preference data.

We will study essentially two aspects: (1) the generator to produce realistic data, having the same properties of the original one. (2) we want to investigate how to inject drift within the data generation process in a controlled manner. The idea is to model the stochastic process through a dependency graph among random variables so that the drift can be simply modeled by changing the structure of the underlying graph through a morphing process.

Output

1 Conference/Journal Paper

1 Prototype

Dataset Samples

Project Partners:

  • INESC TEC, Joao Gama
  • Universiteit Leiden (ULEI), Holger Hoos
  • Consiglio Nazionale delle Ricerche (CNR), Giuseppe Manco

 

Primary Contact: Joao Gama, INESC TEC, University of Porto

Project Description (150 words)

Methods for injecting constraints in Machine Learning (ML) can help bridging the gap between symbolic and subsymbolic models, and address fairness and safety issues in data-driven AI systems. The recently proposed Moving Targets approach achieves this via a decomposition, where a classical ML model deals with the data and a separate constraint solver with the constraints.

Different applications call for different constraints, solvers, and ML models: this flexibility is a strength of the approach, but it makes it also difficult to set up and analyze.

Therefore, this project will rely on the AI Domain Definition Language (AIDDL) framework to obtain a flexible implementation of the approach, making it simpler to use and allowing the exploration of more case studies, different constraint solvers, and algorithmic variants. We will use this implementation to investigate various new constraint types integrated with the Moving Targets approach (e.g. SMT, MINLP, CP).

Output

Stand-alone moving targets system distributed via the AI4EU platform

Interactive tutorial to be available on the AI4EU platform

Scientific paper discussing the outcome of our evaluation and the resulting system

Presentations

Project Partners:

  • Örebro University (ORU), Uwe Köckemann
  • Università di Bologna (UNIBO), Michele Lombardi

 

Primary Contact: Uwe Köckemann, Örebro University

Main results of micro project:

The moving targets method integrates machine learning and constraint optimization to enforce constraints on a machine learning model. The AI Domain Definition Language (AIDDL) provides a modeling language and framework for integrative AI.

We have implemented the moving targets algorithm in the AIDDL framework for integrative AI. This has benefits for modeling, experimentation, and usability. On the modeling side, this enables us to provide applications of “moving target” as regular machine learning problems extended with constraints and a loss function. On the experimentation side, we can now easily switch the learning and constraint solvers used by the “moving targets” algorithm, and we have added support for multiple constraint types. Finally, we made the “moving targets” method easier to use, since it can now be controlled through a small model written in the AIDDL language.

Our tangible outcomes are listed below.

Contribution to the objectives of HumaneAI-net WPs

T1.1 (Linking Symbolic and Subsymbolic Learning)

Moving targets provides a convenient approach to enforce constraint satisfaction in subsymbolic ML methods, within the limits of model bias. Our AIDDL integration pulls this idea all the way to the modeling level where, e.g., a fairness constraint can be added with a single line.

T1.4 (Compositionality and Auto ML)

The moving targets method, combined with an easy way of modeling constraints via
AIDDL may increase trust in fully automated machine learning pipelines.

T2.6 (Dealing with Lack of Training Data)

Training data may be biased in a variety of ways depending on how it was collected. We provide a convenient way to experiment with constraining such data sets and possibly overcome unwanted bias due to lack of data.

Tangible outputs

This project aims at investigating the construction of humor models to enrich conversational agents through the help of interactive reinforcement learning approaches.

Our methodology consists in deploying an online platform where passersby can play a game of matching sentences with humorous comebacks against an agent.

The data collected from these interactions will help to gradually build the humor models of the agent following state of the art Interactive Reinforcement Learning techniques.

We plan to work on this project for 4 months, resulting in an implementation of the platform, a first model for humor-enabled conversational agent and a publication of the obtained results and evaluations.

Output

Online game for collecting humorous interaction data

Humor models for conversational agents

Paper in International Conference of Journal related to AI and AI in Games

Project Partners:

  • Centre national de la recherche scientifique (CNRS), Brian Ravenet
  • Instituto Superior Técnico (IST), Rui Prada

 

Primary Contact: Brian Ravenet, LISN-CNRS (ex LIMSI-CNRS)

Main results of micro project:

The main result of this project will be the creation of an intelligent agent capable of playing a game – Cards Against Humanity- that involves matching sentences with humorous comebacks. The game requires that players be able to combine black and white cards to form the funniest joke possible. Therefore, the developing AI agent must be able to make funny jokes. Ultimately, this opens perspectives for the development of humor models in conversational AIs, a key social competence in our daily human interactions.

Contribution to the objectives of HumaneAI-net WPs

The micro-project produced for HumaneAI-net a dataset of annotated associations between black and white cards following the game design of Cards Against Humanity. By doing so, the micro-project led to the creation of a unique dataset of humorous associations between concepts, annotated in terms of different humor styles by the participants of the experiment. The preliminary analysis on how the dataset can be leveraged to build different humor models for conversational agents is particularly relevant for the tasks T3.3 and 3.4 of WP3. Additionally, the micro project aims at exploring how to refine the humor models through an interactive learning approach, particularly relevant for the task T1.3 of WP1.

Tangible outputs

  • Dataset: Dataset – 1712 jokes, rated on a scale of 1 to 9 in terms of joke level, originality, positivity, entertainment, whether it makes sense and whether it is family-friendly
    – Rui Prada
  • Program/code: Online Game – A game of matching sentences with humorous comebacks against an agent (similar to the game Cards Against Humanity)
    – Ines Batina

Algebraic Machine Learning (AML) offers new opportunities in terms of transparency and control. However, that comes along with many challenges regarding software and hardware implementations. To understand the hardware needs of this new method it is essential to analyze the algorithm and its computational complexity. With this understanding, the final goal of this microproject is to investigate the feasibility of various hardware options particularly in-memory processing hardware acceleration for AML.

Output

Simulation model for a PIM architecture using AML

Report

Presentations

Project Partners:

  • Algebraic AI S.L., Fernando Martin Maroto
  • Technische Universität Kaiserslautern (TUK), Christian Weis
  • German Research Centre for Artificial Intelligence (DFKI), Matthias Tschöpe

 

Primary Contact: FERNANDO MARTIN MAROTO, Algebraic AI

Main results of micro project:

We have carried out a theoretical study of the AML sparse crossing algorithm efficiency and identified in-memory processing and FPGA combined with in-memory processing as the two feasible options for Algebraic Machine Learning. Currently, we are working on a prototype implementation that involves FPGA and in-memory processing of bit arrays in commercial Upmem RAM memories.

Contribution to the objectives of HumaneAI-net WPs

This work is critical to speed up the calculation of Algebraic Machine Learning models and in so doing contribute to:
1- Bidirectional human-machine communication using formal expressions
2- Possibility to set goals and establish limits via formal constraints
3- Reduced dependency on statistics can help overcome bias
4- Transparency by design
5 -Possibility for decentralized, cooperative distributed machine learning

Tangible outputs

  • Program/code: AML engine prototype using bitarrays – Fernando Martin Maroto
    www.algebraic.ai

This micro project will study the adaptation of automatic speech recognition (ASR) systems for impaired speech. Specifically, the micro-project will focus on improving ASR systems for speech from subjects with dysarthria and/or stuttering speech impairment types of various degrees. The work will be developed using either German “Lautarchive” data comprising only 130 hours of untranscribed doctor-patient German speech conversations and/or using English TORGO dataset. Applying human-in-the-loop methods we will spot individual errors and regions of low certainty in ASR in order to apply human-originated improvement and clarification in AI decision processes.

Output

Paper for ICASSP 2021 and/or Interspeech 2022

Presentations

Project Partners:

  • Brno U, Mireia Diez
  • Technische Universität Berlin (TUB), Tim Polzehl

 

Primary Contact: Mireia Diez Sanchez, Brno University of Technology

Main results of micro project:

Project has run for less than 50% of its allocated time.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Contribution to the objectives of HumaneAI-net WPs

WP1 Learning, Reasoning and Planning with Human in the Loop
T1.1 Linking symbolic and subsymbolic learning

WP3 Human AI Interaction and Collaboration
T3.1 Foundations of Human-AI interaction and Collaboration
T3.6 Language-based and multilingual interaction
T3.7 Conversational, Collaborative AI

WP6 Applied research with industrial and societal use cases
T6.3 Software platforms and frameworks
T6.5 Health related research agenda and industrial usecases

Tangible outputs

  • Publication: –
  • Other: Internal report – Mireia Diez Sanchez, mireia@fit.vutbr.cz

Knowledge discovery offer numerous challenges and opportunities. In the last decade, a significant number of applications have emerged relying on evidence from the scientific literature. ΑΙ methods offer innovative ways of applying knowledge discovery methods in the scientific literature facilitating automated reasoning, discovery and decision making on data.

This micro-project will focus on the task of question answering (QA) for the biomedical domain. Our starting point is a neural QA engine developed by ILSP addressing experts’ natural language questions by jointly applying document retrieval and snippet extraction on a large collection of PUBMED articles, thus, facilitating medical experts in their work. DFKI will augment this system with a knowledge graph integrating the output of document analysis and segmentation modules. The knowledge graph will be incorporated in the QA system and used for exact answers and more efficient Human-AI interactions. We will primarily focus upon scientific articles on Covid-19 and SARS-CoV-2.

Output

Paper(s) in a conference or/and journal

Demonstrator

Presentations

Project Partners:

  • ATHINA, Haris Papageorgiou
  • German Research Centre for Artificial Intelligence (DFKI), Georg Rehm

 

Primary Contact: Haris Papageorgiou, ATHENA RC/Institute for Language & Speech Processing

This project continues the collaboration between FBK and TUW about defeasible knowledge in description logics in the Contextualized Knowledge Repository (CKR) framework.

In applications, knowledge can hold by default and be overridden in more specific contexts. For example, in a tourism event recommendation system, events can appear as suggested to a class of tourists in a general context: in the more specific context of a particular tourist, preferences can be refined to more precise interests, which may override those at higher contexts.

Goal of this project is to enhance the answer set programming (ASP) based realization of CKR to deal with complex context hierarchies: we use an ASP extension recently proposed by TUW, ASP with algebraic measures, which allows for reasoning on orderings induced by the organization of defeasible knowledge. This collaboration will provide a prototype for reasoning over CKR hierarchies, but also an application for ASP with algebraic measures.

Output

Prototype implementation: realization of reasoning service for query answering over CKR with contextual hierarchies. The prototype will be made available in AI4EU platform.

Report on formalization: technical report and paper submission containing the defining the formal aspects of model selection for contextual hierarchies via ASP with Algebraic Measures and some initial evaluations in the prototype.

Presentations

Project Partners:

  • Fondazione Bruno Kessler (FBK), Loris Bozzato
  • TU Wien, Thomas EIter

 

Primary Contact: Loris Bozzato, FBK

Main results of micro project:

The goal of this project is to reason on complex contextualized knowledge bases using an answer set programming extension with algebraic measures and show the capabilities of this formalism.

The main formal contributions of this project are:
– an extension of the CKR contextual framework to reason about defeasible information over multi-relational contextual hierarchies.
– an ASP based modelling of multi-relational CKRs, where combination of model preferences is realized via algebraic measure expressions.
– an asprin based implementation of query answering in a fragment of multi-relational CKRs, extending the existing CKR datalog translation.
– a study of further capabilities of algebraic measures, showing the possibilities for reasoning on model aggregation.

Contribution to the objectives of HumaneAI-net WPs

The results of the MP are relevant for AI as they show a combination of non-monotonic contextualized DLs in the CKR framework and Logic Programming with numerical measures in weighted LARS.

With respect to the HumaneAI vision, the resulting framework provides a tool for representing, e.g., complex social structures and the contextualization of information relative to such social organizations. With respect to the WP1 objectives, the work combines different AI areas, and follows the direction of joining symbolic and numeric knowledge representation and reasoning methods with notions of uncertainty.

Tangible outputs

HumanE-AI research needs data to advance. Often, researcher struggle to progress for the lack of data. At the same time, collecting a rich and accurate dataset is no easy task. Therefore, we propose to share through the AI4EU platform the datasets already collected so far by different research groups. The datasets will be curated to be ready-to-use for researchers.

Possible extension and variation of such datasets will also be generated using artificial techniques and published on the platform.

A performance baseline will be provided for each dataset, in form of publication reference, developed model or written documentation.

The relevant legal framework will be investigated with specific attention to privacy and data protection, as to highlight limitations and challenges for the use and extension of existing datasets as well as future data collection on the subject of multimodal data collection for perception modelling.

Output

Publication of OPPORTUNITY dataset (and other datasets if time available) on the AI4EU platform. [lead: UoS, contributor: DFKI]

Publication of baseline performance pipeline for OPPORTUNITY dataset (and other datasets if time available) on AI4EU platform. [lead: UoS, contributor: DFKI]

Investigation of data loader and pipeline integration on AI4EU experiment to load HAR dataset and pre-existent pipelines, with a focus on the opportunity dataset (and other datasets if time available) [lead: UoS, contributor: DFKI]

Generation of variation [lead: DFKI]

Survey publications describing datasets and performance baseline [lead: DFKI, contributor: UoS]

Presentations

Project Partners:

  • University of Sussex (UOS), Mathias Ciliberto
  • German Research Centre for Artificial Intelligence (DFKI), Vitor Fortes Rey
  • Vrije Universiteit Brussel (VUB), Arno de Bois

 

Primary Contact: Mathias Ciliberto, University of Sussex

Main results of micro project:

Collection, curation and publication of 4 datasets for Multi Modal Perception and Modeling (WP2):
– OPPORTUNITY++:
– activity of daily living
– sensor rich
– New additional anonymised, annotated video with OpenPose tracks
– Capacitive Gym:
– 7 popular gym workouts
– 11 subjects, each with separate 5 days
– Capacitive sensors in 3 position
– New dataset
– HCI FreeHand dataset:
– Freehand synthetic gestures
– Multiple 3D accelerometers
– SkodaMini dataset:
– Car manufacturing gestures
– Multiple 3D accelerometer and gyroscope
– Beach volleyball (https://ieee-dataport.org/open-access/wearlab-beach-volleyball-serves-and-games)

Contribution to the objectives of HumaneAI-net WPs

Multi-modal perception and modeling needs data to progress, but recording a new rich and accurate dataset allowing for comparative evaluations by the scientific community is no easy task. Therefore, we gathered rich datasets for multimodal perception and modelling of human activities and gestures. We curated the dataset in order to make them easy to use for research thanks to clear documentation and file formats.
The highlight of this microproject is the OPPORTUNITY++ dataset of activities of daily living, a multi-modal extension of the well-established OPPORTUNITY dataset. We enhanced this dataset which contains wearable sensor data, with previously unreleased data, including video and motion tracking data, which make OPPORTUNITY++ a truly multi-modal dataset with wider appeal, such as to the computer vision community.
In addition, we released other well established activity datasets (HCI FreeHand and SkodaMini dataset) as well as datasets involving novel sensor modalities (CapacitiveGym) and skill-assessment dataset (Wearlab BeachVolleyball)

Tangible outputs

Nowadays ML models are used in decision-making processes in real-world problems, by learning a function that maps the observed features with the decision outcomes. However these models usually do not convey causal information about the association in observational data, thus not being easily understandable for the average user, therefore not being possible to retrace the models’ steps, nor rely on its reasoning. Hence, it is natural to investigate more explainable methodologies, such as causal discovery approaches, since they apply processes that mimic human reasoning. For this reason, we propose the usage of such methodologies to create more explicable models that replicate human thinking, and that are easier for the average user to understand. More specifically, we suggest its application in methods such as decision trees and random forest, since by themselves are highly explainable correlation-based methods.
na

Output

1 Conference Paper

1 Prototype

Dataset Repository

Project Partners:

  • INESC TEC, Joao Gama
  • Università di Pisa (UNIPI), Dino Pedreschi
  • Consiglio Nazionale delle Ricerche (CNR), Fosca Giannotti

 

Primary Contact: Joao Gama, INESC TEC, University of Porto

Main results of micro project:

1) Journal paper submitted to WiRES – data mining and knowledge discovery:
Methods and Tools for Causal Discovery and Causal Inference
Ana Rita Nogueira, Andrea Pugnana, Salvatore Ruggieri, Dino Pedreschi, João Gama
(under evaluation)

2) Github repository of datasets, software, and papers related to causal discovery and causal inference research

https://github.com/AnaRitaNogueira/Methods-and-Tools-for-Causal-Discovery-and-Causal-Inference

Contribution to the objectives of HumaneAI-net WPs

The HumanE-AI project thinks a society of increasing interactions between humans and artificial agents. All around the project, causal models are relevant for plausible models of human behavior, man-machine explanations, and upgrading machine-learning algorithms with causal-inference mechanisms.

The output of the micro-project presents a deep study about causal discovery and causal inference. Moreover, the github repository of datasets, papers, and code will be an excellent source of resources for those want to study the topic.

Tangible outputs