Contact person: Florian Müller (florian.mueller@um.ifi.lmu.de )

Internal Partners:

  1. LMU Munich, Florian Müller, florian.mueller@um.ifi.lmu.de  

External Partners:

  1. University of Bari, Giuseppe Desolda, giuseppe.desolda@uniba.it 

 

Manufacturing tools like 3D printers have become accessible to the wider society, making the promise of digital fabrication for everyone seemingly reachable. While the actual manufacturing process is largely automated today, users still require knowledge of complex design applications to not only produce ready-designed objects, but also adapt them to their needs or design new objects from scratch. To lower the barrier for the design and customization of personalized 3D models, we imagine an AI-powered system that assists users in creating 3D objects for digital fabrication. Reaching this vision requires a common understanding – a common ground – between the users and the AI system. As a first step, in this micro project, we explored novices’ mental models in voice-based 3D design by conducting a high-fidelity Wizard of Oz study with 22 participants without skills in 3D design. We asked the participants to perform 14 tasks revolving around some basic concepts of 3D design for digital modeling, like the creation of objects, the manipulation of objects (e.g., scaling, rotating, and/or moving objects), and the creation of composite objects. We performed a thematic analysis of the collected data assessing how the mental model of novices translates into voice-based 3D design.

Results Summary

We found that future AI assistants to support novice users in voice-based digital modeling must: manage the correction the users do during and after the commands to fix certain errors; deal with vague and incomplete commands by automatically completing the commands with sensible defaults or by asking the users for clarification; consider the prior novices knowledge, for example, about the use of undo and redo functions; provide only a simplified set of operations for creating simple and composite 3D objects; design a workflow similar to what novices would do if they were building real objects, for example, providing wizard procedures that guide novices in designing composite 3D models starting from the bottom; provide different commands to select 3D objects; understand and execute chained commands; understand commands that are relative to the users’ point of view; grant multiple ways to refer to the axes, for example, by using their names, colors and user direction; favor explicit trigger words to avoid unintentional activation of the voice assistant; embrace diversity in naming approaches since novices often use other words to refer to 3D objects.

Contact person: Mirco Nanni (mirco.nanni@isti.cnr.it)

Internal Partners:

  1. Università di Pisa (UNIPI), Giuliano Cornacchia, Giovanni Mauro , and Dino Pedreschi
  2. Consiglio Nazionale delle Ricerche (CNR), Mirco Nanni and Luca Pappalardo
  3. German Research Centre for Artificial Intelligence (DFKI), Paul Lukowicz

External Partners:

  1. University of Rome, Matteo Bohm 

 

Satellite Navigation systems like TomTom, Google Maps and OpenStreetMap-based services are pervasively used in our cities to facilitate and speed up reaching the desired destination. They are typically optimized to minimize an individual driver’s travel time without considering their impact on traffic, whether a street can absorb the traffic, or if some routes compromise drivers’ safety.

When dealing with socio-technical systems, the aggregation of selfish optimal individual suggestions may negatively impact the collective level, worsening the wellness of society as a whole and leading to discomfort, particularly when individuals share a resource with a limited capacity, as in the case of road networks. For instance, if the navigation system recommends all vehicles to travel on the same road to reach a destination, congestion may emerge on that road.

The adverse traffic effects unintentionally exacerbated by navigation systems can impact the environment. Indeed, vehicular traffic represents a critical hazard to urban sustainability, contributing to one of the most severe problems affecting conurbations, air pollution, which negatively impacts human health and the environment.

This study aims to assess and quantify the impact of navigation systems on traffic and the urban environment through realistic simulations and what-if analysis. In particular, we consider CO2 emissions to measure the effect of traffic on the environment. Furthermore, we will assess how much the percentage of vehicles that follow different types of indications impacts on the CO2 emissions and their concentration on the roads of the network.

Results Summary

We set up precise and realistic traffic simulations using SUMO (Simulation of Urban MObility) within different scenarios in which various proportions of the vehicles follow the recommendations of satellite navigators to assess the impact of routing strategies concerning individual and collective dimensions. From the simulations, we find that the extreme settings where no vehicle follows any particular router and where all the cars follow one particular router are the worst in terms of their impact on sustainability. On the opposite side, a balanced mix of vehicles following a navigation system and vehicles that do not follow their instructions appear to have a lighter impact on the city’s roads and the environment, making vehicles’ emissions more sustainable. In the balanced setting composed of 40% of cars following the suggestion of satellite navigators we observed the lowest number of CO2 emissions, and the pollution is distributed more evenly on the road network.

Tangible Outcomes

  1. A bundle to replicate a simulation with SUMO over Milano with 15k vehicles and 40% routed ones: https://kdd.isti.cnr.it/~nanni/Simulation_bundle_40percent_routed.zip
  2. Video presentation summarizing the project 

Contact person: Frank Dignum (dignum@cs.umu.se)

Internal Partners:

  1. Umeå University (UMU), Frank Dignum
  2. Consiglio Nazionale delle Ricerche (CNR), Eugenia Polizzi, Giulia Andrighetto and Mario Paolucci
  3. Leiden university, Mark Dechesne  

External Partners:

  1. UU and National Police in Netherlands, Mijke van den Hurk 

 

In this project we investigate whether normative behavior can be detected in facebook groups. In a first step we hypothesize about possible norms that could lead to a group becoming more extreme on social media, or whether groups that become more extreme will develop certain norms that distinguish them from other groups and that could be detected. An example of such a norm could be that a (self-proclaimed) leader of a group is massively supported by retweets, likes or affirmative messages, along with evidence of verbal sanctioning toward counter-normative replies. Simulations and analyses of historical facebook data (using manual detection in specific case studies and more broadly through NLP) will help reveal the existence of normative behavior and its potential change over time.

Results Summary

The project delivered detailed analyses of the tweets around the USA elections and subsequent riots. Where we thought we might discover some patterns in the tweets indicating more extreme behavior, it appears that extremist expressions are quickly banned from Twitter and find a home in more niche social platforms (in this case Parler). Thus the main conclusion of this project is that we need to find the connections between users in different social media platforms in order to track any extreme behavior.

In order to see how individuals might contribute to behavior that is not in the interest of society we cannot analyze one social media platform. Especially more extremist expressions quickly disappear from main stream social media to niche platforms that can quickly change over time. Thus the connection between individual and societal goals is difficult to observe by just analyzing data from a single social media platform. On the other hand it is very difficult to link users between platforms. Our core contribution could be summarized in 2 points:

  1. Identification of radical behavior in Parler groups
  2. Characterizing the language use of radicalized communities detected on Parler

Tangible Outcomes

  1. Video presentation summarizing the project

Contact person: Richard Benjamins (richard.benjamins@telefonica.com

Internal Partners:

  1. Telefónica Investigación y desarrollo S.A. (TID), Richard Benjamins
  2. Volkswagen AG, Richard Niestroj
  3. Università di Bologna (UNIBO), Laura Sartori
  4. Consiglio Nazionale delle Ricerche (CNR), Fosca Giannotti  

External Partners:

  1. City Council, Valladolid, Pedro de Alarcon, pedroantoniode.alarconsanchez@telefonica.com

 

Globally, nine out of ten people breathe polluted air, and it is the direct death cause of more than seven million people per year. Between 20%-40% of deaths due to serious diseases are caused by air pollution (source: https://www.stateofglobalair.org/sites/default/files/documents/2020-10/soga-global-profile-factsheet.pdf). In Spain, 10.000 people die every year due to air pollution (almost tripling traffic deaths) and in Madrid alone, there are 5000 pollution deaths per year (14/day). Transportation by combustion engines is responsible for about 30% of air pollution, and in large cities this is higher. Urban areas and their respective local governments are facing immense challenges with accelerating rates of NO2, Ozone, Particle Matter and CO2 emissions amongst other pollutants. In their mission to ensure cleaner air for their cities, the first and most important step is to collect accurate and consistent data to ensure healthy air quality levels for citizens as well as to identify where the major air pollution hotspots are. Moreover, cities are increasingly looking at their transit systems to cut those emissions that impact public health and the environment.

Until now, monitoring the quality of air has involved great efforts for cities. For local governments, air quality management can be costly due to the required expensive equipment to monitor the key pollutants that worsen the quality of air. There are several sources of pollutants: industrial activities, construction, residential heating, among others, but road traffic of fossil combustion vehicles is the most prevalent source for dangerous pollutants such as NO2 and Ozone (O3). However, the way to investigate the actual traffic volumes is relatively manual, using roadside interview data and manual counters, although IoT sensors to quantify are increasingly deployed. Not only is this expensive, but often it is also inaccurate – providing a small snapshot on how traffic really moves around cities and countries. However, by using mobility data and IoT, the authorities can shift to Big Data and AI. Rather than using small samples, they can now receive insights more frequently, precise, and granular. That is an important complement to inform decisions with respect to air quality, as traffic along the weather conditions are closely correlated with air pollution levels.

European regulation requires cities to not exceed thresholds of pollutants. However, oftentimes measurements are taking place at the district level ignoring the fact that air quality might be different for every street. Moreover, air quality has not the same importance in a residential area versus a more industrial area. And the type of use is also important such as schools, hospitals, sports facilities, et cetera.

The combination of mobility data (generated from anonymized and aggregated mobile phone data of the telecommunications sector), IoT pollution and climate sensor data from moving vehicles, and Open Data, can provide actionable insights about traffic mobility patterns and pollution such that authorities and policymakers can better measure, predict and manage cities’ mobility and pollution. This micro project is strategically aligned with Europe’s Green Deal and the EU Data Strategy.

Results Summary

Artificial intelligence algorithms help in increasing the spatio-temporal accuracy of the monitoring activity and in providing predictions on future (dangerous) pollution levels, so authorities can take preventive actions. We have performed a series of innovation activities from the development of a prototype in one city (Madrid), which we subsequently validated in a second city (Valladolid) that also includes a social and ethical impact analysis to understand whether air quality related decisions are affecting social groups in an equal manner. The prototype we built, in collaboration with the city of Madrid, exploits both privately held data as well as publicly available (open) data to monitor air quality at street level. Data sources include traffic, vegetation, temperature/wind speed and demographics. The system allows cities to perform evidence-based policy- and decision-making. An important feature of the system built, is the collection of heterogeneous data, algorithms, advanced visualization, and filtering control in a single platform. This capacity is key to perform exploratory data analysis and to find insights.

This project uses industrial data from the telecommunications industry, combined with open data and IoT generated data to palliate an important societal problem, while at the same time showing a way in which the telecom sector can create value using artificial intelligence and data. It is aligned with the European data strategy, the Guidelines for Trustworthy AI, and the European Green Deal. This is the first of a series of three micro projects.

Tangible Outcomes

  1. Press release through the participating organizations’ websites raising awareness about the issue: https://unstats.un.org/unsd/undataforum/blog/7-ways-mobile-data-is-being-used-to-change-the-world/ 
  2. video explaining the project Air Quality for All (AQ4A) that could be used for government and business presentations https://www.youtube.com/watch?v=WBNf5F9Kp7c
  3. Source of the presentation slides https://www.humane-ai.eu/_micro-projects/mps/MP-23/MP-6.10-airquality_v2_Berlin.pptx

Contact person: Paolo Ferragina (paolo.ferragina@unipi.it )

Internal Partners:

  1. UNIPI, Paolo Ferragina, paolo.ferragina@unipi.it
  2. ISTI-CNR, Giulio Rossetti, giulio.rossetti@isti.cnr.it  

External Partners:

  1. Scuola Normale Superiore 

 

The micro-project aims at designing a Recommender System able to foster pluralistic viewpoints in news pieces suggestions. The first step consists of quantifying the political bias of a news article. While such an issue has been widely investigated in the USA domain, as far as we know, no work has been performed in the European domain. In this scenario, we have already built a dataset with more than 8 million European news articles labeled by their political leaning, popularity, and distribution area. Since a publicly available dataset of such size and richness of annotations does not exist in the EU media landscape, we think that it could have an enormous potential value for subsequent academic studies. Additionally, we are currently leveraging AI-based techniques for NLP to define a topic modeling algorithm and a multilingual classifier able to identify the main topics and the political leaning of each article.

Results Summary

The tangible objective of this micro-project was to develop two datasets for European News with a political leaning labelling. This was needed to tackle the next step of the project, which was the one of building a bias-minimizing recommender system for European news.

The first dataset comprehends millions of European news, and it has been enriched with metadata coming from Eurotopics.net. Each entry in the dataset contains the maintext, title, publishment date, language, news source together with news source metadata. This metadata comprehends political leaning of the news source and its country.

We then built an article bias classifier, in the attempt of predicting the political label of single articles using the labels obtained through distant supervision. We then applied explainableAI to our classifier, and concluded that the classifier is effectively predicting the news source, rather than the political leaning.

In order to try and overcome this issue, we built a second dataset, which has the same features of the first one described above, but with the addition of topics, chosen between 7 macro-topics.

The immediate plan is to perform political-bias classification exploiting the new dataset by filtering out all the articles which do not carry political-bias, such as those dealing with sports or gossip.

 

Tangible Outcomes

  1. Dataset without topics https://drive.google.com/file/d/1Qq2khT7lM-5_oHSNJhbK_-EATNdOSY-n/view?usp=sharing 
  2. Dataset with topics:
    https://drive.google.com/file/d/1KGy-FcLulACK_Fa3Abd9Xr4qaurBnu2S/view?usp=sharing 
  3. Repo with the code used to build and study the datasets: https://github.com/LorenzoBellomo/EU-NewsDataset 
  4. Repo with the political bias classifier: https://github.com/LorenzoBellomo/BiasClassification   
  5. Report summarizing the detailed results
    https://sobigdata.d4science.org/catalogue-sobigdata?path=/dataset/pluralistic_recommendation_in_news_-_report

 

Contact person: Jesus Cerquides (j.cerquides@csic.es

Internal Partners:

  1. Consejo Superior de Investigaciones Científicas (CSIC), Jesus Cerquides, cerquide@iiia.csic.es

External Partners:

  1. University of Geneva, Jose Luis Fernandez Marquez  

 

Social media generates large amounts of almost real-time data which can turn out extremely valuable in an emergency situation, especially for providing information within the first 72 hours after a disaster event. Despite there being abundant state-of-the-art machine learning techniques to automatically classify social media images and some work for geolocating them, the operational problem in the event of a new disaster remains unsolved.

Currently the state-of-the-art approach for dealing with these first response mapping is first filtering and then submitting the images to be geolocated to a crowd of volunteers [1], assigning the images randomly to the volunteers.

The project is aimed at leveraging the power of crowdsourcing and artificial intelligence (AI) to assist emergency responders and disaster relief organizations in building a damage map from a zone recently hit by a disaster. Specifically, the project involves the development of a platform that can intelligently distribute geolocation tasks to a crowd of volunteers based on their skills. The platform uses machine learning to determine the skills of the volunteers based on previous geolocation experiences.

Thus, the project concentrates on two different tasks:

  • Profile Learning. Based on the previous geolocations of a set of volunteers, learn a profile of each of the volunteers which encodes its geolocation capabilities. These profiles should be understood as competency maps of the volunteer, representing the capability of the volunteer to provide an accurate geolocation for an image coming from a specific geographical area.
  • Active Task Assignment. Use the volunteer profiles efficiently in order to maximize the geolocation quality while maintaining a fair distribution of geolocation tasks among volunteers.

In the first stage, we envision an experimental framework with realistically generated artificial data, which acts as a feasibility study. This will be published as a paper in a major conference or journal. Simultaneously we plan to integrate both the profile learning and the active task assignment with the crowdnalysis library, a software outcome of our previous micro-project. Furthermore, we plan to organize a geolocation workshop to take place in Barcelona with participation from the JRC, University of Geneva, United Nations, and IIIA-CSIC.

In the near future, the system will generate reports and visualizations to help these organizations quickly understand the distribution of damages. The resulting platform could enable more efficient and effective responses to natural disasters, potentially saving lives and reducing the impact of these events on communities.

[1] Fathi, Ramian, Dennis Thom, Steffen Koch, Thomas Ertl, and Frank Fiedrich. “VOST: A Case Study in Voluntary Digital Participation for Collaborative Emergency Management.” Information Processing and Management 57, no. 4 (July 1, 2020): 102174. https://doi.org/10.1016/j.ipm.2019.102174 

Results Summary

The project focused on improving the accuracy and efficiency of geolocating social media images during emergencies by using crowdsourced volunteers. Key results include the development of two models: a profile-learning model to gauge volunteers’ geolocation abilities and a task assignment model that optimizes image distribution based on volunteer skills. These models outperform traditional random assignment approaches by reducing annotation requirements and improving the quality of geolocation consensus without sacrificing accuracy. This method holds promise for disaster response applications. We had 3 main outputs:

  1. Open-source implementation of the volunteer profiling and consensus geolocation algorithms into the crowd analysis library.
  2. Papers with the evaluation of the different geolocation consensus and active strategies for geolocation:
  3. an online workshop to collect expert feedback about the topic

Tangible Outcomes

  1. Ballester, Rocco, Yanis Labeyrie, Mehmet Oguz Mulayim, Jose Luis Fernandez-Marquez, and Jesus Cerquides. “Crowdsourced Geolocation: Detailed Exploration of Mathematical and Computational Modeling Approaches.” Cognitive Systems Research 88 (December 1, 2024): 101266. https://doi.org/10.1016/j.cogsys.2024.101266 .
  2. Ballester, R., Labeyrie, Y., Mulayim, M.O., Fernandez-Marquez, J.L. and Cerquides, J., 2023. Mathematical and Computational Models for Crowdsourced Geolocation. In Artificial Intelligence Research and Development (pp. 301-310). IOS Press. https://ebooks.iospress.nl/doi/10.3233/FAIA230699
  3.  Firmansyah, H. B., Bono, C. A., Lorini, V., Cerquides, J., & Fernandez-Marquez, J. L. (2023). Improving Disaster Response by Combining Automated Text Information Extraction from Images and Text on Social Media. In Artificial Intelligence Research and Development (pp. 320-329). IOS Press. https://ebooks.iospress.nl/doi/10.3233/FAIA230701
  4. Cerquides J., Mülâyim M.O. Crowdnalysis: A software library to help analyze crowdsourcing results (2024), 10.5281/zenodo.5898579 https://github.com/IIIA-ML/crowdsourced_geolocation

Contact person: Elisabetta Biondi, ( elisabetta.biondi@iit.cnr.it )

Internal Partners:

  1. Consiglio Nazionale delle Ricerche (CNR),Elisabetta Biondi, elisabetta.biondi@iit.cnr.it
  2. Central European University (CEU), Janos Kertesz, kerteszj@ceu.edu, Gerardo Iniguez, IniguezG@ceu.edu

 

The Friedkin-Johnsen model is a very popular model in opinion dynamics, validated on real groups, and well-investigated from the opinion polarization standpoint. Previous research has focused almost exclusively on static networks, where links between nodes do not evolve over time. In this micro-project, we want to fill this gap by designing a variant of the Friedkin-Johnsen model that embeds the dynamicity of social networks. Furthermore, we designed a novel definition of global polarization that combines network features and opinion distribution, to capture the existence of clustered opinions. We have analyzed the polarization effect of the new dynamic model, and identified the impact of the network structure.

Results Summary

Human social networks are very complex systems and their structure has an essential impact on opinion dynamics. However, since my main goal is to study the impact of the opinion dynamics model per se, we decided to deal with two different social network typologies: a Erdős–Rényi (ER) and a stochastic block model (SBM).

— Design of the Friedkin-Johnsen (FJ) dynamic model. We have implemented a rewiring policy that has been extensively studied in discrete opinion diffusion models. This involves substituting edges that connect nodes with different opinions with other edges. We have adapted this scheme to work with the FJ model’s opinions, which are within the range of [-1,1], in both the asynchronous and synchronous versions. According to two parameters θ (the disagreement threshold) and p_rew (the rewiring probability): • With probability 1-p_rew the FJ is applied • With probability p_rew, if i and j disagree, i.e. |x_i-x_j |> θ, the edge (i,j) is replaced with an edge (i,k) where k agrees with i, i.e. |x_i-x_j |<= θ. The above algorithm was specifically designed and implemented for the ER graph. However, in the case of the SBM, I have limited the potential candidates for rewiring to nodes within a maximum of two hops distance. This decision was made to prevent the block structure from becoming entirely irrelevant. The rationale behind this choice is based 38 on the triadic closure mechanism, which suggests that individuals are more inclined to choose new acquaintances among the friends of their friends.

–Design of the polarization metric. The design of the polarization metric involved developing a definition for identifying highly polarized networks. We defined a highly polarized network as one in which there are two distinct opinions that are clustered into two tightly connected communities. To achieve this, we needed to consider both the network structure and the distribution of opinions. Therefore, we decided to use two different metrics to measure these aspects: bimodality for the opinion distribution and homogeneity for its correspondence with the network structure.

— Bimodality. The bimodality coefficient was used to measure the extent to which a distribution is bimodal. It is calculated using the skewness and kurtosis values and represents how similar the distribution is to one with two modes.

— Homogeneity To measure the homogeneity of the opinion distribution with the network structure, we examined the local distribution of nodes’ opinions. We looked at whether each node’s opinion was similar to those of its neighbors, which would suggest that it was in line with the overall opinion distribution over the network. The final homogeneity value was close to zero if the distribution of opinions was close to linear.

–Experimental evaluation. We have developed a Python simulator that can compute the dynamic FJ (rewiring included), and polarization metrics over time based on the given network and initial opinions. To test the model, we ran simulations on a small network comprising 20 nodes and compared the outcomes of the FJ with rewiring to those without rewiring. For the ER network, we used a vector of uniformly distributed opinions over [-1,1] as the initial opinions. However, for the SBM networks, we employed a different configuration, where the initial opinions were uniformly extracted over the intervals [-0.5,0-0.1] and [0.1,0.5], depending on whether the nodes belonged to one or the other block. In conclusion, this Micro-Project involves the design of a dynamic version of the FJ model for synchronous and asynchronous cases. Additionally, we have developed a new definition of polarization that considers both the distribution of opinions and the network topology. To assess the model’s effectiveness, we conducted simulations on two different network types: an ER network and an SBM network. Our findings indicate that the rewiring process has significant effects on polarization, but these effects are dependent on the initial network.

 

Tangible Outcomes

  1. Github link of the code of the simulator for the new dynamic model: https://github.com/elisabettabiondi/FJ_rewiring_basic.git 

 

Contact person: Giuseppe Manco (giuseppe.manco@icar.cnr.it

Internal Partners:

  1. Consiglio Nazionale delle Ricerche (CNR), Giuseppe Manco
  2. INESC TEC 
  3. Università di Pisa (UNIPI) 

 

This microproject investigates methods for learning deep probabilistic models based on latent representations that can explain and predict event evolution within social media. Latent variables are particularly promising in situations where the level of uncertainty is high, due to their capabilities in modeling the hidden causal relationships that characterize data and ultimately guarantee robustness and trustworthiness in decisions. In addition, probabilistic models can efficiently support simulation, data generation and different forms of collaborative human-machine reasoning.

Results Summary

Our microproject aims at investigating methods for modeling event interactions through temporal processes. We revisited the notion of event modeling and provided the mathematical foundations that characterize the literature on the topic. We defined an ontology to categorize the existing approaches in terms of three families: simple, marked, and spatio-temporal point processes. For each family, we systematically reviewed the existing approaches providing a deep discussion. Specifically, we investigated recent machine and deep learning-based methods for modeling temporal processes. We focused on studying prediction problems from event sequences to understand their structural and temporal dynamics. In fact, understanding these dynamics can provide insights into the complex patterns that govern the process and can be used to forecast future events. Among existing approaches, we investigated probabilistic models based on latent representations that represent an appropriate choice to model event sequences. Event sequences are pervasive in several application contexts, such as business processes, smart industry as well as scenarios involving human activities, including especially information diffusion in social media. Indeed, our study has been focused on works whose aim is the prediction of events within social media. Social media focus on the interactions among individuals within context-sharing platforms such as Twitter, Instagram, etc. Interactions can be modeled as event sequences since events can be user actions over time. In addition, we also provided an overview of other application scenarios such as healthcare, finance, disaster management, public security, and daily life. The analyzed literature provides several datasets that we categorized according to the application scenarios they can be used for. For each dataset, we reported its description, the papers containing experiments over it, and, when available, a source web link.

Tangible Outcomes

  1. [arxiv] Modeling Events and Interactions through Temporal Processes – A Survey” https://arxiv.org/abs/2303.06067  [under review] ACM Computing Surveys
  2. A list of relevant datasets. https://github.com/Angielica/datasets_point_processes
  3. Survey showing Point Processes resources. https://github.com/Angielica/temporal_processes

Contact person: Andrea Passarella, (andrea.passarella@iit.cnr.it)

Internal Partners:

  1. Consiglio Nazionale delle ricerche (CNR), Andrea Passarella,

andrea.passarella@iit.cnr.it

  1. Central European University, János Kertész, kerteszj@ceu.edu

 

We envision a human-AI ecosystem in which AI-enabled devices act as proxies of humans and try to learn collectively a model in a decentralized way. Each device learns a local model that needs to be combined with the models learned by the other nodes, in order to improve both the local and global knowledge. The challenge of doing so in a fully decentralized AI system entails understanding how to compose models coming from heterogeneous sources and, in case of potentially untrustworthy nodes, decide who can be trusted and why. In this micro-project, we focus on the specific scenario of model “gossiping” for accomplishing a decentralized learning task and we study what models emerge from the combination of local models, where the combination takes into account the social relationships between the humans associated with the AI. We use synthetic graphs to represent social relationships, and large-scale simulation for performance evaluation.

Results Summary

The micro project has developed a modular simulation framework to test decentralised machine learning algorithms on top of large-scale complex social networks. The framework is written in Python, exploiting state-of-the-art libraries such as networkx (to generate network models) and Pytorch (to implement ML models). The simulator is modular, as it accepts networks in the form of datasets as well as synthetic models. Local data are allocated on each node, which trains a local ML model of choice. Communication rounds are implemented, through which local models are aggregated and re-trained based on local data. Benchmarks are included, namely federated learning and centralised learning. Initial simulation results have been derived, to assess the accuracy of decentralised learning (social AI gossiping) on Barabasi-Albert networks, showing that social AI gossiping is able to achieve comparable accuracy with respect to centralised and federated learning versions (which rely on centralised elements, though). The work has been continued in a follow-up micro-project (TMP-034). The simulator developed in this micro-project and in its follow-up (TMP-034) has been used in the publications reported here.

Tangible Outcomes

  1. Palmieri, Luigi, Lorenzo Valerio, Chiara Boldrini, and Andrea Passarella. “”The effect of network topologies on fully decentralized learning: a preliminary investigation.”” In Proceedings of the 1st International Workshop on Networked AI Systems, pp. 1-6. 2023. https://dl.acm.org/doi/10.1145/3597062.3597280 
  2. Luigi Palmieri, Lorenzo Valerio, Chiara Boldrini, Andrea Passarella, Marco Conti, “”Exploring the Impact of Disrupted Peer-to-Peer Communications on Fully Decentralized Learning in Disaster Scenarios””, 8th International Conference on Information and Communication Technologies for Disaster Management (ICT-DM 2023). https://doi.org/10.1109/ICT-DM58371.2023.10286953   https://arxiv.org/abs/2310.02986 
  3. Luigi Palmieri, Chiara Boldrini, Lorenzo Valerio, Andrea Passarella, and Marco Conti. “Impact of Network Topology on the Performance of Decentralized Federated Learning.” Computer Networks 253 (2024). https://www.sciencedirect.com/science/article/pii/S1389128624005139 
  4. [arxiv] Valerio, L., Boldrini, C., Passarella, A., Kertész, J., Karsai, M., and Iñiguez, G., “Coordination-free Decentralised Federated Learning on Complex Networks: Overcoming Heterogeneity”, arXiv e-prints, 2023. https://arxiv.org/abs/2312.04504 
  5. [arxiv] Palmieri, Luigi, Chiara Boldrini, Lorenzo Valerio, Andrea Passarella, Marco Conti, and János Kertész. “”Robustness of Decentralised Learning to Nodes and Data Disruption.”” arXiv preprint arXiv:2405.02377 (2024). https://arxiv.org/abs/2405.02377 
  6. Code: SAIsim, C. Boldrini, L. Valerio, A. Passarella, https://zenodo.org/record/5780042#.Ybi2sX3MLPw

Contact person: Jennifer Renoux (jennifer.renoux@oru.se)

Internal Partners:

  1. Örebro University (ORU), Jennifer Renoux
  2. Instituto Superior Técnico (IST), Ana Paiva

 

Social dilemmas are situations in which the interests of the individuals’ conflict with those of the team, and in which maximum benefit can be achieved if enough individuals adopt prosocial behavior (i.e., focus on the team’s benefit at their own expense). In a human-agent team, the adoption of prosocial behavior is influenced by various features displayed by the artificial agent, such as transparency, or small talk. One feature still unstudied is expository communication, meaning communication performed with the intent of providing factual information without favoring any party.We implemented a public goods game with information asymmetry (i.e., agents in the game do not have the same information about the environment) and performed a user-study in which we manipulated the amount of information that the artificial agent provides to the team and examined how varying levels of information increase or decrease human prosocial behavior.

Results Summary

This micro-project has led to the design and development of an experimental platform to test how communication from an artificial agent influences a human’s pro-social behavior.

The platform comprises the following components:

– a fully configurable mixed-motive public good game, allowing a single human player to play with artificial agents, and an artificial “coach” giving feedback on the human’s action. Configuration is made through json files (number and types of agents, type of feedback, game configuration…). The game is called “Pest Control”, which implements a public good game during which players must prevent a spreading pest from reaching their farm while gathering as many coins as possible. An artificial agent can give feedback to the player. In this implementation, only one human player can control the game and 4 artificial agents are playing with them. This game has been used as the base for a user study investigating the impact of expository information on a human’s prosociality.

– a set of questionnaires designed to evaluate the prosocial behavior of the human player during a game

Tangible Outcomes

  1. Pest control game demo https://jrenoux.github.io/pestcontrolgame/demo/index.html
  2. The Pest Control Game experimental platform – Jennifer Renoux*, Joana Campos, Filipa Correia, Lucas Morillo, Neziha Akalin, Ana Paiva https://github.com/jrenoux/pest-control-game-source
  3. Video presentation summarizing the project

Contact person: Eugenia Polizzi

Internal Partners:

  1. Consiglio Nazionale delle Ricerche (CNR), ISTC: Eugenia Polizzi)
  2. Fondazione Bruno Kessler (FBK), Marco Pistore

 

The goal of the project is to investigate the role of social norms on misinformation in online communities. This knowledge can help identify new interventions in online communities that help prevent the spread of misinformation. To accomplish the task, the role of norms was explored by analyzing Twitter data gathered through the Covid19 Infodemics Observatory, an online platform developed to study the relationship between the evolution of the COVID-19 epidemic and the information dynamics on social media. This study can inform a further set of microprojects addressing norms in AI systems through theoretical modelling and social simulations.

Results Summary

In this MP, we diagnosed and visualized a map of existing social norms underlying fake news related to COVID19. Through the analysis of millions of geolocated tweets collected during the Covid-19 pandemic we were able to identify the existence of structural and functional network features supporting an “illusion of the majority” on Twitter. Our results suggest that the majority of fake (and other) contents related to the pandemic are produced by a minority of users and that there is a structural segmentation in a small “core” of very active users responsible for large amount of fake news and a larger “periphery” that mainly retweets the contents produced by the core. This discrepancy between the size and identity of users involved in the production and diffusion of fake news suggests that a distorted perception of what users believe is the majority opinion may pressure users (especially those in the periphery) to comply with the group norm and further contribute to the spread of misinformation in the network.

Tangible Outcomes

  1. The voice of few, the opinions of many: evidence of social biases in Twitter COVID-19 fake news sharing – Piergiorgio Castioni, Giulia Andrighetto, Riccardo Gallotti, Eugenia Polizzi, Manlio De Domenico   https://arxiv.org/abs/2112.01304
  2. Video presentation summarizing the project

 

Contact person: Frank Dignum (dignum@cs.umu.se)

Internal Partners:

  1. Umeå University (UMU), Frank Dignum
  2. Instituto Superior Técnico (IST), Rui Prada, Maria Inês Lobo, and Diogo Rato

 

In order for systems to function effectively in cooperation with humans and other AI systems they have to be aware of their social context. Especially in their interactions they should take into account the social aspects of their context, but also can use their social context to manage the interactions. Using the social context in the deliberation about the interaction steps will allow for an effective and focused dialogue that is geared towards a specific goal that is accepted by all parties in the interactions. In this project we started with the Dialogue Trainer system that allows for authoring very simple but directed dialogues to train (medical) students to have effective conversations with patients. Based on this tool, in which social context is taken into account only through the authors of the dialogue, we designed a system that will actually deliberate about the social context.

Results Summary

The MP addresses the following limitations of scripted dialogue training systems:

  • Dialogue is not self-made: players are unable to learn relevant communication skills
  • Dialogue is predetermined: agent does not need to adapt to changes in the context
  • Dialogue tree is very large: editor may have difficulty managing the dialogue

Therefore, this project’s goal is the creation of a flexible dialogue system, in which a socially aware conversational agent will deliberate and provide context-appropriate responses to users, based on defined social practices, identities, values, or norms. Scenarios in this dialogue system should be easy to author as well.

The main result is a Python prototype of a dialogue system with an architecture based on Cognitive Social Frames and Social Practices, whose dialogue scenarios are easy to edit in a widely used tool called Twine. We also submitted a workshop paper.

Tangible Outcomes

  1. Socially Aware Interactions: From Dialogue Trees to Natural Language Dialogue Systems. I. Lobo, D. Rato, R. Prada, F. Dignum In: , et al. Chatbot Research and Design. CONVERSATIONS 2021. Lecture Notes in Computer Science(), vol 13171. Springer, Cham. https://link.springer.com/chapter/10.1007/978-3-030-94890-0_8
  2. Prototype of dialogue system – ines.lobo@tecnico.ulisboa.pt https://github.com/GAIPS/socially-aware-interactions
  3. Video presentation summarizing the project