Many citizen science projects have a crowdsourcing component where several different citizen scientists are requested to fulfill a micro task (such as tagging an image as either relevant or irrelevant for the evaluation of damage in a natural disaster, or identifying a specimen into its taxonomy). How do we create a consensus between the different opinions/votes? Currently, most of the time simple majority voting is used. We argue that alternative voting schemas (taking into account the errors performed by each annotator) could severely reduce the number of citizen scientists required. This is a clear example of continuous human-in-the-loop machine learning with the machine creating a model of the humans that it has to interact with.

We propose to study consensus building under two different hypotheses: truthful annotators (as a model for most voluntary citizen science projects) and self-interested annotators (as a model for paid crowdsourcing projects).


Software and documentation for the two new consensus models into the crowdnalysis framework.

New consensus models case study in a citizen science project.

Algorithm for numerical simulations useful to evaluate the efficacy of the consensus models considered in crowdnalysis.

Report of the results of simulations, with suggestions to improve the consensus models.


Project Partners:

  • Consejo Superior de Investigaciones Científicas (CSIC), Jesus Cerquides
  • Consiglio Nazionale delle Ricerche (CNR), Daniele Vilone

Primary Contact: Jesus Cerquides, IIIA-CSIC

Main results of micro project:

We have contributed to the implementation of several different probabilistic consensus models in the Crowdnalysis library which has been resealed as a Python package.
We have proposed a generic mathematical framework for the definition of probabilistic consensus algorithms, and for performing prospective analysis. This has been published in a journal paper.
We have used the library and the mathematical framework for the analysis of images from the Albanian earthquake scenario.
We exploited Monte Carlo simulations to understand which can be the best way to assess group decisions in evaluating the correct level of damage in natural catastrophes. The results collected so far, which will be published by this year, suggest that Majority rule is the best option as long as all the agents are competent enough to address the task. Otherwise, when the number of unqualified agents is no longer negligible, smarter procedures must be found out.

Contribution to the objectives of HumaneAI-net WPs

Crowdsourcing can be applied to quickly obtain accurate information in different domains, including disaster management scenarios. This requires the computation of the consensus among the different annotators. Probabilistic graphical models can be used to build interpretable consensus models. These models can answer questions such as “Who is the more competent annotator for this task?”, or “How many annotators do I need for this task?” which provide a clear example of machine learning with human-in-the-loop, and fully related to T1.3 Continuous & incremental learning in joint human/AI systems in WP2.
The evolutionary results will help to understand the best way to proceed in the research, suggesting new theoretical and experimental studies to address the topic. Therefore they make it possible to evaluate the interplay between human acting and AI learning in crowdsourcing tasks, connected with T3.2 Human-AI Interaction/collaboration paradigms

Tangible outputs