Contact person: Joao Gama (INESC TEC) (jgama@fep.up.pt)

Internal Partners:

  1. INESC-Tech Joao Gama,
  2. CNR, Giuseppe Manco,
  3. ULEI, Holger Hoos 

 

The goal is to devise a data generation methodology that, given a data sample, can approximate the stochastic process that generated it. The methodology can be useful in many contexts where we need to share data while preserving user privacy. There are known literature for data generation based on Bayesian neural networks/hidden Markov models that are restricted to static and propositional data. We focus on time-evolving data and preference data. We will study essentially two aspects: (1) the generator to produce realistic data, having the same properties of the original one, and (2) we want to investigate how to inject drift within the data generation process in a controlled manner. The idea is to model the stochastic process through a dependency graph among random variables so that the drift can be simply modeled by changing the structure of the underlying graph through a morphing process.

Tangible Outcomes

  1. available on github – https://github.com/fsp22/mcd_dds4rs 
  2. implementation of the model presented in the paper “Modelling Concept Drift in Dynamic Data Streams for Recommender Systems” https://github.com/fsp22/mcd_dds4rs