The goal is to devise a data generation methodology that, given a data sample, can approximate the stochastic process that generated it. The methodology can be useful in many contexts where we need to share data while preserving user privacy.
There are known literature for data generation based on Bayesian neural networks/hidden Markov models that are restricted to static and propositional data. We focus on time-evolving data and preference data.
We will study essentially two aspects: (1) the generator to produce realistic data, having the same properties of the original one. (2) we want to investigate how to inject drift within the data generation process in a controlled manner. The idea is to model the stochastic process through a dependency graph among random variables so that the drift can be simply modeled by changing the structure of the underlying graph through a morphing process.
Output
1 Conference/Journal Paper
1 Prototype
Dataset Samples
Project Partners:
- INESC TEC, Joao Gama
- Universiteit Leiden (ULEI), Holger Hoos
- Consiglio Nazionale delle Ricerche (CNR), Giuseppe Manco
Primary Contact: Joao Gama, INESC TEC, University of Porto
Results Description
The goal is to devise a data generation methodology that, given a data sample, can approximate the stochastic process that generated it. The methodology can be useful in many contexts where we need to share data while preserving user privacy.
There are known literature for data generation based on bayesian neural networks/hidden Markov models that are restricted to static and propositional data. We focus on time evolving data and preference data.
We study two aspects:
(1) the generator to produce realistic data, having the same properties of the original one.
(2) we investigate how to inject drift within the data generation process in a controlled manner.
The idea is to model the stochastic process through a dependency graph among random variables, so that the drift can be simply modeled by changing the structure of the underlying graph through a morphing process.
This MP fits the goals of WP1 and WP5.
Publications
Luciano Caroprese, Francesco Sergio Pisani, Bruno Veloso, Matthias König, Giuseppe Manco, Holger H. Hoos, and João Gama. Modelling Concept Drift in Dynamic Data Streams for Recommender Systems (under evaluation)
Links to Tangible results
– Paper (under second revision)
– Software Prototype