Making sense of data is a main challenge in creating human understandable descriptions of complex situations. When data refer to process executions, techniques exist that discover explicit descriptions in terms of formal models. Many research works envisage the discovery task as a one-class supervised learning job. Work on deviance mining highlighted nonetheless the need to characterise behaviours that exhibit certain characteristics and forbid others (e.g., the slower, less frequent), leading to the quest for a binary supervised learning task.
In this microproject we focus on the discovery of declarative process models, expressed through Linear Time Temporal Logic, as a binary supervised learning task, where the input log reports both positive and negative behaviours. We therefore investigate how valuable information can be extracted and formalised into a “optimal” model, according to user-preferences (e.g., model generality or simplicity). By iteratively including further examples, the user can also refine the discovered models
Output
Paper to be submitted to relevant journal
Machine learning tool
Artificial data set
Presentations
Project Partners:
- Fondazione Bruno Kessler (FBK), Chiara Ghidini
- Università di Bologna (UNIBO), Federico Chesani
Primary Contact: Chiara Ghidini, FBK
Main results of micro project:
The microproject has produced so far two main results:
– A two-step approach for the discovery of temporal-logic patterns as a binary supervised learning problem, that is starting from a set of "positive traces" (execution traces whose behaviour we want to observe in the discovered patterns), and a set of "negative" traces (execution traces whose behaviour we do not want to observe in the discovered patterns). In detail, in the first step, sets of patterns (possible models) that accept all positive traces and discard as much as possible of the negative ones are discovered. In the second step, the model(s) optimizing one criterion, as for instance the generality or the simplicity, are selected among the possible discovered models.
– Two synthetic labelled ("positive" and "negative") event log datasets used for the synthetic evaluation of the proposed approach.
Contribution to the objectives of HumaneAI-net WPs
The results of the microproject mainly contribute to WP1 (Human-in-the-Loop Machine Learning, Reasoning and Planning). Indeed, on the one hand, the micro-project aims at leveraging machine learning techniques (sub-symbolic learning) to provide LTL patterns (symbolic representation) of a set of "positive" traces, while excluding the "negative" ones (T1.1). On the other hand, the micro-project is a first step towards including the human in the loop of the discovery of LTL patterns representing all and only the cases the human wants to represent (T1.3). The user could indeed iteratively refine the discovered patterns so as to be sure to include all the cases she is interested to include, while excluding all those cases that she wants to exclude.
Tangible outputs
- Dataset: LoanApproval1 – Federico Chesani
https://drive.google.com/drive/folders/15BwG4PJq8iIMh9Sr9dpMXAYBY-qp7QDE?usp=sharing - Dataset: LoanApproval2 – Federico Chesani
https://drive.google.com/drive/folders/1fcJ8itzdMbNOjEAeV6nUEeI5B6__aB_c?usp=sharing - Program/code: Discovery Framework – Sergio Tessaris
https://zenodo.org/record/5158528
Results Description
The microproject has produced three main results:
– A two-step approach for the discovery of temporal-logic patterns as a binary supervised learning problem, that is starting from a set of "positive traces" (execution traces whose behaviour we want to observe in the discovered patterns), and a set of "negative" traces (execution traces whose behaviour we do not want to observe in the discovered patterns). In detail, in the first step, sets of patterns (possible models) that accept all positive traces and discard as much as possible of the negative ones are discovered. In the second step, the model(s) optimizing one criterion, as for instance the generality or the simplicity, are selected among the possible discovered models.
– Two synthetic labelled ("positive" and "negative") event log datasets used for the synthetic evaluation of the proposed approach.
– A paper describing the approach, as well as the approach evaluation.
The microproject has produced three main results:
– A two-step approach for the discovery of temporal-logic patterns as a binary supervised learning problem, that is starting from a set of "positive traces" (execution traces whose behaviour we want to observe in the discovered patterns), and a set of "negative" traces (execution traces whose behaviour we do not want to observe in the discovered patterns). In detail, in the first step, sets of patterns (possible models) that accept all positive traces and discard as much as possible of the negative ones are discovered. In the second step, the model(s) optimizing one criterion, as for instance the generality or the simplicity, are selected among the possible discovered models.
– Two synthetic labelled ("positive" and "negative") event log datasets used for the synthetic evaluation of the proposed approach.
– A paper describing the approach, as well as the approach evaluation.
Publications
Chesani, F., Francescomarino, C. D., Ghidini, C., Grundler, G., Loreti, D., Maggi, F. M., Mello, P., Montali, M., and Tessaris, S. (2022). Shape your process: Discovering declarative business processes from positive and negative traces taking into account user preferences. In Almeida, J. P. A., Karastoyanova, D., Guizzardi, G., Montali, M., Maggi, F. M., and Fonseca, C. M., editors, Enterprise Design, Operations, and Computing – 26th International Conference, EDOC 2022, Bozen-Bolzano, Italy, October 3-7, 2022, Proceedings, volume 13585 of Lecture Notes in Computer Science, pages 217–234. Springer.
Chesani, F., Francescomarino, C. D., Ghidini, C., Grundler, G., Loreti, D., Maggi, F. M., Mello, P., Montali, M., and Tessaris, S. (2022). Shape your process: Discovering declarative business processes from positive and negative traces taking into account user preferences. In Almeida, J. P. A., Karastoyanova, D., Guizzardi, G., Montali, M., Maggi, F. M., and Fonseca, C. M., editors, Enterprise Design, Operations, and Computing – 26th International Conference, EDOC 2022, Bozen-Bolzano, Italy, October 3-7, 2022, Proceedings, volume 13585 of Lecture Notes in Computer Science, pages 217–234. Springer.
Chesani, F., Francescomarino, C. D., Ghidini, C., Loreti, D., Maggi, F. M., Mello, P., Montali, M., and Tessaris, S. (2022). Process discovery on deviant traces and other stranger things. IEEE Transactions on Knowledge and Data Engineering, pages 1–17. DOI: https://doi.org/10.1109/TKDE.2022.3232207
Links to Tangible results
Loan Approval1: dataset. https://drive.google.com/drive/folders/15BwG4PJq8iIMh9Sr9dpMXAYBY-qp7QDE?usp=sharing
Loan Approval2: dataset. https://drive.google.com/drive/folders/1fcJ8itzdMbNOjEAeV6nUEeI5B6__aB_c?usp=sharing
Discovery Framework: program/code https://zenodo.org/record/5158528
Experiments: https://github.com/stessaris/negdis-experiments/tree/v1.0