Titelangaben
Buss, Alina ; Kecht, Christoph ; Kratsch, Wolfgang ; Röglinger, Maximilian ; Sadeghianasl, Sareh ; Wynn, Moe T.:
From Words to Workflows: Extracting Object-Centric Event Logs from Textual Data.
In:
Intelligent Information Systems : CAiSE 2025 Forum and Doctoral Consortium ; Proceedings. -
Cham
: Springer
,
2025
. - S. 37-44
ISBN 978-3-031-94590-8
DOI: https://doi.org/10.1007/978-3-031-94590-8_5
Abstract
Organizations generate vast amounts of data in unstructured formats, such as textual descriptions, which remain largely untapped for process mining. This data is particularly valuable because it often captures critical exception cases and intricate dependencies that are absent in structured datasets, but crucial for understanding process deviations. Importantly, these unstructured sources frequently preserve the object-centric nature of real-world processes – information that is typically flattened or lost in traditional, case-centric event log formats. In this paper, we harness this potential and tackle the research gap by introducing a novel approach to extract Object-Centric Event Logs (OCELs) from unstructured textual descriptions using natural language processing techniques and large language models. Our approach consists of two subcomponents: a collector and a refiner. The collector aims to extract activities, timestamps, entities and their properties from textual descriptions, while the refiner integrates, cleans, and refines the extracted information from multiple descriptions. We implement both subcomponents in heuristic and generative forms, creating four distinct extractor variants that are compared against each other on synthetic textual descriptions derived from six publicly available OCEL datasets. Our results reveal that a generative collector combined with a heuristic refiner exhibits the strongest generalization capabilities on unseen textual descriptions.