Literature by the same author
plus at Google Scholar

Bibliografische Daten exportieren
 

Process Mining Between the Lines : Extracting Object-Centric Event Logs From Textual Data

Title data

Buss, Alina ; Kecht, Christoph ; Kratsch, Wolfgang ; Röglinger, Maximilian ; Sadeghianasl, Sareh ; Wynn, Moe T.:
Process Mining Between the Lines : Extracting Object-Centric Event Logs From Textual Data.
In: Information Systems. Vol. 140 (2026) . - 102713.
ISSN 0306-4379
DOI: https://doi.org/10.1016/j.is.2026.102713

Official URL: Volltext

Project information

Project financing: QUAPRO

Abstract in another language

Organizations generate vast amounts of unstructured textual data – a valuable source of information that frequently remains underutilized for process mining. However, textual descriptions often record exceptions and manual activities absent from structured data, and therefore, enable a better understanding of deviations from the expected business process behavior. Importantly, unstructured sources typically retain the object-centric characteristics of real-world processes – information that gets flattened or lost in case-centric event logs. Yet, existing approaches primarily target structured data sources or produce case-centric event logs. To address this gap, we present an automated approach to derive object-centric event logs directly from unstructured textual descriptions. The approach comprises two subcomponents: a collector that identifies events and objects (including their attributes and relationships), and a refiner that consolidates and cleans the extracted information. We instantiate each subcomponent in heuristic and generative implementations and create four pairwise combinations of collector and refiner instances to assess the effectiveness of heuristic natural language processing and generative artificial intelligence techniques. We compare these variants quantitatively and qualitatively in a controlled, artificial setting based on synthesized texts and demonstrate the practical utility on two naturally occurring corpora (fire status updates and a legal judgment). Our results show that the configurations with a generative collector achieve the highest extraction quality. In particular, the fully generative variant produces coherent and standardized event and object labels. Overall, this study fills a notable research gap by enabling the incorporation of textual information into process mining applications.

Further data

Item Type: Article in a journal
Refereed: Yes
Keywords: Process mining; Object-centric event logs; Natural language processing; Large language models; Generative artificial intelligence
Institutions of the University: Faculties > Faculty of Law, Business and Economics > Department of Business Administration
Faculties > Faculty of Law, Business and Economics > Department of Business Administration > Chair Business Administration XVII - Information Systems and Value-Based Business Process Management
Faculties > Faculty of Law, Business and Economics > Department of Business Administration > Chair Business Administration XVII - Information Systems and Value-Based Business Process Management > Chair Business Administration XVII - Information Systems and Value-Based Business Process Management - Univ.-Prof. Dr. Maximilian Röglinger
Research Institutions
Research Institutions > Affiliated Institutes
Research Institutions > Affiliated Institutes > Branch Business and Information Systems Engineering of Fraunhofer FIT
Research Institutions > Affiliated Institutes > FIM Research Center for Information Management
Result of work at the UBT: Yes
DDC Subjects: 000 Computer Science, information, general works > 004 Computer science
300 Social sciences > 330 Economics
Date Deposited: 07 Apr 2026 12:34
Last Modified: 07 Apr 2026 12:34
URI: https://eref.uni-bayreuth.de/id/eprint/96604