ERef Bayreuth

Anmelden

Literatur vom gleichen Autor/der gleichen Autor*in

bei Google Scholar

Bibliografische Daten exportieren

Data-Centric Fine-Tuning of Small Language Models for Automatic Extraction of Technical Requirements

Titelangaben

Müller, Leopold ; Schwarz, Nina ; Böcking, Lars ; Bereczuk, Andreas ; Stagge, Hanno ; Kratsch, Wolfgang ; Kühl, Niklas:
Data-Centric Fine-Tuning of Small Language Models for Automatic Extraction of Technical Requirements.
In: IEEE Access. Bd. 13 (2025) . - S. 135301-135315.
ISSN 2169-3536
DOI: https://doi.org/10.1109/ACCESS.2025.3591739

Volltext

Link zum Volltext (externe URL):

Abstract

Training small language models for specific tasks often encounters a significant challenge: the limited availability of high-quality labeled data, which can restrict model performance. This constraint is especially critical in organizations handling sensitive data, where large language models cannot be readily deployed due to privacy concerns, high costs, and dependency on external providers. While existing research demonstrates the effectiveness of large language models in automating text-based tasks, there is a lack of methods tailored to fine-tuning small language models for secure, domain-specific applications with minimal labeled data. To address this gap, we introduce a data-centric fine-tuning method that systematically enhances training data through prompt engineering, making small language model fine-tuning feasible and effective in data-constrained environments. This study evaluates the proposed method on a real-world use case with an industry partner, focusing on the automatic extraction of technical requirements from full-text documents and using both quantitative metrics and qualitative expert assessments. Our findings reveal that the fine-tuned small language model achieves accuracy and consistency comparable to human service providers, while outperforming baseline models, including GPT-4-turbo, on key evaluation metrics. These results underscore the potential of data-centric fine-tuning for adapting small language models to high-stakes, privacy-sensitive tasks, offering a scalable alternative to LLMs enabling their applications to real-world use cases.

Weitere Angaben

Publikationsform:	Artikel in einer Zeitschrift
Begutachteter Beitrag:	Ja
Keywords:	data-centric AI; fine-tuning; large language models; requirement extraction
Institutionen der Universität:	Fakultäten > Rechts- und Wirtschaftswissenschaftliche Fakultät > Fachgruppe Betriebswirtschaftslehre Fakultäten > Rechts- und Wirtschaftswissenschaftliche Fakultät > Fachgruppe Betriebswirtschaftslehre > Lehrstuhl Wirtschaftsinformatik und humanzentrische Künstliche Intelligenz Fakultäten > Rechts- und Wirtschaftswissenschaftliche Fakultät > Fachgruppe Betriebswirtschaftslehre > Lehrstuhl Wirtschaftsinformatik und humanzentrische Künstliche Intelligenz > Lehrstuhl Wirtschaftsinformatik und humanzentrische Künstliche Intelligenz - Univ.-Prof. Dr.-Ing. Niklas Kühl Forschungseinrichtungen Forschungseinrichtungen > Institute in Verbindung mit der Universität Forschungseinrichtungen > Institute in Verbindung mit der Universität > Institutsteil Wirtschaftsinformatik des Fraunhofer FIT Forschungseinrichtungen > Institute in Verbindung mit der Universität > FIM Forschungsinstitut für Informationsmanagement Fakultäten Fakultäten > Rechts- und Wirtschaftswissenschaftliche Fakultät
Titel an der UBT entstanden:	Ja
Themengebiete aus DDC:	000 Informatik,Informationswissenschaft, allgemeine Werke > 004 Informatik 300 Sozialwissenschaften > 330 Wirtschaft
Eingestellt am:	01 Sep 2025 08:08
Letzte Änderung:	01 Sep 2025 08:08
URI:	https://eref.uni-bayreuth.de/id/eprint/94553