Literature by the same author
plus at Google Scholar

Bibliografische Daten exportieren
 

Data-Centric Fine-Tuning of Small Language Models for Automatic Extraction of Technical Requirements

Title data

Müller, Leopold ; Schwarz, Nina ; Böcking, Lars ; Bereczuk, Andreas ; Stagge, Hanno ; Kratsch, Wolfgang ; Kühl, Niklas:
Data-Centric Fine-Tuning of Small Language Models for Automatic Extraction of Technical Requirements.
In: IEEE Access. Vol. 13 (2025) . - pp. 135301-135315.
ISSN 2169-3536
DOI: https://doi.org/10.1109/ACCESS.2025.3591739

Abstract in another language

Training small language models for specific tasks often encounters a significant challenge: the limited availability of high-quality labeled data, which can restrict model performance. This constraint is especially critical in organizations handling sensitive data, where large language models cannot be readily deployed due to privacy concerns, high costs, and dependency on external providers. While existing research demonstrates the effectiveness of large language models in automating text-based tasks, there is a lack of methods tailored to fine-tuning small language models for secure, domain-specific applications with minimal labeled data. To address this gap, we introduce a data-centric fine-tuning method that systematically enhances training data through prompt engineering, making small language model fine-tuning feasible and effective in data-constrained environments. This study evaluates the proposed method on a real-world use case with an industry partner, focusing on the automatic extraction of technical requirements from full-text documents and using both quantitative metrics and qualitative expert assessments. Our findings reveal that the fine-tuned small language model achieves accuracy and consistency comparable to human service providers, while outperforming baseline models, including GPT-4-turbo, on key evaluation metrics. These results underscore the potential of data-centric fine-tuning for adapting small language models to high-stakes, privacy-sensitive tasks, offering a scalable alternative to LLMs enabling their applications to real-world use cases.

Further data

Item Type: Article in a journal
Refereed: Yes
Institutions of the University: Faculties
Faculties > Faculty of Law, Business and Economics
Faculties > Faculty of Law, Business and Economics > Department of Business Administration
Faculties > Faculty of Law, Business and Economics > Department of Business Administration > Chair Business Informatics and Human-Centered Artificial Intelligence
Faculties > Faculty of Law, Business and Economics > Department of Business Administration > Chair Business Informatics and Human-Centered Artificial Intelligence > Chair Business Informatics and Human-Centered Artificial Intelligence - Univ.-Prof. Dr.-Ing. Niklas Kühl
Research Institutions
Research Institutions > Central research institutes > Research Center for AI in Science and Society
Research Institutions > Affiliated Institutes
Research Institutions > Affiliated Institutes > Branch Business and Information Systems Engineering of Fraunhofer FIT
Research Institutions > Affiliated Institutes > FIM Research Center for Information Management
Research Institutions > Central research institutes
Result of work at the UBT: Yes
DDC Subjects: 000 Computer Science, information, general works > 004 Computer science
600 Technology, medicine, applied sciences > 620 Engineering
Date Deposited: 28 Aug 2025 05:51
Last Modified: 06 Nov 2025 11:08
URI: https://eref.uni-bayreuth.de/id/eprint/94546