Literature by the same author
plus at Google Scholar

Bibliografische Daten exportieren
 

Honey, I Shrunk the Language Model : Impact of Knowledge Distillation Methods on Performance and Explainability

Title data

Hendriks, Daniel ; Spitzer, Philipp ; Kühl, Niklas ; Satzger, Gerharh:
Honey, I Shrunk the Language Model : Impact of Knowledge Distillation Methods on Performance and Explainability.
In: IEEE Transactions on Knowledge and Data Engineering. (April 2026) .
ISSN 1558-2191
DOI: https://doi.org/10.1109/TKDE.2026.3671872

Official URL: Volltext

Abstract in another language

Artificial Intelligence (AI) has increasingly influenced modern society, recently in particular through significant advancements in Large Language Models (LLMs). However, high computational and storage demands of LLMs still limit their deployment in resource-constrained environments. Knowledge distillation addresses this challenge by training a small student model from a larger teacher model. Previous research has introduced several distillation methods for both generating training data and training the student model. Despite their relevance, the effects of state-of-the-art distillation methods on model performance and explainability have not been thoroughly investigated and compared. In this work, we enlarge the set of available methods by applying critique-revision prompting to distillation for data generation and by synthesizing existing training methods. We systematically compare the distillation methods on the widely used Commonsense Question-Answering (CQA), Extended Stanford Natural Language Inference (ESNLI), and StrategyQA datasets. While we measure performance via student model accuracy, we employ a human-grounded study to evaluate explainability. We contribute new distillation methods and their comparison in terms of both performance and explainability. This should further advance the distillation of small language models and, thus, contribute to broader applicability and faster diffusion of language models.

Further data

Item Type: Article in a journal
Refereed: Yes
Keywords: Training; Data Models; Training Data; Computational Modeling; Cognition; Adaption Models; Question Answering Information Retrieval
Institutions of the University: Faculties > Faculty of Law, Business and Economics > Department of Business Administration
Faculties > Faculty of Law, Business and Economics > Department of Business Administration > Chair Business Informatics and Human-Centered Artificial Intelligence
Faculties > Faculty of Law, Business and Economics > Department of Business Administration > Chair Business Informatics and Human-Centered Artificial Intelligence > Chair Business Informatics and Human-Centered Artificial Intelligence - Univ.-Prof. Dr.-Ing. Niklas Kühl
Research Institutions
Research Institutions > Affiliated Institutes
Research Institutions > Affiliated Institutes > FIM Research Center for Information Management
Result of work at the UBT: Yes
DDC Subjects: 000 Computer Science, information, general works > 004 Computer science
300 Social sciences > 330 Economics
Date Deposited: 24 Apr 2026 05:48
Last Modified: 24 Apr 2026 05:48
URI: https://eref.uni-bayreuth.de/id/eprint/96902