Literatur vom gleichen Autor/der gleichen Autor*in
plus bei Google Scholar

Bibliografische Daten exportieren
 

Stratify or Die: Rethinking Data Splits in Image Segmentation

Titelangaben

Jami, Naga Venkata Sai Jitin ; Altstidl, Thomas ; Mueller, Jonas ; Li, Jindong ; Zanca, Dario ; Eskofier, Björn ; Leutheuser, Heike:
Stratify or Die: Rethinking Data Splits in Image Segmentation.
In: Advances in Neural Information Processing Systems 38 : NeurIPS 2025. - Cambridge, Mass. , 2026 . - S. 72945-72974
DOI: https://doi.org/10.48550/arXiv.2509.21056

Angaben zu Projekten

Projektfinanzierung: Bayerische Forschungsstiftung

Abstract

Random splitting of datasets in image segmentation often leads to unrepresentative test sets, resulting in biased evaluations and poor model generalization. While stratified sampling has proven effective for addressing label distribution imbalance in classification tasks, extending these ideas to segmentation remains challenging due to the multi-label structure and class imbalance typically present in such data. Building on existing stratification concepts, we introduce Iterative Pixel Stratification (IPS), a straightforward, label-aware sampling method tailored for segmentation tasks. Additionally, we present Wasserstein-Driven Evolutionary Stratification (WDES), a novel genetic algorithm designed to minimize the Wasserstein distance, thereby optimizing the similarity of label distributions across dataset splits. We prove that WDES is globally optimal given enough generations. Using newly proposed statistical heterogeneity metrics, we evaluate both methods against random sampling and find that WDES consistently produces more representative splits. Applying WDES across diverse segmentation tasks, including street scenes, medical imaging, and satellite imagery, leads to lower performance variance and improved model evaluation. Our results also highlight the particular value of WDES in handling small, imbalanced, and low-diversity datasets, where conventional splitting strategies are most prone to bias.

Weitere Angaben

Publikationsform: Aufsatz in einem Buch
Begutachteter Beitrag: Nein
Institutionen der Universität: Fakultäten > Fakultät für Mathematik, Physik und Informatik > Institut für Informatik > Lehrstuhl Machine Learning in Medicine / Maschinelles Lernen in der Medizin
Fakultäten > Fakultät für Mathematik, Physik und Informatik > Institut für Informatik > Lehrstuhl Machine Learning in Medicine / Maschinelles Lernen in der Medizin > Lehrstuhl Machine Learning in Medicine / Maschinelles Lernen in der Medizin - Univ.-Prof. Dr. Heike Leutheuser
Titel an der UBT entstanden: Ja
Themengebiete aus DDC: 000 Informatik,Informationswissenschaft, allgemeine Werke > 004 Informatik
Eingestellt am: 21 Mai 2026 10:50
Letzte Änderung: 21 Mai 2026 10:50
URI: https://eref.uni-bayreuth.de/id/eprint/97285