Treffer: Conformal Prediction for Long-Tailed Classification

Title:
Conformal Prediction for Long-Tailed Classification
Contributors:
University of California [Berkeley] (UC Berkeley), University of California (UC), Sciences environnementales guidées par les données (IROKO), Centre Inria d'Université Côte d'Azur, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Université de Perpignan Via Domitia (UPVD)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Université de Montpellier Paul-Valéry (UMPV)-Université de Perpignan Via Domitia (UPVD)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Université de Montpellier Paul-Valéry (UMPV)-Institut Montpelliérain Alexander Grothendieck (IMAG), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), ANR-20-CHIA-0001,CAMELOT,Apprentissage automatique et optimisation coopératifs.(2020)
Publisher Information:
CCSD, 2025.
Publication Year:
2025
Collection:
collection:CNRS
collection:INRIA
collection:UNIV-MONTP3
collection:UNIV-PERP
collection:INRIA-SOPHIA
collection:I3M_UMR5149
collection:INSMI
collection:INRIASO
collection:INRIA_TEST
collection:INRIA34
collection:TESTALAIN1
collection:LIRMM
collection:IMAG-MONTPELLIER
collection:INRIA2
collection:UNIV-MONTPELLIER
collection:UNIV-COTEDAZUR
collection:ANR
collection:UPVM-TI
collection:UM-2015-2021
collection:UM-EPE
collection:INRIA-ETATSUNIS
collection:IROKO
collection:ANR-IA-20
collection:ANR-IA
Original Identifier:
ARXIV: 2507.06867
HAL: hal-05157207
Document Type:
E-Ressource preprint<br />Preprints<br />Working Papers
Language:
English
Relation:
info:eu-repo/semantics/altIdentifier/arxiv/2507.06867
Rights:
info:eu-repo/semantics/OpenAccess
Accession Number:
edshal.hal.05157207v2
Database:
HAL

Weitere Informationen

Many real-world classification problems, such as plant identification, have extremely long-tailed class distributions. In order for prediction sets to be useful in such settings, they should (i) provide good class-conditional coverage, ensuring that rare classes are not systematically omitted from the prediction sets, and (ii) be a reasonable size, allowing users to easily verify candidate labels. Unfortunately, existing conformal prediction methods, when applied to the long-tailed setting, force practitioners to make a binary choice between small sets with poor class-conditional coverage or sets with very good class-conditional coverage but that are extremely large. We propose methods with guaranteed marginal coverage that smoothly trade off between set size and class-conditional coverage. First, we propose a conformal score function, prevalence-adjusted softmax, that targets a relaxed notion of class-conditional coverage called macro-coverage. Second, we propose a label-weighted conformal prediction method that allows us to interpolate between marginal and class-conditional conformal prediction. We demonstrate our methods on Pl@ntNet and iNaturalist, two long-tailed image datasets with 1,081 and 8,142 classes, respectively.