Crowdsourcing label noise simulation on image classification tasks

Tanguy Lefort; Benjamin Charlier; Alexis Joly; Joseph Salmon

Conference Papers Year : 2022

Crowdsourcing label noise simulation on image classification tasks

(1, 2) , (1) , (2) , (1, 3)

1
2
3

Tanguy Lefort

Function : Author
PersonId : 1174103
IdHAL : tanguy-lefort
ORCID : 0009-0000-6710-3221

Institut Montpelliérain Alexander Grothendieck

Scientific Data Management

Benjamin Charlier

Function : Author
PersonId : 15104
IdHAL : bcharlier
IdRef : 163585091

Institut Montpelliérain Alexander Grothendieck

Alexis Joly

Function : Author
PersonId : 12088
IdHAL : alexis-joly
ORCID : 0000-0002-2161-9940
IdRef : 107969394

Scientific Data Management

Joseph Salmon

Function : Author
PersonId : 170495
IdHAL : joseph-salmon
ORCID : 0000-0002-3181-0634
IdRef : 14987233X

Institut Montpelliérain Alexander Grothendieck

Institut universitaire de France

Abstract

It is common to collect labelled datasets using crowdsourcing. Yet, labels quality depends deeply on the task difficulty and on the workers abilities. With such datasets, the lack of ground truth makes it hard to assess the quality of annotations. There are few open-access crowdsourced datasets, and even fewer that provide both heterogeneous tasks in difficulty and all workers answers before the aggregation. We propose a new crowdsourcing simulation framework with quality control. This allows us to evaluate different empirical learning strategies empirically from the obtained labels. Our goal is to separate different sources of noise: workers that do not provide any information on the true label against poorly performing workers, useful on easy tasks.

Keywords

Crowdsourcing

Apprentissage participatif

Domains

Human-Computer Interaction [cs.HC]

Fichier principal

jds22.pdf (1.03 Mo)

Origin : Files produced by the author(s)

Tanguy Lefort : Connect in order to contact the contributor

https://hal.science/hal-04562503

Submitted on : Monday, April 29, 2024-2:08:03 PM

Last modification on : Wednesday, May 1, 2024-3:28:02 AM

Dates and versions

hal-04562503 , version 1 (29-04-2024)

Identifiers

HAL Id : hal-04562503 , version 1

Cite

Tanguy Lefort, Benjamin Charlier, Alexis Joly, Joseph Salmon. Crowdsourcing label noise simulation on image classification tasks. JDS 2022 - 53es Journées de Statistique, Jun 2022, Lyon, France. ⟨hal-04562503⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA I3M_UMR5149 INSMI ZENITH LIRMM IMAG-MONTPELLIER INRIA2 UNIV-MONTPELLIER ANR

0 View

0 Download

Crowdsourcing label noise simulation on image classification tasks

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Share