Sammanfattning
This paper proposes a methodology for identifying data samples that are likely to be mislabeled in a c-class classification problem (dataset). The methodology relies on an assumption that the generalization error of a model learned from the data decreases if a label of some mislabeled sample is changed to its correct class. A general classification model used in the paper is OP-ELM; it also provides a fast way to estimate the generalization error by PRESS Leave-One-Out. It is tested on two toy datasets, as well as on real life datasets for one of which expert knowledge about the identified potential mislabels has been sought.
| Originalspråk | Engelska |
|---|---|
| Referentgranskad vetenskaplig tidskrift | Neurocomputing |
| Volym | 159 |
| Nummer | July |
| Sidor (från-till) | 242-250 |
| Antal sidor | 9 |
| ISSN | 0925-2312 |
| DOI | |
| Status | Publicerad - 14.02.2015 |
| MoE-publikationstyp | A1 Originalartikel i en vetenskaplig tidskrift |
Nyckelord
- 512 Företagsekonomi