This paper proposes a methodology for identifying data samples that are likely to be mislabeled in a c-class classification problem (dataset). The methodology relies on an assumption that the generalization error of a model learned from the data decreases if a label of some mislabeled sample is changed to its correct class. A general classification model used in the paper is OP-ELM; it also provides a fast way to estimate the generalization error by PRESS Leave-One-Out. It is tested on two toy datasets, as well as on real life datasets for one of which expert knowledge about the identified potential mislabels has been sought.
- 512 Business and Management
- Extreme Learning Machine
Akusok, A., Veganzones, D., Miche, Y., Björk, K-M., du Jardin, P., Severin, E., & Lendasse, A. (2015). MD-ELM: Originally Mislabeled Samples Detection using OP-ELM Model. Neurocomputing, 159(July), 242-250. https://doi.org/10.1016/j.neucom.2015.01.055