Abstract
In the paper, we examine the general regression problem under the missing data scenario. In order to provide reliable estimates for the regression function (approximation), a novel methodology based on Gaussian Mixture Model and Extreme Learning Machine is developed. Gaussian Mixture Model is used to model the data distribution which is adapted to handle missing values, while Extreme Learning Machine enables to devise a multiple imputation strategy for final estimation. With multiple imputation and ensemble approach over many Extreme Learning Machines, final estimation is improved over the mean imputation performed only once to complete the data. The proposed methodology has longer running times compared to simple methods, but the overall increase in accuracy justifies this trade-off.
Original language | English |
---|---|
Peer-reviewed scientific journal | Neurocomputing |
Volume | 174, Part A |
Issue number | January |
Pages (from-to) | 220-231 |
Number of pages | 12 |
ISSN | 0925-2312 |
DOIs | |
Publication status | Published - 22.01.2016 |
MoE publication type | A1 Journal article - refereed |
Keywords
- 512 Business and Management
- Extreme LearningMachine
- Missing data
- Multiple imputation
- Gaussian mixturemodel
- Mixture ofGaussians
- Conditional distribution