Emerging Missing Data Estimation Problems: Heteroskedasticity; Dynamic Programming and Impact of Missing Data

View Sample PDF

Author(s): Tshilidzi Marwala (University of Witwatersrand, South Africa)
Copyright: 2009
Pages: 26
Source title: Computational Intelligence for Missing Data Imputation, Estimation, and Management: Knowledge Optimization Techniques
Source Author(s)/Editor(s): Tshilidzi Marwala (University of Witwatersrand, South Africa)
DOI: 10.4018/978-1-60566-336-4.ch013

Keywords: Artificial Intelligence / Computational Intelligence / Computer Science & IT / Information Science Reference

Purchase

View Emerging Missing Data Estimation Problems: Heteroskedasticity; Dynamic Programming and Impact of Missing Data on the publisher's website for pricing and purchasing information.

Abstract

This chapter is divided into three parts: The first part presents a computational intelligence approach for predicting missing data in the presence of concept drift using an ensemble of multi-layered feed-forward neural networks. An algorithm that detects concept drift by measuring heteroskedasticity is proposed. Six instances prior to the occurrence of missing data are used to approximate the missing values. The algorithm is applied to simulated time series data sets resembling non-stationary data from a sensor. Results show that the prediction of missing data in non-stationary time series data is possible but is still a challenge. In the second part, an algorithm that uses dynamic programming and neural networks to solve the problem of missing data imputation is presented. A model that uses autoassociative neural networks and genetic algorithms is used as a basis; however, the neural networks are not trained using the entire data set. Data are broken up into granules and various models are created. The models are tested on a real dataset and the results show that the proposed method is effective in missing data estimation. In the third part of this chapter, a study of the impact of missing data estimation on fault classification in mechanical systems is undertaken. The fault classification task is implemented using the extension network as well as Gaussian mixture models. When the imputed values are used in the classification of faults using the extension networks, the fault classification accuracy of 95% is observed for single-missing-entry cases and 92% for two-missing-entry cases while the full database set is able to give classification accuracy of 97%. On the other hand, the Gaussian mixture model gives 94% for single-missing-entry cases and 92% for two-missing-entry cases while the full database set is able to give classification accuracy of 96%.

The IRMA Community

Research IRM

Emerging Missing Data Estimation Problems: Heteroskedasticity; Dynamic Programming and Impact of Missing Data

Purchase

Abstract

Related Content

IRMA Sponsors