Impact of Imputation Strategies on Fairness in Machine Learning | Journal of Artificial Intelligence Research

PDF

Published: Jun 27, 2022

DOI: https://doi.org/10.1613/jair.1.13197

Keywords:

machine learning

Simon Caton

School of Computer Science, University College Dublin

https://orcid.org/0000-0001-9379-3879

Saiteja Malisetty

University of Nebraska at Omaha

Christian Haas

Department of Strategy and Innovation, Vienna University of Economics and Business (WU)

https://orcid.org/0000-0002-2690-5962

Abstract

Research on Fairness and Bias Mitigation in Machine Learning often uses a set of reference datasets for the design and evaluation of novel approaches or definitions. While these datasets are well structured and useful for the comparison of various approaches, they do not reflect that datasets commonly used in real-world applications can have missing values. When such missing values are encountered, the use of imputation strategies is commonplace. However, as imputation strategies potentially alter the distribution of data they can also affect the performance, and potentially the fairness, of the resulting predictions, a topic not yet well understood in the fairness literature. In this article, we investigate the impact of different imputation strategies on classical performance and fairness in classification settings. We find that the selected imputation strategy, along with other factors including the type of classification algorithm, can significantly affect performance and fairness outcomes. The results of our experiments indicate that the choice of imputation strategy is an important factor when considering fairness in Machine Learning. We also provide some insights and guidance for researchers to help navigate imputation approaches for fairness.

Issue

Vol. 74 (2022)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details