close x


Enter Your Info Below And We Will Send
You The Brochure To Your Inbox!

aibotics go-ditial brochure (en)

Thank you!
Your submission has been received!

Oops! Something went wrong while submitting the form


Bias in Data‐Driven Artificial Intelligence Systems - An Introductory Survey

Eirini Ntoutsi et al., Wiley Online Library

The goal of this survey is to provide a broad multidisciplinary overview of the area of bias in AI systems, focusing on technical challenges and solutions as well as to suggest new research directions towards approaches well‐grounded in a legal frame.


Artificial Intelligence (AI)‐based systems are widely employed nowadays to make decisions that have far‐reaching impact on individuals and society. Their decisions might affect everyone, everywhere, and anytime, entailing concerns about potential human rights issues. Therefore, it is necessary to move beyond traditional AI algorithms optimized for predictive performance and embed ethical and legal principles in their design, training, and deployment to ensure social good while still benefiting from the huge potential of the AI technology. The goal of this survey is to provide a broad multidisciplinary overview of the area of bias in AI systems, focusing on technical challenges and solutions as well as to suggest new research directions towards approaches well‐grounded in a legal frame. In this survey, we focus on data‐driven AI, as a large part of AI is powered nowadays by (big) data and powerful machine learning algorithms. If otherwise not specified, we use the general term bias to describe problems related to the gathering or processing of data that might result in prejudiced decisions on the bases of demographic features such as race, sex, and so forth.

This article is categorized under:

  • Commercial, Legal, and Ethical Issues > Fairness in Data Mining
  • Commercial, Legal, and Ethical Issues > Ethical Considerations
  • Commercial, Legal, and Ethical Issues > Legal Issues

1 Introduction

Artificial Intelligence (AI) algorithms are widely employed by businesses, governments, and other organizations in order to make decisions that have far‐reaching impacts on individuals and society. Their decisions might influence everyone, everywhere, and anytime, offering solutions to problems faced in different disciplines or in daily life, but at the same time entailing risks like being denied a job or a medical treatment. The discriminative impact of AI‐based decision‐making to certain population groups has been already observed in a variety of cases. For instance, the COMPAS system for predicting the risk of re‐offending was found to predict higher risk values for black defendants (and lower for white ones) than their actual risk (Angwin, Larson, Mattu, & Kirchner, 2016) (racial‐bias). In another case, Google's Ads tool for targeted advertising was found to serve significantly fewer ads for high paid jobs to women than to men (Datta, Tschantz, & Datta, 2015) (gender‐bias). Such incidents have led to an ever increasing public concern about the impact of AI in our lives.

Bias is not a new problem rather “bias is as old as human civilization” and “it is human nature for members of the dominant majority to be oblivious to the experiences of other groups.”1 However, AI‐based decision‐making may magnify pre‐existing biases and evolve new classifications and criteria with huge potential for new types of biases. These constantly increasing concerns have led to a reconsideration of AI‐based systems towards new approaches that also address the fairness of their decisions. In this paper, we survey recent technical approaches on bias and fairness in AI‐based decision‐making systems, we discuss their legal ground2 as well as open challenges and directions towards AI‐solutions for societal good. We divide the works into three broad categories:

  • Understanding bias. Approaches that help understand how bias is created in the society and enters our socio‐technical systems, is manifested in the data used by AI algorithms, and can be modeled and formally defined.
  • Mitigating bias. Approaches that tackle bias in different stages of AI‐decision making, namely, preprocessing, in‐processing, and post‐processing methods focusing on data inputs, learning algorithms, and model outputs, respectively.
  • Accounting for bias. Approaches that account for bias proactively, via bias‐aware data collection, or retroactively, by explaining AI‐decisions in human terms.

Figure 1 provides a visual map of the topics discussed in this survey.

Overview of topics discussed in this survey

This paper complements existing surveys that either have a strong focus on machine ethics, such as Yu et al. (2018), study a specific subproblem, such as explaining black box models (Atzmueller, 2017; Guidotti et al., 2019), or focus in specific contexts, such as the Web (Baeza‐Yates, 2018), by providing a broad categorization of the technical challenges and solutions, a comprehensive coverage of the different lines of research as well as their legal grounds.

We are aware that the problems of bias and discrimination are not limited to AI and that the technology can be deployed (consciously or unconsciously) in ways that reflect, amplify or distort real world perception, and status quo. Therefore, as the roots to these problems are not only technological, it is also naive to believe that technological solutions will suffice. Rather, more than technical solutions are required including socially acceptable definitions of fairness and meaningful interventions to ensure the long‐term well‐being of all groups. These challenges require multidisciplinary perspectives and a constant dialogue with the society as bias and fairness are multifaceted and volatile. Nevertheless, as the AI technology penetrates our lives, it is extremely important for technology creators to be aware of bias and discrimination and to ensure responsible usage of the technology, keeping in mind that a technological approach on its own is not a panacea for all sorts of bias and AI problems.

2 Understanding Bias

Bias is an old concept in machine learning (ML), traditionally referring to the assumptions made by a specific model (inductive bias) (Mitchell, 1997). A classical example is Occam's razor preference for the simplest hypothesis. With respect to human bias, its many facets have been studied by many disciplines including psychology, ethnography, law, and so forth. In this survey, we consider as bias the inclination or prejudice of a decision made by an AI system which is for or against one person or group, especially in a way considered to be unfair. Given this definition, we focus on how bias enters AI systems and how it is manifested in the data comprising the input to AI algorithms. Tackling bias entails answering the question how to define fairness such that it can be considered in AI systems; we discuss different fairness notions employed by existing solutions. Finally, we close this section with legal implications of data collection and bias definitions...

Read the full story and survey on Wiley Online Library

more posts