Antiracism in AI: How to Build Bias Checkpoints Into Your Development and Delivery Process

By Allison Langley

Allison Langley, Welltok's Data Science Manager, recently sat down with Managed Healthcare Executive to discuss antiracism in AI, and building checkpoints into development and delivery processes. See below for the full article. 


Much of the data that AI depends on is tainted with racial bias.

Artificial Intelligence (AI) is a powerful tool in healthcare because it can process large volumes of data to inform decisions that drive health improvements, reduce costs and streamline resource allocation. But are healthcare organizations making these important decisions based on biased data that is the byproduct of systemic racism?

Due to socioeconomic inequalities in healthcare delivery, research data, and medical records, AI is less likely to represent Black patients’ information adequately and accurately. When machine learning models are trained on biased data, they are likely to reproduce inequities. In fact, research suggests that biases inherent in data and algorithms can actually intensify existing health disparities for racial minorities.

The potential harm of biased AI is so great that many regulatory bodies are taking bold steps to set up legal guardrails, such as the European Union’s recent proposal to regulate riskier AI applications and the Federal Trade Commission’s pledge to enforce laws that ensure truthful, fair and equitable use of AI.

Most industry professionals are aware of this problem, but its prevalence shows that many of the teams building AI solutions may still be overlooking critical areas where bias can be introduced into an AI system. To ensure healthcare AI solutions have a positive impact and don’t propagate existing health inequities, healthcare leaders should ensure that AI implementation processes include bias checkpoints at each phase of development and delivery.

Here are a few examples of questions that healthcare leaders should consider when building machine learning solutions for their organization:

Are you building solutions that address the health needs of all, or are you placing higher priority on problems that affect advantaged groups?

Much of the conversation around AI bias has focused on data and models, but the first checkpoint should occur much earlier in the process when a problem that needs to be addressed is first identified.

There are many healthcare concerns that affect Black Americans disproportionately, such as healthcare coverage, chronic health conditions, mental health, cancer, infant death, and more. According to the CDC, Black Americans are more likely to die at early ages for all causes, because young Black Americans are living with conditions that are more common at older ages in other populations, such as high blood pressure, diabetes, and stroke. Many social factors such as unemployment, poverty and prohibitive costs of healthcare also affect Black Americans at younger ages.

Evidence suggests that digital health technologies can make a difference in these disparities, if healthcare organizations are willing to take the time to deeply understand the needs of their users. The specific considerations will vary greatly by use case, but product and solutions teams should be well-informed about health inequities and ensure that they are building technology for all, not just solving for the more visible problems of advantaged populations.

Do you have balanced data from users across races, genders, and socioeconomic backgrounds, or are you missing data from underserved and marginalized groups?

If data is biased, the product will be too. Unfortunately, many of the most used data sources in healthcare technology are not equally representative of all populations. For example, research shows biases that affect randomized controlled trials, electronic health records, administrative health data and social media data that cause minority populations to be underrepresented.

Because access to healthcare is unequal, clinical records alone are not enough. Incorporating nonclinical data such as social determinants of health (SDOH) indicators may help bridge this gap. A recent study found that including SDOH indicators in their machine learning model improved prospective risk adjustment for health plan payments in several vulnerable populations. With a more holistic data approach, you can understand some of the larger societal factors that drive health inequities and include these in modeling.

Have you identified and accounted for any biases in your prediction definition and algorithm design?

Once imbalances are rectified within these datasets, it is important to think carefully about how to interpret each variable and define predictive outcomes. When predicting for an unknown variable, assumptions need to be made at times about which data features are most likely to have predictive power. However, you will need to evaluate the feature selection mechanism, as opposed to using a “black box” model, so that you can consider any biases that may be present in the correlations.

Additionally, you should consider a prediction label itself to ensure that you are directly measuring what you intend to. Research shows that an algorithm widely used to predict health need is racially biased against Black patients, reducing the number of Black patients identified for extra healthcare by more than half. This bias arises because the model uses health cost as a proxy for health need, which fails to account for structural inequities in the healthcare system that result in less money being spent on Black patients who have the same level of need as other populations. Changing the prediction label to remove this false equivalence resulted in a 28.8% increase in the number of Black patients who would receive additional help.

After implementing the model, are you monitoring its performance and impact across different populations?

While not often discussed, perhaps the most important bias checkpoint relies on the measurement of real-world impact. In healthcare, the decisions you automate with machine learning are high-stakes, and you cannot afford to assume that your organization has perfectly eradicated bias in earlier steps. Therefore, it is critical to measure the downstream impact of the solution to ensure the model generalizes well to both the majority and minority groups within a user population. In the highest risk use cases, such as models used for diagnosis and treatment decisions, potential harms should be explicitly identified and measured.

Removing bias from machine learning solutions in the healthcare space requires dedicated attention from design to deployment. This technology has the potential to have a profound positive impact on the world, if the necessary steps are taken to ensure the industry is not propagating racial disparities or inequities.

Original article appeared in Managed Healthcare Executive - April 27, 2021

More resources: