In the healthcare sector, bias in machine learning algorithms can result in patients not receiving the care they deserve. In the worst cases, this can lead to waste and even death.
A variety of factors may cause bias in AI models. These include how data is collected/prepared, model development, and evaluation.
The broad scope of ML applications and potential patient harms following deployment across health systems demands an increased emphasis on evaluating and mitigating bias in machine learning algorithms models. These tools are primarily trained on data that reflect historical inequities in disease diagnosis and care delivery and could perpetuate these inequities by shaping clinical decision-making.
There are many steps in developing an ML model, and each can be a source of bias. The most common cause of algorithmic bias is inadequate training data. When an ML model is trained on datasets underrepresentation of certain groups, the algorithm may produce erroneous predictions for these groups. For example, some facial-analysis experiments showed that an algorithm trained on some faces could not identify them correctly.
Various strategies can overcome this issue, including training the model on fresh, unbiased data, requiring regular audits, and incentivizing operators to review their algorithms for bias. In addition, legal measures, such as class action lawsuits, can be used to hold companies accountable when they fail to take these measures into account. In the meantime, the authors recommend that healthcare professionals and other stakeholders become more aware of these issues. They can do so by encouraging broader, more diverse teams working on these technologies and advocating for checklists that ensure technical diligence, transparency, and equity in developing new algorithms.
The use of ML in medicine is evolving rapidly. In this context, research addressing the impact and mitigation of bias in clinical ML is still emerging. A growing understanding of the potential harms and benefits of ML is needed to guide model development and implementation, including considerations of algorithmic fairness.
The data sets used to train ML algorithms often reflect the inequities and bias that have long plagued the healthcare system, including differences in the quality of care delivered by clinicians to white and people of color patients. This variation in treatment gets immortalized in the data, and these societal inequalities can be transmitted to the ML models built from them.
Despite developers’ best intentions, bias can slip into ML models. recently discovered that their ML algorithm to predict childhood sepsis was biased against Hispanic patients. The team realized the algorithm might be misinterpreting delays in bloodwork for Hispanic children as signs of a severe illness, which could lead to slower diagnoses and, ultimately, more significant mortality for those kids.
Several methods for bias mitigation have been described in the literature, including preprocessing through sampling before an algorithm is built, in-processing through incorporating mathematical approaches that incentivize a model to learn balanced predictions, and post-processing by incorporating human oversight into the design of ML algorithms. Nevertheless, the underlying causes of bias and their implications for healthcare outcomes remain complex.
A study that found many hospitals’ clinical algorithms exhibited racial bias has heightened awareness of how ML models can directly and potentially harm patients. But the work still lies ahead: preventing and mitigating these effects requires a more holistic approach that addresses all steps in the ML process, from data collection/preparation to model development to deployment in clinical settings.
Bias can emerge in any of these steps, but the most common sources of bias include underlying social inequities that shape the data sets on which an algorithm is trained. For example, the racial discrimination in the sepsis detection algorithm may have stemmed from Black patients’ health records being less accessible and more likely to contain outdated information. In contrast, white patients’ health records are more likely to be complete and up-to-date.
Bias can also arise when the people building an ML model have personal and professional preferences encoded into their algorithms. These unique and professional biases may be unconscious or subtle, but they can significantly impact the final result. The big challenge is that when you are trying to make an unbiased algorithm, it is difficult without taking a hit in overall accuracy. And that’s why it’s important to incorporate diversity, equity, and inclusion into all these processes.
ML algorithms start with data sets that often reflect the racial inequities and biases that have long plagued healthcare. For example, a clinical algorithm many hospitals use to determine which patients require extra care was shown in 2019 to have a racial bias based on its interpretation of past patient data about their healthcare spending, which reflects historical income and wealth disparities.
When a machine learning model is biased, the results can exacerbate healthcare inequities and create new ones. Roundtable participants noted that addressing the issue of bias in ML models requires a multidisciplinary approach that involves not only engineering but also legal, marketing, and strategy functions. Creating a bias impact statement identifying the societal, ethical, and moral issues associated with an automated decision is a crucial first step toward this goal.
While the ML community recognizes a need for normative theories of fairness, disputes about what constitutes an unfair ML algorithm typically focus on operational definitions of fairness based on statistical metrics of diagnostic accuracy that compare performance for traditionally disadvantaged and advantaged groups. While these metrics have the advantage of being directly applicable to real-world decisions, they fail to address many underlying causes of algorithmic unfairness, including the choice of convenient, seemingly effective proxies for ground truth and flaws in the design process.