Senior Data Scientist
In recent years, several events have prompted the marketing industry to reevaluate how machine learning and AI impact our society. Many conversations have centered around diversity and equality, motivating marketers to rethink our current approaches. One critical component is how audience selection influences marketing campaigns and the people that may, or may not, be receiving those messages based on algorithmic decisioning.
All aspects of marketing are increasingly driven by algorithms which can inherit and amplify biases. Typically, in addressing modeling bias the focus is on exclusionary variables such as age, gender, ethnicity, and others. Depending on the vertical or industry, these will be subject to specific regulations guiding how the variables can or cannot be leveraged. But when dealing with powerful machine learning systems, simply negating these variables from training data does not guarantee that a model will be unbiased.
A variety of forms of bias may appear in algorithms. Even when sensitive variables such as gender, ethnicity, and sexual orientation are excluded, AI systems still make decisions in the context of training data that may reflect human prejudices or historical inequalities.
-IAB, “Understanding Bias in AI for Marketing”, November 2021
This may surprise many marketers, especially those familiar with the responses from platforms like Facebook and Google, which have removed these variables from look-alike modeling or audience selection in response to regulatory pressure. So how can bias still enter a model even when attempts are being made to actively prevent it?
After training data has been selected for model development samples, it goes through automated processes where the model is going to try to predict the best candidates for the desired outcome. If all customers are wealthy individuals for example, a look-alike model will mirror that seed set. Now let’s say a marketer is aware of this, and to be more equitable, they instruct their data scientist to negate income as a predictor variable. All other approved variables are still going to be evaluated in combination, so an algorithm may find a home value variable as a proxy for income, along with other potentially unexpected variables, to make a similar prediction had income been included.
Maintaining a Human Element to Model Development
Machine learning processes have an element of opaqueness, and after reading the example above, it might feel like there is not much else to do to limit bias. But data scientists and members of other business teams, such as strategy and compliance, can play a role in subsequent model development steps.
Model development workflows are not always linear, meaning data scientists can review the performance reporting of the algorithms and loop back to adjust. For example, they can spend time evaluating the top variables influencing the model. This will provide an initial sense of how biased a model may be. To take it a step further, the model can be used to score another data set. Alliant compares the distribution of scoring results by rank to the overall distribution of a representative sample of the US population. If the top variables are pointing too heavily to one group, this can be an indicator for the data scientist to do additional exploration.
At this time, a data scientist may want to share their findings with legal or compliance teams for further assessment. Legal, and other data strategists can make recommendations for mitigation measures based on the specific use case.
If it is determined that bias has surpassed a certain threshold in a model, the data scientist can return to the earlier stages to refine the model development process. There are many ways to make these adjustments. Modifying predictor variables is one approach. It may also make sense to remove extreme outliers from a modeling sample, or balancing the seed set to reduce inherent biases. In some cases it may make sense to test a different modeling methodology altogether. This will be an iterative process and it may take several attempts to find the right approach.
What we’ve outlined here only begins to scratch the surface in the ways that biases can be removed from machine learning and AI. Alliant’s VP of Data Science, Malcolm Houtz, contributed to the IAB whitepaper “Understanding Bias in AI for Marketing”, which is a must-read for companies looking to develop frameworks for better AI solutions.
Interested in learning more about audience modeling? You can always reach out to Alliant directly!