6 min read
Dr. Min Sun, Chief AI Scientist at Appier, explains, “AI bias basically means AI or ML is making decisions with a certain bias towards a specific outcome or relying on a subset of features. A common example is a facial recognition system that has been trained with mainly Caucasian people. As a result, the system cannot make accurate judgements about people from different cultural groups.”
The model is making decisions from a set of features that is not representative of the data it is meant to be making decisions with or it performs badly on some types of data the model has not seen during training.
Where Does AI Bias Come From?
AI and machine learning systems are trained using sets of data that are acquired using various mechanisms. The data that is used to train the system, the inputs or features, is used to make decisions. These outputs are sometimes called labels. Some sets of features can be biased towards specific results. In these cases, the systems perform poorly on some types of data they haven’t exposed to during training and give results that are not optimized.
It is important to note, says Dr Sun, that the machine learning model itself is not the origin of the bias. The bias comes from the data that the model is trained with. “In some cases, the system will perform very well on certain types of data but badly on another,” he adds.
There are some clear examples of how bias in AI models can result in negative consequences. This has been seen in the United States where models trained with historical data that African American, Hispanic and other minorities have been overrepresented in crime statistics. This has resulted in AI models being used for sentences that determine harsher penalties for members of those minorities.
If the data is skewed in one particular way, the model will make decisions based on that skewing of the data.
Not All Bias Is Bad
However, for marketers, some bias in the data used to train AI models is valuable.
“We often talk about bias in a very negative sense, but you may actually want your model to push in a particular direction rather than be absolutely neutral,” says Dr Sun. “If everything is neutral then your model faces a much harder learning task.”
This is because an absolutely neutral model can take a very long time to curate the right set of training data in order to deliver benefits. If you are serving a particular customer base, then training the AI model with data for that customer base can be highly beneficial. Exploiting bias can help ensure your AI model delivers value from the outset of its deployment.
For example, if a business is selling fashion products that are targeted at young women aged between 18 and 25, then a recommendation engine that is powered by AI can use inherent bias to suggest further purchases based on other people in that target customer group. As the customer makes more decisions, the model can learn the preferences of that customer and deliver better targeted suggestions.
“If you want to have a stronger performance at the beginning of the model’s use then some bias is useful because it lets you maximize your return from the model initially,” says Dr Sun. “When the model you are using is serving data that has the same bias as your training data, we should actually exploit this bias so your performance is good when you start using the model.”
Leveraging AI Bias in Marketing
Once the model has been in operation, you can make your decision more accurate by using data collected as the model has been working. For example, a recommendation engine may start by making suggestions based on people the model believes are like your customers. But then, as the model learns more about your customers, it can make recommendations that are more specific to them.
By using the bias in the data, it is possible to reduce the initial costs of deploying the AI as the cost of collecting unbiased data is higher.
“For example, if you want to sell cosmetics, then you should exploit selling to women and girls in the beginning. Then, as you want to keep increasing your volume, you can figure out what additional features you need to suggest cosmetics to men,” says Dr Sun.
While exploiting the bias in the data used to train AI and machine learning models can be beneficial, it is important to recognize that the bias in your data can lead to negative consequences.
“If you exploit the bias at the beginning and then lock onto that bias, you might not improve further,” says Dr Sun. “For example, if you exploit a certain age group and do very well, after a while you realize your volume cannot increase anymore. If you don’t do something to overcome this bias, the campaign will gradually become harder to scale and more costly due to competition, since you may think only this certain group has the best performance and therefore you only target them.”
Overcoming AI Bias
When AI and machine learning systems are trained using biased data and that bias is not recognized and addressed, there can be significant consequences. You may miss the potentially valuable customer niche and fail to continuously scale your market share. Being able to assess this and take action is critical.
One way to do this is changing the way that data is collected, and see if this has an impact on model performance. You can then conduct an A/B test, testing the model with different data sets to see which delivers better outcomes. As well as offering a path to optimization, this ensures new data does not decrease the effectiveness of the model.
Although refining the data collection is critical, it can be very costly without more insight. So, it is important to evaluate how the model values specific features and combinations of features. By using domain knowledge, you can further refine the model. Deciding whether to refine the model or refine the data collection is a decision based on return on investment where the expense of either changing the data collection methodology or reevaluating the importance of features in the model needs to be weighed up.
It is important to understand, says Dr Sun, that once a ML model starts operating, its initial results will be directed by the data that it is trained with. However, once the model is in operation, the system itself can continue to collect data and learn. We see that with online advertising.
“The machine, using the data it is trained with, will determine where to place a particular advertisement. Based on how users interact, the model will learn where to place ads in the future,” explains Dr Sun.
AI and machine learning algorithm bias is a challenge, but marketers who are aware of the implications of bias can be prepared and use it as a tool. When bias is understood, it can be used to assist AI models in their initial operating phase to deliver recommendations before the model learns from more data it collects in the field. But when it is not recognized, it can lead to unwanted results.