Google were questioned in the US Congress about whether their search algorithms are biased against conservative leaning websites, an accusation that they, unsurprisingly, deny. A New York Times opinion piece questioned the strength of the US legislative response to bias in AI algorithms. Amazon has been caught up in claims of racial and gender bias in tools it has created for facial recognition.
The headlines rarely discuss what bias means. To tackle the problem of bias, and to understand just how complex the issue, in machine learning, could become we first need to understand its origins. The Amazon controversy is one of the clearer examples. Researchers from MIT claimed that the Amazon software more frequently mis-identified the gender of female faces and darker skinned faces than lighter skinned male faces. Whether this bias arises from the data being used to train the algorithm is not representative or whether it is a consequence of the model itself is not clear. Perhaps it it is the machine learning algorithm picking out the bias that is built into both analogue and digital photography?
Even if the data itself is representative and without systematic error, bias still exists in the models generated from the data. To remove bias, we need a way to quantify it. Statistical bias is a measure of how far the average prediction is from the observed value. Bias differs from random variations, which are measured through the variance which tell us how much the predictions vary about the the average prediction. An algorithm that has a high but equal rate of misclassification of the gender of male and female faces has a high variance but a low bias. An algorithm that misclassifies the gender of female faces at a, statistically significant, higher rate has a high bias even if, on average, over both female and male genders, it correctly classifies at a high rate. The challenge in statistical learning is what is called the bias variance trade off. Models with low bias have a high variance, so although they might misclassify male and female faces equally they do so at the price of too much misclassification. Much of the fine-tuning of machine learning is about finding the optimal balance between bias and variance, which is usually considered to be when the model makes the most overall correct predictions.
To try to understand statistical bias in machine learning, I’m now working on a classification problem focussed on my favourite topic, microstructures. The question I’m asking is: can machine learning classify microstructures when it only ‘sees’ a subset of the image data? More on this to follow …