Yesterday, after a talk I gave on our progress in using machine learning to emulate phase separation, I was asked whether I was a convert to the machine learning approach. This led to a discussion about when and why machine learning might have a role to play in physics. This is an important question. The danger of machine learning is that it can be used to predict anything and the black box nature makes it difficult to interrogate the integrity of the predictions. This is why validation and cross checking is essential, but before reaching that stage, I think there is a fairly straightforward way to evaluate whether ML may have a role.
ML works well when you have a lot of data but lack a model to describe that data, or you don’t want to be constrained by a particular choice of model. The data might be generated by solving equations or it might be experimental. Establish whether you are trying to do interpolation or extrapolation. Extrapolation, trying to predict the output when the input falls outside of the training input, should be done with considerable caution, and is probably best avoided! As I’ve written previously, unlike many other machine learning algorithms, Gaussian Processes provide not just a prediction but also a measure of the uncertainty of the prediction. This uncertainty typically becomes very large when extrapolating which is an immediate sign that the prediction should not be taken too seriously.