The ‘aha’ moment

I dived into machine learning by buying the excellent book by Kevin Murphy. It is comprehensive, so daunting, but I was given a huge boost of confidence to continue when I realised that I, like most other scientists, and indeed anyone else who taken high school science, was familiar with a rudimentary form of machine learning that turns out, for me at least, to be a wonderful starting point for turning on the lights inside the black box. Remember

y = mx + c,

the equation that fits a straight line through some data? We measure some property, y, as we vary some other quantity, x, and attempt to describe the relationship using m and c. So how does this help me to start demystifying AI? Before computers became so commonplace, at school we were taught how to find the line-of-best fit by eye, which is OK for rough estimates but not very good if you are planning on writing a research article. These days we use computer software to do the work for us. It is easier and much more reliable but it also means I’ve stopped thinking about the math that lies behind finding the line-of-best-fit. For such a simple equation, the math turns out be quite sophisticated, as much in terms of concepts as in terms of the actual math itself. Reading about it also reminded me that the way we find a line-of-best-fit requires us to make assumptions about the nature of our measurements. The most common assumption is that the scatter that we see in our data, from a variety of causes, has a Gaussian distribution. This tends to work well for most scientific data, but there are other reasonable assumptions.

So how does this amount to an ‘aha’ moment. I’ve used this equation and more complicated variants of it on a number of occasions with little thought about the algorithm the software uses. But the point is that there is an algorithm, it is relatively straightforward to work through the math for anyone who has studied high school statistics, so I can know what is going on inside the black box, even though most of the time I’m not particularly interested. In other words, I’ve already used a rudimentary form of machine learning! And what is more, I’ve happily been doing so for years in complete ignorance. The machine takes my data and churns out a description of how it thinks future data might behave. The best guess for y if I take another measurement at a different point x? Well it just uses the m and the c that it learnt from previous data. If I like, I can also use the new data to improve my guesses for m and c.

With hindsight this all seems rather simple and obvious, but it turns out to provide a great way into understanding the foundations of machine learning and this particular form of artificial intelligence.

First steps …

This blog documents my attempts to make sense of artificial intelligence and how I can use it to help me to do the science I love doing better. I’m hoping others who also want to make sense of this technology might be curious to follow as I try to find my way through.

As a teenager in the 1980s I was fascinated by the excitement and promises of artificial intelligence. Concepts such as The Lambda Calculus and programming languages such as LISP captivated me, but at the time, I didn’t have the skills to do any more than look on from a distance. Back then the claims for what AI would achieve were lofty, with the darkest implications perhaps best imagined in the Sci-fi classic, The Terminator.

AI slowly drifted from my consciousness since I was too busy juggling studying for a degree and then a PhD in physics, with achieving rock and roll fame and fortune. Happily one of those worked out and I’ve now been teaching and researching in theoretical physics for over 25 years.

Over the past year, I’ve almost by accident started to think about artificial intelligence, or more precisely machine learning, and how I might use it to help solve some of the riddles that I’m working on. One use of AI in the physical sciences that has really taken off is the searching for a needle in a haystack problem. Advances in both hardware and software have enabled modern scientific instruments to collect vast amounts of data that are simply too huge to sift through by hand, even for the most dedicated PhD student. Teaching machines what kind of blips in the data we are looking for and then letting them loose on more data to find similar blips is what AI is really good at.  This is the black box magic of AI. We don’t care how the AI finds the blips, we just want to know where they are. For many physicists, it is the black box nature that makes us suspicious and, I believe, has limited the uptake of AI beyond looking for needles.

This blog follows from an ‘aha’ moment I had recently. In this post I share my thoughts on why I started to renew my interest in AI..