Since the beginnings of Artificial Intelligence, there has been a dream of creating a computerised expert, filled with our knowledge, and able to infer new conclusions of its own. These computer programs became known as expert systems, often using elaborate trees constructed of rules: “if symptom A and not symptom B then ask about C, if symptom C then diagnosis is D”. But unlocking the knowledge held within scientific journals and coded in the neurons of specialists was not so easy. In medicine, for example, medical conditions and symptoms are not either true or false. Not all people experience the same symptoms, and the appearance of some symptoms is hugely more significant than others. There can also be a large number of symptoms, leading to rules with excessive numbers of variables. So the old “decision tree” methods were often cumbersome and made bad decisions, sometimes ruling out possibilities for no other reason than data being presented in an unexpected order.
The failure of these rationalist “good old fashioned AI” (GOFAI) methods led to many scientists rethinking the ideas. Clearly an expert system needed to represent knowledge in some form, and clearly that knowledge needed to be used with data to infer some form of decision. But how best to achieve these goals?
One common solution was to use fuzzy logic, where the binary true or false rules were turned into linguistic variables such as “partially true” or “mostly false”. But even fuzzy logic still suffered from problems: it might enable the expert system to define degrees of truth, but in fields such as medicine, a partially true diagnosis is not ideal. Instead, it would be much more useful if the probability of a diagnosis being true was provided. |
For example, while a runny nose might result in a “partially true” diagnosis that you have a virus in a fuzzy logic system, it would be more helpful if the system could infer that you have a certain probability of suffering from several different illnesses, some much more probable than others. The solution, as exemplified by the medical decision support system Promedas, was to use Bayesian inference rules.
Thomas Bayes was a mathematician born in London in 1702. Amongst his works, he wrote about probability. Instead of being concerned with, say, the probability of drawing a black ball from a bag of a certain number of black and white balls, Bayes was interested in the inverse probability of the event. In other words, if you had drawn more black balls compared to white balls from the bag, what was the probability that the bag contained more black balls and so you would draw another black one next? Given a hypothesis (for example that the next ball will be black) and some information about which balls have been picked previously, Bayes figured out the maths to infer the probability that the hypothesis was true or not. Several decades later, Frenchman Laplace developed these ideas further, creating a more general version of the Bayes theorem for use in astronomy and physics.
Amazingly, two centuries later, systems such as Promedas now use Bayesian inference to achieve decision support. It’s ideal for an application such as medicine where we have plenty of evidence that certain symptoms tend to be observed for specific illnesses. |