Nobel Prize in Physics 2024: Machine Learning and Artificial Neural Networks

The prize will be awarded to researchers from the USA and Canada for the development of artificial neural networks.

The Nobel Prize in Physics will be awarded this year to two researchers from the United States and Canada for the development of tools in machine learning and artificial neural networks. John Joseph Hopfield from Princeton University and Geoffrey Everest Hinton from the University of Toronto will receive the prize for developing computational tools that mimic the workings of the nervous system.

Pioneers in the field of artificial intelligence. Hinton (right) and Hopfield.

Imitating the Human Brain

The first modern computer, described by John von Neumann in 1945, was inspired by two ideas: the concept of an imaginary universal machine — a system capable of performing any computable task — introduced in 1936 by Alan Turing, who is often considered the father of computers and artificial intelligence. The second idea was presented in a 1943 paper by neuroscientist Warren McCulloch and logician Walter Pitts, where they presented a mathematical model describing the function of neurons in the human brain, representing neurons as logical gates. By connecting these gates in various configurations, it becomes possible to compute virtually anything and even create systems with memory.

The path toward creating such memory systems and ultimately developing artificial intelligence was paved by the ideas of psychologist Donald Hebb. In 1949, Hebb proposed a theory explaining how learning occurs in the brain. According to his hypothesis, basic learning occurs by way of changes in the strength of connections between neurons. That is, when neurons frequently activate one another, the connection between them strengthens, while connections between neurons that do not work together weaken.

Basic learning in the brain occurs through changes in the strength of connections between neurons, and the same process takes place in an artificial neural network. | From the Nobel Prize website.

These foundational ideas led to attempts to develop artificial neural networks that mimic the workings of the human brain's neural architecture. These networks simulate the brain's characteristics and learning processes using artificial neurons—software-simulated nodes in a network with different values that are interconnected by "weights" that represent the strength of connections in the network. Like biological neurons, the weights - the strength of the connections - in these networks can change in response to external stimuli, enabling the system to learn. Rather than relying on traditional programming, which provides explicit instructions to solve a task, artificial neural networks can be trained to perform complex tasks that are difficult or impossible to accomplish with strict programming rules. Instead, artificial neural networks can be trained to perform complex or ill-defined tasks, which are hard to describe with clear instructions, such as facial recognition, detecting cancerous tumors, or distinguishing between different breeds of cats, such as a Persian and an Angora cat. Unlike traditional computer programs, artificial neural networks can learn from examples to carry out tasks that defy straightforward rule-based programming.

Consider the challenge of defining what a house cat is. If we tried to create a list of rules to define a house cat, it would prove to be a complex and nearly endless task. For instance, a rule that states, "a cat is a furry animal with four legs," might also describe rats or aardvarks. What about cats that have lost a leg, hairless cats, or those standing on two legs, like the Cat in the Hat? And how exactly do we define "animal"?

Instead, we can train a neural network to recognize a cat. This involves feeding the network many pre-classified images labeled as "cat" or "not cat." After initial processing, the data is passed to other parts of the network, which also process it, abstracting different features to identify common characteristics of cats and distinguish them from non-cats. During network training, the weights between the "neurons" are adjusted each time the network incorrectly classifies a known image until the network learns to reliably classify the training images, identifying patterns and features shared by "cat" images.With successful training, the network can then recognize cats in new, unseen images. This network training process, which relies on examples and guidance, is known as supervised learning.

Initial attempts to develop such networks began as early as the 1960s but were largely abandoned due to limited success. Interest in these networks was revived in the 1980s, driven by several groundbreaking works, including those by Hopfield and Hinton.

How do you define a cat? Cats that don't necessarily fit the same definition | Shutterstock, Asichka.

Complex Systems

John Hopfield was born in 1933 in Chicago to physicist parents. By the age of 25, he had earned a PhD in physics from Cornell University and began working at Bell Laboratories. He later became a researcher at Princeton University, and, in 1980, was appointed professor of chemistry and biology at the California Institute of Technology (Caltech). There, he sought to bridge the gap between physics and biological systems, focusing on the development of a computerized neural network.

Hopfield drew inspiration from his work on systems composed of many particles, particularly magnetic systems where neighboring components influence one another, creating long-range effects across the entire system. Unlike conventional artificial neural networks, where artificial neurons are arranged in layered sequences, Hopfield designed a network in which all neurons are interconnected. He introduced the concept of energy into the system, similar to magnetic systems, calculated using the neurons' values and the strengths of their connections. An image is fed into such a network, and the connections between the neurons can then be adjusted to reach a minimum energy state. When another image is introduced, the values of the neurons adjust again to achieve a new minimum energy state, and so on. This process allows the network to "remember" and reproduce the original images or images it was trained on.

The network he developed in 1982 , now known as the "Hopfield Network," operates similarly to our associative memory—like when we try to recall a particular word by repeating its sound and searching for similar words. The Hopfield network was a system capable of retrieving specific pieces of information, such as a word or an image, based on similar elements.

In 1982, John Hopfield developed the Hopfield Network. Illustration of an artificial neural network | © Johan Jarnestad/The Royal Swedish Academy of Sciences.

Multilayer Network

Geoffrey Hinton was born in 1947 in London, England. He studied experimental psychology at the University of Cambridge and earned a PhD in artificial intelligence from the University of Edinburgh in Scotland in 1971. After facing difficulties securing funding for his research in the UK, Hinton moved to the United States and later to the University of Toronto in Canada.

Hinton built upon Hopfield’s network and upon concepts from statistical physics, which describes the general behavior of systems composed of many particles. For instance, in a gas made up of numerous molecules, statistical physics can predict the probability that a particular molecule will have a certain speed, based on factors such as temperature, pressure, and volume of the gas. Different types of distributions describe such systems, with the Boltzmann distribution being one of the most widely known. The network developed by Hinton relies on the Boltzmann distribution and is accordingly named the "Boltzmann Machine." The Boltzmann Machine is an early example of generative artificial intelligence (GenAI). While it is inefficient and requires lengthy computation times, it laid the foundation for today's advanced image and text generation models.

In such a network, neurons are organized into layers and trained to process data. The neurons are divided into two main layers: the visible layer, where data is fed into the network, and the hidden layer, which is connected to the visible layer and processes the data. As the network trains on a set of images, the weights are updated iteratively. After a sufficient number of iterations, the network converges to a state where its overall behavior stabilizes, even as individual weights continue to change. From this state, the network can generate new images based on what it was trained on. The training process is made possible through the method of back-propagation, which adjusts the network's weights by calculating errors and propagating them backward through the layers - a principle introduced by Hinton and his colleagues in a seminal 1986 paper.

Hinton’s research continued to shape the fields of artificial intelligence and deep learning. Among other things, worked with Google for several years, before leaving the company in 2023, expressing a desire for the freedom to speak openly about the potential risks posed by advancements in artificial intelligence technologies.

Throughout their careers, both Hopfield and Hinton have received numerous awards and honors for their work. Hinton was elected to the British Royal Society in 1998, received the BBVA Foundation Frontiers of Knowledge Award in 2016, and the Turing Award—the most prestigious award in computer science—in 2018. Hopfield was elected to the American Academy of Sciences in 1973 and was awarded the Albert Einstein Award in 2005 and the Boltzmann Medal in 2022, among other honors.

In the Boltzmann Machine, the neurons are divided into two groups: the visible layer and the hidden layer. Illustration of Boltzmann Machines (right and center) compared to Hopfield Networks | Nobel Prize website.

From Genetics to Driving

This year’s Nobel Prize in Physics is yet another example of the broad applicability of physics, which repeatedly demonstrates its practical relevance across various fields, even those that may initially seem unrelated to "pure" physics. The researchers' breakthrough was inspired by the principles and mindset of statistical physics, which describes systems composed of numerous components. Hopfield discovered that, much like in complex physical systems, a network of biological neurons - or artificial ones built from electrical circuits - can develop unexpected collective properties that enable them to perform computations.

Although the research recognized by this year's Nobel Prize in Physics did not focus on "pure" physics, it is important to recognize that machine learning tools and artificial neural networks are now being used across many scientific disciplines, including physics. They are utilized for analyzing vast amounts of data from particle physics experiments, astronomical observations, and other data-intensive fields such as genetic data analysis. These systems are becoming increasingly valuable in a wide range of applications, ranging from medical diagnostics and autonomous driving to customer service. Their significance in our daily lives is expected to continue growing in the coming years.

Prize Week

Yesterday it was announced that the Nobel Prize in Medicine will be awarded this year to Gary Ruvkun and Victor Ambros from the United States for their discovery of microRNA molecules and their role in gene expression regulation. On Wednesday, the laureates of the Nobel Prize in Chemistry will be announced, followed by the Nobel Prize in Literature on Thursday. On Friday, the laureates of the Nobel Peace Prize will be announced by the Oslo committee. The week of announcements will conclude on Monday with the announcement of the laureates of the Nobel Memorial Prize in Economic Sciences, in Memory of Alfred Nobel.