Dating Back to Pavlov: New Twists in Behavioral Association Theories

The researchers conducted experiments on C. elegans, a roundworm with just 300 neurons, that offers a simple laboratory model for studying how an animal learns.

A multi-dimensional model to explain the learning process of an animal over time.

Physicists have developed a dynamic model of animal behavior that could shed light on the long-standing mysteries of associative learning, dating back to Pavlov’s famous canine experiments. The study, which was performed on the widely used laboratory organism C. elegans, was published in the Proceedings of the National Academy of Sciences (PNAS).

“We showed how learned associations are not mediated by just the strength of an association, but by multiple, nearly independent pathways — at least in the worms,” says Ilya Nemenman, an Emory professor of physics and biology whose lab led the theoretical analyses for the paper. “We expect that similar results will hold for larger animals as well, including maybe in humans.”

“Our model is dynamical and multi-dimensional,” adds William Ryu, an associate professor of physics at the Donnelly Centre at the University of Toronto, whose lab led the experimental work. “It explains why this example of associative learning is not as simple as forming a single positive memory. Instead, it’s a continuous interplay between positive and negative associations that are happening at the same time.”

First author of the paper is Ahmed Roman, who worked on the project as an Emory graduate student and is now a postdoctoral fellow at the Broad Institute. Konstaintine Palanski, a former graduate student at the University of Toronto, is also an author.

The conditioned reflex

More than 100 years ago, Ivan Pavlov discovered the “conditioned reflex” in animals through his experiments on dogs. For example, after a dog was trained to associate a sound with the subsequent arrival of food, the dog would start to salivate when it heard the sound, even before the food appeared.

About 70 years later, psychologists built on Pavlov’s insights to develop the Rescorla-Wagner model of classical conditioning. This mathematical model describes conditioned associations by their time-dependent strength. That strength increases when the conditioned stimulus (in Pavlov’s dog’s case the sound) can be used by the animal to decrease the surprise in the arrival of the unconditioned response (the food).

Such insights helped set the stage for modern theories of reinforcement learning in animals, which in turn enabled reinforcement learning algorithms in artificial intelligence systems. But many mysteries remain, including some related to Pavlov’s original experiments.

After Pavlov trained dogs to associate the sound of a bell with food he would then repeatedly expose them to the bell without food. During the first few trials without food, the dogs continued to salivate when the bell rang. If the trials continued long enough, the dogs “unlearned” and stopped salivating in response to the bell. The association was said to be “extinguished.”

Pavlov discovered, however, that if he waited a while and then retested the dogs, they would once again salivate in response to the bell, even if no food was present. Neither Pavlov nor more recent associative-learning theories could accurately explain or mathematically model this spontaneous recovery of an extinguished association.

Teasing out the puzzle

Researchers have explored such mysteries through experiments with C. elegans. The one-millimeter roundworm only has about 1,000 cells and 300 of them are neurons. That simplicity provides scientists with a simple system to test how the animal learns. At the same time, C. elegans’ neural circuitry is just complicated enough to connect some of the insights gained from studying its behavior to more complex systems.

Earlier experiments have established that C. elegans can be trained to prefer a cooler or warmer temperature by conditioning it at a certain temperature with food. In a typical experiment, the worms are placed in a petri dish with a gradient of temperatures but no food. Those trained to prefer a cooler temperature will move to the cooler side of the dish, while the worms trained to prefer a warmer temperature go to the warmer side.

But what exactly do these results mean? Some believe that the worms crawl toward a particular temperature in expectation of food. Others argue that the worms simply become habituated to that temperature, so they prefer to hang out there even without a food reward.

The puzzle could not be resolved due to a major limitation of many of these experiments — the lengthy amount of time it takes for a worm to traverse a nine-centimeter petri dish in search of the preferred temperature.

Measuring how learning changes over time

Nemenman and Ryu sought to overcome this limitation. They wanted to develop a practical way to precisely measure the dynamics of learning, or how learning changes over time.

Ryu’s lab used a microfluidic device to shrink the experimental model of nine-centimeter petri dishes into four-millimeter droplets. The researchers could rapidly run experiments on hundreds of worms, each worm encased within its individual droplet.

“We could observe in real time how a worm moved across a linear gradient of temperatures,” Ryu says. “Instead of waiting for it to crawl for 30 minutes or an hour, we could much more quickly see which side of the droplet, the cold side or the warm side, that the worm preferred. And we could also follow how its preferences changed with time.”

Their experiments confirmed that if a worm is trained to associate food with a cooler temperature it will move to the cooler side of the droplet. Over time, however, with no food present, this memory preference seemingly decays.

“We found that suddenly the worms wanted to spend more time on the warm side of the droplet,” Ryu says. “That’s surprising because why would the worms develop a different preference and even avoidance of the temperature they had come to associate with food?”

Eventually, the worm begins moving back and forth between the cooler and warmer temperatures.

The researchers hypothesized that the worm does not simply forget the positive memory of food associated with cooler temperatures but instead starts to negatively associate the cooler side with no food. That spurs it to head for the warmer side. Then as more time passes, it begins to form a negative association of no food with the warmer temperature, which combined with the residual positive association to the cold, makes it migrate back to the cooler one.

“The worm is always learning, all the time,” Ryu explains. “There is an interplay between the drive of a positive association and a negative association that causes it to start oscillating between cold and warm.”

“It’s like when you lose your keys”

Nemenman’s team developed theoretical equations to describe the interactions over time between the two independent variables — the positive, or excitatory, association that drives a worm toward one temperature and the negative, or inhibitory, association that drives it away from that temperature.

“The side that the worm gravitates toward depends on when exactly you take the measurements,” Nemenman explains. “It’s like when you lose your keys you may check the desk where you usually keep them first. If you don’t see them there right away, you run around different places looking for them. If you still don’t find them, you go back to the original desk figuring you just didn’t look hard enough.”

The researchers repeated the experiments under different conditions. They trained the worms at different starting temperatures and starved them for different durations of time before testing their temperature preference, and the worms’ behaviors were correctly predicted by the equations.

They also tested their hypothesis by genetically modifying the worms, knocking out the insulin-like signaling pathway known to serve as a negative association pathway.

“We perturbed the biology in specific ways and when we ran the experiments, the worm’s behavior changed as predicted by our theoretical model,” Nemenman says. “That gives us more confidence that the model reflects the underlying biology of learning, at least in C. elegans.”

The researchers hope that others will test their model in studies of larger animals across species.

“Our model provides an alternative quantitative model of learning that is multi-dimensional,” Ryu says. “It explains results that are difficult, or in some cases impossible, for other theories of classical conditioning to explain.”

Reference: “A dynamical model of C. elegans thermal preference reveals independent excitatory and inhibitory learning pathways” by Ahmed Roman, Konstantine Palanski, Ilya Nemenman and William S. Ryu, 20 March 2023, Proceedings of the National Academy of Sciences.
DOI: 10.1073/pnas.2215191120

The study was funded by the Natural Sciences and Engineering Research Council of Canada, the Human Frontier Science Program, and the National Science Foundation.