Bennett wraps the history of intelligence around his theory of the five breakthroughs: steering (approach/avoid), learning, simulating (mental models), mentalizing (understanding your own and others mental states) and language. He makes a compelling case but more importantly it is a great vehicle for describing the dozens of very specific skills that we have evolved over the past 550 million years, since the first bilateral creatures evolved.
Bennett comes for a commercial AI background, but dropped into the neuroscience community to understand how brains solved some of the fundamental problems. That he was quickly taken in and invited in the labs of giants such as Karl Friston and Joe LeDoux indicates the depth of his knowledge and curiosity. The book tells dozens of stories of the development of key ideas by researches in neuroscience, primatology, and AI. For example, the story of the Credit Assignment Problem, or how Pavlov's dog knows it's the bell as opposed to some other stimulus. That took decades to articulate and pin down in the AI world. Without that perspective, you might not even think that it is a problem to be solved.
This book presents concepts so clearly that I gained new insights into ideas I have read about several times, for example, the granular neocortex and named for the granule cells in layer four. The description of Theory of Mind is fantastic.
Breakthrough #1 is steering.
Bilateral creatures could steer left or right. They developed brains to enable opposing valence signals to be integrated into a single steering decision. Dopamine and serotonin first appear to enable motivation and reward. These ancient worms learned what to approach and what to avoid. These first brains created the affective axes of valence & arousal, which yield the basic states: pleasure, pain, satiation, and stress.
p. 81 "Once animals began approaching specific things and avoiding others, the ability to tweak what was considered good and bad became a matter of life and death."
Here is a quote from Breakthrough #2 reinforcement learning.
Gambling and social feeds [FB/Instagram] work by hacking into our five-hundred-million-year-old preference for surprise, producing a maladaptive edge case that evolution has not had time to account for."
p.145 "
Curiosity and reinforcement learning coevolved because curiosity is a requirement for reinforcement learning to work. With the newfound ability to recognize patterns, remember places, and flexibly change behavior based on past rewards and punishments, the first vertebrates were presented with a new opportunity: for the first time, learning became, in and of itself, an extremely valuable activity. The more patterns a vertebrate recognized and the more places she remembered, the better she would survive. And the more new things she tried, the more likely she was to learn the correct contingencies between her actions and their corresponding outcomes. And so it was 500 million years ago in the tiny brain of our fish-like ancestors when curiosity first emerged.
Bennett wrote a paper outlining the key ideas in 2020 that was interesting, but a little dry, as technical papers will be. https://www.frontiersin.org/articles/10.3389/fnana.2021.693346/full
The book is fantastic!!
Breakthrough #1 - Reflexes drive valence responses without any learning required, making choices based on evolutionarily hard-coded rules.
Breakthrough #2 - The vertebrate basal ganglia and amygdala can then learn new behaviors based on what has historically been reinforced by these reflexes, making choices based on maximizing reward.
Breakthrough #3 - The mammalian aPFC can then learn a generative model of this model-free behavior and construct explanations, making choices based on imagined goals (e.g., drinking water). This could be considered a first-order model.
Breakthrough #4 - The primate gPFC can then learn a more abstract generative model (a second-order model) of this aPFC-driven behavior and construct explanations of intent itself, making choices based on mind states and knowledge
p. 206 "The frontal neocortex of a human brain contains three main subregions: motor cortex, granular prefrontal cortex (gPFC), and agranular prefrontal cortex (aPFC). The words granular and agranular differentiate parts of the prefrontal cortex based on the presence of granule cells, which are found in layer four of the neocortex. In the granular prefrontal cortex, the neocortex contains the normal six layers of neurons. However, in the agranular prefrontal cortex, the fourth layer of neocortex (where granule cells are found) is weirdly missing.* Thus, the parts of the prefrontal cortex that are missing layer four are called the agranular prefrontal cortex (aPFC), and the parts of the prefrontal cortex that contain a layer four are called the granular prefrontal cortex (gPFC). It is still unknown why, exactly, some areas of frontal cortex are missing an entire layer of neocortex, but we will explore some possibilities in the coming chapters."
p.289 "There are three broad abilities that seem to have emerged in early primates:
These may not, in fact, have been separate abilities but rather emergent properties of a single new breakthrough: the construction of a generative model of one’s own mind, a trick that can be called “mentalizing.” We see this in the fact that these abilities emerge from shared neural structures (such as the gPFC) that evolved first in early primates. We see this in the fact that children seem to acquire these abilities at similar developmental times..."
- Theory of mind: inferring intent and knowledge of others
- Imitation learning: acquiring novel skills through observation
- Anticipating future needs: taking an action now to satisfy a want in the future, even though I do not want it now
A few more random quotes. This book deserves better treatment. However, since I listened to it, I was not able to mark things in the margins. sigh.
p. 106 "Sutton decomposed reinforcement learning into two separate components: an actor and a critic. The critic predicts the likelihood of winning at every moment during the game; it predicts which board configurations are great and which are bad. The actor, on the other hand, chooses what action to take and gets rewarded not at the end of the game but whenever the critic thinks that the actor’s move increased the likelihood of winning. The signal on which the actor learns is not rewards, per se, but the temporal difference in the predicted reward from one moment in time to the next. Hence Sutton’s name for his method: temporal difference learning."
p. 135 "The same visual object can activate different patterns depending on its rotation, distance, or location in your visual field. This creates what is called the invariance problem: how to recognize a pattern as the same despite large variances in its inputs."
p. 200 "The second reason model-based reinforcement learning is hard is that choosing what to simulate is hard. In the same paper that Marvin Minsky identified the temporal credit assignment problem as an impediment to artificial intelligence, he also identified what he called the “search problem”: In most real-world situations, it is impossible to search through all possible options."
2023-11-26 YON <> jch.com/notes/BennettIntelligence.html