The ARC of Intelligence
It is not unreasonable to suppose that abstract reasoning and conceptualization is the distinguishing feature of great Intelligence from a human perspective. In fact, this belief, particularly in symbolic reasoning as a primary intelligence feature, influenced early AI researchers like Herbert Simon. Ultimately, those early efforts prove brittle and unscalable. While current Large Language Models(LLMs) display capabilities that suggest reasoning, these systems don’t display reasoning as robust as humans. Hence, I concur with Prof. Melanie Mitchell in her blog post that forming abstractions and concepts is one of the most important challenges in AI. Francois Chollet's Abstraction and Reasoning Corpus (ARC) is a step towards addressing this. An AI system's success on ARC tasks could intensify discussions about AGI.
What is AGI?
Ben Goertzel, notable for popularizing AGI in his book Artificial General Intelligence set the stage for widespread discussion of the topic. With the release of ChatGPT, AGI became a very popular and widely discussed subject especially as the likes of Eliezer Yudkowsky dramatically raised concerns of AI doom. In this post, I’d like to explore the concept of Artificial General Intelligence(AGI) which is the goal of companies like OpenAI and DeepMind. In a recent interview on The Dwarkesh Podcast, Shane Legg suggested that AGI can be viewed as an AI system which matches human performance on a broad category of tasks in as many domains as possible. He goes on to state that if we find it difficult to create a task which the AI fails, we can assume we we possess an AGI.
DeepMind paper on Operationalizing AGI progress
At the moment, there is no universally accepted lucid, unambiguous definition of AGI. In an effort to clarify thinking around some aspects of the AGI subject, Deepmind released a paper titled “Levels of AGI: Operationalizing Progress on the path to AGI”. In the paper, the authors introduce 6 principles they believe contribute to a clear definition of AGI. The principles are:
Focus on Capabilities and not processes: The definition should emphasize what AGI can achieve, rather than the mechanisms behind its functionality.
Focus on Generality and Performance: Both the breadth of tasks (generality) and the effectiveness in performing these tasks (performance) are essential components of AGI.
Focus on cognitive and metacognitive(rather than physical) tasks: The authors suggest that ability to complete physical tasks may increase generality but should not be considered as a necessary prerequisite.
Focus on Potential and not deployment: AGI should be defined by its potential capabilities rather than its practical deployment in real-world applications. Focusing on deployment introduces legal and social hurdles.
Focus on Ecological validity for benchmarking tasks: Emphasis should be placed on tasks which humans value(value includes economic, social or other aspect people create).
Focus on the path towards AGI rather than a single endpoint: Authors think there is value in a level-based approach is proposed for evaluating AGI. We should recognize the progression towards AGI rather than focusing solely on the final goal.
In addition, they identify Performance and Generality as two characteristics which are core to AGI. A screenshot from the paper is stating this is shown below.
The dimensions of performance and generality are used to introduce a levels of AGI classification which range from Level 0:No AI to Level 5: Superhuman.
My thoughts and opinions
DeepMind's paper has influenced my understanding of AI systems, for which I am grateful. However, my perspective slightly diverges from their task-oriented approach to performance and generality.
I choose to disregard performance and focus solely on generality. Rather than viewing generality as a simple yes-or-no attribute, I see it as existing on a continuum. This perspective allows us to understand generality through various lenses: the diversity of domains it covers, the range of problems across those domains, and the mechanisms employed. Therefore, I define generality in AGI as the breadth of mechanisms an agent utilizes to effectively address challenges across a wide spectrum of domains.
Domains
On this aspect of Generality, I have been heavily influenced by Christopher Summerfield’s book- Natural General Intelligence. I highly recommend this book to anyone seeking a better understanding of the issues, thinking and research directions in AI. Here is a quote from the book
So here lies the major problem for AI research - the natural world is complex in terms of both sensory signals and the tasks we encounter. Successfully negotiating our environment thus jointly requires both sensorimotor coordination(to solve problems with high input complexity, such as Atari) and careful planning and foresight (to solve problems with fiendish strategic complexity like Go).
I therefore view an AGI system as having to operate in two domains: sensorimotor domain and strategic domain. While giant strides have been made in problem solving, search and symbolic reasoning, the challenges of the sensorimotor domain is still an open problem as summed up in Moravec’s paradox. I consider the sensorimotor domain necessary for an agent acquiring deep understanding of our human experience. More on this later.
Problems and Mechanisms
To help with following the guidelines I outlined above, I found the books by Max Bennett - A Brief History of Intelligence and Daniel Dennett - From Bacteria to Bach and Back very insightful. Max Bennett’s book is particularly good as he chronicles a sequence of breakthroughs that led to the development of human intelligence. I view each breakthrough as highlighting problems animals encountered in the world. I’ll quickly summarize the breakthroughs and the problems I think they solved:
Breakthrough 1: Steering - Development of bilateral body design to reduce navigational choices, neural architectures which encoded stimuli as good or bad to solve problem of survival, associative learning to change steering based on past experience leading to better decisions and increased chances of survival
Breakthrough 2: Reinforcing - Development of dopamine ad Temporal Difference(TD) learning signal, basal ganglia development as actor-critic system enabling natural agents to better solve Credit Assignment Problem, Hippocampus and perception of 3D space for better navigation
Breakthrough 3: Simulating - Development of neocortex and angular prefontal cortex(aPFC) which helped animals simulate world and engage in counterfactual learning. This was a more advanced solution to credit assignment problem. An artificial equivalent of this solution is AlphaZero which elegantly combines search and learning.
Breakthrough 4: Mentalizing - Abilities such as theory of mind and imitation learning. For agents in social groups, these abilities enable them better navigate interactions and adapt ultimately boosting chances of survival.
Breakthrough 5: Speaking - Development of language which boosts communication between agents and provides more solutions to problems encountered in social groups such as co-ordination.
Moving on to Daniel Dennett’s book, he makes the point of competence without comprehension and argues that comprehension of the sort humans have may emerge from a compounding or composition of competencies. Furthermore, he creates a classification of natural agents based on competencies which roughly map to Max Bennett’s breakthroughs.
Darwinian creatures: Hard-coded behaviors, not very skilled learners
Skinnerian creatures: Better at adjusting behaviour based on Reinforcement.
Popperian creatures: Extract information from world and generate hypothetical behaviors offline. Looks before leaping. AlphaZero does this in it’s look ahead search.
Gregorian creatures: These agents have “thinking tools”. The agents or creatures in this category have abilities which include abstract conceptualization and reasoning. Humans belong in this group
This leads to the following table which provides a rough mapping between competencies/breakthroughs and the classification of system of Daniel Dennett.
I believe a refined, detailed and well parameterized version of the above table may present a more satisfactory approach to formalizing AGI development.
Drawing inspiration from the Feynman quote “What I cannot create, I do not understand”, I have outlined a methodology for building an AI system. Through building, we may gain a more nuanced understanding of nature and capabilities of AI/AGI systems.
A focus on the problems Intelligence solves. The first two principles in the DeepMind paper suggest this but it needs to be explicitly stated. To help identify these problems, we can take inspiration from the natural world which gives us existence proof of Intelligence. I feel this point needs emphasis. When I say take inspiration from biology, I don’t mean in terms of the “how” biology achieves intelligence. The best reference to buttress this point is David Marr’s 3 levels of analysis which details a computational(problem identification), algorithmic and physical level. We should take inspiration from nature/biology at the computational level.
Next, we should create the “primitives/mechanisms” that solve a subset of the problems identified. This is harder but is easier once the problem has been identified. For instance, a cryptographic system needs to provide confidentiality or solve the problem of eavesdroppers. To meet this requirement/solve this problem, we have Encryption as a mechanism.
Provide algorithmic implementations of each mechanism.
With these steps, we can then delineate the features and functions of an AI system, forming the basis of our definition. This leads me to tentatively define AGI as an agent with the fundamental set of capabilities for generating concepts/thinking tools geared towards solving problems across the complex sensorimotor and strategic domains.
Image Schemas, Abstractions, Concepts and AGI Architecture
There is a tight relationship between the notions of Image schemas, affordances, and Abstractions/Concepts which demands clarification.
The works of cognitive scientists Mark Johnson and George Lakoff on Image Schemas highlight the importance of embodiment in understanding our cognitive processes. They propose that our abilities in abstract reasoning and conceptualization are deeply rooted in the inferential structures developed from our sensorimotor experiences. Johnson and Lakoff refer to these inferential structures as Image Schemas. This concept is closely linked to the idea of 'affordances,' which refers to the opportunities for action that objects present to us based on their properties and our capabilities. In simpler terms, affordances are what the environment offers the individual, guiding our interactions with the world around us.
Expanding on this, 'concepts' are the building blocks of our thoughts and knowledge. I couldn’t have articulated the notion of ‘concepts’ any better than Melanie Mitchell here:
Consider, for example, the “simple” concept on top of. In its most concrete definition, an object or location being “on top of” another object or location refers to a spatial configuration (“the cat is on top of the TV”) but the concept can be abstracted in any number of ways: being on top of a social hierarchy, on top of one’s game, on top of the world (i.e., extremely happy), staying on top of one’s work, singing at the top of one’s voice, being born at the top of a decade, and so on.
Forming and abstracting concepts is at the heart of human intelligence. These abilities enable humans to understand and create internal models of the world—often involving physical knowledge or experience, such as “something on top of something else”—and to use these models to make sense of new information, often via analogy, and to decide how to behave in novel situations. I particularly like the definition of “concept” given by the cognitive psychologist Lawrence Barsalou: “A concept is a competence or disposition for generating infinite conceptualizations of a category.” And the driving force for generating such conceptualizations is analogy, such as when we map the spatial notion of “on top of” to a temporal, vocal, or social notion. Douglas Hofstadter put it this way: “A concept is a package of analogies.”
They are the mental representations that we use to categorize our experiences and understand the world. I consider ‘concepts’ to be our language expression of the preverbal Image Schemas introduced by Johnson and Lakoff.
As for ‘abstractions’ I find it hard to distinguish it from ‘concepts’ as people tend to use them interchangeably. Here is Christopher Summerfield on abstractions:
The problem of abstraction is to learn the explicit representations of the relations that exist in data
Referring to Melanie Mitchell’s example of the ‘on top of’ concept, we can equivalently name the ‘on top of’ concept a ‘relational abstraction’.
Hence, an agent-natural or artificial may need to explore its world, extract affordances and transform them into abstractions/concepts. The combination of image schemas, concepts and abstractions allow us to understand and navigate the world by categorizing and generalizing our experiences.
Shown below is a rough sketch of the domains and links between both domains which I believe an AGI should possess.
Conceptually, I envision an AGI system resembling a programming language compiler. The front-end, like a compiler's parser, would extract patterns from complex signals into an Intermediate Representation (IR). The back-end would be akin to a compiler's instruction-selection module, employing the IR for high-level planning and abstract reasoning.
Therefore, the path to AGI, as I see it, lies through Instrumental Intelligence. This form of intelligence emphasizes the necessity for an agent to develop by engaging with the real world or a high-fidelity simulation of it. It's about gaining a deep understanding of the world's affordances, forming abstractions from sensorimotor experiences, and developing concepts that can be applied across various contexts. This approach not only mirrors human cognitive development but also aligns with the complex and dynamic nature of real-world interactions.
Challenges and Future Directions
So, what does all this mean for the ARC challenge created by Francois Chollet? I am doubtful that a system trained via conventional means on images will robustly solve the ARC challenge or variations of it. We might need to train an agent which learns to act, navigate and alter it’s external environment in either the real world or simulated environment. In this manner, the agent acquires the inferential structures or image schemas which form the bed rock of abstract reasoning and conceptualization. This will equip the agent with the capability to create the thinking tools necessary for solving novel problems.
For now, the concepts of Intelligence and AGI are still in a “Keplerian” state. Our descriptions of Intelligence and AGI are still somewhat “descriptive” and reflect empirical observations and some intuitions. Ideally, we want to have laws, theories, predictions, equations, quantities and deep explanations of Intelligence as a phenomenon. It might be that we need someone or a group to do for Intelligence what Alan Turing did for Computation. Rodney Brooks alluded to this in an article for MIT TechnologyReview titled Newer Math?. We need someone to set Intelligence on a firm foundation within some formal system akin to mathematics.