Shrijith Venkatramana

Posted on Feb 11

Moravec's Paradox: When the “Easy” Gets Hard

#beginners #ai #machinelearning #datascience

Hi there! I'm Shrijith Venkatrama, founder of Hexmos. Right now, I’m building LiveAPI, a tool that makes generating API docs from your code ridiculously easy.

Moravec's Paradox challenges our intuition about intelligence.

While we marvel at computers playing chess or solving complex equations, the tasks we find trivial—like tying a shoe or accurately assessing a friend’s mood—remain elusive for machines.

This blog dives deep into this paradox, examining both historical insights and its modern implications in AI evaluations, such as the recent FrontierMath benchmark referenced by Andrej Karpathy and the insightful xkcd comic .

1. Unpacking the Paradox

At its core, Moravec's Paradox highlights that:

High-level reasoning (e.g., mathematical problem-solving, logic puzzles) is relatively straightforward for computers.
Sensorimotor and perceptual tasks (e.g., visual recognition, object manipulation) require a tremendous amount of computational heft and remain challenging for AI.

This counterintuitive observation—first articulated in the 1980s by pioneers like Hans Moravec, Marvin Minsky, and Rodney Brooks—forces us to rethink what “intelligence” really means.

While computers can rapidly process vast datasets and execute deterministic tasks, the seemingly “menial” functions that come naturally to us involve deeply ingrained evolutionary skills.

A Quick Comparison

Below is a table that summarizes this duality:

Task Category	Computers Excel At	Humans Excel At
Closed, Deterministic Tasks	Chess, algebra, formal logic	–
Sensorimotor/Perceptual Tasks	Limited performance in dynamic, real-world scenarios	Object recognition, spatial navigation, manual tasks
Autonomous Problem-Solving	Requires well-defined prompts (e.g., FrontierMath evals)	Fluid, adaptive reasoning in unstructured environments
Contextual & Multimodal Understanding	Struggles with long-term coherence and context	Natural language understanding and everyday perception

2. LLM Evals and the FrontierMath Benchmark

Recent developments in Large Language Model (LLM) evaluations bring a modern twist to Moravec's Paradox.

New benchmarks—such as the FrontierMath benchmark—demonstrate that while LLMs are inching closer to expert-level performance in structured domains like math and coding, they falter when asked to perform tasks that require continuous, autonomous reasoning.

In simple terms, you could easily feed an LLM a neatly packaged problem, but ask it to “think on its feet” like a human intern, and you’ll see its limitations.

This phenomenon echoes the paradox: the tasks that seem simple to us, like piecing together a coherent narrative or handling long context windows, are those that AI struggles with the most.

3. Historical Roots and Biological Underpinnings

The origins of Moravec's Paradox lie in both the history of artificial intelligence and the biological evolution of human skills:

Evolutionary Perspective:
Human sensorimotor abilities are the product of millions of years of natural selection. Our brain’s evolved systems allow us to effortlessly recognize faces, navigate our environment, and even fold a shirt. In contrast, abstract reasoning—a relatively recent development in our evolutionary history—has not been as finely honed.
Early AI Ambitions:
In the early days of AI, researchers were confident that once the “hard” problems (like logic and algebra) were solved, the “easy” ones would fall into place.

Marvin Minsky and others soon discovered that mimicking a one-year-old’s perceptual and motor skills was an entirely different beast. This historical miscalculation is well captured in the xkcd comic , which humorously illustrates how tasks we take for granted can be enormously challenging for machines.

Bold takeaway: The natural abilities we perform without thought have been refined over billions of years, making them incredibly hard to replicate through computational means.

4. Looking Forward: The Future of AI Evaluations

As we push the boundaries of AI, the challenge is clear: we need evaluation frameworks that test not only closed-form reasoning but also the “menial” tasks that are deceptively complex. Consider the following points:

Long Context Windows & Coherence:
How do we ensure that AI maintains a coherent narrative over thousands of words?
Autonomy in Problem-Solving:
Unlike a calculator that executes clear instructions, can AI systems adapt and self-correct in unstructured environments?
Multimodal Input/Output:
Future benchmarks must account for challenges in processing images, audio, and text simultaneously.

A brief table outlining the evolving challenges might look like this:

Challenge	Current AI Strength	The Unsolved Puzzle
Deterministic Reasoning	Strong performance in structured tasks	Limited flexibility in unstructured problem-solving
Perceptual and Sensorimotor Tasks	Basic pattern recognition with curated data	Real-time, context-aware perception and interaction
Long-term Coherence	Capable with short-term context	Struggles with extended, dynamic narratives
Multimodal Integration	Specialized models for individual data types	Seamless integration across varied modalities

The goal is to bridge this gap by designing tests that capture the “effortless” skills of everyday human experience—essentially, creating evals for the tasks that have been evolving in nature for millennia.

5. Concluding Thoughts: The Duality of Intelligence

Moravec's Paradox forces us to re-examine the nature of intelligence itself.

It reminds us that the ease with which we perform everyday tasks is the result of deep, evolutionary refinement—a benchmark that modern AI still struggles to reach.

As we continue to build and evaluate intelligent systems, embracing this duality is crucial.In a world where LLMs can out-calculate a human in a math problem but falter at stitching together a coherent story or navigating a cluttered room, we are reminded that intelligence is not monolithic .

Each advancement in AI invites us to question what it means to be truly “smart.”What are your thoughts on the next frontier for AI evaluation?

Share your insights and join the debate in the comments below.

References:

Feel free to engage, challenge, or expand on these ideas—after all, the debate over what truly constitutes intelligence is far from settled.

DEV Community

Moravec's Paradox: When the “Easy” Gets Hard

1. Unpacking the Paradox

A Quick Comparison

2. LLM Evals and the FrontierMath Benchmark

3. Historical Roots and Biological Underpinnings

4. Looking Forward: The Future of AI Evaluations

5. Concluding Thoughts: The Duality of Intelligence

Top comments (0)

Read next

C# da Metod

Agent AI as you personal gym trainer

Mastering System Prompts for LLMs

Building an AI Agent Is Easy! Here’s Proof..