Shrijith Venkatramana

Posted on Feb 13

Understanding Backpropagation from Scratch with micrograd - Derivatives

#programming #ai #python #machinelearning

Hi there! I'm Shrijith Venkatrama, founder of Hexmos. Right now, I’m building LiveAPI, a tool that makes generating API docs from your code ridiculously easy.

Neural networks might seem complex, but at their core, they rely on a simple yet powerful concept: derivatives. Andrej Karpathy’s micrograd proves this beautifully—it's just two Python files with less than 150 lines of code, yet it captures the fundamental ideas behind neural networks.

This blog breaks down micrograd step by step, starting with the very foundation: what derivatives really mean and how we compute them. You’ll learn:

How backpropagation works by understanding derivatives in the simplest way
The difference between symbolic and computational differentiation
How small input changes affect output (positive, negative, and zero slopes)
Why neural networks don’t need explicit derivative formulas

With visual explanations, simple code snippets, and practical insights, by the end of this post, you’ll have a solid grasp of how gradients drive learning in neural networks—without drowning in unnecessary complexity. Let’s dive in.

Karpathy's `micrograd` is just 2 files of Python (< 150 LOC)

micrograd consists of just two small files:

engine.py: Less than 100 lines of code, defines the Value class, the code that powers the neural network
nn.py: Defines Neuron, Layer and MLP (Multi-Layer Perceptron). In total, around 60 lines of code.

Fundamentally - the core ideas behind neural networks can be captured in just under 150 lines of simple Python code. The rest of the code complexity in other libraries is about efficiency.

Groundwork for understanding the definition of derivatives

The first goal is to understand the concept of derivatives with some examples. So we do the following to prepare some groundwork:

Define a function f - takes in scalar input, gives scalar output
Generate a range of values for x and y (input/output)
Plot the values

Two Ways of Calculating Derivative

The task is to find the derivative of the function at particular points, such as where x=3 and so on.

In school, we are usually taught the symbolic method.

Say for the expression 3*x**2 - 4*x + 5, we can find the derivative expression to be 6*x - 4.

But since we're dealing with neural networks - the expression we are dealing with could be huge, and nobody writes those expressions down.

So instead of taking a symbolic approach - we take a computational approach.

However, it is useful to understand what derivatives mean at a conceptual level first - before we move onto the computations.

The Meaning of a Differentiable Function

The key formula is the following:

In the above image - we see that h is a very small value, and it keeps getting smaller, vanishing towards a 0.

The question is - what is the trend of a function's output, when there's a small bump/increase in its input.

At a higher level, we are asking at point x, if we increase it by a tiny amount h to get x+h, does the output increase or decrease? And the change in output - what is the magnitude of it?

The resultant value of the formula is a slope. And if a bump in input leads to a positive slope, it means the value of output increases.

If the input is increased, and we get a negative slope it means, the value of output decreases.

Also at a specific point of 2/3 in the above diagram you can also see that a slight increase in input will still keep the same output - that is we have a zero slope

Numerical Exploration

The above intuition can be validated with some numerical exploration with a valid x value and a tiny h value.

Positive Slope Example

Negative Slope Example

Zero Slope Example

Reference

The spelled-out intro to neural networks and backpropagation: building micrograd)

DEV Community

Understanding Backpropagation from Scratch with micrograd - Derivatives

Karpathy's `micrograd` is just 2 files of Python (< 150 LOC)

Groundwork for understanding the definition of derivatives

Two Ways of Calculating Derivative

The Meaning of a Differentiable Function

Numerical Exploration

Positive Slope Example

Negative Slope Example

Zero Slope Example

Reference

Top comments (0)

Read next

🚀 Master Flutter CI/CD: Automate App Deployment with GitHub Actions

LivinGrimoire: The Skill Crafter - A Game Concept

Dependency Injection, CRUD With SQLAlchemy | Building Performant APIs with Python & Litestar Part #4

Big Issues in Vue 3: What Developers Need to Know

Karpathy's micrograd is just 2 files of Python (< 150 LOC)

Groundwork for understanding the definition of derivatives

Two Ways of Calculating Derivative

The Meaning of a Differentiable Function

Numerical Exploration

Positive Slope Example

Negative Slope Example

Zero Slope Example

Reference

Read next

🚀 Master Flutter CI/CD: Automate App Deployment with GitHub Actions

LivinGrimoire: The Skill Crafter - A Game Concept

Dependency Injection, CRUD With SQLAlchemy | Building Performant APIs with Python & Litestar Part #4

Big Issues in Vue 3: What Developers Need to Know

Karpathy's `micrograd` is just 2 files of Python (< 150 LOC)