Ever felt like your database is a messy, overstuffed closet? Too many redundancies, anomalies, and weird dependencies?
Well, SQL normalization is here to tidy things up!
Normalization is the process of organizing a relational database to reduce redundancy and improve data integrity.
It involves breaking down tables into smaller, structured tables based on rules called normal forms (NFs).

Speed Up DB Queries Like a Pro
Athreya aka Maneshwar ・ Feb 6
1NF: First Normal Form (Atomicity is the Key)
What’s the issue?
Unnormalized tables often contain repeating groups and non-atomic (multi-valued) data, making data retrieval and updates inefficient.
How does 1NF fix it?
1NF ensures that:
- Each column contains atomic values (indivisible values).
- No repeating groups or arrays exist in any row.
Example:
Unnormalized Table (Before 1NF)
StudentID | Name | Courses |
---|---|---|
101 | Alice | Math, Science |
102 | Bob | English |
103 | Charlie | Math, History |
Here, the Courses column contains multiple values, violating 1NF.
1NF Table (After Fix)
StudentID | Name | Course |
---|---|---|
101 | Alice | Math |
101 | Alice | Science |
102 | Bob | English |
103 | Charlie | Math |
103 | Charlie | History |
Now, each column contains atomic values, and there are no repeating groups.
2NF: Second Normal Form (Eliminating Partial Dependencies)
What’s the issue?
Even after 1NF, we might still have partial dependencies, where a non-key attribute depends on only part of a composite primary key.
How does 2NF fix it?
- Remove partial dependencies by creating separate tables.
Example:
In our 1NF table, StudentID and Course together form the primary key.
If we add StudentAge, it only depends on StudentID, not the whole key.
1NF Table (Before 2NF)
StudentID | Name | Age | Course |
---|---|---|---|
101 | Alice | 20 | Math |
101 | Alice | 20 | Science |
102 | Bob | 22 | English |
Here, Age depends only on StudentID, not on Course.
That's a partial dependency.
2NF Tables (After Fix)
Students Table
StudentID | Name | Age |
---|---|---|
101 | Alice | 20 |
102 | Bob | 22 |
Enrollments Table
StudentID | Course |
---|---|
101 | Math |
101 | Science |
102 | English |
Now, no column depends on only part of the primary key.
3NF: Third Normal Form (Eliminating Transitive Dependencies)
What’s the issue?
Even after 2NF, we might still have transitive dependencies, where a non-key column depends on another non-key column instead of the primary key.
How does 3NF fix it?
- Ensure that every non-key column depends ONLY on the primary key and not on another non-key attribute.
Example:
Let’s say we store the Department along with the Instructor.
2NF Table (Before 3NF)
CourseID | Course | Instructor | Department |
---|---|---|---|
201 | Math | Shiv | Science |
202 | English | Jhon | Arts |
Here, Department depends on Instructor, not directly on CourseID.
That’s a transitive dependency.
3NF Tables (After Fix)
Courses Table
CourseID | Course | Instructor |
---|---|---|
201 | Math | Shiv |
202 | English | Jhon |
Instructors Table
Instructor | Department |
---|---|
Shiv | Science |
Jhon | Arts |
Now, every column depends ONLY on the primary key, eliminating transitive dependencies.
BCNF: Boyce-Codd Normal Form (Stricter Than 3NF)
What’s the issue?
- 3NF doesn’t always eliminate overlapping candidate keys, which can lead to anomalies.
How does BCNF fix it?
- Ensures every determinant (an attribute that functionally determines another) is a candidate key.
Example:
Let’s say we have a table storing course assignments with instructors.
3NF Table (Before BCNF)
StudentID | Course | Instructor |
---|---|---|
101 | Math | Shiv |
101 | Science | Mike |
Here, Instructor determines Course, but Instructor is not a candidate key.
This violates BCNF.
BCNF Tables (After Fix)
StudentCourses Table
StudentID | Course |
---|---|
101 | Math |
101 | Science |
CourseInstructors Table
Course | Instructor |
---|---|
Math | Shiv |
Science | Mike |
Understanding BCNF
Boyce-Codd Normal Form (BCNF) is an advanced level of normalization that builds on 3NF by addressing cases where 3NF still allows certain functional dependencies to exist improperly.
BCNF ensures that every functional dependency (FD) has a superkey as its determinant.
This means that if a column (or set of columns) determines another column, it must be a candidate key—a minimal set of attributes that uniquely identify a row.
A simple way to understand BCNF is:
- If X → Y, then X must be a candidate key.
Why is BCNF necessary?
3NF still allows anomalies when a non-trivial functional dependency exists where a non-superkey determines another column.
BCNF eliminates these issues by enforcing stricter rules.
How do we achieve BCNF?
If a table violates BCNF, we can:
- Decompose the table into smaller tables to eliminate the dependency.
- Ensure that all determinants are candidate keys so that each functional dependency is properly structured.
By following these steps, we ensure data integrity and eliminate redundancy, making our database more efficient.
I’ve been working on a super-convenient tool called LiveAPI.
LiveAPI helps you get all your backend APIs documented in a few minutes
With LiveAPI, you can quickly generate interactive API documentation that allows users to execute APIs directly from the browser.
If you’re tired of manually creating docs for your APIs, this tool might just make your life easier.
Top comments (1)
That us a very good explanation of the concept