Back to articles list

How to Normalize a Database Into 2NF and 3NF (With Examples)

We present a practical example of how to normalize a table that is in the first normal form (1NF) into a set of tables in the third normal form (3NF).

Database normalization is a very important concept in database design. Every database designer should know the rules for different normal forms. In practice, the most important normal form is the third normal form (3NF). If you want to be a database designer, you should know how to check if a table is in 3NF. You should also know how to normalize a table that’s not in 3NF into one in 3NF.

As you recall, a table is in 3NF if:

1. It is in second normal form (2NF).
2. All non-prime attributes are directly (non-transitively) dependent on the entire candidate key.

Typically, you normalize a table from 1NF to 3NF in two steps: first you normalize it into 2NF, then you normalize that into 3NF. In this article, we’ll show you an example of normalization from 1NF through 2NF into 3NF. We’ll also discuss problems with 1NF tables that are solved when you normalize them into 2NF.

Normalization to 2NF

We’ll work with a table named `Courses1`, which contains information about courses offered at a certain university: the titles of the courses, the lecturers, the departments, and the recommended textbooks for each course. The table is in the first normal form, since all column values are atomic. Here’s the table:

`Courses1`

coursedepartmentlecturertextbook
Relational DatabasesComputer ScienceJeremy BrownDatabase Systems
Relational DatabasesComputer ScienceJeremy BrownIntroduction to Databases
Cloud DatabasesComputer ScienceJeremy BrownDatabase Systems
Cloud DatabasesComputer ScienceJeremy BrownBigQuery Manual
AlgorithmsComputer ScienceJames CormenIntro to Algorithms
Data ScienceMathematicsJane DowningDatabase Systems
Data ScienceMathematicsJane DowningStatistics
CalculusMathematicsJohn SmithIntro to Calculus

There are several problems with various operations on data in this table:

• `INSERT`: You can't add a course with no textbooks.
• `UPDATE`: To change the lecturer for the course `Relational Databases`, you have to change two rows.
• `DELETE`: If you remove the `Algorithms` course, you also remove the lecturer `James Cormen`.

These problems come from the fact that this table is not in 2NF. Let’s see why. There are three non-trivial functional dependencies in this table:

• `course, textbook -> lecturer, department`: The `course` and `textbook` determine the `lecturer` and the `department`.
• `course -> lecturer, department`: The `course` determines the `lecturer` and the `department`.
• `lecturer -> department`: The `lecturer` determines the department.

The candidate key in this table is the set {`course`, `textbook`}.

It is not in 2NF, because we have functional dependencies with only a part of the candidate key on the left hand-side of a dependency. The functional dependency `course` -> `lecturer`, `department` violates the rules for 2NF.

To normalize this, we need to get rid of the functional dependency that violates 2NF: `course` -> `department`, `lecturer`.

The normalization procedure usually involves decomposing a table into two or more tables that contain the same information. In this case, we’ll extract the columns of the functional dependency violating 2NF, that is, `course`, `department`, and `lecturer`, into a separate table, `Courses2`, with the candidate key {`course`}. The remaining column in `Courses1` is the `textbook` column. Of course, we can’t just put it into a new table, because we’ll lose the course-textbook relationship. So, we will put both columns, `course` and `textbook`, into another table, `CourseTextbooks`.

To decompose a table into a set of columns, you identify these columns, take the data from those columns only, and remove the duplicates. For example, we create a table named `Courses2` by first taking the columns `course`, `department`, and `lecturer` from `Courses1`. We then take the data from these columns and remove the duplicates. There are two rows with all of the values `Relational` `Databases`, `Computer Science`, and `Jeremy Brown` in `Courses1`, but there is only one row with all of these values in `Courses2`.

Here’s the final result of the split:

Courses2

coursedepartmentlecturer
Relational DatabasesComputer ScienceJeremy Brown
Cloud DatabasesComputer ScienceJeremy Brown
AlgorithmsComputer ScienceJames Cormen
Data ScienceMathematicsJane Downing
CalculusMathematicsJohn Smith

CourseTextbooks

coursetextbook
Relational DatabasesDatabase Systems
Relational DatabasesIntroduction to Databases
Cloud DatabasesDatabase Systems
Cloud DatabasesBigQuery Manual
AlgorithmsIntro to Algorithms
Data ScienceDatabase Systems
Data ScienceStatistics
CalculusIntro to Calculus

You can reconstruct the same information in `Courses1`: simply join the data from `Courses2` and `CourseTextbooks`.

Both of these tables are in 2NF. There are no functional dependencies within `CourseTextbooks`, and the candidate key is {`course`, `textbook`}. The functional dependencies in `Courses1` are:

• `course` -> `department`, `lecturer`.
• `lecturer` -> `department`.

The candidate key is {`course`}, and there are no functional dependencies on only a part of a key, so it is in 2NF.

The following problems have been removed in the new database:

• `INSERT`: You can now add a course with no textbooks. Simply add the course to `Courses2` without adding rows to `CourseTextbooks`.
• `UPDATE`: You can change the lecturer for the course `Relational Databases` by changing just one row in `Courses2`.

This problem still persists:

• `DELETE`: If we remove the `Algorithms` course, we also remove the lecturer `James Cormen`.

We’ll address this problem in a little.

The Algorithm for Normalizing to 2NF

Here’s the general algorithm for normalizing a table from 1NF to 2NF.

Suppose you have a table `R` with scheme `S` which is in 1NF but not in 2NF. Let A -> B be a functional dependency that violates the rules for 2NF, and suppose that the sets A and B are distinct (A ∩ B = ∅).

• Let C = S – (A U B). In other words:
• A = attributes on the left-hand side of the functional dependency.
• B = attributes on the right-hand side of the functional dependency.
• C = all other attributes.
• We can split `R` into two parts:
• `R1`, with scheme C U A.
• `R2`, with scheme A U B.
• The original relation can be recovered as the natural join of `R1` and `R2`: R = `R1 NATURAL JOIN R2`.

Normalization to 3NF

Let’s look again at `Courses2`. It is in 2NF but not in 3NF.

`Courses2`

coursedepartmentlecturer
Relational DatabasesComputer ScienceJeremy Brown
Cloud DatabasesComputer ScienceJeremy Brown
AlgorithmsComputer ScienceJames Cormen
Data ScienceMathematicsJane Downing
CalculusMathematicsJohn Smith

We have the following functional dependencies in this table:

• `course` -> `lecturer`, `department`.
• `lecturer` -> `department`.

The candidate key is {`course`}.

The table is not in 3NF, because there is a transitive functional dependency from the candidate key `course` to `department` (via `course` -> `lecturer` and `lecturer` -> `department`). The functional dependency `lecturer` -> `department` violates the rules for 3NF.

To convert `Course2` into 3NF, we have to remove this functional dependency. The procedure is similar to the one we performed for 2NF: we extract the columns `lecturer` and `department` into a separate table, `Lecturers`. The remaining column in `Courses2` is `course`. We need to keep the course-lecturer connection, so the other table is `Courses3` with columns `course` and `lecturer`.

Here is the result:

Lecturers

lecturerdepartment
Jeremy BrownComputer Science
James CormenComputer Science
Jane DowningMathematics
John SmithMathematics

Courses3

courselecturer
Relational DatabasesJeremy Brown
Cloud DatabasesJeremy Brown
AlgorithmsJames Cormen
Data ScienceJane Downing
CalculusJohn Smith

The problems we had in 2NF are now resolved:

• `INSERT`: You can now add a course with no textbooks. Simply add the course to `Courses3` without adding rows to `CourseTextbooks`.
• `UPDATE`: You can change the lecturer for `Relational Databases` by changing just one row in `Courses3`.
• `DELETE`: Removing the `Algorithms` course does not remove the lecturer `James Cormen`.

Get Started With Database Normalization

To normalize a table from 1NF to 3NF, you need to normalize it to 2NF first then to 3NF. In the normalization process, you decompose a table into multiple tables that contain the same information as the original table. The normalization process usually removes many problems related to data modification.

If you liked this article, check out other normalization articles on our blog.

If you’re a student taking database classes, make sure to create a free academic account in Vertabelo, our online ER diagram drawing tool. It allows you to draw logical and physical ER diagrams directly in your browser.

Vertabelo supports PostgreSQL, SQL Server, Oracle, MySQL, Google BigQuery, Amazon Redshift, and other relational databases. Try it out – see how easy it is to get started!