Database Normalization & Everything you need to know about normalization

Gautam Gandhi
3 min readMay 23, 2021

--

Definition: Database normalization is a technique for organizing the data in the database.

Normalization deals with redundancy(repetition) and database anomalies.

Database Anomalies

Insertion Anomaly

Imagine for a new admission, until and unless students select a branch, data of the student inserted in the table for branch column will be NULL.

Also, there are multiple students in the same branch them all of them have the same branch details, this means branch information is repeated for all those students.

These are Insertion Anomaly

Update Anomaly

Imagine let's say one teacher name ‘X’ from your department leaves the college. Then we have to update the record of ‘X’ in every table, At all the places where the ‘X’ name appears.

Failing in doing so will result in an update anomaly and it will lead to data inconsistency.

Delete Anomaly

Imagine let's say there is only one employee in a department and that employee leaves due to some reason. Department info is only present with that employee records. Then deleting the data of that employee will also delete the department info.

This is a delete anomaly.

Normalization deals with all these anomalies.

Normalization Forms

Normalization rules

  1. First Normal Form
  2. Second Normal Form
  3. Third Normal Form
  4. BCNF
  5. Fourth Normal Form

First Normal Form

For the table to be in the first normal form then it should follow some rules:

a. It should have an Atomic (single) valued column.

b. Values stored in the column should be of the same Domain(type).

Second Normal Form

For the table to be in the second normal form then it should follow some rules:

a. Table should be in first normal form.

b. It should not have any Partial dependency.

Partial Dependency: It is a dependency where a non-prime attribute(column) in a table depends on only a part of the primary key and not on the whole key.

Third Normal Form

For the table to be in the third normal form then it should follow some rules:

a. Table should be in second normal form.

b. It should not have any Transitive dependency.

Transitive Dependency: When a non-prime attribute(column) depends on other non-prime attributes(column) and that attribute is dependent on the primary key.

example:

NP -> NP -> P

NP(Non-Prime)

P(Prime)

NOTE: We should normalization up to a certain extend. Having normalization too deep is also not advisable. that’s why we only do normalization up to 3rd normal form.

BCNF (Boyce-Codd Normal Form)

Higher version of Third Normal Form.

For the table to be in the BCNF normal form then it should follow some rules:

a. Table must be in Third normal form.

b. It deals with multiple overlapping candidate keys.

Fourth Normal Form

For the table to be in the fourth normal form then it should follow some rules:

a. Table must be in BCNF.

b. It deals with Multivalued dependency.

Multivalued Dependency: When two attributes in a table are dependent on each other but, both dependent on a third attribute.

--

--