Database Normalization & Everything you need to know about normalization
--
Definition: Database normalization is a technique for organizing the data in the database.
Normalization deals with redundancy(repetition) and database anomalies.
Database Anomalies
Insertion Anomaly
Imagine for a new admission, until and unless students select a branch, data of the student inserted in the table for branch column will be NULL.
Also, there are multiple students in the same branch them all of them have the same branch details, this means branch information is repeated for all those students.
These are Insertion Anomaly
Update Anomaly
Imagine let's say one teacher name ‘X’ from your department leaves the college. Then we have to update the record of ‘X’ in every table, At all the places where the ‘X’ name appears.
Failing in doing so will result in an update anomaly and it will lead to data inconsistency.
Delete Anomaly
Imagine let's say there is only one employee in a department and that employee leaves due to some reason. Department info is only present with that employee records. Then deleting the data of that employee will also delete the department info.
This is a delete anomaly.
Normalization deals with all these anomalies.
Normalization rules
- First Normal Form
- Second Normal Form
- Third Normal Form
- BCNF
- Fourth Normal Form
First Normal Form
For the table to be in the first normal form then it should follow some rules:
a. It should have an Atomic (single) valued column.
b. Values stored in the column should be of the same Domain(type).
Second Normal Form
For the table to be in the second normal form then it should follow some rules:
a. Table should be in first normal form.
b. It should not have any Partial dependency.
Partial Dependency: It is a dependency where a non-prime attribute(column) in a table depends on only a part of the primary key and not on the whole key.
Third Normal Form
For the table to be in the third normal form then it should follow some rules:
a. Table should be in second normal form.
b. It should not have any Transitive dependency.
Transitive Dependency: When a non-prime attribute(column) depends on other non-prime attributes(column) and that attribute is dependent on the primary key.
example:
NP -> NP -> P
NP(Non-Prime)
P(Prime)
NOTE: We should normalization up to a certain extend. Having normalization too deep is also not advisable. that’s why we only do normalization up to 3rd normal form.
BCNF (Boyce-Codd Normal Form)
Higher version of Third Normal Form.
For the table to be in the BCNF normal form then it should follow some rules:
a. Table must be in Third normal form.
b. It deals with multiple overlapping candidate keys.
Fourth Normal Form
For the table to be in the fourth normal form then it should follow some rules:
a. Table must be in BCNF.
b. It deals with Multivalued dependency.
Multivalued Dependency: When two attributes in a table are dependent on each other but, both dependent on a third attribute.