top of page

Database Normalization

Writer's picture: Cristian DuqueCristian Duque

Updated: Dec 1, 2023



This concept was introduced by the British computer scientist Edgar F. Codd in 1970 as part of his relational model.

"Database normalization is the process of structuring a relational database in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity." Database normalization - Wikipedia


Database normalization is a process of dividing large tables into smaller ones and linking them using database relationships. It involves defining the structure of a database with columns (attributes) and tables (relations).


Not applying database normalization can lead to several anomalies:

  • Insertion anomaly: Adding data to a table without knowing the values of all the attributes.

  • Update anomaly: This happens when incomplete updates are made due to redundant data in the table.

  • Deletion anomaly: Removing certain data can result in unintended loss of other relevant data.



"The process of normalization involves applying rules to a set of data. Each of these rules transforms the data to a certain structure, called a normal form." A Step-By-Step Guide to Normalization in DBMS With Examples (databasestar.com)


A normal form is applied to individual tables or relations. There are six normal forms, but the first three are the most commonly used. To be considered a third normal form, a database must satisfy the following requirements:

FIRST NORMAL FORM (1NF): Atomic values

  • Each table row should have a unique identifier, known as the primary key (PK).

  • Data stored in each table column should only contain a single value.

  • There are no repeating groups of table columns.


SECOND NORMAL FORM (2NF)

  • Built on 1NF rules.

  • Non-key attributes are fully dependent on the primary key. Only data associated with the PK is stored in each table.

THIRD NORMAL FORM (3NF)



BENEFITS

  • Consistency: It minimises data redundancy and the potential for inconsistencies and errors.

  • Integrity: It facilitates database management through relational integrity.

  • Data persistence: It stores data effectively and efficiently.

  • Structure: It builds a well-organised database design.

DRAWBACKS

  • Complexity: Normalization to higher normal forms (4NF, 5NF) can be time-consuming and difficult to satisfy.

  • Joins: Several tables can produce relationships that are interrelated using joins.

  • Flexibility: Data is organised in a specific form, which can limit the flexibility of a database.


Check out my video:


20 views

Comments


bottom of page