This concept was introduced by the British computer scientist Edgar F. Codd in 1970 as part of his relational model.
"Database normalization is the process of structuring a relational database in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity." Database normalization - Wikipedia
Database normalization is a process of dividing large tables into smaller ones and linking them using database relationships. It involves defining the structure of a database with columns (attributes) and tables (relations).
Not applying database normalization can lead to several anomalies:
Insertion anomaly: Adding data to a table without knowing the values of all the attributes.
Update anomaly: This happens when incomplete updates are made due to redundant data in the table.
Deletion anomaly: Removing certain data can result in unintended loss of other relevant data.
"The process of normalization involves applying rules to a set of data. Each of these rules transforms the data to a certain structure, called a normal form." A Step-By-Step Guide to Normalization in DBMS With Examples (databasestar.com)
A normal form is applied to individual tables or relations. There are six normal forms, but the first three are the most commonly used. To be considered a third normal form, a database must satisfy the following requirements:
FIRST NORMAL FORM (1NF): Atomic values
Each table row should have a unique identifier, known as the primary key (PK).
Data stored in each table column should only contain a single value.
There are no repeating groups of table columns.
SECOND NORMAL FORM (2NF)
Built on 1NF rules.
Non-key attributes are fully dependent on the primary key. Only data associated with the PK is stored in each table.
THIRD NORMAL FORM (3NF)
2NF rules must be satisfied.
"There should be no intra-table dependencies between the columns in each table." Database Normalization in SQL with Examples – SQLServerCentral. In other words, every non-key attribute must depend solely on the primary key.
BENEFITS
Consistency: It minimises data redundancy and the potential for inconsistencies and errors.
Integrity: It facilitates database management through relational integrity.
Data persistence: It stores data effectively and efficiently.
Structure: It builds a well-organised database design.
DRAWBACKS
Complexity: Normalization to higher normal forms (4NF, 5NF) can be time-consuming and difficult to satisfy.
Joins: Several tables can produce relationships that are interrelated using joins.
Flexibility: Data is organised in a specific form, which can limit the flexibility of a database.
Check out my video:
Comments