badge icon

This article was automatically translated from the original Turkish version.

Article

Database Normalization

2FEFD916-3C6D-4FF8-BD17-2BE88B2D532F.png
Article Title
Database Normalization
Category
Computer ScienceDatabase Systems
Subcategories
Database DesignRelational DatabasesSoftware Engineering

Database normalization is a process applied in relational database design to ensure data consistency, eliminate data redundancy, and optimize the overall structure of the database. Normalization enables data to be stored in a logical structure while allowing for more orderly data storage. By preserving data integrity, database normalization facilitates more efficient data processing and update operations.


The normalization process typically progresses through stages known as "normal forms." These stages examine dependencies within each field of the database to create a simpler and more consistent structure. Normalization plays a crucial role in the design of database management systems (DBMS) because it ensures data integrity and minimizes potential errors during data manipulation.

Historical Development

Database normalization was first introduced in the 1970s with the development of the relational data model by E. F. Codd. Codd argued that data in a relational model should be stored in a more organized and structured manner and proposed a set of rules to achieve this. These rules eventually evolved into the formal normalization rules. In Codd’s relational model, data is stored in tables, with each table linked to others through primary keys.


Early database designs relied on structures where data was stored irregularly, leading to various inconsistencies during processing. Codd’s proposed normalization process quickly gained widespread adoption as an approach aimed at eliminating such problems. The first normal form (1NF) requires that each cell in a database table contain only a single value, while subsequent forms (2NF, 3NF, etc.) began to examine data dependencies in greater detail.

Primary Objectives of Normalization

The primary objectives of database normalization are as follows:

Reducing Data Redundancy

Storing the same data in multiple locations within a database can lead to inconsistencies during updates and deletions. Normalization eliminates this redundancy by storing each piece of information in only one place, thereby ensuring a consistent database structure.

Ensuring Data Consistency

Database design ensures that data remains consistent. Since data is updated in only one location, the risk of inconsistencies and incorrect data storage is eliminated. This preserves the integrity of the database and ensures accurate data updates.

Simplifying Data Manipulation

Because each record in the database is organized according to specific rules through normalization, operations performed on the data are faster and less error-prone. All database operations are carried out more effectively due to the proper structure of the data.

Improving Database Performance

Database normalization enhances database efficiency by reducing data redundancy and complexity. This increases the speed of data queries and enables more efficient data storage.

Efficient Use of Storage Space

Normalization prevents unnecessary duplication of data by eliminating redundant copies, allowing the database to be stored more efficiently overall.

Normalization Stages (Normal Forms)

Database normalization is typically carried out through a series of "normal forms." Each normal form seeks to organize dependencies and reduce the complexity of data structures to a specific level. These stages are as follows:

1. First Normal Form (1NF)

A table is in First Normal Form (1NF) if each cell contains only one value. This means each column must hold a single value and each row must be unique. Additionally, all data in each cell must be atomic—that is, it cannot be subdivided further.

Example:

In the table below, multiple categories for a product are stored in a single cell separated by commas.



This table does not comply with 1NF because the "Categories" column contains multiple values. To bring it into 1NF, each category must be placed in a separate row. The corrected table is as follows:



2. Second Normal Form (2NF)

Second Normal Form (2NF) requires that, after a table satisfies 1NF, every non-prime attribute must be fully functionally dependent on the entire primary key. This means partial dependencies must be eliminated.

Example:

In a student table containing information about students, their courses, and instructors, the instructor’s name depends only on the course, not on the student. This creates a partial dependency. To achieve 2NF, the course and instructor information must be moved to a separate table.



In the table above, the "Instructor" column depends on the "Course" column but not on the "StudentID" column. To satisfy 2NF, a new table must be created to separate this dependency.

3. Third Normal Form (3NF)

Third Normal Form (3NF) requires that, after satisfying 2NF, all transitive dependencies be eliminated. If an attribute depends on another attribute, which in turn depends on the primary key, then the dependent attribute must be moved to a separate table.

Example:

In a company employee table containing information about employees, their departments, and department managers, a transitive dependency exists. Employee data is not directly related to the manager; rather, the manager’s information is indirectly dependent on the department. To eliminate this dependency, department information must be stored in a separate table.



In this table, the "Manager" column depends only on the "Department" column and indirectly on the "EmployeeID" column. To satisfy 3NF, department information must be stored in a separate table.

4. Fourth Normal Form (4NF) and Fifth Normal Form (5NF)

Fourth Normal Form (4NF) addresses multivalued dependencies, requiring that if a column contains multiple independent values, each value must be stored in a separate table.


Fifth Normal Form (5NF) asserts that all dependencies in a table must be expressible in a normalized form and that every dependency must be decomposable. This is the highest level of normalization applied when data must be fragmented to eliminate complex join anomalies.

Author Information

Avatar
AuthorSıla TemelDecember 11, 2025 at 8:21 AM

Tags

Discussions

No Discussion Added Yet

Start discussion for "Database Normalization" article

View Discussions

Contents

  • Historical Development

  • Primary Objectives of Normalization

    • Reducing Data Redundancy

    • Ensuring Data Consistency

    • Simplifying Data Manipulation

    • Improving Database Performance

    • Efficient Use of Storage Space

  • Normalization Stages (Normal Forms)

    • 1. First Normal Form (1NF)

      • Example:

    • 2. Second Normal Form (2NF)

      • Example:

    • 3. Third Normal Form (3NF)

      • Example:

    • 4. Fourth Normal Form (4NF) and Fifth Normal Form (5NF)

Ask to Küre