Tuesday, July 24, 2012

Database Normalization

Database Normalization-
Database normalization is the process of efficiently organizing data in a database. There are two reasons of the normalization process:
  1. Eliminating redundant data, for example, storing the same data in more than one tables.
  2. Ensuring data dependencies make sense.
Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored. Normalization consists of a series of guidelines that help guide you in creating a good database structure.
Normalization guidelines are divided into normal forms; think of form as the format or the way a database structure is laid out. The aim of normal forms is to organize the database structure so that it complies with the rules of first normal form, then second normal form, and finally third normal form.
It's your choice to take it further and go to fourth normal form, fifth normal form, and so on, but generally speaking, third normal form is enough.


  1. First Normal Form (1NF)
  2. Second Normal Form (2NF)
  3. Third Normal Form (3NF)
First Normal Form (1NF) -----------

First Rule of 1NF:

You must define the data items. This means looking at the data to be stored, organizing the data into columns, defining what type of data each column contains, and finally putting related columns into their own table.
For example, you put all the columns relating to locations of meetings in the Location table, those relating to members in the MemberDetails table, and so on.

Second Rule of 1NF:

The next step is ensuring that there are no repeating groups of data. Consider we have following table:
CREATE TABLE CUSTOMERS(
       ID   INT              NOT NULL,
       NAME VARCHAR (20)     NOT NULL,
       AGE  INT              NOT NULL,
       ADDRESS  CHAR (25),
       ORDERS   VARCHAR(155)
);
So if we populate this table for a single customer having multiple orders then it would be something as follows:
IDNAMEAGEADDRESSORDERS
100Sachin36Lower West SideCannon XL-200
100Sachin36Lower West SideBattery XL-200
100Sachin36Lower West SideTripod Large
But as per 1NF, we need to ensure that there are no repeating groups of data. So let us break above table into to parts and join them using a key as follows:
CUSTOMERS table:
CREATE TABLE CUSTOMERS(
       ID   INT              NOT NULL,
       NAME VARCHAR (20)     NOT NULL,
       AGE  INT              NOT NULL,
       ADDRESS  CHAR (25),
       PRIMARY KEY (ID)
);
This table would have following record:
IDNAMEAGEADDRESS
100Sachin36Lower West Side
ORDERS table:
CREATE TABLE ORDERS(
       ID   INT              NOT NULL,
       CUSTOMER_ID INT       NOT NULL,
       ORDERS   VARCHAR(155),
       PRIMARY KEY (ID)
);
This table would have following records:
IDCUSTOMER_IDORDERS
10100Cannon XL-200
11100Battery XL-200
12100Tripod Large

Third Rule of 1NF:

The final rule of the first normal form . create a primary key for each table which we have already created.


Second Normal Form (2NF) --------------------

Second normal form states that it should meet all the rules for 1NF and there must be no partial dependences of any of the columns on the primary key:
Consider a customer-order relation and you want to store customer ID, customer name, order ID and order detail, and date of purchage:
CREATE TABLE CUSTOMERS(
       CUST_ID    INT              NOT NULL,
       CUST_NAME VARCHAR (20)      NOT NULL,
       ORDER_ID   INT              NOT NULL,
       ORDER_DETAIL VARCHAR (20)  NOT NULL,
       SALE_DATE  DATETIME,
       PRIMARY KEY (CUST_ID, ORDER_ID)
);
This table is in first normal form, in that it obeys all the rules of first normal form. In this table, the primary key consists of CUST_ID and ORDER_ID. Combined they are unique assuming same customer would hardly order same thing.
However, the table is not in second normal form because there are partial dependencies of primary keys and columns. CUST_NAME is dependent on CUST_ID, and there's no real link between a customer's name and what he purchaged. Order detail and purchage date are also dependent on ORDER_ID, but they are not dependent on CUST_ID, because there's no link between a CUST_ID and an ORDER_DETAIL or their SALE_DATE.
To make this table comply with second normal form, you need to separate the columns into three tables.
First, create a table to store the customer details as follows:
CREATE TABLE CUSTOMERS(
       CUST_ID    INT              NOT NULL,
       CUST_NAME VARCHAR (20)      NOT NULL,
       PRIMARY KEY (CUST_ID)
);
Next, create a table to store details of each order:
CREATE TABLE ORDERS(
       ORDER_ID   INT              NOT NULL,
       ORDER_DETAIL VARCHAR (20)  NOT NULL,
       PRIMARY KEY (ORDER_ID)
);
Finally, create a third table storing just CUST_ID and ORDER_ID to keep track of all the orders for a customer:
CREATE TABLE CUSTMERORDERS(
       CUST_ID    INT              NOT NULL,
       ORDER_ID   INT              NOT NULL,
       SALE_DATE  DATETIME,
       PRIMARY KEY (CUST_ID, ORDER_ID)
);

Third Normal Form (3NF)----------------------------------------

A table is in third normal form when the following conditions are met:
  • It is in second normal form.
  • All nonprimary fields are dependent on the primary key.
The dependency of nonprimary fields is between the data. For example in the below table, street name, city, and state are unbreakably bound to the zip code.
CREATE TABLE CUSTOMERS(
       CUST_ID       INT              NOT NULL,
       CUST_NAME     VARCHAR (20)      NOT NULL,
       DOB           DATE,
       STREET        VARCHAR(200),
       CITY          VARCHAR(100),
       STATE         VARCHAR(100),
       ZIP           VARCHAR(12),
       EMAIL_ID      VARCHAR(256),
       PRIMARY KEY (CUST_ID)
);
The dependency between between zip code and address is called a transitive dependency. To comply with third normal form, all you need to do is move the Street, City, and State fields into their own table, which you can call the Zip Code table:
CREATE TABLE ADDRESS(
       ZIP           VARCHAR(12),
       STREET        VARCHAR(200),
       CITY          VARCHAR(100),
       STATE         VARCHAR(100),
       PRIMARY KEY (ZIP)
);
Next, alter the CUSTOMERS table as follows:
CREATE TABLE CUSTOMERS(
       CUST_ID       INT              NOT NULL,
       CUST_NAME     VARCHAR (20)      NOT NULL,
       DOB           DATE,
       ZIP           VARCHAR(12),
       EMAIL_ID      VARCHAR(256),
       PRIMARY KEY (CUST_ID)
);
The advantages of removing transitive dependencies are mainly twofold. First, the amount of data duplication is reduced and therefore your database becomes smaller.
The second advantage is data integrity. When duplicated data changes, there's a big risk of updating only some of the data, especially if it's spread out in a number of different places in the database. For example, If address and zip code data were stored in three or four different tables, then any changes in zip codes would need to ripple out to every record in those three or four tables.

2 comments:

  1. The information provided this article is really very unique and informative, useful too.

    ReplyDelete
  2. Thanks for sharing Informative post with us, keep update blog to get more Information to me.
    artist platform in india || artist portfolio services in mumbai

    ReplyDelete

SQL BASIC

  • Sql-Overview
  • Sql-Sysntax
  • Sql-Normalization
  • RDBMS Concept
  • Sql-Data Type
  • Sql-Operator
  • Sql-Expression
  • Create database
  • Sql-Delete
  • Sql-Select
  • Sql-Create
  • Sql-Like
  • Sql-Join
  • Sql-Insert
  • Sql-Drop Table
  • Sql-Wild Card
  • Sql-Order By
  • Sql-Group By
  • Sql-Index
  • Not Null Constraints
  • Transaction Control
  • Sql-Transaction
  • Sql-In
  • Sql-Distinct
  • Check Constraint
  • Sql Alias
  • Sql-Primary
  • Sql-Where
  • Sql-Update
  • Sql-Alias
  • Sql-Top-Rownum
  • Primary key vs Unique key
  • SQL Interview Question
  • PL/SQL BASIC

  • Variable
  • Block Structure
  • Function
  • Procedure
  • Nested Blog
  • If Statement
  • While Loop
  • For Loop
  • SEO

  • Introduction Seo
  • Top Social Bookmarking List
  • Directory Submission List
  • Classified Ads
  • Key Word Research
  • Html

  • Introduction Html
  • Introduction Css
  • Introduction Java Script
  • Unix

  • Unix
  • Software Testing

  • Software Testing
  • Computer Network

  • Computer Network
  •