Thursday, September 4, 2008

List of Normalization Techniques

Normalization

Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process:
eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense
(only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and
ensure that data is logically stored.

Database normalization, sometimes referred to as canonical synthesis, is a technique for designing relational database
tables to minimize duplication of information and, in so doing, to safeguard the database against certain types of
logical or structural problems, namely data anomalies.

For the most part, the first 3 normal forms are common sense.
1. No repeating elements or groups of elements
2. No partial dependencies on a concatenated key
3. No dependencies on non-key attributes

First Normal Form (1NF) - No repeating elements or groups of elements

i. No two rows are identical
ii. Each table entries having single value
iiiv. Eliminate duplicative columns from the same table
iv. Create separate tables for each group of related data and
v. Identify each row with a unique column or set of columns (the primary key).


Second Normal Form (2NF) - No partial dependencies on a concatenated key

* Meet all the requirements of the first normal form.
* Remove subsets of data that apply to multiple rows of a table and place them in separate tables.
* Create relationships between these new tables and their predecessors through the use of foreign keys.

i. The table must be in 1NF.
ii. A table is 2NF if it is in 1NF and all fields are dependent on the whole of the primary key,
or a relation is in 2NF if it is in 1NF and every non-key attribute is fully dependent on each candidate key of the relation.
iii. None of the non-prime attributes of the table are functionally dependent on a part (proper subset) of a candidate key;
in other words, all functional dependencies of non-prime attributes on candidate keys are full functional dependencies.
For example, consider an "Employees' Skills" table whose attributes are Employee ID, Employee Name, and Skill;
and suppose that the combination of Employee ID and Skill uniquely identifies records within the table.
Given that Employee Name depends on only one of those attributes – namely, Employee ID – the table is not in 2NF.


Third Normal Form (3NF) - No dependencies on non-key attributes

A 3NF definition that is equivalent to Codd's, but expressed differently, was given by Carlo Zaniolo in 1982.

i. The table must be in 2NF.
ii. Remove columns that are not dependent upon the primary key.
iii. Transitive dependencies must be eliminated. All attributes must rely only on the primary key.
So, if a database has a table with columns Student ID, Student, Company, and Company Phone Number, it is not in 3NF. This is because the Phone number relies on the Company. So, for it to be in 3NF, there must be a second table with Company and Company Phone Number columns; the Phone Number column in the first table would be removed.if you have three columns in a single column. A COLUMN IS STUDENT ID , B COLUMN IS STUDENT NAME AND C COLUMN IS STUDENT ADDRESS

A -> B
A -> C
C -> B

Boyce-Codd normal form (BCNF)

A table is in Boyce-Codd normal form (BCNF) if and only if, for every one of its non-trivial functional dependencies,
X -> Y,
X is a superkey—that is,
X is either a candidate key or a superset thereof.


Fourth Normal Form (4NF)
-------------------------

* Meet all the requirements of the third normal form.
* A relation is in 4NF if it has no multi-valued dependencies.

A table is in fourth normal form (4NF) if and only if, for every one of its non-trivial multivalued dependencies
X ->> Y,
X is a superkey—that is,
X is either a candidate key or a superset thereof.

* For example: if you can have two phone numbers values and two email address values,
then you should not have them in the same table.


Normalizing an Example Table

These steps demonstrate the process of normalizing a fictitious student table.
1. First Normal Form: Repeating groups across columns

The designer might attempt to get around this restriction by defining multiple Telephone Number columns:

Customer ID First Name Surname TelNo1 TelNo2 TelNo3

2. First Normal Form: Repeating groups within columns

Storing Multiple Telephone Numbers on a single columns

TelNo
555-403-1659, 555-776-4100

3. First Normal Form: Unnormalized table:

Student# Advisor Adv-Room Class1 Class2 Class3
1022 Jones 412 101-07 143-01 159-02
4123 Smith 216 201-01 211-02 214-01

4. First Normal Form: No Repeating Groups

Tables should have only two dimensions. Since one student has several classes, these classes should be listed in a separate table. Fields Class1, Class2, and Class3 in the above records are indications of design trouble.

Spreadsheets often use the third dimension, but tables should not. Another way to look at this problem is with a one-to-many relationship, do not put the one side and the many side in the same table. Instead, create another table in first normal form by eliminating the repeating group (Class#), as shown below:

Student# Advisor Adv-Room Class#
1022 Jones 412 101-07
1022 Jones 412 143-01
1022 Jones 412 159-02
4123 Smith 216 201-01
4123 Smith 216 211-02
4123 Smith 216 214-01

5. Second Normal Form: Eliminate Redundant Data

Note the multiple Class# values for each Student# value in the above table. Class# is not functionally dependent on Student# (primary key), so this relationship is not in second normal form.

The following two tables demonstrate second normal form:

Students:

Student# Advisor Adv-Room
1022 Jones 412
4123 Smith 216


Registration:

Student# Class#
1022 101-07
1022 143-01
1022 159-02
4123 201-01
4123 211-02
4123 214-01

4. Third Normal Form: Eliminate Data Not Dependent On Key

In the last example, Adv-Room (the advisor's office number) is functionally dependent on the Advisor attribute. The solution is to move that attribute from the Students table to the Faculty table, as shown below:

Students:

Student# Advisor
1022 Jones
4123 Smith


Faculty:

Name Room Dept
Jones 412 42
Smith 216 42

No comments:

Popular Posts