Introduction
As the database grows, the ER diagram representation becomes more complex and crowded. It is creates a difficult situation to understand the requirement and their structure as a whole. Similarly, if ER diagram is represented at very high level, it again creates a difficulty in understanding the system. But representation at high level and till the minute levels is very necessary to understand the system well. These concepts are well defined by generalization and specialization. Sometimes, we would have divided the entities into two or more entities to be more accurate in design. But when compared to the whole database or user, it can be combined to one entity. Such a process is called as aggregation.
Once designing ER diagram is complete, we need to put it into logical structure. But how it can be done? Let us discuss this in the last section.
Generalization
In our Employee example, we have seen different types of employees like Engineer, Accountant, Salesperson, Clerk etc. Similarly each employee belongs to different departments. We can represent it in an ER diagram as below. When you see this diagram for the first time, you will not understand it quickly. One will take time to understand it or he might misunderstand some requirement.
What if we group all the sub departments into one department and different employees into one employee? However sub departments and different employee types have same features in their own domain. So if we merge the child entities into their parent, it makes the diagram simpler, hence easy to understand. This method of merging the branches into one is called generalization. We can see the generalized structure of requirement to understand it quickly. So above ER diagram will be changed to as below:
Isn’t it simpler? Generalization is the bottom up approach which helps to design the requirement at high level. Thus making one to understand quickly.
Specialization
It is opposite approach of generalization. Here, each entity is further divided into sub levels to understand it deeper. In the above example, Department entity is further divided into sub departments to understand how they are scattered. This method of representation helps the developer to code correctly and quickly. It is a top down approach of breaking higher level entity to low level entity. Once the entities are understood at higher level, it makes easy to understand the requirement at low level.
One more example of specialization would be Person. We can further divide person as STUDENT, TEACHER, ENGINEER, SOLDIER etc. (Merging STUDENT, TEACHER, ENGINEER etc into PERSON is an example of generalization).
Aggregation
Look at below ER diagram of STUDENT, COURSE and SUBJECTS. What does it infer? Student attends the Course, and he has some subjects to study. At the same time, Course offers some subjects. Here a relation is defined on a relation. But ER diagram does not entertain such a relation. It supports mapping between entities, not between relations. So what can we do in this case?
If we look at STUDENT and COURSE from SUBJECT’s point of view, it does not differentiate both of them. It offers it’s subject to both of them. So what can we do here is, merge STUDENT and COURSE as one entity. This process of merging is called aggregation. It is completely different from generalization. In generalization, we merge entities of same domain into one entity. In this case we merge related entities into one entity.
Here we have merged STUDENT and COURSE into one entity STUDENT_COURSE. This new entity forms the mapping with SUBJECTS. The new entity STUDENT_COURSE, in turn has two entities STUDENT and COURSE with ‘Attends’ relationship.
Transform ER Diagram into Tables
Since ER diagram gives us the good knowledge about the requirement and the mapping of the entities in it, we can easily convert them as tables and columns. i.e.; using ER diagrams one can easily created relational data model, which nothing but the logical view of the database.
There are various steps involved in converting it into tables and columns. Each type of entity, attribute and relationship in the diagram takes their own depiction here. Consider the ER diagram below and will see how it is converted into tables, columns and mappings.
The basic rule for converting the ER diagrams into tables is
- Convert all the Entities in the diagram to tables.
All the entities represented in the rectangular box in the ER diagram become independent tables in the database. In the below diagram, STUDENT, COURSE, LECTURER and SUBJECTS forms individual tables.
- All single valued attributes of an entity is converted to a column of the table
All the attributes, whose value at any instance of time is unique, are considered as columns of that table. In the STUDENT Entity, STUDENT_ID, STUDENT_NAME form the columns of STUDENT table. Similarly, LECTURER_ID, LECTURER_NAME form the columns of LECTURER table. And so on.
- Key attribute in the ER diagram becomes the Primary key of the table.
In diagram above, STUDENT_ID, LECTURER_ID, COURSE_ID and SUB_ID are the key attributes of the entities. Hence we consider them as the primary keys of respective table.
- Declare the foreign key column, if applicable.
In the diagram, attribute COURSE_ID in the STUDENT entity is from COURSE entity. Hence add COURSE_ID in the STUDENT table and assign it foreign key constraint. COURSE_ID and SUBJECT_ID in LECTURER table forms the foreign key column. Hence by declaring the foreign key constraints, mapping between the tables are established.
- Any multi-valued attributes are converted into new table.
A hobby in the Student table is a multivalued attribute. Any student can have any number of hobbies. So we cannot represent multiple values in a single column of STUDENT table. We need to store it separately, so that we can store any number of hobbies, adding/ removing / deleting hobbies should not create any redundancy or anomalies in the system. Hence we create a separate table STUD_HOBBY with STUDENT_ID and HOBBY as its columns. We create a composite key using both the columns.
- Any composite attributes are merged into same table as different columns.
In the diagram above, Student Address is a composite attribute. It has Door#, Street, City, State and Pin. These attributes are merged into STUDENT table as individual columns.
- One can ignore derived attribute, since it can be calculated at any time.
In the STUDENT table, Age can be derived at any point of time by calculating the difference between DateOfBirth and current date. Hence we need not create a column for this attribute. It reduces the duplicity in the database.
These are the very basic rules of converting ER diagram into tables and columns, and assigning the mapping between the tables. Table structure at this would be as below:
Let us see some of the special cases.
- Converting Weak Entity
Weak entity is also represented as table. All the attributes of the weak entity forms the column of the table. But the key attribute represented in the diagram cannot form the primary key of this table. We have to add a foreign key column, which would be the primary key column of its strong entity. This foreign key column along with its key attribute column forms the primary key of the table.
In our example above, SUBJECTS is the weak entity. Hence, we create a table for it. Its attributes SUBJECT_ID and SUBJECT_NAME forms the column of this table. Although SUBJECT_ID is represented as key attribute in the diagram, it cannot be considered as primary key. In order to add primary key to the column, we have to find the foreign key first. COURSE is the strong entity related to SUBJECT. Hence the primary key COURSE_ID of COURSE is added to SUBJECT table as foreign key. Now we can create a composite primary key out of COURSE_ID and SUBJECT_ID.
- Representing 1:1 relationship
Imagine SUBJECT is not a weak entity, and we have LECTURER teaches SUBJECT relation. It is a 1:1 relation. i.e.; one lecturer teaches only one subject. We can represent this case in two ways
- Create table for both LECTURER and SUBJECT. Add the primary key of LECTURER in SUBJECT table as foreign key. It implies the lecturer name for that particular subject.
- Create table for both LECTURER and SUBJECT. Add the primary key of SUBJECT in LECTURER table as foreign key. It implies the subject taught by the lecturer.
- Representing 1:N relationship
Consider SUBJECT and LECTURER relation, where each Lecturer teaches multiple subjects. This is a 1: N relation. In this case, primary key of LECTURER table is added to the SUBJECT table. i.e.; the primary key at 1 cardinality entity is added as foreign key to N cardinality entity
- Representing M:N relationship
Consider the example, multiple students enrolled for multiple courses, which is M:N relation. In this case, we create STUDENT and COURSE tables for the entities. Create one more table for the relation ‘Enrolment’ and name it as STUD_COURSE. Add the primary keys of COURSE and STUDENT into it, which forms the composite primary key of the new table.
That is, in this case both the participating entities are converted into tables, and a new table is created for the relation between them. Primary keys of entity tables are added into new table to form the composite primary key. We can add any additional columns, if present as attribute of the relation in ER diagram.
- Self Referencing 1:N relation
Consider the example of HOD and Lecturers. Here one of the Lecturers is a HOD of the department. i.e.; one HOD has multiple lecturers working with him. In this case, we create LECTURER table for the Lecturer entity. Create the columns and primary keys as usual. In order to represent HOD, we add one more column to LECTURER table which is same column as primary key, but acts as a foreign key. i.e.; LECTURER_ID is the primary key of LECTURER table. We add one more column HOD, which will have LECTURER_ID of the HOD. Hence LECTURER table will show HOD’s Lecturer ID for each Lecturer. In this case, primary key column acts as a foreign key in the same table.
- Self Referencing M:N relation
Consider Student and Teacher example as ‘Manny students have Many Teachers teaching the subjects’. Here relation between Student and Teacher is M:N. In this case, create independent tables for student and teacher, and set their primary keys. Then we create a new table for the relationship ‘have’ as STUDENT_TEACHER, which will have student and teacher combination, and any other columns if applicable. Basically, student-teacher combination is the two primary key columns from respective tables, hence establishing the relationship between them. Both the primary keys from both tables act as a composite primary key in the new table. This reduces the storing of redundant data and consistency in the database.