Three ways to implement inheritance in a database using the Vertabelo data modeler.
Inheritance is a common modeling technique used in modern software development. In data modeling, you can use inheritance in the logical model creation process. However, implementing inheritance in a physical database model is not straightforward; standard SQL does not provide inheritance statements for physical implementation.
In this article, we will cover three basic strategies used to implement inheritance in a relational database. And we’ll show examples of inheritance modeling being implemented in Vertabelo.
Inheritance in Database Modeling
We know that the inheritance technique is widely used to define the classes and subclasses of an object-oriented software model. However, it is also possible to use inheritance when defining a database model. When we create the logical model, we can apply the concept of inheritance to represent the idea that an entity (often called the child entity) is derived (i.e. it inherits) from another entity (the parent entity). Usually, the child entity has all the elements of its parent entity, plus some additional attributes.
Vertabelo allows us to use inheritance in database logical data models to represent the idea of a child entity being derived from a parent entity, as we can see in the following image.
In the above logical model, we used inheritance to represent the idea that “any car is a vehicle” and also any “bike is a vehicle”. However, some attributes are common to all vehicles (like brand and model); we put these in the vehicle
entity. The other attributes are specific to cars or bikes, so we put them in the relevant child entity.
After we finish the creation of the logical model, we need to convert it into a physical model. A physical model is a set of tables, attributes, primary keys, foreign keys, and other elements that we create using SQL in a given relational database management system.
Remember, implementing inheritance in a relational database is not straightforward. There are no built-in inheritance mechanisms in standard SQL.
Let’s consider three possible strategies to convert the entities connected by inheritance in the logical model into the tables of the physical model. These strategies are:
- One table per inheritance hierarchy.
- One table per entity.
- One table per entity with all attributes.
Before going to the next section, you may want to read the article How to Generate a Physical Diagram from a Logical Diagram in Vertabelo. It explains the complete process of creating a physical model from a logical model using Vertabelo.
3 Ways to Model Inheritance in a Database
One Table Per Inheritance Hierarchy
Before we describe this strategy, let’s expand our previous inheritance logical model to include a three-level inheritance hierarchy.
When we convert the logical model into a physical model using this method of implementing inheritance, we will get only one table to represent all the inheritance hierarchy. This table will have one column for each attribute in the hierarchy. In other words, all the attributes from any entity in the inheritance will be in the generated table.
It is important to note that the generated inheritance table will have an additional column called discriminator
, which will be used to identify what type of entity in the hierarchy the record is representing. In our example, the discriminator will indicate if the record is a car or a bike.
If we use Vertabelo to immplement this strategy, we will obtain the following physical model. Note that all the inheritance hierarchy entities are implemented by using only one table that has the name of the first level inheritance entity in the logical data model: vehicle
. The other table in the physical model is related to the entity in the logical model (car_owner
), which was not part of the inheritance hierarchy.
Note that the table vehicle
has columns taken from four different entities in the inheritance model: vehicle
, car
, bike
, and electrical_bike
. Many columns are nullable; in this table we can represent car objects or bike objects; thus, if a record is representing a car, all bike related attributes will be NULL
. There are also two extra columns that did not come from the inheritance model. The first is the discriminator
column, which is used to identify what type of object (car or bike) is represented by that record. The other is car_owner_person_id
, which is part of the foreign key pointing to the car_owner
table.
One of the advantages of this strategy is its simplicity. As we have only one table, access to all vehicles will be simple because all vehicles are in the same table. By the same token, access to all subtype elements will also be simple; we only need to use a WHERE
clause like WHERE discriminator = ‘Bike’
.
On the other hand, the disadvantage is the need to use a discriminator to distinguish between different types of objects so we can know which fields are relevant. If the discriminator value is for a bike object, then the table columns related to car objects are not relevant.
In the following example data set, we can see how data is stored when using this physical strategy and how the discriminator
works:
Discriminator | Vehicle id | brand | model | car_id | seats | Car type | car_owner person_id | Bike id | Wheel size | Bike type | Electrical Bike_id | Battery Charge time | Autonomy time |
Bike | 100 | Stajvelo | EL1 | NULL | NULL | NULL | NULL | 1 | 30 inch | City electrical | 1000 | 1 hour | 6 hours |
Car | 101 | BMW | Z8 | 1 | 2 | sport | 1092341 | NULL | NULL | NULL | NULL | NULL | NULL |
Bike | 102 | Stajvelo | EL2 | NULL | NULL | NULL | NULL | 2 | 30 inch | Off Road electrical | 1001 | 2 hours | 9 hours |
Bike | 103 | Stajvelo | CT1 | NULL | NULL | NULL | NULL | 3 | 28 inch | City traditional | NULL | NULL | NULL |
You can see how the data is stored in the table. For readability reasons, I used a text value (‘Car’, ‘Bike’) in the discriminator
column instead of an integer value (as the physical model defined). When the discriminator is ‘Bike’, you can see that all columns for a car object are NULL. And when the discriminator is ‘Car’, all columns for a bike object are NULL.
If we have relationships in the logical data model between an entity in the inheritance hierarchy and another entity outside the hierarchy, then the physical model will treat these relationships as if they were relationships with the parent entity; references are generated as if they were to the parent table. An example of that is the table car_owner
, which is related to the table vehicle
.
In the article How to Model Inheritance in a Relational Database, you will learn key concepts in inheritance data modeling, such as supertypes and subtypes. Give it a read for more information on this topic.
One Table Per Entity
This strategy of inheritance implementation will create one table in the physical model for each entity in the logical model’s inheritance hierarchy. The supertype entity in the logical model will be the parent table and all the subtype entities will be converted to tables with foreign keys referencing the primary key of the parent table. Each table will have only the columns of the entity from which it derives.
The resulting physical model is simple and we no longer need the discriminator
column. All the relationships between tables in the inheritance hierarchy will be one-to-one. One disadvantage for this strategy is that, in order to gather all the attributes of the supertype objects (for example vehicles), we need to traverse multiple tables.
In the following image, we will show how the physical model looks when we apply the “one table per entity” strategy.
Note that we have one table per entity in the inheritance hierarchy. We also have foreign keys between every parent and child entity connected by inheritance. Another important change is that the discriminator
column is not needed; different objects (cars, bikes) are stored in different tables. One other point to mention is the relationship between car_owner
is not with the vehicle
table, as it was in the previous physical model. Now the relationship is between the car_owner
and car
tables.
Let’s see how the data will be stored in this physical model. We will use the same records as in the previous section.
In the generated physical model, we have one-to-one relationships connecting every entity with its parent entity. When the tables are created, these relationships are implemented as foreign keys. For example, the column vehicle_id
is a foreign key of the table bike
, pointing to the primary key of the table vehicle
. Thus, bike.vehicle_id
will only contain valid values from vehicle.vehicle_id.
For example, if the values of bike.vehicle_id
are 100, 102 and 103, and we look at vehicle
, all of them exist as valid vehicle_id
values.
One Table Per Entity with All Attributes
Finally, we will show the third strategy we can use to create a physical database model that implements inheritance. In this strategy, we create one table per entity in the inheritance hierarchy, but each table will have all the attributes of its parent entities, as we can see below.
In this physical data model, each entity in the inheritance hierarchy appears with all the columns of its parent entity. One point to note is the composition of the primary keys in the tables car
and bike
. For example, the bike
primary key is formed by the columns bike_id
and vehicle_id
.
One advantage of this approach is that we don’t need to access several tables (as we did in the previous approach). If we need to access all the bike attributes, then all of them will be in the bike
table; the same idea applies to the electrical_bikes
or car
tables. Another advantage is that we don’t need a discriminator
column
The disadvantage of this strategy is data duplication: modifying data will take more work. For example, the brand
and model
attributes for bikes are stored in 3 tables: vehicle
, bike
, and electrical_bike
.
The relationships between tables of the inheritance hierarchy will continue as one-to-one relations, as it did in the previous strategy. Each child table will have a foreign key pointing to its parent table. However, when primary keys are composed of two columns, then foreign keys must follow the same format. You can see that the foreign key in electrical_bike
that points to the bike
table is now a composite foreign key. You can see the values of foreign keys in the following image of how data is stored when this strategy is used:
Note the duplication of data in the above image. Some columns like bike_type
or wheel_size
are duplicated, while others (like brand
and model
) are in three tables. Look at the model
values ‘EL1’ and ‘EL2’. They are in the tables electrical_bike
, bike
, and vehicle
.
Database Tables Can Inherit, Too
In this article, we covered how to model inheritance in databases and how to convert a logical database model that includes an inheritance hierarchy to a physical model. We showed three basic strategies for creating the physical tables representing the inheritance hierarchy – and as Vertabelo supports these three implementation strategies, we showed images with the physical model generated in Vertabelo and a sample of how the data would be stored. Hopefully, this will help you use inheritance in your next data model.