5 Steps for an Effective Database Model
Database design is the process of producing a detailed model of a database. This model contains the necessary logical (table names, column names) and physical (column datatypes, foreign keys) choices to translate the design into a data definition language (aka SQL), which can be used to create the actual physical database.
When I need to create the design for a new database, in other words, the data layer for an application, I follow a few mental steps that I think can help others when they need to go through the same process. And, to be honest, for me, I progress through the first steps mentally without actually working on the technical details – and sometimes at a more subconscious level.
For me, the first step is to get a high-level grasp of the topic and an understanding of the business or functional area. Generally this is referred to as the business domain.
What is the domain that this solution needs to address? What are the issues in this domain? What are the types of information that need to be held in the database? Take the example of a human resources database for a company: you would need to model employees, their marital status, employment status, salary, holiday periods, etc.
What is the functionality that is required? In other words, what are the Use Cases related to this data? What types of functionality do you need to support: creating and maintaining (update, delete, edit) items, reporting and analysis, etc? Hopefully, the functional requirements of the application have already been defined, but that is not always the case. Make a real effort to have a high-level understanding of how the data will be used.
Now this gets interesting: what functionality is allowed for an employee? Should all basic CRUD (Create, Retrieve, Update, Delete) functionality be allowed – creating new employees, editing employees when their situation or employment status changes (s/he gets married or divorced, resigns, is fired, etc)? Most likely you will allow only Create-Retrieve-Update functionality since employee records may need to be kept for a very long period (e.g. 10 years) and should not be immediately deleted.
Now you should have a concept in your head of what you need to create and you know the types of interactions that are necessary with the data (and therefore with the database).
To actually build the database, you need to start working with the database entities: modelling the main entities of the system.
The good thing about thinking about the domain and the functionality is that you probably have actually defined what the main entities in the database are likely to be. You know what the contents of the database are and how the content will be used.
When considering the domain, we already mentioned most of the entities for a human resources database: employees’ marital status, employment status and salary. Can marital status and salary simply be columns on the employees table or is it necessary to keep a history of what an employee’s salary was in the past? Usually, you need to keep the employment history so we should add tables for status history, salary history, and probably also marital history.
The next level is to understand how the entities are related. What entities are linked to what other entities (e.g. users to the items that they have created)?
Should these relationships be well-defined or casual in the database (foreign keys or loose relations with the related ids stored, but not actually defined as a foreign key in the physical model)?
We said that several columns of the employee table will have a well-defined value, such as their status: single, married, divorced. Here is a perfect example where we might link a column to a table of appropriate values via a foreign key so that the database itself ensures the integrity of the data. It is also possible to rely on the application that is creating rows in the database, but why not use the power of a database’s foreign keys to ensure data integrity? That way, you can avoid having the application introduce errors into the data.
5. Formal Design
Now that you know the entities and relationships, you are ready to build a model or an Entity Relationship Diagram (ERD) of the database, and that should not take too long as you know what you want to create. This is where tools come in handy.
Let us consider Vertabelo for creating the formal design.
First, create a model for the database and start adding in the entities that you thought of previously. Vertabelo will remind you that you need to define primary keys for each table; I recommend using id fields as that will give you more potential flexibility for the future. Next, add in the relationships that you considered previously.
I typically add timestamps with the date/time of the creation of each row, so that the information can be displayed in the application (for example “Created 24 December 2014”).
In the business area that I work in, financial services, it is also very important to keep a record of the last user that modified a row and when the row was modified to have at least some traceability of changes. Of course, other business areas may not have this need for traceability. So we want a reference between “user last changed” to the table of users. However, we may want to allow a user to be deleted even if he or she was the last user that changed a row. If that is the case (that a user can be deleted), then we need to loosen that referential integrity constraint and remove the foreign key from the “user last changed” to the table of users.
After creating the basic model, you should be able to start thinking about improvements. What additional information might be stored in each entity? What additional details and attributes exist for each entity?
I have found these steps to be very effective in helping me create my database models.
What steps do you use to design yours?