September 2, 2021 - 9 minutes read

What Is Data Modeling?

Kiana is a Google Cloud Certified Professional Data Engineer who currently works as a Google BigQuery developer for Pointwest. When she’s not working, she’s passionate about video games and is an aspiring streamer. These days, she mainly streams Valorant, as she is an avid enthusiast of competitive first-person shooters.

Tags:

data modeling

Have you ever wondered how data modeling works?

Data modeling is the process of creating a data model. A data model can be thought of as a blueprint that describes the structure and internal relationships in a database. It is meant to demonstrate how objects within the database interact with each other to fulfill the prescribed business logic.

Since relational databases are composed of tables, a data model is a diagram that illustrates the relationship between these tables. It helps identify the purpose and dependencies of each table. This is especially true when working with databases like MySQL and PostgreSQL, just to name a few.

The usual data model creation process consists of three phases: conceptual, logical, and physical. Additionally, there are several different types of data models, such as ER diagrams, object-oriented data models, and dimensional data models.

Though there are many considerations in building a data model, your decision will be largely driven by the business requirements. This is because one of the main purposes of a database system is to provide business intelligence. In turn, this allows analysts to extract meaningful insights that enable decision-makers to make data-driven decisions.

Data Model Elements

A data model normally consists of three elements: entities, attributes, and relationships. These elements are instrumental in the database design process, as they represent actual database objects.

Note the ER diagram above. It describes a data model for a school records system.

Furthermore, it demonstrates the usage of entities, attributes, and relationships. We’ll take a closer look at these elements below.

Entities

An entity is an object, such as a person or place, whose data will be stored in the database. For instance, in a database that is meant to keep track of class records, we are most likely going to want to know information about students, teachers, and classes. Thus, in our database, the entities would be students, teachers, and classes.

Attributes

An attribute is a particular bit of information about an entity. For instance, a student entity would have a name attribute. It would be quite hard to access student records without the student’s name, right? Other important attributes for a student would be their age, address, and class records.

Relationships

Finally, relationships describe how the entities in a database relate to each other. There are many kinds of relationships. Here are some examples:

The one-to-many relationship describes a relationship between two entities where Entity A can be linked to many instances of Entity B but Entity B can only be linked to one instance of Entity A. A good example of this relationship is one between a teacher and class. A teacher can teach several classes, but usually a class has one teacher.

Types of Data Models

The types of data models can vary according to the level of abstraction. Abstraction in programming refers to how simple a concept appears to be. A highly abstracted concept is one that is simplified for easier understanding. On the other hand, a concept that has a low level of abstraction includes many technical details – making it different for inexperienced programmers to understand.

For instance, a basic data model would likely be a conceptual model because it is highly abstracted. The creation of data models can be thought of as a progressive process, where we start with the most basic foundation and build our way to a more sophisticated and detailed diagram.

In this article, we’ll only cover the basic definition of each phase. For a more in-depth guide, see this article on conceptual, logical, and physical data models. And we’ll use the entity-relationship diagram – also called an ER diagram or ERD – to draw our models. Check this article out to learn why we need an ER diagram.

Conceptual Data Model

This is the simplest model. There are normally only two elements included: the entity and their relationships. You can see this in the example below:

This conceptual data model describes a database system that keeps track of school records. There are three entities: student, teacher, and class. A student can have many teachers and classes. A teacher can have many students and classes. Finally, a class can have many students but only one teacher.

The conceptual data model makes it easy for non-technical people to understand what is going on between the entities. It’s intended to be as intuitive as possible to convey important information to stakeholders – whether they have a technical understanding of databases or not. When you’re building sophisticated database models, it’s wise to start at the conceptual level and slowly add more detail.

Logical Data Model

The logical model is one of the most-used data models, as it strikes a balance between readability and technicality. It includes entities, relationships, attributes, primary keys, and foreign keys.

In essence, it describes the logical flow of data within a database without including low-level details, such as the database and data types to be used. From a software standpoint, the logical data model is quite enough to start building an actual database.

Building on the conceptual data model from before, here is a sample logical data model. Compared to the conceptual data model, this model includes attributes and primary keys. The student entity would have a StudentID as a primary key and unique identifier; they’d also have other important attributes such as name, and age.

The same concept applies to the Teacher entity and the Class entity. The relationship between the entities remains the same.

Physical data model

The physical data model represents the lowest level of abstraction. It includes details specific to the intended database management system (e.g. PostgreSQL, Oracle, MySQL, etc.). Entities will become tables and attributes will become columns, as is the case in an actual database. Each column will have a specific data type, such as INT, VARCHAR, or DATE.

Because the physical model contains details that are specific to the database being used, it would require going into details beyond the scope of this article. However, a physical model looks like a more detailed version of the logical model shown above. For more information and examples, see the articles How to Generate a Physical Diagram from a Logical Diagram in Vertabelo and 8 Things to Consider When Creating a Physical Data Model.

Phases of Data Modeling

1. Gather Business Requirements

You must dedicate time to a very thorough discussion with business stakeholders. There are certain prerequisites to creating a data model. The most important one is to know the business purpose of the database you are designing. Ultimately, databases are meant to provide business value that fosters smart decisions.

The following questions provide essential information:

What is the business domain?
What information needs to be stored in the database?
What problems will be addressed by this database?

You can find out more about which business questions to ask before modeling. From there, as a database designer, you must now align several factors with the business goal(s). These are:

Performance
Cost
Security

2. Collaborate with Other Teams

Your database is unlikely to be the entire solution. There will usually be some kind of user interface, web application, or mobile application that will facilitate the convenient usage of your database.

Thus, expect that you’ll be collaborating with other teams, such as UX/UI designers, front-end developers, back-end developers, and more. The database is the heart of the solution, but it will need support.

As a database designer, you must be able to fulfill certain needs. For instance, you need to ensure that the data format and types you’ve chosen will comply with the expected format and types once it arrives at the application side.

A simple example would be that of a school-record database system. Let’s say that the web developers have requested that you save student IDs as UUIDs (unique user ID) instead of as integers. This is because a UUID is an object type that defines a unique user ID and can be generated programmatically via libraries. It is the industry standard when it comes to defining user IDs.

Discussions about details like that are very important as software design is in general a highly collaborative process. Very few successful software solutions were created by a single person. Hence, learning the art of collaboration and communication is an essential skill for any developer.

3. Research Industry Standards and Templates

Instead of reinventing the wheel, chances are that some database designer out there has already achieved part of what you need for a certain project. For instance, there are probably thousands of school-records database systems in the world.

You need to do your due diligence to uncover what data model for school records is considered the best and most efficient in the industry. This is going to save you a lot of time; designing a database from scratch is very costly in terms of time and resources.

However, there is probably something about your business requirements that will make your database unique. That is the part that you need to work on from scratch. But because you found the industry-standard design for other parts of the database, you will be able to focus more energy on that unique aspect.

Check out Vertabelo’s collection of sample ER diagrams to examine some industry-standard data models.

4. Start Modeling the Database

At this point, you have everything you need to start the actual modeling. You have done the following:

Gathered business requirements.
Collaborated with other teams.
Researched industry standards.

As a result, you will have a thorough knowledge of what entities, attributes, and relationships are needed. These three elements are the building blocks of any database. Simply start with a conceptual data model, progress to the logical data model, and finally end with the physical data model.

At the end of the process, you will be able to execute the implementation of the database with ease while fulfilling the business goals posed by stakeholders. To learn more, read this article on why database design is more than just an ERD.

Data Modeling in a Nutshell

Once you have reached this part of the article, you should know the essentials of data modeling. To recap, a data model can be thought of as a blueprint that describes the structure and relationships in a database. There are a few key elements that make up a data model such as entities, attributes, and relationships.

The three types of data models by level of abstraction are the conceptual (the simplest), logical (the most used), and physical (the most technical) data models. Finally, the prerequisites to creating a data model include thorough discussions with business stakeholders, collaborating with other teams, and doing research on industry standards.

To further expand your knowledge, read this article on database modeling tips.

Tags:

data modeling

Data Model Elements

Entities

Attributes

Relationships

Types of Data Models

Conceptual Data Model

Logical Data Model

Physical data model

Phases of Data Modeling

1. Gather Business Requirements

2. Collaborate with Other Teams

3. Research Industry Standards and Templates

4. Start Modeling the Database

Data Modeling in a Nutshell

You may also like

Explaining an ER Diagram, With Steps and Use Cases

What Is a One-to-One Relationship in a Database?

What Is a One-to-Many Relationship in a Database? An Explanation with Examples

What Is a Many-to-Many Relationship in a Database? An Explanation with Three Examples

What Are Conceptual, Logical, and Physical Data Models?

Why Do You Need an ER Diagram?