June 16, 2022 - 10 minutes read

A Beginner’s Guide to Database Security

Stephanie Bailey is a technical writer who covers databases, programming, and data science topics. When she’s not learning a new programming language or researching a new article, she’s probably walking her dog. You can find more of her work on her website:

smbwriting.com

Tags:

database
security

How do database pros guard against data breaches? A quick overview of database security essentials: authentication, encryption, user access, and more.

Some time ago, I came across WIRED's article on the data breach at Vastaamo, a Finnish mental health provider network. It read like a cyberthriller: someone hacking into a massive database of patients' deepest, darkest secrets and using them to blackmail the company, the individual patients, and the community as a whole. One specific part of the article got me thinking:

“[Vastaamo founder Ville] Tapio wouldn’t go into technical detail about the system, but in court documents he suggests it was browser-based and stored patients’ records on a MySQL server. [...] But the slick exterior concealed deep vulnerabilities.[...] It didn’t anonymize the records. It didn’t even encrypt them. The only thing protecting patients’ confessions and confidences were a couple of firewalls and a server login screen.”

I’m not a cybersecurity expert. But I can see the tremendous problem in a system that relies on firewalls and a couple of passwords to secure information – while ignoring encryption and shortchanging user access control.

Setting aside the firewall aspect, let’s talk about what database professionals can do to keep sensitive information safe. Specifically, we’ll focus on preventing leaks and hacks. (Guarding against malware, data loss, and other potential problems is a real concern, but it’s outside the scope of this article.)

If you’re a database administrator or designer, you should already know this stuff; if you’re just learning the craft or if you’re a business owner or manager, this can be your introduction to the world of database security.

Database Security Essentials

Like everything else in tech, database security is a balancing act between ease of use and tight security. Ultra-tight security can mean restricting access to almost no one – but even that has a level of risk because of social engineering, which we’ll discuss later.

In the real world, people must access sensitive information; this could include company employees, the information’s owners (e.g. a patient reading their own medical records), or third parties (e.g. a specialist consulting with a family doctor on a patient’s case). The goal will always be to ensure that only the people who legitimately need sensitive information are the ones who get it. Ever.

To that end, there are many ways that database and IT professionals can safeguard sensitive data. I’ll divide these methods into three broad categories:

Direct methods, which can be incorporated into the actual database structure.
Indirect methods, which relate to the application using the database or the technical infrastructure the database is using.
Social engineering, which relates to how people can be manipulated into giving unauthorized persons access to sensitive data.

We’ll start with the direct method first. Please note this isn’t meant to be a comprehensive list of all database security measures; it’s just a quick overview of the basic principles.

Direct Database Security

We can group six database security methods into the direct category:

Authentication.
Authorization.
Encryption.
Password hashing.
Separation of duties.
Database auditing and monitoring.

While the last two aren’t exactly part of the database itself, they deal with it so closely – and can affect it so directly – that we’re including them here.

As you’ll see, there are some familiar faces among this lot.

Authentication

First up, we have authentication. Connecting to a database is a lot like signing into your email account: you have to provide the right username and password to get access. The purpose of authentication is obvious: it’s the gatekeeper to who can and cannot access the database.

Authorization

However, not all database users are created equal. Some have more rights than others. This is similar to user roles in Google Docs: Some people can edit, some people can comment, others can view. Likewise, different user groups can do different things in a database. Much depends on how each organization defines user roles; as a broad generalization, we can say that:

Administrators can set up accounts for new users, change the database itself, assign and remove permissions to various users, and view all the data.
Data analysts can work with data directly, but they can’t set up new accounts.
Business team leaders might only have access to their department’s data, while team members might only see what’s relevant to their role (e.g. their own sales data).

The guiding principle behind authorization is limiting data exposure. By limiting each user’s access to only what they need for their role, we reduce the vulnerability of sensitive data. It’s the most workable compromise between usability and ultra-tight security.

Encryption

As the Vastaamo breach illustrates, more is needed than just a few passwords and limiting the number of people who have access to data. In addition to encryption and access control measures, antivirus software plays a crucial role in defending against data breaches. Antivirus software helps detect and prevent malware attacks that could compromise the security of the database. For example, solutions like Norton or McAfee offer robust protection against various forms of malicious software attempting to infiltrate the system. While encryption adds another layer of security by making the data less readily understandable, antivirus software serves as a frontline defense against various forms of malicious software attempting to infiltrate the system.

We’re all fairly familiar with what encryption means. If not, here’s the definition given by the cybersecurity platform Trellix:

“Data encryption scrambles data into “ciphertext” to render it unreadable to anyone without the correct decryption key or password...There are two main types of data encryption: symmetric encryption (the same key is used to both encrypt and decrypt the data) … and asymmetric encryption, which uses two mathematically related keys, a public key and a private key. The public key is used to encrypt the data, while a corresponding but separate private key is required to decrypt the data.”

In today’s world, the question isn’t if some unauthorized person will gain access to a database; it’s when this will happen. Scrambling sensitive data as described above isn’t a foolproof solution, but it does guard against some attacks.

Password Hashing

While encryption is a two-way process, hashing is a one-way process. As the name suggests, password hashing takes the password string (a string is any group of letters, numbers, and symbols, like this sentence) and turns it into a totally different string. WIRED describes hashing as “random-looking strings of characters into which the passwords have been mathematically transformed to prevent them from being misused.” In a database, it’s common to see passwords hashed, but it is certainly possible to have other types of information (e.g. social security numbers, credit card numbers, etc.) treated the same way.

Because hashing doesn’t store passwords (or sensitive data), some types can be a bit more secure than some types of encryption. The key phrase is “some types”, as both encryption and hashing are only as good as their underlying algorithms. Some are hard to break, others are … less hard. So, assuming there’s a robust hashing algorithm in charge, how does it work?

Let’s suppose you’re signing into your online bank account. When you set up your online account with your bank, you chose the password MyBankAccount. (It’s not a very good password, but that’s another article.) Instead of storing MyBankAccount in their databases, the bank automatically ran your lousy password through their hash algorithm, which stored it as asdsni204umdnai89435nnomsh9kcn48nosfdiph. Now, whenever you login into your bank account, the bank simply re-runs that hash algorithm and checks it against the value they’ve stored. If it doesn’t match exactly, your request is denied. And because it’s much harder to reverse engineer the hashed value, it’s much safer than storing your password as text.

Separation of Duties

In one form or another, human error accounts for approximately 88% of data breaches. Mistakes in the design, development, implementation, or migration of a database are certainly at risk for simple error. Thus, it’s standard practice to have more than one person review development work. Much like proofreading an important message to your boss, it helps to have another pair of eyes examine code before it goes into production. The person who writes the code should not be the person who verifies that it works as intended.

Auditing and Monitoring

This is another familiar concept: tracking the activity and performance of a database as a sort of health indicator and warning system. Depending on who you ask, database monitoring can be done by tools, while auditing is done by an experienced human (or team of humans).

Essentially, the main goal is the same:

To ensure that the system is performing as expected, or
To investigate why a problem is arising, or
In the case of a security event, to determine what went wrong.

Auditing and monitoring might seem like the least technically impressive of these security methods, but they are an integral part of data safety.

Indirect Database Security

There are more indirect database security measures than we can comfortably cover in this article: database and application firewalls, application security, server and hardware security, and the complex world of network security. On top of this, there are maintenance, upgrade, and patching tasks that are essential to the safety of databases residing on or being accessed by different systems.

Much (if not all) of this security is outside the role of the database administrator, designer, or developer. And when you factor in things like user or employee activities (e.g. employees who choose weak passwords), it’s painfully clear that true data security can’t be achieved by one party in the transaction; everyone has to be on board. To support this collective effort, data privacy tools play a crucial role in protecting sensitive information and enhancing overall security measures. And that brings us to social engineering.

Social Engineering and Database Security

Social engineering, in this context, is (mis)using social skills to get access to a database.

Social engineering is anything but new; it’s a trope in every heist movie, which has at least one stock figure – the charming conman or conwoman who wheedles information out of the goodhearted but gullible mark. It’s the stuff of fiction, but it works remarkably well in real life.

There are two things that make social engineering particularly appealing:

People generally want to be helpful and sympathetic.
It doesn’t require any spectacular coding skills. In other words, it’s the low-hanging fruit of the hacking world. For an example of how simple and effective social engineering can be, watch this short YouTube clip.

In this scenario, the hacker will pose as someone who has legitimate reasons to access the data – such as the new company VP, who you somehow forgot to include on the allowed user list. Or they might claim to be associated with an Internet service provider, a business partner, etc. Or, as we saw in the above clip, just someone who’s having a really bad day and needs to get into their phone account. Once the hacker has the needed login credentials, they can basically do whatever they want.

From a security standpoint, the problem with this approach is that any of these situations could easily happen. And, sympathetic and/or fearful creatures that we are, we want to help the person experiencing them (or, we want to right our “mistake” and avoid losing our jobs!). This mindset is exactly what social engineering prays upon.

This is outside of a database professional’s direct control. But, as someone with an understanding of the risks involved, you can explain and advocate for security measures. For example, any company can:

Require that all access requests are verified by a known and trusted second person.
Educate all employees about what social engineering attacks look like.
Conduct online safety classes and teach data security ‘hygiene’ (e.g. creating strong passwords, etc.).

For a readable, in-depth look at social engineering, check out this CSO article.

Database Security Is Everyone’s Job

In conclusion, it’s fair to say that database security is the responsibility of anyone who has access to any kind of data. In other words, it’s the job of everyone in the company. While database designers and administrators may be the only ones with the explicit role of database security, weak passwords and social engineering are still widely exploited. That’s why cybersecurity experts continue to evangelize for awareness and training across all roles.

We’ve just barely scratched the surface of database security best practices. If you’d like to learn more about this topic, let us know in the comments.