Figuring out authentication is part of a secure data storage strategy. Find out how to store auth data safely in your database.
Almost every application requires user authentication; that is why authentication data storage is a common feature of database and application design. Special attention is needed to ensure data is kept secure and to avoid breaches that can compromise sensitive information.
Do I Need Authentication or Authorization?
Although both words are frequently used in a similar way, they do not mean the same thing. Most systems require both authentication and authorization, so let’s first explain what each one means.
Authentication is a process that verifies that a person (in software application terms, the user) is whoever they say they are. It uses different mechanisms (password, security questions, fingerprint recognition, etc.) to confirm that the user is the person they are claiming to be. Modern systems combine many of these mechanisms and add additional methods (like one-time codes sent to email or phone) to authenticate users.
Authorization verifies that the user has permission to perform a specific task or access specific data. A set of policies or rules may be established to define the actions a user can perform. Authorization is usually processed AFTER authentication.
The article Applying Simple Access Control to a Small Data Model explains more about the difference between authentication and authorization. For now, we’ll review some authentication methods and explain how to securely store authentication data in a database.
Common Data Authentication Methods
There are many authentication methods applications employ to verify a user’s identity:
- Passwords are the most common method, where a user provides a password to confirm their
- Biometric methods include scanning fingerprints, faces, or even retinas. This usually requires specific hardware and the user’s physical presence, so such methods are often used to grant physical access to buildings or areas.
- Two-Factor Authentication requires the user to provide a password and a second value to verify their Popular second verification methods include:
- Secret questions.
- Security tokens.
- Security codes sent to email, SMS, authenticator apps, etc.
- Scratch codes that users can print out and manually enter.
- Social Authentication: Authentication is delegated to a social network (like Facebook, Twitter, or LinkedIn) or other platforms (like Google or Microsoft).
Not all of these methods require data to be stored on your own databases, so we’ll concentrate on those authentication methods that require storing sensitive information on your database.
Storing Passwords in a Database
Now we are going to review some of the best practices to store passwords in a database.
Plain Text, Encrypted or Hashed?
It should be quite obvious that storing passwords as plain text in our database is not a good idea, since it would mean that anyone with access to the database would be able to see all user passwords. Once we discard the option to store passwords in plain text, should we choose encryption or hashing?
Encryption is a two-way process that "scrambles'' readable text and converts it into something that is "illegible" until it is decrypted (using a "decryption key") and converted back to readable text. There are many encryption algorithms (like AES, DES, Twofish, and Blowfish).
Hashing is a one-way process that converts a string (usually a legible one) into another (illegible) string. No matter the size of the input string, a hashing mechanism returns a fixed length string output. As with encryption, there are several hashing algorithms (like MD5, SHA-1, SHA-2) that can be used to hash user passwords; we will see some of them later in the article.
You may initially be tempted to use encryption to store your passwords, but that is not the right approach! If you store encrypted passwords, the application will have to decrypt the stored password (using a decryption key) and compare it – every single time the user logs in. But if someone gets to know the decryption key, that person would be able to decrypt and access all stored passwords. That is a security concern.
If you store hashed rather than encrypted passwords, the application can simply hash the entered password and compare it with the stored hash. This way, nobody can decrypt the stored values.
Adding Salt and Pepper
Don’t worry – you’re still reading an article about login data and not a cooking recipe! Salting and peppering in this context refer to additional security measures taken to ensure passwords stored in a database are kept secure. Let’s discuss them.
Salting consists of generating a long and random string for each user and adding it to each typed-in password before hashing. That "salt" value must be stored together with the hashed password. This mechanism offers some notable benefits:
- If two users select the same password, the hashed values would be different and nobody could detect that both users share the same password. This is because the passwords are hashed after they’ve been concatenated with the random string. And since each salt string is randomly generated, the combination of password and string will be unique.
- Since passwords are usually short (most users do not use more than 10 or 12 characters), hackers have developed “rainbow tables” containing already hashed values for short strings. They can compare these values with a stored hash. Adding a lengthy string to the password makes using pre-calculated tables like rainbow tables more difficult.
To generate the salt for each user, use a reliable random generator like SecureRandom, which is recommended by OWASP. The formula to calculate the hashed value would be:
Hashed Password = HASH(INDIVIDUAL SALT + PASSWORD)
Peppering is simply adding an additional string to the “password + salt” combination before hashing it. (This extra string is often called “Secret” or “Pepper”; it’s not as frequently implemented as salting.) We are not going to dig into all the details of peppering; you can find them on Wikipedia. The main differences between peppering and salting are:
- The pepper string is common to all passwords.
- The pepper string is stored on a separate layer (like the application layer) than the database. Even if the database is compromised due to SQL Injection or a lost backup, passwords are not compromised; the pepper part of the formula is not available, as it is stored on a different server or
- The general formula to calculate the hashed value is:
Hashed Password = HASH(INDIVIDUAL SALT + PASSWORD + COMMON PEPPER)
Selecting the Right Algorithm
Some cryptographic algorithms are older; their usage should be avoided for password hashing, since they present some vulnerabilities. MD5 and SHA-1 have been reported as vulnerable due to collisions; the SHA-2 family of algorithms is currently the standard for hashing passwords. Having said that, newer options like SHA-3 offer more secure options. Longer hashes require more computation time to calculate and generate dictionary-based, brute force, or rainbow table attacks.
A Basic Model for Login Information
Now that we have explained the best way to store login data in a database, let’s take a quick look to a simple data model that stores user information:
In this diagram, we have a
UserAccount entity with the following attributes:
UserAccountID: An auto generated ID.
LoginName: The name used by the user to login in the system. Some systems may use an email address as a login name, but I would recommend keeping this as a separate attribute and allowing several options like a username, an email, or even a phone number.
FirstName: The user’s first name.
LastName: The user’s last name.
PasswordHash: A hash string for the user’s password plus the salt and pepper combination.
PasswordSalt: A unique, randomly-generated string that is concatenated with the user’s password before it is hashed and stored in the
PasswordDate: The date when the user last updated or created their password; this is useful when the password needs to be renewed.
After you have designed your data structure to store passwords in your database, you should consider reading the article Email Confirmation and Recovering Passwords to enhance your application with the features described in the article.
Open Authentication Standards
We have just reviewed some considerations and recommendations for safely storing passwords in our databases. But a secure platform is not easy to design and implement, so you may also need to think about relying on experts for authentication.
There are open standards that can be used to delegate authentication to a reliable third party, like Google, Facebook or Twitter. Let’s do a quick overview of them.
This standard allows authentication against an identity provider. As shown in the image below, the user logs in to the identity provider and it sends a certificate or key that confirms the user’s identity to the application making the request.
This standard is an extension (you can see it as a special defined use case) of the OAuth 2.0 Framework explained below. When you log in into Google to access your YouTube account, you are using OpenID. Open ID allows only the ID, profile, email, address and/or phone number to be shared to the calling application.
OAuth (Open Authentication)
OAuth is an authorization standard, not an authentication standard, but it can be used for “pseudo-authentication”. Rather than asking a third party to certify that a user is who they claim to be, this authorization API provides a key to access some part of the user’s account with the provider (usually containing basic information like name and email address) and/or to perform actions like sending messages, posting tweets, etc.
Although the provider does not certify an identity, the fact that it provides a key to access a specific account can be considered as proof that the user sending the key is actually the person they say they are.
If you use your Google or Facebook account to access a third-party application and you receive a message that you’re sharing your basic information with the third party, then you are using the more generic OAuth protocol rather than OpenID. The third-party application is accessing some information or tasks on the authorization provided rather than just validating your identity.
Although each method was designed for a different purpose, they can both be used to allow access to an application by delegating the authentication process to a well-known provider. This allows you to avoid designing and implementing a proprietary authentication feature – and the security risks that may exist if this feature is not well designed and implemented.
More About Storing Authentication Data in a Database
If you want to review more on storing authentication data in a database, see the article How to Store Authentication Data in a Database Part 4. It includes a sample data model used to store delegated authentication information (like authentication or authorization tokens and additional data) in your database.