This is the fourth in our multi–part series on data modeling for information security as well as data characteristics. A simple data model for a fictional website that supports shared–interest organizations (bird–watching clubs, etc.) has provided us with content for exploring data modeling from a security viewpoint.
In Oscar Wilde’s play Lady Windermere’s Fan, Lord Darlington tags a cynic as “somebody who knows the price of everything, and the value of nothing.” Sadly, the information in our databases can be unconsciously treated in the same way. Is a customer account worth the sum of its purchases? What do we suffer if we lose four hours of marketing data during holiday shopping season?
We data modelers won’t be making those assessments, but we must store their relevant data on behalf of the people who will. We’ll have to fill the gaps of implicit data structure. In this article, we’ll see how to add this important security element to our database.
Show Me the Money!
How much should we protect each data object? Consider them in the light of Confidentiality, Integrity, and Availability — the key qualities determining the security of an information system. We must also allow for the difference between these measures on an “intrinsic” basis and how much this data may affect security.
There are two reasons to do this. First, it will help us to see how to protect the data in our club database. Should some tables be encrypted? Put in other schemas? Perhaps downstream we will attach Virtual Private Database controls? This information will help us to choose appropriate safeguards.
Secondly, we’ll ponder the data from a raw accounting perspective: What is its total value? What could we lose in case of data corruption? What is our liability if personal data is disclosed? When we add this information to our schema, we add a critical metric to our stored data: dollars and cents. This lets the people payingt he bills determine what they can afford for security ––– and, in a monetary way, how much it is worth.
All the Relationships Are Spelled Out
Let’s recap the state of our model. As of the last article, the data structure has been filled in. Persons, clubs, membership, photos, albums, and content are all there. How they tie together is there. This schema is ready to store data with the relationships explicitly captured throughout, while implicit relationships have been eliminated as far as possible.
Attributing Value and Sensitivity
Now we’ll figure out how to put numbers to data. We really can’t attach a single value to a data item telling us how much to protect it. However, we can’t — and needn’t — go into a collection of metrics, either. We’ll focus on how much a piece of data can earn for us, and how much losing or disclosing that data can cost us.
We use the terms “value” and “sensitivity” for this – a positive and a negative measurement, if you will. Value is often considered in terms of future value or opportunity. Sensitivity is very much defensive; it relates to risks on a financial level (regulatory or legal penalties) and in loss of reputation or goodwill.
Valuation relates directly to Integrity and Availability. We will judge this in terms of what benefits the data can generate, or how much damage will be done if access to it is lost. We address sensitivity mainly in terms of Confidentiality, which must be measured by the damage or liability if it is revealed.
The Common Structure of Valuation and Sensitivity
Now let us consider valuation and sensitivity against our database. As we view the data model again, we find that these qualities are relative only to a club or a person. A club or a person benefits from the value of something, and they suffer when something sensitive is made public. Therefore, each of these assessments are captured with regard to a club or a person. as we look at our data entities, we will ensure that each one that has a value (benefit) also carries a sensitivity (risk), and vice–versa. So each entity involved will have both, separate Valuation and Sensitivity fields. They’ll be optional or defaulted in most cases. Also, both will be weighed in terms of money: a currency value, precise to hundredths of a U.S. dollar. (For the sake of clarity, we’ll use just one currency.) Celebrate it or bemoan it, money is our only usable metric for either. To leverage this commonality we will call these “Importance”.
As data modelers, we cannot actually put numbers on this ourselves. Even as the site or database operator, we don’t know enough to assign these values; besides, the data isn’t completely ours. For data that is specific to a club, we need to let that club assign its own importance levels and its rules for using those levels. Then we apply their rules to their data.
Let’s start with the types of entities clubs can assign.
Club Data
The Club entities are:
- Club
- Club_Office
- Officer
- Member
- Album
- Album_Photo
- Photo
We’ll add Valuation
and Sensitivity
columns to each of these. Because these columns are attached to the Club, their names are specific – e.g. club_sensitivity
.
Here is our set of focus tables for Club, including Person:
Personal Data
Now we need to address the Person
entity. Again, we do not assign the values here – that is the prerogative of the person. Naturally, we need to add Importance columns to Person. But to better support personal privacy, we are going to slice this entity finer. After all, privacy is key to data sensitivity.
First, we will add a new column called monicker
that is like a username or alias. Club members can use that for identification rather than their actual names. We will provide a valuation/sensitivity column pair for the name-monicker association. These will be person_name_valuation
and person_name_sensitivity
. The rest of the fields are controlled by these two pairs.
A Person’s club activity is as much their interest as the Club’s. Therefore we will add the same Importance fields to Member and Officer.
Now we could add person_importance
fields to the Photo entity, but look at the photo_content column. A photo can involve multiple persons, and this is part of what we store in photo_content. Therefore, we will put the importance fields on photo_content. instead of on Photo.
The “Sensitized” Model
We have modified our data model to ascribe data value and data sensitivity everywhere it’s needed. The following is our final schema.
We have been careful to avoid distorting the original schema with additional relationships or constraints. This is critical because we are taking that schema as an accurate analysis of the real data with real business rules.
Attaching any kind of inherent importance to your data is difficult. It’s worse if you are trying to apply it to a database without support in the model or schema. This article demonstrates a technique to attach this information in a way that doesn’t distort the intrinsic business parts of the model.
The flexibility and modifiability of Value and Sensitivity are key goals here. As you start applying real values to these attributes, you will find you need to modify them and revise your approach. That’s one reason for individually attaching these values to the tables themselves, rather than having them offboard. The downside is that it gets quite complicated, due to the many locations for these values. This can even show up in how the model is used. We will take up the multi–faceted issue of managing complexity in information security in our next article.
Please leave any comments or critiques in our combox.