Let’s take a look at the following table:
Customer | Purchase date | Product name | Amount | Price | Total price |
---|---|---|---|---|---|
Joe Smith | 2014-02-14 | Yoga mat | 1 | 80 | 80 |
Jane Bauer | 2014-02-16 | Yoga block | 2 | 30 | 60 |
Joe Smith | 2014-02-14 | Yoga block | 2 | 30 | 60 |
Joe Smith | 2014-02-14 | Yoga strap | 1 | 10 | 10 |
Thomas Apple | 2014-02-18 | Dumbbells 2kg | 2 | 30 | 60 |
Jane Bauer | 2014-02-16 | Yoga mat | 1 | 80 | 80 |
What’s wrong with this table? It’s difficult to modify data in it. Upon modification, several anomalies can occur:
Insert anomalies
It’s impossible to insert a product into the table if the product hasn’t been bought by a customer yet. Similarly, it’s impossible to insert a customer who hasn’t made a purchase yet.
Update anomalies
It’s difficult to update data in the table. If you want to change the name of the product, you have to update all rows where the product is bought. You cannot change the price of the product for all future purchases.
Delete anomalies
If you delete the Thomas Apple purchase (say, because the order was cancelled), you will also delete the product “Dumbbells 2kg.”
How do you deal with tables like this? You have to normalize them! A detailed post on data normalization will appear soon. Stay tuned!