Back to articles list
- 6 minutes read

Modeling a Database for Recording Sales. Part 1

Storing sales data properly and later combining it can lead to creating a predictive model with a high rate of accuracy. In this and the next few articles we’ll analyze a database design for recording sales.

Everyone lives by selling something.

Robert Louis Stevenson

In today’s world, selling products is ubiquitous. And salespeople who have access to robust tools that leverage historical data to analyze trends and enable an enterprise to adjust business strategies accordingly have an advantage over their competitors. There are lots of parameters that can affect company results: the current global economic situation, clients’ location, age, material and marital status, and history of previous contacts or sales to clients.

We’ll start with a very simple example: a database model for sales in a coffee shop. In subsequent articles, we'll extend the model to selling products in other branches.

Sales Model

In this article we’ll analyze just a part of model that contains sales data with other parts missing.

We still have connections to missing tables and we’ll look at the model as a black-box assuming that the following is correct for table sale:

  • user_has_role_id refer to id in user_has_role (as presented in my previous article in the section “Time component added”) and stores information about the user that created sale record



This model enables us to create sales records with multiple items. Each item is related to a product from our catalog. The moment when we generate a sale can be different from the moment when the sale is paid for. For example, for a cup of coffee these moments will differ in a matter of minutes or hours. If our shop sold telecommunication devices, the difference can be a few days, maybe even months.

Tables

Let’s take a look at table definition and explain the purpose and usage of attributes.

Sales design - product table

The most important table in the model is product. It’s used to store details about products we’ll offer to our clients. Products are usually delivered to a client and paid for once, usually at delivery time. Also, products are usually physical objects like cars, phones, packages of sugar, or cups of coffee.

We’ll talk about selling non-physical things (services) in the next articles.

Attributes in the product table are:

  • name – the name of the product in system
  • price_per_unit – cost of product per unit (e.g., 1 cup of coffee costs 1.8 Euro, 1 car costs 17,500 Euro, 1 kg of rice costs 2 Euros)
  • basic_unit – base unit when we’re selling a product (e.g., piece, kg, liter)
  • tax_percentage – percent of the price_per_unit to be charged as tax. We must assume that tax percentage wouldn’t be the same for all products
  • limited – this field is set to True if we have a limited quantity on stock and False otherwise (e.g., we can order any quantity we need for our store from a distributer)
  • in_stock – if limited=True this attribute shows how many we have available to sell
  • active_for_sale – if this attribute is False than we’re currently not offering that product for sale, otherwise we can offer it to clients

We can get a list of products we can offer to clients with the following query:

SELECT product.id, product.price_per_unit, 
       product.basic_unit, product.limited, product.in_stock
FROM product
WHERE product.active_for_sale = True
AND (product.limited = False OR
      (product.limited = True and product.in_stock > 0))

Sales design - sale_status table

The table sale_status is just a simple dictionary that contains all statuses that a sale can have in the system (e.g., “receipt issued”, “receipt paid”).

Sales design - sale table

The table sale is the second most important table in this model. So far, this table has no connection with clients to whom we sold products because, in our coffee shop example, we don’t need to know such information. In part 2, the model will be extended to cover such cases.

Attributes in the table and their meanings are:

  • time_created – time when a sale record was generated in the system (e.g., automatic time that the record was created when we generated a sale for coffee in our coffee shop or a manually added time if we want it so)
  • time_paid – generally we can expect that some sales will be paid within a few days or even a month after creation (e.g., if we deliver software and create a receipt we can wait up to 90 days to get paid in some countries, if everything goes by the law)
  • sale_amount – original amount intended to be charged to the client
  • sale_amount_paid – amount that the client actually paid. It can be null because at the moment we create a receipt we don’t always have this information
  • tax_amount – sum of all tax amounts for items on that receipt
  • sale_status_id – reference to sale_status table
  • user_has_role_id – reference to user and his role at the moment he or she entered the receipt into system

We can get the issued and paid amount (according to time_created) within a period of time with query like this:

SELECT SUM(sale.sale_amount) AS amount_issued,
       SUM(sale.sale_amount_paid) AS amount_paid 
FROM sale
WHERE sale.time_created >= @start_time 
AND sale.time_created <= @end_time;

To get the exact amount paid within a period of time we must use a query like this:

SELECT SUM(sale.sale_amount_paid) AS amount_paid 
FROM sale
WHERE sale.time_paid >= @start_time 
AND sale.time_paid <= @end_time;

The query below will calculate the issued and paid amount within a period of time with the issue date and payment date checked separately:

SELECT 
SUM(CASE WHEN sale.time_created >= @start_time 
    AND sale.time_created <= @end_time 
    THEN sale.sale_amount END) AS amount_issued,
SUM(CASE WHEN sale.time_paid >= @start_time 
    AND sale.time_paid <= @end_time 
    THEN sale.sale_amount_paid END) AS amount_paid 
FROM sale

In all examples @start_time and @end_time are variables containing the start time and end time of period for which we want to check issued and paid SUM.

Sales design - sale_item table

The table sale_item connects products and sales. Of course, we must assume that we’ll have multiple items on one receipt so we need this table to have a many-to-many relationship.

Attributes and theirs meanings are:

  • quantity_sold – quantity of product that was sold and is charged on that sale/receipt (e.g., 3 coffees)
  • price_per_unit – same value as product.price_per_unit at the moment when the sale was created. We have to save it because price_per_unit in the product table can change over time
  • price – product of quantity_sold and price_per_unit; a small redundancy that helps us to avoid this calculation in queries. Generally, the sum of all item prices belonging to the same sale should be equal to the sale.sale_amount
  • tax_amount – tax amount for that item on receipt
  • sale_id – id of sale that this item belongs to
  • product_id – product id related to this item

We could now easily make a simple report, how many products/services we sold in period and at what price.

SELECT product.name, SUM(sale_item.quantity_sold) AS quantity, 
       SUM(sale_item.price) AS price
FROM sale, sale_item, product
WHERE sale.id = sale_item.sale_id
AND sale_item.product_id = product.id
AND sale.time_created >= @start_time 
AND sale.time_created <= @end_time
GROUP BY product.id

go to top