Practical SQL: Designing and Creating a Relational Database

9 min readNov 27, 2020

--

SQL is a programming language that is used by most relational database management systems (RDBMS) to manage data stored in tabular form (i.e. tables).

A relational database consists of multiple tables that relate to each other. The relation between tables is formed in the sense of shared columns.

There are many different relational database management systems (e.g. MySQL, PostgreSQL, SQL Server). The SQL syntax they adapt might differ slightly. However, the difference is very small so if you learn how to use one, you can easily switch to another one.

In this post, we will design a simple relational database for a retail business and create it using MySQL.

Below is the schema of the sales database. It contains 4 relational tables. The first line is the name of table. The other lines represent the columns in table.

Schema of sales database (image by author)

You may have noticed the initials “PK” (primary key) and “FK” (foreign key) next to some column names.

Primary key is the column that uniquely identifies each row. It is like the index of a pandas dataframe.
Foreign key is what relates a table to another one. Foreign key contains the primary key of another table. For instance, the “item_id” in the purchase table is a foreign key. It stores the rows from the primary key in the item table.

Foreign keys are fundamental in creating relational tables. Consider a case where we store item descriptions and prices in the purchase table instead of storing them in a separate item table.

If the price of an item changes, we need to change the price for all the rows in the purchase table that contains that item. It is a lot of work. When the item prices are stored in a separate table, we just need to make one update.

Since the purchase table is related to the item table by using the “item_id” as foreign key, we don’t have to worry about updating the purchase table.

The design of the database is set. We can start to build it now. I will be using the terminal to write SQL syntax. There are also other options such as MySQL Workbench.

We first need to connect to the MySQL server from the terminal:

~$ sudo mysql -u root

The next step is to create the sales database and enter it:

mysql> create database sales;
mysql> use sales;

We are now in the sales database but it does not contain any table yet. We can start creating the tables.

The first one is the store table because all the other ones have foreign keys that relate to another table. The related tables through foreign keys must be created first.

mysql> CREATE TABLE store(
    -> store_id INT PRIMARY KEY,
    -> address VARCHAR(20) NOT NULL,
    -> manager VARCHAR(20) NOT NULL
    -> );

The syntax of CREATE TABLE statement consists of the table name and column descriptions in parenthesis. The primary key phrase is used to indicate the primary key column.

Each column description contains the column name and data type. Here are the data types used by SQL:

Int: integer
Decimal(M, N): Floating point number. M is total number of digits, N is the number of decimal digits.
Varchar(N): String (text) of lenght N
Blob: Binary large object
Date: ‘YYYY-MM-DD’
Timestamp: ‘YYYY-MM-DD HH:MM:SS’

We can now create the customer table.

mysql> CREATE TABLE customer(
    -> cust_id INT PRIMARY KEY,
    -> f_name VARCHAR(20) NOT NULL,
    -> l_name VARCHAR(20) NOT NULL,
    -> gender CHAR(1),
    -> store_id INT,
    -> FOREIGN KEY (store_id) REFERENCES store(store_id) 
    -> ON DELETE SET NULL
    -> );

You may have notices that we used the phrase “NOT NULL” in the description of some columns. It indicates that rows in this column cannot contain any null values (i.e. cannot be empty).

Note: The primary key columns cannot have any null values but we do not need to specify it explicitly by using the “NOT NULL” phrase.

The store_id column in customer table is a foreign key. We first need to create and then set as foreign key.

The foreign key statement also requires to specify the reference table and column names.

ON DELETE statement is used to specify what will happen to the values in the foreign key column when the values they refer to get deleted. For instance, the store_id column in the customer table is a foreign key. It refers to the store_id column in the store table (primary key of the store table). If a store gets deleted from the store table, some values in the customer table are also affected. ON DELETE SET NULL indicates that these values will be replaced with NULL.

We have two more tables to create which are purchase and item tables. Since purchase table contains a foreign key related to the item table, we need to create the item table first.

mysql> CREATE TABLE item(
    -> item_id INT,
    -> description VARCHAR(20),
    -> price DECIMAL(6,2) NOT NULL,
    -> store_id INT,
    -> PRIMARY KEY(item_id),
    -> FOREIGN KEY(store_id) REFERENCES store(store_id) 
    -> ON DELETE SET NULL
    -> );

As you can see, we can also declare a column as primary key after it is created.

The last table is the purchase table which contains 6 columns and three of them are foreign keys.

mysql> CREATE TABLE purchase(
    -> purchase_id INT PRIMARY KEY,
    -> item_qty INT NOT NULL,
    -> date DATE NOT NULL,
    -> item_id INT,
    -> cust_id INT,
    -> store_id INT,
    -> FOREIGN KEY(item_id) REFERENCES item(item_id)
    -> ON DELETE SET NULL,
    -> FOREIGN KEY(cust_id) REFERENCES customer(cust_id)
    -> ON DELETE SET NULL,
    -> FOREIGN KEY(store_id) REFERENCES store(store_id)
    -> ON DELETE SET NULL
    -> );

We now have 4 tables in our database. We can check them with SHOW TABLES statement.

mysql> show tables;
+-----------------+
| Tables_in_sales |
+-----------------+
| customer        |
| item            |
| purchase        |
| store           |
+-----------------+

We have successfully created a simple relational database. The structure of table can be viewed using the DESCRIBE statement.

mysql> describe customer;
+----------+-------------+------+-----+---------+-------+
| Field    | Type        | Null | Key | Default | Extra |
+----------+-------------+------+-----+---------+-------+
| cust_id  | int(11)     | NO   | PRI | NULL    |       |
| f_name   | varchar(20) | NO   |     | NULL    |       |
| l_name   | varchar(20) | NO   |     | NULL    |       |
| gender   | char(1)     | YES  |     | NULL    |       |
| store_id | int(11)     | YES  | MUL | NULL    |       |
+----------+-------------+------+-----+---------+-------+
5 rows in set (0.11 sec)

The tables do not contain any data now. In the next post, I will explain how to populate these tables with appropriate data and run queries to retrieve data.

The SELECT statement is used to create queries and retrieve data from tables. If you are or plan to be a data scientist, you are likely to use the select statement much more than the other ones.

The queries can be as simple as getting all the data in a table. However, we are more likely to run complex queries to retrieve the data partially. The SELECT statement allows us to create queries based on conditions, relations, and statistical measures.

Thank you for reading. Please let me know if you have any feedback.

Practical SQL: Designing and Creating a Relational Database

Written by uio uzxc

No responses yet