SQL (structured query language) is the language of databases. On its own, it is a very powerful tool, and as a full stack dev, it is super important we understand how this language works. In this blog post, I am going to discuss SQL join tables in depth: what they are, why they are necessary, the different types of joins you can perform, and how to use them. This article is for you if you are beginning your journey with SQL and have a working knowledge of the language, including some basic syntax. Difficulty level: intermediate beginner.
What is a Join Table? Why wouldn't I just make a new table?
SQL join tables are used to associate records from different tables; think of them as linking tables. Making a new table to represent a relationship would work, however, you would need to make a new table for each instance that is created.
If you intend to grow your db, this would become unmanageable very quickly. So a join table allows you to create a many-to-many relationship between two or more tables in an efficient way.
Wait...what is a many-to-many relationship again?
A many-to-many relationship is where multiple records in one table can be associated with multiple records in another table.
For example, imagine you have a chatbot with two db tables: a "statements" table that collects all the training data for your bot, and "tag" table, which has a collection of helper tags with an associated id. A statement can have multiple tags, and a tag can have multiple statements; this helps the bot classify statements and make searching for an appropriate response easier, creating a many-to-many relationship between the two tables.
In order to represent this relationship in SQL, you would need to use a third table, a join table, that sits between the two. Let's call it "tag_association". This table would have two columns of foreign keys that reference the primary keys of both the "statements" and "tag" tables. The join table creates our necessary many-to-many relationship and maintains the referential integrity of "statement" and "tag", simultaneously avoiding data duplication. Also, it is self-scaling, which means as your data updates, so does your relationship model.
The Importance of Relationships
SQL is a relational database language, which means data is organized into tables with rows and columns. Each row represents a specific type of data, and each column is an instance of that data. Naturally, the process of querying the databases heavily relies on relationships: different types of joins in SQL can represent different types of relationships between tables in a database. There are several types of joins in SQL, including (but not entirely limited to):
Inner Join: This type of join returns only the rows that have matching values in both tables being joined. An inner join can be used to represent a one-to-many or many-to-many relationship between two tables.
Left Outer Join: This type of join returns all the rows from the left table and matching rows from the right table. If there is no matching row in the right table, then the result will contain NULL values for the right table columns. A left outer join can be used to represent a one-to-many relationship, where the left table contains the primary key and the right table contains the foreign key.
Right Outer Join: This type of join returns all the rows from the right table and matching rows from the left table. If there is no matching row in the left table, then the result will contain NULL values for the left table columns. A right outer join can be used to represent a one-to-many relationship, where the right table contains the primary key and the left table contains the foreign key.
Full Outer Join: This type of join returns all the rows from both tables being joined, with NULL values for any unmatched columns. A full outer join can be used to represent a many-to-many relationship between two tables.
WHEW! That is a lot of information. Want to practice?
Let's go ahead a practice creating some join tables using MyComplier:. We are going to use some examples from W3School's SQL program (which is extensive and contains a lot of helpful information in learning the language).
-- create a table
CREATE TABLE orders (
OrderID INTEGER PRIMARY KEY,
CustomerID INTEGER,
OrderDate TEXT NOT NULL
);
-- insert some values
INSERT INTO orders VALUES (10308, 2, '1996-09-18');
INSERT INTO orders VALUES (10309, 39, '1996-09-19');
INSERT INTO orders VALUES (10310, 77, '1996-09-20');
-- fetch some values
SELECT * FROM orders WHERE CustomerID = 2;
Go ahead copy and paste this code into the complier link above (the link will take you to the main page, just click on the "Run Your SQL Code" button, it is giant and yellow, can't miss it). You can see we are creating a table called orders which has columns of OrderID, CustomerId and OrderDate. We are then inserting values into the table as particular instance orders. Then we are retrieving, or fetching an entire instance (annotated by the *) from our table where customerID matches 2. Hit the green run play button at the top and see what happens.
Now we are going to create another table, customers and populate with some data. You can copy and paste the code below, or type it yourself to improve muscle memory:
-- create another table
CREATE TABLE customers (
CustomerID INTEGER PRIMARY KEY,
CustomerName TEXT NOT NULL,
ContactName TEXT NOT NULL,
Country TEXT NOT NULL
);
-- insert some values
INSERT INTO customers VALUES (1, 'Alfreds Futterkiste', 'Maria Anders', 'Germany');
INSERT INTO customers VALUES (2, 'Ana Trujillo Emparedados y helados', 'Ana Trujillo', 'Mexico');
INSERT INTO customers VALUES (3, 'Antonio Moreno Taquería', 'Antonio Moreno', 'Mexico');
-- fetch some values
SELECT * FROM customers WHERE CustomerID = 2;
Hitting run, you will see that we are retuning the instance from customers where the id is 2.
Let's write for an Inner Join Table.
Great! Now it time to test how we can join these tables together. Comment out the two SELECT commands, and write the code below:
SELECT Orders.OrderID, Customers.CustomerName, Orders.OrderDate
FROM Orders
INNER JOIN Customers ON Orders.CustomerID=Customers.CustomerID;
Before you press run, can you guess what the new table will look like? Take a minute to study the above code and take an educated guess at what type of relationship we are creating. We see we are selecting from our both of our tables: orderID from orders, customerName from customers, and orderDate from orders. So there will be three columns of organized data.
The next line of code specifies the first table we are joining, orders. The next line, INNER JOIN Customers specifies that an inner join should be performed with the Customers table. ON Orders.CustomerID=Customers.CustomerID specifies the join condition, which is that the CustomerID column in the Orders table must match the CustomerID column in the Customers table.
Can you assume which is the instance that will be created? Run the code to see if you're right!
You can see that the only instance that is created is the one where the customer ID is in both tables, the id of 2. This is a very small example, so can you imagine ways to scale the table to see more instances in our join table?
Left Join Example
SELECT Customers.CustomerName, Orders.OrderID
FROM Customers
LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID
ORDER BY Customers.CustomerName;
As we may recall, a left join returns all the rows of the left table (in this instance customers) and only the matching rows of the right table (the orderId where customer.customerId=orders.customerID). There is only one occurrence of this in our tables, the order 10308, so the join table contains only one orderID. The other spaces for this may return Null or empty. So we see all three customer names printed, but only Anna Trujillo's id printed.
How to use the join tables within my application?
Join tables are used when we have many-to-many or a many-through relationship. For instance, say we have db that has coffee drinks and coffee shops as tables. We also have a customer tables that allows you to make an account, say for placing an online order. A join table (orders) allows us to make a relationship of orders, connecting one order with a customer_id, a drink_id and a shop_id. This way, the system has knowledge base of who ordered what and from where.
Conclusion
SQL is a beast of a language, and if you would like to dive into it, you could very well become a SQL master. However, with the benefits of abstraction and using tools like SQLAlchemy, our databases become easier to write. Understanding depth of the SQL language to write DB's can help us in our abstraction have a better understanding of what is happening in our program under the hood. It can be dense, but once you emerge from the darkest forest, the sun shines brighter!
If there is anything that I missed or that you think I ought to have included in this blog post, please let me know. And as always, Happy Coding!
Top comments (0)