Joins in relational databases are a feature that allows you to combine rows from multiple tables into one based on a related column between them. There are several types of joins in relational databases, including inner joins, outer joins, and cross joins. Each type has its own purpose and use cases.
Although joins can be useful, beware of using them with large amounts of data as they can introduce performance problems. Nobody wants performance problems, right?
Let's look at each join type in more detail.
An inner join returns only the rows that have matching values in both tables. If there is no match between the two tables, the row is not included in the result set.
Let's assume that we have following two tables:
EmployeeID | EmployeeName | DepartmentID |
---|---|---|
1 | John Doe | 101 |
2 | Gonzo Baloni | 102 |
3 | Jim Beam | 103 |
4 | Jane Silvester | 101 |
5 | Carla Overwood | 105 |
DepartmentID | DepartmentName |
---|---|
101 | Engineering |
102 | Fishihg |
103 | Finance |
104 | Grass cutting |
If we needed to know the department name for each employe, we would use an inner join to combine these tables
based on their common column, DepartmentID
.
SELECT employees.EmployeeName, departments.DepartmentName
FROM employees
INNER JOIN departments ON employees.DepartmentID = departments.DepartmentID;
This query joins the Employees
table with the Departments
table where their
DepartmentID
values match. The resulting table is:
EmployeeName | DepartmentName |
---|---|
John Doe | Engineering |
Gonzo Baloni | Fishing |
Jim Beam | Finance |
Jane Silvester | Engineering |
Left join returns all records from the left (first specified) table and the matched records from the right table. If
there is no match, a NULL
value will be assigned to the right table column.
We will again work with the same tables:
EmployeeID | EmployeeName | DepartmentID |
---|---|---|
1 | John Doe | 101 |
2 | Gonzo Baloni | 102 |
3 | Jim Beam | 103 |
4 | Jane Silvester | 101 |
5 | Carla Overwood | 105 |
DepartmentID | DepartmentName |
---|---|
101 | Engineering |
102 | Fishihg |
103 | Finance |
104 | Grass cutting |
Left join query looks like this:
SELECT employees.EmployeeName, departments.DepartmentName
FROM employees
LEFT JOIN departments ON employees.DepartmentID = departments.DepartmentID;
The resulting table would be:
EmployeeName | DepartmentName |
---|---|
John Doe | Engineering |
Gonzo Baloni | Fishing |
Jim Beam | Finance |
Jane Silvester | Engineering |
Carla Overwood | NULL |
Right join is the opposite of the left join. It returns all records from the right (second specified) table and the
matched records from the left table. If there is no match, a NULL
value will be assigned to the left
table column.
So, if we were to run the following query with the same tables as before:
EmployeeID | EmployeeName | DepartmentID |
---|---|---|
1 | John Doe | 101 |
2 | Gonzo Baloni | 102 |
3 | Jim Beam | 103 |
4 | Jane Silvester | 101 |
5 | Carla Overwood | 105 |
DepartmentID | DepartmentName |
---|---|
101 | Engineering |
102 | Fishihg |
103 | Finance |
104 | Grass cutting |
SELECT employees.EmployeeName, departments.DepartmentName
FROM employees
RIGHT JOIN departments ON employees.DepartmentID = departments.DepartmentID;
the resulting table would look like this:
EmployeeName | DepartmentName |
---|---|
John Doe | Engineering |
Jane Silvester | Engineering |
Gonzo Baloni | Fishing |
Jim Beam | Finance |
NULL | Grass cutting |
Notice that we have two rows for the Engineering
department. This is because there are two employees
assigned to that department.
A self join is a concept where a table is joined with itself. It might sound unusual at first, but it's quite useful in scenarios where you need to compare rows within the same table or construct hierarchical relationships.
To perform a self join, you can use table aliases to refer to the same table in different roles within the query. This way, you can distinguish between the "left" and "right" instances of the same table.
Let's consider a modified version of the employees
table. Here, we have an additional column
ManagerID
that references EmployeeID
column within the same table.
EmployeeID | EmployeeName | DepartmentID | ManagerID |
---|---|---|---|
1 | John Doe | 101 | NULL |
2 | Gonzo Baloni | 102 | 1 |
3 | Jim Beam | 103 | 1 |
4 | Jane Silvester | 101 | 2 |
5 | Carla Overwood | 105 | 2 |
Now, imagine if an angry customer asked Gonzo Baloni to call his manager. To find out who directly reports to whom, we can perform a self join on our table:
SELECT
employee.EmployeeName AS Employee,
manager.EmployeeName AS Manager
FROM employees AS employee
JOIN employees AS manager ON manager.EmployeeID = employee.ManagerID;
In the resulting table, we can easily see that Gonzo reports to John Doe. We don't even know who the guy is, such a generic name.
EmployeeName | ManagerName |
---|---|
Gonzo Baloni | John Doe |
Jim Beam | John Doe |
Jane Silvester | Gonzo Baloni |
Carla Overwood | Gonzo Baloni |
A cross join returns the Cartesian product of two tables. It means that each row from the first table is combined with
each row from the second table. This means that if the first table has N
rows and the second table has
M
rows, the resulting table will have N * M
rows.
Cross join does not require a join condition, which is a difference from other types of joins.
To demonstrate a cross join, let's imagine these two tables:
ColorID | Color |
---|---|
1 | Red |
2 | Green |
3 | Blue |
ProductID | Product |
---|---|
1 | Shirt |
2 | Hat |
If we wanted to list all possible combination of products and colors, we would use a Cross join.
SELECT colors.Color, products.Product
FROM colors
CROSS JOIN products;
The resulting table looks like this:
Color | Product |
---|---|
Red | Shirt |
Red | Hat |
Green | Shirt |
Green | Hat |
Blue | Shirt |
Blue | Hat |