Summary: in this tutorial, you will learn about the MariaDB subqueries to form more complex queries.
What is a subquery
A subquery is a query nested in another query.
See the following countries
table from the sample database:
Suppose that you want to find countries whose areas are bigger than 5,000,000 km2. To achieve this, you may come up with these steps:
First, find identifications of countries whose areas are greater than 5,000,000:
select country_id
from countries
where area > 5000000;
Code language: SQL (Structured Query Language) (sql)
Then, use these ids in the following query to get the countries’ information:
select
name,
area
from countries
where country_id in (12,15,31,38,42,182,224)
order by
area,
name;
Code language: SQL (Structured Query Language) (sql)
This approach has some issues:
- First, if the country’s areas are updated, you need to update your second query.
- Second, you need to execute two queries to the database server and manually update the ids in the second query to get the result.
Subqueries can help resolve these issues. Instead of executing two queries at different time, you nest the first query in the second query as follows:
select
name,
area
from
countries
where country_id in (
select
country_id
from
countries
where
area > 5000000
)
order by
area,
name;
Code language: SQL (Structured Query Language) (sql)
The following query is called a subquery:
select country_id
from countries
where area > 5000000
Code language: SQL (Structured Query Language) (sql)
And the query that includes the subquery is known as the outer query.
This tutorial focuses on the three types of subqueries: scalar subqueries, row subqueries, and the subqueries that appears in the from
clause of a select
statement
Scalar queries
A scalar subquery returns a single value. A scalar subquery can be used in an SQL statement where a literal or single column value is used.
The following query returns the largest area of all countries:
select max(area)
from countries;
Code language: SQL (Structured Query Language) (sql)
Since the query returns a single value, you can use it as a subquery to find the country that has the largest area:
select *
from countries
where area = (
select max(area)
from countries
);
Code language: SQL (Structured Query Language) (sql)
Row subqueries
A row subquery returns a single row.
See the following countries
and country_stats
tables:
The following statement returns the average population and GDP of all countries in 2018:
select
avg(population),
avg(gdp)
from country_stats
where year = 2018;
Code language: SQL (Structured Query Language) (sql)
This query returns a row that consists of two columns. You can use this query as a subquery to find countries whose population and GDP are greater than the average population and GDP of all countries in 2018:
select
name
from
country_stats
inner join countries
using (country_id)
where
year = 2018 and
(population, gdp) > (
select
avg(population),
avg(gdp)
from country_stats
where year = 2018)
order by
name;
Code language: SQL (Structured Query Language) (sql)
Subqueries in the from clause
Typically, you place a table in the from clause of a select
statement.
Since a query also returns a result set which is the same as a virtual table, you can place a subquery in the from clause.
select select_list
from (subquery) as table_alias
...;
Code language: SQL (Structured Query Language) (sql)
A subquery that appears in the from clause is commonly called a derived table. A derived table must have an alias or you will get a syntax error.
For example, the following statement returns the average area of all regions:
select avg(region_area)
from (
select sum(area) region_area
from countries
group by region_id
) t;
Code language: SQL (Structured Query Language) (sql)
In this example:
- First, the subquery returns the total area for each region.
- Then, the outer query calculates the average area of all regions.
In this tutorial, you learned about the MariaDB subquery to form more complex queries.