Explore the use of subqueries in SQL to enhance your queries. Understand their function in SELECT, FROM, and WHERE clauses for dynamic calculations, modular queries, and dynamic data filtering.
Improve your readability in:
Subqueries, often referred to as inner queries or nested queries, are queries embedded inside another SQL query. They allow you to dynamically filter, transform, or manipulate data by providing intermediate results for use in the main (outer) query. Subqueries can appear in the SELECT, FROM, or WHERE clauses of a query, and each placement serves a different purpose. Let’s understand more using the provided books, authors, and sales tables from Chapter 3! Subqueries in the SELECT Clause Subqueries in the SELECT clause are used to calculate or fetch derived values for each row in the main query’s result set. You can use subqueries to retrieve a value from another table and display it as a computed column. To find total sales revenue for each book: SELECT b.book_title , ( SELECT SUM(s.price) FROM sales s WHERE s.book_id = b.book_id ) AS total_revenue FROM books b; Total sales revenue for each book Total sales revenue for each book The (SELECT SUM(s.price) FROM sales s WHERE s.book_id = b.book_id) subquery calculates the total sales revenue for each book. This derived value is included as a new column (total_revenue) in the result set. The result allows you to see both the book_title and its total_revenue. Subqueries in the FROM Clause Subqueries in the FROM clause are often referred to as “derived tables” or “inline views.” They allow you to create a temporary table or result set that can be treated like a regular table in the main query. For ranking authors by total sales, use: SELECT author_name , total_sales FROM ( SELECT a.author_name , SUM(s.price) AS total_sales FROM authors a JOIN books b ON a.author_id = b.author_id JOIN sales s ON b.book_id = s.book_id GROUP BY a.author_name ) AS author_sales ORDER BY total_sales DESC; Rank authors by total sales Rank authors by total sales In the FROM clause, the subquery calculates the total_sales for each author by summing up the price column from the sales table. This intermediate result set (author_sales) is treated as a derived table. The main query retrieves data from the derived table, sorting authors by their total_sales. Subqueries in the WHERE Clause Subqueries in the WHERE clause are used to filter rows in the main query based on the results of another query. Let’s find books that have not been sold! SELECT book_title FROM books WHERE book_id NOT IN ( SELECT book_id FROM sales ); Books that have not been sold Books that have not been sold The subquery (SELECT book_id FROM sales) retrieves all book_id that have been sold. The main query then filters out the book_id that appeared in the subquery result and returns books that have not been sold. Correlated Subqueries A correlated subquery is executed once for every row of the outer query. It references columns from the outer query, meaning the subquery depends on each row being processed in the outer query to execute. To find the titles of books that have more pages than the average number of pages for books in their genre: SELECT book_title FROM books b WHERE total_pages > ( SELECT AVG(total_pages) FROM books WHERE book_genre = b.book_genre ); Books that have more pages than the average in their genre Books that have more pages than the average in their genre Here, the subquery calculates the average number of pages for each genre (b.book_genre), and this calculation is repeated for every row in the books table because the subquery depends on the outer query’s book_genre. 💡 This can lead to performance issues when dealing with large datasets, as the subquery is repeatedly executed for each outer query row. If a subquery is correlated, it will execute for every row in the outer query. On the other hand, if a subquery is non-correlated, it executes only once, making it more efficient in most cases. Use subqueries for modularity, breaking down complex problems into manageable parts and easier to understand. Keep in mind that subqueries can be resource-intensive, especially if they are not properly indexed or if they involve large datasets. Consider using filters in subqueries or completely changing them by using joins or Common Table Expressions (CTEs) for improved readability and performance. References https://en.wikibooks.org/wiki/Structured_Query_Language/SELECT:_Subquery https://en.wikipedia.org/wiki/Correlated_subquery
Originally published at https://blog.ardenov.com.
Top comments (0)