DEV Community

Cover image for SQL 101 | Chapter 9: Common Table Expression (CTE)
Yujin
Yujin

Posted on

SQL 101 | Chapter 9: Common Table Expression (CTE)

Common Table Expression (CTE) is defined using the WITH keyword, writing the query logic within it, and then referencing it in your main query.

Improve your readability in:

SQL 101 | Chapter 9: Common Table Expression (CTE)

Common Table Expression (CTE) is defined using the WITH keyword, writing the query logic within it, and then referencing it in your main query.

favicon blog.ardenov.com

Common Table Expression (CTE) is introduced using the WITH keyword and serves as a temporary result set that is available only for the duration of the main query. Unlike subqueries, which can make SQL statements harder to read, CTE improves overall clarity, especially in queries that involve multiple steps. As an illustration, you can imagine CTE expressing the raw table into different expressions (temporary result sets) as needed to be processed into the final result. Transforming raw table to temporary result sets with CTE Transforming raw table to temporary result sets with CTE Why Use a Common Table Expression (CTE)? Improved Readability A Common Table Expression (CTE) helps break down complex SQL logic into smaller, modular blocks of code. This approach makes a query easier to understand and maintain. Reusable Logic One of the key benefits of CTE is its reusability. Once defined, the CTE can be referenced multiple times in the main query. This avoids the need to rewrite the same subquery logic repeatedly, which saves the time and ensures consistency in your queries. Step-by-Step Queries CTE is especially helpful when a query involves intermediate calculations or transformations. By defining each step separately, you can build the query incrementally. This makes the query easier to understand and debug. Recursive Queries Recursion in CTE is essential for handling hierarchical or tree-structured data. Whether you are working with organizational charts, file systems, or other similar structures, recursion simplifies complex queries and enhances their efficiency. Basic Syntax of a CTE The basic syntax of a CTE is defined using the WITH keyword, writing the query logic within it, and then referencing it in your main query. WITH cte_name AS ( SELECT column_1 FROM table_1 WHERE condition ) SELECT column_1 FROM cte_name; Here, cte_name acts as a placeholder for the CTE, which behaves like a temporary table for the main query. The logic within the CTE (SELECT statement) is executed first, and the result can be used like a normal table in the following query. If you use multiple CTE statements, use a comma as a separator between each CTE statement. WITH cte_1 AS ( SELECT column_1 , column_2 FROM table_1 WHERE condition ) , cte_2 AS ( SELECT column_1 , column_3 FROM table_2 WHERE condition ) SELECT c1.column_1 , column_2 , column_3 FROM cte_1 AS c1 LEFT JOIN cte_2 AS c2 ON c1.column_1 = c2.column_1; Simple CTE in Practice Suppose you want to find books that have a higher number of pages than the average page count in their genre. Using a CTE, you can first calculate the average page count for each genre, and then use this result in the main query. WITH AveragePages AS ( SELECT book_genre , AVG(total_pages) AS avg_pages FROM books GROUP BY book_genre ) SELECT b.book_title , b.book_genre , b.total_pages , ap.avg_pages FROM books b JOIN AveragePages ap ON b.book_genre = ap.book_genre WHERE b.total_pages > ap.avg_pages; Books with above-average page counts in their genre Books with above-average page counts in their genre The AveragePages CTE calculates the average page count for each genre. In the main query, JOIN combines each book’s data with the corresponding average page count for its genre. The main query then compares each book’s total_pages to the average for its genre and filters books with above-average page counts (> ap.avg_pages). Chaining Multiple CTEs You can chain multiple CTEs together to build more complex queries step by step. In this case to identify books belonging to top-selling genres, where the genres have generated total sales exceeding $5,000: WITH GenreSales AS ( SELECT b.book_genre , SUM(s.price) AS total_genre_sales FROM books b JOIN sales s ON b.book_id = s.book_id GROUP BY b.book_genre ) , TopGenres AS ( SELECT book_genre FROM GenreSales WHERE total_genre_sales > 5000 ) SELECT b.book_title , b.book_genre FROM books b JOIN TopGenres tg ON b.book_genre = tg.book_genre; Books belonging to top-selling genres Books belonging to top-selling genres This first CTE (GenreSales) calculates total sales for each genre. The second CTE (TopGenres) filters the genres identified in GenreSales to only include those where the total sales exceed $5,000. Finally the main query retrieves book_title and book_genre for books that belong to the top-selling genres (TopGenres). Recursive CTE Query A recursive CTE in SQL is created using a WITH RECURSIVE clause. This allows you to define a CTE that references itself, enabling you to work with hierarchical or recursive data. Below is the general syntax: WITH RECURSIVE cte_name AS ( -- Anchor Query: Base result set SELECT column_1, column_2, ... FROM table_name WHERE condition_for_base_case UNION ALL -- Recursive Query: Recursive logic for self-referencing SELECT column_1, column_2, ... FROM table_name JOIN cte_name ON condition_for_recursive_step ) -- Final Query: Use the recursive CTE SELECT column_1, column_2, ... FROM cte_name; Recursive CTE can be used to navigate parent-child relationships, perform tree traversals, or build hierarchies. For this practice, we use the books table with an additional rule. The parent-child relationship is derived dynamically based on the book_id. Specifically, the parent_book_id is determined as book_id - 1. If the book_id is 1, it is considered the root book and does not have a parent. The query will show the entire hierarchy, indicating the level (depth) of each book in the hierarchy. WITH RECURSIVE BookHierarchy AS ( -- Anchor Query: Start -- with the root book -- (where book_id - 1 = 0) SELECT book_id , book_title , CAST(NULL AS INT) AS parent_book_id , 1 AS level FROM books WHERE book_id - 1 = 0 UNION ALL -- Recursive Query: Find -- child books by dynamically -- calculating parent_book_id SELECT b.book_id , b.book_title , bh.book_id AS parent_book_id , bh.level + 1 AS level FROM books b JOIN BookHierarchy bh ON b.book_id - 1 = bh.book_id ) SELECT book_id , book_title , parent_book_id , level FROM BookHierarchy ORDER BY level, book_id; Books Hierarchy Books Hierarchy Anchor query initializes the hierarchy by selecting the root book, where book_id - 1 = 0. In this case, Book 1 is the root book, as its parent_book_id is NULL. Recursive Query identifies child books by unifying the books table with the recursive CTE (BookHierarchy), using the condition b.book_id - 1 = bh.book_id. For every match, the parent_book_id is assigned as the book_id of the parent book. The level is incremented by 1 to indicate the depth of the child in the hierarchy. The query outputs the entire hierarchy of books, ordered by their level in the hierarchy. Still confused? Here is a Recursive Illustration. Watch it until it ends! 😆 Still confused? Here is a Recursive Illustration. Watch it until it ends! 😆 References https://en.wikipedia.org/wiki/Hierarchical_and_recursive_queries_in_SQL#Common_table_expression https://www.postgresql.org/docs/9.4/queries-with.html https://www.atlassian.com/data/sql/using-common-table-expressions


Originally published at https://blog.ardenov.com.

Top comments (0)