hexfloor

Posted on Oct 6, 2023

SQL Pro tips : AWS Athena SQL UNPIVOT : CROSS JOIN UNNEST

#aws #database #bigdata #sql

Introduction

TL;DR : Why to use UNPIVOT over UNION ALL ? Performance, UNION ALL scans all the data for each subquery, with UNPIVOT the scan happens only once.
We shall perform now the same operation with AWS Athena
Feel free to check the Oracle XE CROSS JOIN UNPIVOT tips or a AWS documentation.
I'm writing this guide to provide a simple step-by-step manual easy to understand.

UNPIVOT

If you are new to Athena you may wish to go through the quick start guide

Let's start straight away with something trivial :

select 'SUNDAY' as day0, 'MONDAY' as day1;

and use array

select array['SUNDAY', 'MONDAY'] as day;

now, let's flatten the array with UNNEST :

select * from unnest(array['SUNDAY', 'MONDAY']) t(day);

and add another column into the output :

with 
    in_data as (select * from unnest(array['SUNDAY', 'MONDAY'], array[0, 1]) t(day, day_order))
select * from in_data;

That was fast and straight to the point! Let's explore CROSS JOIN

CROSS JOIN

In a nutshell CROSS JOIN
operation is simple = it's just a Cartesian product of the two tables, take all the possible combinations from the values on the left with the values on the right.
Few examples :

with 
    week_data as (select '2024W01' as week_ref),
    day_data as (select 'SUNDAY' as day_ref)
select * from week_data cross join day_data;

with UNION ALL :

with 
    week_data as (select '2024W01' as week_ref),
    day_data as (select 'SUNDAY' as day_ref 
    union all select 'MONDAY' as day_ref)
select * from week_data cross join day_data;

and with array :

with 
    week_data as (select '2024W01' as week_ref),
    day_data as (select array['SUNDAY','MONDAY'] as day_ref)
select * from week_data cross join day_data;

let's flatten the array :

with 
    week_data as (select '2024W01' as week_ref),
    day_data as (select * from unnest(array['SUNDAY','MONDAY']) r(day_ref))
select * from week_data cross join day_data;

and add another one to show the full power of the CROSS JOIN :

with 
    week_data as (select * from unnest(array['2024W01', '2024W02']) l(week_ref)),
    day_data as (select * from unnest(array['SUNDAY','MONDAY']) r(day_ref))
select * from week_data cross join day_data;

Finally we are ready to combine both CROSS JOIN and UNNEST

CROSS JOIN UNNEST

The operation I'm going to perform is named UNPIVOT in Oracle XE, see here at the same time BigQuery has both UNPIVOT and UNNEST, you may check here

Let's have some test data :

with 
    in_data as (select '2024W01' as week_ref, 'SUNDAY' as day0, 0 as day_order0, 'MONDAY' as day1, 1 as day_order1)
select * from in_data;

Firstly, let's generate the result with UNION ALL :

with 
    in_data as (select '2024W01' as week_ref, 'SUNDAY' as day0, 0 as day_order0, 'MONDAY' as day1, 1 as day_order1)
select week_ref, day0 as day, day_order0 as day_order from in_data
union all 
select week_ref, day1 as day, day_order1 as day_order from in_data;

The query above will scan the in_data as many times as many subqueries you have, in this example we have 2 subqueries.

and finally the performance optimization with CROSS JOIN UNNEST :

with 
    in_data as (select '2024W01' as week_ref, 'SUNDAY' as day0, 0 as day_order0, 'MONDAY' as day1, 1 as day_order1)
select week_ref, day, day_order
from in_data
cross join unnest(array[day0, day1], array[day_order0, day_order1]) t(day, day_order);

Using UNPIVOT increases performance of your data analytics requests and saves the energy.
Enjoy !

Forem

SQL Pro tips : AWS Athena SQL UNPIVOT : CROSS JOIN UNNEST

Introduction

UNPIVOT

CROSS JOIN

CROSS JOIN UNNEST

Top comments (0)

Read next

A conversation with your architecture

Beyond LIKE: Advanced Text Search and Keyword Matching in Postgres using Full Text Search

Iterations

Amazon Q Developer Tips: No.22 Amazon Q Keyboard shortcuts