Maria is a SQL enthusiast who works for The Coffee Company. Maria has mainly handled the databases and ensured everything operates nicely.
Recently she discovered the MERGE
commands of Postgres. She wants to create a Proof Of Concept about improving the ETL of the company processes with this command.
Data Architecture
Before manipulating any data, Maria wants to refresh her mind with the current architecture in the database.
Data architecture is simple for now, and the company does not need a complex one.
For each ETL process, data goes into the staging schema first. Then, if everything is fine, we can put the data in the production schema.
She creates a POC database with the same schema to avoid messing up with the production database and schema.
POC database
Did you know that when you make a database in PostgreSQL, it uses a template database called template1? And after creating her database, she establishes the schemas with a staging schema and a production schema with the tables in it.
CREATE DATABASE coffee_testing;
CREATE SCHEMA staging;
CREATE SCHEMA production;
CREATE TABLE staging.coffee_stock;
CREATE TABLE production.coffee_stock;
Better Processed Coffee
Recently the coffee company has decided to update and add some references to decaffeinated Coffee in their database.
To be more respectful of the environment, they have decided to add decaffeinated Coffee through the SWP process. And through the process with high-pressure CO2 (supercritical) at 31°C instead of chemically decaffeinated Coffee.
These data can be perfect for testing the new ETL process.
Move data into other tables
MERGE INTO
is a specific Postgres Command that enables :
-
UPDATE
INSERT
-
DELETE
You can batch load data that matches some conditions in different tables, even tables from different schemas, but not from other databases.
Maria wants to move her data from the coffee_stock of the staging to the coffee_stock production in her test database.
Some decaffeinated coffee exists on the production table, whereas others dont. If this coffee exists, we want to update the stock. If the coffee does not, we want to insert the coffee and the quantity.
MERGE INTO production.coffee_stock cs
USING staging.coffee_stock cse
ON cs.coffee_id = cse.coffee_id
WHEN MATCHED THEN
UPDATE SET quantity = cse.quantity
WHEN NOT MATCHED THEN
INSERT (coffee_id,quantity)
VALUES (cse.coffee_id, cse.quantity)
LGTM
After this test, she sends a PR to her colleagues for feedback. In the majority, they liked it, and this could significantly improve their ETL.
Indeed MERGE
has a straightforward syntax and avoids using procedural language in our PostgreSQL database.
We can see which SQL command is executed in what conditions. We can perform the basic DML (UPDATE, DELETE, INSERT) commands with it.
Now Maria has a lot of work to migrate their ETL. Of course, she will begin with one of the less important ones to test it in production, ensure everything is fine, and then migrate everything bite by bite.
PS :
I did not choose the name of Maria on purpose, this was not related to MariaDB. But I found this fun so let's keep it :)
Keep in Touch
On Twitter : @yet_anotherDev
On Linkedin : Lucas Barret
Top comments (2)
Nice article! I also wrote about MERGE operator here
I am definitely going to check this out