Lucas Barret

Posted on Jun 23, 2023

Logged VS Unlogged table Performance in Postgres

#postgres #performance #database #data

Introduction

Durability and Consistency are essential for relational databases. They are part of the ACID principles.

To ensure this durability and consistency, relational databases use several techniques. One of these is to use a WAL file (Write-Ahead Log file).

Let's see what this means and how it affects the performance of our read and writes operations. And determines what can suit more for an ETL process.

Tables

Logged Tables

When you create a table in a relational database, you make a logged table. Every manipulation (DML statement) on your data will be logged to the WAL file before being run on your database in a LOGGED table.

CREATE TABLE logged_table(test integer);

For example, if you do an insert and then delete in a transaction as follows:

BEGIN;
INSERT INTO logged_table VALUES (1);
DELETE FROM logged_table;
COMMIT;

This will write the WAL file, the insertion you want to perform, and the deletion you wish to achieve.
Then it will perform the insertion and, eventually, the deletion.

This ensures :

Data stay consistent
Recovering data in case of systems failures

Unlogged Tables

There is another type of table in a relational database unlogged table.

CREATE UNLOGGED TABLE unlogged_table(test integer);

Performance

As you can tell, writing every operation in a log file before executing has an overhead cost.

With the simple tables, we had created before :

select now();
insert into logged_table values(generate_series(1,20000000));
select now();

        clock_timestamp
-------------------------------
 2023-06-19 09:07:55.525075+04
(1 row)

INSERT 0 20000000
        clock_timestamp
-------------------------------
 2023-06-19 09:08:20.054688+04
(1 row)


-------------------------------------------------------------


select now();
insert into unlogged_table values(generate_series(1,20000000));
select now();

insert into test2 values(generate_series(1,20000000));
select clock_timestamp();
        clock_timestamp
-------------------------------
 2023-06-19 09:07:37.594111+04
(1 row)

INSERT 0 20000000
       clock_timestamp
------------------------------
 2023-06-19 09:07:41.37622+04
(1 row)

Unlogged tables enable us to manipulate data faster. Here for a significant amount of data, we have a more than 50% difference in speed.

ETL process needs

Does an ETL process need for WAL? It might not be. Indeed ETL processes are processes that are often rerun when data changes. It is easy, and we must ensure data are correctly formatted and stored.

Speed is more valuable than ACID principles here since we can rerun our pipeline without fearing losing our data.

In this case, the unlogged table can be a valuable asset.

Conclusion

Our favorite RDBMS has many features, and the more you dig into it, the more you realize that there are things to discover.

Using any feature for the sake of it is something other than what we want. But in this case, knowing the difference between a Logged table and an Unlogged table can result in winning times and speed of execution in our ETL pipeline.

See you around for the following article. I hope you enjoy it!

Keep in Touch

On Twitter : @yet_anotherDev

On Linkedin : Lucas Barret

DEV Community

Logged VS Unlogged table Performance in Postgres

Introduction

Tables

Logged Tables

Unlogged Tables

Performance

ETL process needs

Conclusion

Keep in Touch

Top comments (0)

Read next

Populating a Vertica Database with Random Data

How to Install Vertica DB Client (vsql) on Ubuntu 24.04

CTE | Common Table Expression | Real world scenarios examples

RAG Web Scraping