DEV Community

Manoj Gohel
Manoj Gohel

Posted on

Stop Using UUIDs in Your Database

How UUIDs can Destroy SQL Database Performance.
One of the most common way to uniquely identify rows in a database is by using UUID fields.

This approach, however, comes with performance caveats that you must be aware of.

In this article, we discuss two performance issues that may arise when using UUIDs as keys in your database tables.

So without further ado… Let’s jump right in!

What are UUIDs?

UUID stands for Universally Unique Identifier.

There are many versions of UUID, but in this article we will consider the most popular one: UUIDv4.

Here is an example of how a UUIDv4 looks like:

NOTE: Each UUID has the digit 4 in the same position to denote the version.

Problem 1 — Insert Performance

When a new record is inserted into a table, the index associated with the primary key must be updated to maintain optimal query performance.

Indexes are constructed using the B+ Tree data structure.

If you want to learn more about how Indexes and B+ Trees work, I highly suggest watching this great video from Abdul Bari.

TL;DR: for every record insertion, the underlying B+ Tree must be rebalanced to optimize query performance.

10.2 B Trees and B+ Trees. How they are useful in Databases
0:00 / 39:41•Live

The rebalancing process becomes highly inefficient for UUIDv4.

This is because of the inherent randomness of UUIDs, making it harder to keep the tree in balance.

As you scale, you will have millions of nodes to rebalance, which dramatically decreases the insert performance when using UUID keys.

NOTE: Other options such as UUIDv7 could be a better option, since they have inherent ordering which makes it easier to index them.

Problem 2 — Higher Storage

Let’s consider the size of a UUID with an auto-incrementing integer key:

By comparison, auto-incrementing integers consume 32bits per value, while UUIDs consume 128bits per value.

This is 4x more per row.

Additionally, most people store UUIDs in human-readable form, which means a UUID could consume up to 688bits per value.

This is approximately 20x more per row.

Let’s evaluate how UUIDs can actually impact your storage by simulating a realistic database.

We will adapt the tables used by Josh Tried Coding in this example:

This example uses a Neon postgresql database.

  • Table 1 will contain 1 million rows with UUIDs.
  • Table 2 will contain 1 million rows with auto-incrementing integers.

Here are the results, let’s break down each statistic one by one:

Total table size: When considering both table sizes, the UUID table is approximately 2.3x larger than the Integer table!

ID field size: An individual UUID field requires 9.3x more storage space than an equivalent integer field!

ID column size: When excluding other attributes in each table, there is a 3.5x size difference between the UUID and Integer columns!

Conclusion

UUIDs are a great way to ensure uniqueness between records in a table.

These problems are prevalent at scale, so UUIDs will not actually cause noticeable performance degradation for most people.

Although these problems are prevalent at scale, it is important to realise the implications of using UUIDs in your tables and ensure optimal database design.

Top comments (0)