How to Postgres indexes review

#postgres #indexes #database #devops

Hey, in addition to Part 2, I recently started to review indexes in one of my prod databases. I would say, it turned out to be very educational.

The database has been developed for a few years, and thousands of new features and migrations have been applied. I was surprised, at how many indexes have been created and never used, or did not used for a long time.

Why index management is so important? The indexes are costly! The costs of indexing:

Indexes are eating storage, you need high-speed, highly-available storage for that.
Indexing takes CPU and some IOPSes on each INSERT/UPDATE/DELETE query, and gives additional latency on these queries.
Indexes take time to rebuild on database restore during incident recovery.

Any index is a trade-off. On one side it accelerates queries and reduces IOPSes, on another it creates an additional workload.

At the last database review, I found unused indexes for 50GB on a 600GB database. How am I found that?
How to find unused indexes:

-- Unused indexes
SELECT 
    relname AS table_name, 
    indexrelname AS index_name, 
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0  -- No scans
ORDER BY pg_relation_size(indexrelid) DESC;
-- Total size of unused indexes
SELECT 
    pg_size_pretty(SUM(pg_relation_size(indexrelid))) AS indexes_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0;

The unused indexes are sorted by size and it is a priority to fix first. The total size of unused indexes shows how bad the whole situation is.

How to find indexes, that cover the same columns:

-- Duplicate indexes cover the same columns
SELECT 
    indrelid::regclass AS table_name,
    array_agg(indexrelid::regclass) AS duplicate_indexes
FROM pg_index
GROUP BY indrelid, indkey
HAVING COUNT(*) > 1;

Yes, it is very interesting and a good reason to review the indexes.

A sorted table to see the percentage of idx_scan/(seq_scan + idx_scan) when using the table. A good occasion to review the index coverage of tables.

-- Index usage vs seq scan
SELECT
    relname AS table_name,
    seq_scan, seq_tup_read,
    idx_scan, idx_tup_fetch,
    round(100 * idx_scan::numeric / NULLIF(seq_scan + idx_scan, 0), 2) AS index_usage_percent
FROM pg_stat_user_tables
ORDER BY index_usage_percent DESC NULLS LAST;

The query example for migrations, that drops the unused index:

ALTER TABLE table_name DROP INDEX IF EXISTS index_name;

The reasons, why this index "crisis" has happened:

a lot of new features were added and removed in the application, and indexes have been created, but we forgot to review and remove indexes;
a lot of indexes have been created because it was obvious - developers plan to filter or join by these columns, but Postgres plan in real life does not want to use it;
indexes have never been reviewed.

Conclusions:

Indexing is required to run queries, but it is costly for the CPU on INSERTs/UPDATEs/DELETEs, and takes extra storage space.
Indexes review should be included in the regular database audit and maintenance procedures.
Here are the extremely useful queries to get an overview of the current situation with indexes for your Postgres database.

PS: I use AI a lot as a learning partner or an advisor, but not in production. No AI was used to write this article.

DEV Community

How to Postgres indexes review

Top comments (0)

Read next

Insider Realities of Site Reliability Engineering: Lessons from a DevRel Perspective

Scale, Speed, and Crush Complexity: 10 Reasons Why Microservices Are (Usually) the Right Choice

What is KubeArchInspect and what problem does it solve?

Blog Post: My Experience Setting Up NGINX on an AWS Ubuntu Server