DEV Community

Cover image for 8 Postgres Extensions You Need to Know
Team Timescale for Timescale

Posted on • Originally published at timescale.com

8 Postgres Extensions You Need to Know

For a more in-depth explainer, jump to the full article.


PostgreSQL's flexibility comes largely from its extension ecosystem.

Let's explore eight powerful extensions that can transform your database capabilities with minimal setup.

1. PostGIS 🌎

What it does: Transforms PostgreSQL into a database system that can efficiently handle spatial data. It introduces additional data types such as geometry, geography, raster, and more, along with a suite of functions, operators, and indexing capabilities tailored to these spatial types.

Perfect for: IoT applications, location-based services, and geospatial analysis.

Installation:

CREATE EXTENSION postgis;
Enter fullscreen mode Exit fullscreen mode

PostGIS sample query:

We want to know: “How many taxis picked up passengers within 400 meters of Times Square on New Year’s Day 2016?”

-- How many taxis picked up rides within 400m of Times Square on New Years Day?
-- Times Square coordinates: (40.7589, -73.9851)
SELECT time_bucket('30 minutes', pickup_datetime) AS thirty_min, COUNT(*) AS near_times_sq
FROM rides
WHERE ST_Distance(pickup_geom, ST_Transform(ST_SetSRID(ST_MakePoint(-73.9851,40.7589),4326),2163)) < 400
AND pickup_datetime < '2016-01-01 14:00'
GROUP BY thirty_min ORDER BY thirty_min
LIMIT 5;
Enter fullscreen mode Exit fullscreen mode

The above query produces the following results:

-[ RECORD 1 ]-+--------------------
thirty_min    | 2016-01-01 00:00:00
near_times_sq | 74
-[ RECORD 2 ]-+--------------------
thirty_min    | 2016-01-01 00:30:00
near_times_sq | 102
-[ RECORD 3 ]-+--------------------
thirty_min    | 2016-01-01 01:00:00
near_times_sq | 120
-[ RECORD 4 ]-+--------------------
thirty_min    | 2016-01-01 01:30:00
near_times_sq | 98
-[ RECORD 5 ]-+--------------------
thirty_min    | 2016-01-01 02:00:00
near_times_sq | 112
Enter fullscreen mode Exit fullscreen mode

2. pg_stat_statements 📊

What it does: Tracks execution statistics for all queries executed by a Postgres database. It'll help you debug queries, identify slow queries, and generally give you deeper information about how your queries are running.

Perfect for: Performance tuning and identifying slow queries.

Installation:

CREATE EXTENSION pg_stat_statements;
Enter fullscreen mode Exit fullscreen mode

Pg_stat_statements sample query

We’re interested in: “What are the top 5 I/O-intensive SELECT queries?”

SELECT query, calls, total_time, rows, shared_blks_hit, shared_blks_read
FROM pg_stat_statements
WHERE query LIKE 'SELECT%'
ORDER BY shared_blks_read DESC, calls DESC
LIMIT 5;
Enter fullscreen mode Exit fullscreen mode

The output would look like this:


-[ RECORD 1 ]--—+---------------------------------------------------
query             | SELECT * FROM customer_data WHERE created_at > $1
calls             | 500
total_time        | 23000
rows              | 500000
shared_blks_hit   | 100000
shared_blks_read  | 75000
-[ RECORD 2 ]-----+---------------------------------------------------
query             | SELECT name, address FROM orders WHERE status = $1
calls             | 450
total_time        | 15000
rows              | 450000
shared_blks_hit   | 95000
shared_blks_read  | 55000
-[ RECORD 3 ]-----+---------------------------------------------------
query             | SELECT COUNT(*) FROM transactions WHERE amount > $1
calls             | 300
total_time        | 12000
rows              | 300000
shared_blks_hit   | 85000
shared_blks_read  | 50000
-[ RECORD 4 ]-----+---------------------------------------------------
query             | SELECT product_id FROM inventory WHERE quantity < $1
calls             | 400
total_time        | 16000
rows              | 400000
shared_blks_hit   | 80000
shared_blks_read  | 45000
-[ RECORD 5 ]-----+---------------------------------------------------
query             | SELECT * FROM user_logs WHERE user_id = $1 AND activity_date > $2
calls             | 350
total_time        | 17500
rows              | 350000
shared_blks_hit   | 75000
shared_blks_read  | 40000
Enter fullscreen mode Exit fullscreen mode

3. pgcrypto 🔒

What it does: Adds cryptographic functions to PostgreSQL for encryption, hashing, and more.

Perfect for: Secure storage of sensitive data and password management.

Installation:

CREATE EXTENSION pgcrypto;
Enter fullscreen mode Exit fullscreen mode

Pgcrypto sample query

Here’s how you might use pgcrypto to encrypt and decrypt data. Let’s say you want to store encrypted user passwords. First, you would encrypt a password when inserting it into a table:

INSERT INTO users (username, password) VALUES ('john_doe', crypt('my_secure_password', gen_salt('bf')));
Enter fullscreen mode Exit fullscreen mode

In this statement, crypt is a function provided by pgcrypto that encrypts the password using the Blowfish algorithm, which is indicated by gen_salt('bf')

Next, to authenticate a user, you would compare a stored password against one provided during login:

SELECT username FROM users WHERE username = 'john_doe' AND password = crypt('input_password'
Enter fullscreen mode Exit fullscreen mode

4. pg_partman 📑

What it does: Simplifies the creation and management of table partitions. Partitioning is a key database technique that involves splitting a large table into smaller, more manageable pieces while allowing you to access the data as if it were one table.

Perfect for: This automation is particularly useful for large, time-series datasets that can grow rapidly.

Installation:

CREATE EXTENSION pg_partman;
Enter fullscreen mode Exit fullscreen mode

Pg_partman sample query

Consider a scenario where you have a large table of IoT device data that you want to partition by day.

Here’s how you might set up a daily partition scheme for a table called device_data using pg_partman:

-- Create a parent table
CREATE TABLE device_data (
    time timestamptz NOT NULL,
    device_id int NOT NULL,
    data jsonb NOT NULL
);

-- Set up pg_partman to manage daily partitions of the device_data table
SELECT partman.create_parent('public.device_data', 'time', 'partman', 'daily');
Enter fullscreen mode Exit fullscreen mode

In this setup, create_parent is a function provided by pg_partman that takes the parent table name and the column to partition on (time), as well as the schema (partman) and the partition interval (daily).

5. postgres_fdw 🌉

What it does: Allows you to use a Foreign Data Wrapper to access tables on remote Postgres servers (hence the name "fdw"). A Foreign Data Wrapper lets you create proxies for data stored in other Postgres databases to query them as if they were coming from a table in the current database.

Perfect for: Distributed databases and multi-server setups.

Installation:

CREATE EXTENSION postgres_fdw;
Enter fullscreen mode Exit fullscreen mode

Postgres_fdw sample query

Here’s how you create a connection to your foreign server:

CREATE SERVER myserver FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '123.45.67.8', dbname postgres, port '5432');
Enter fullscreen mode Exit fullscreen mode

This query connects to a database hosted on IP address 123.45.67.8, with the name postgres at port 5432. Now, create a user mapping so that users on your database can access the foreign server:

CREATE USER MAPPING FOR postgres 
SERVER myserver 
OPTIONS (user 'postgres', password 'password');
Enter fullscreen mode Exit fullscreen mode

6. pgvector 🔍

What it does: Adds support for vector operations in Postgres—enabling similarity search, nearest-neighbor search, and more.

Perfect for: Applications like recommendation systems, image retrieval, and semantic search.

Installation:

CREATE EXTENSION vector;
Enter fullscreen mode Exit fullscreen mode

Pgvector sample query
Say you want to find the most similar images to a given feature vector. Here’s how you might use pgvector to perform a nearest-neighbor search:

-- Assuming we have a table with image features stored as vectors
-- Table: image_features
-- Columns: id (integer), features (vector)

-- Given a query vector, find the 5 most similar images
SELECT id, features
FROM image_features
ORDER BY features <-> 'query_vector'::vector
LIMIT 5;
Enter fullscreen mode Exit fullscreen mode

This query orders the results by the distance between the query_vector and the features column, effectively returning the closest matches

7. hstore 🔑

What it does: A key-value store within Postgres, that stores sets of key/value pairs in a single Postgres data type.

Perfect for: Semi-structured data with varying attributes that need fast indexing and flexible schema requirements without table alterations.

Installation:

CREATE EXTENSION hstore;
Enter fullscreen mode Exit fullscreen mode

Hstore sample query
Here’s an example of how you might use hstore to store and query product data with varying attributes:

-- Create a table with an hstore column for storing product attributes
CREATE TABLE products (
    id serial PRIMARY KEY,
    name text NOT NULL,
    attributes hstore
);

-- Insert a product with attributes into the table
INSERT INTO products (name, attributes)
VALUES ('Smartphone', 'color => "black", storage => "64GB", battery => "3000mAh"');

-- Query to find products with a specific attribute
SELECT name
FROM products
WHERE attributes @> 'storage => "64GB"';
Enter fullscreen mode Exit fullscreen mode

8. pgcre 📝

What it does: Integrates Perl Compatible Regular Expressions (PCRE) into PostgreSQL, providing advanced string-matching functionality beyond PostgreSQL's built-in regex capabilities.

Perfect for: Applications requiring sophisticated text analysis with complex pattern-matching needs like parsing logs, searching text, or validating string formats with advanced regex features.

Installation

CREATE EXTENSION pgpcre
Enter fullscreen mode Exit fullscreen mode

pgpcre sample query
If you want to search for email addresses in a column of unstructured text, you might use a PCRE pattern for matching emails as follows:

-- Assuming we have a table named messages with a column named content
-- Table: messages
-- Column: content (text)
-- Use pgpcre to match email addresses within the content
SELECT content, pcre_match('^\S+@\S+$', content) AS email
FROM messages
WHERE pcre_match('^\S+@\S+$', content) IS NOT NULL;
Enter fullscreen mode Exit fullscreen mode

Is TimescaleDB a PostgreSQL Extension?

Yes! TimescaleDB is another top ⭐️ extension to know. It adds a query and ingest performance boost to Postgres via automatic partitioning, query planner enhancements, improved materialized views, columnar compression, and much more.

What it does: Enhances Postgres for time-series workloads with features for real-time analytics and efficient data management.

Perfect for: Applications handling time-series data that need faster queries, better storage efficiency, and simplified continuous aggregations.

TimescaleDB example query
Suppose we have a sensor data table that records temperature readings from various devices every minute. We want to analyze the average temperature per hour for a specific device. Here’s how you might construct such a query using TimescaleDB:

-- Assuming we have a hypertable named sensor_readings with columns time, device_id, and temperature
-- Table: sensor_readings
-- Columns: time (TIMESTAMP WITH TIME ZONE), device_id (INT), temperature (DOUBLE PRECISION)

-- Calculate the average temperature per hour for device with ID 1
SELECT time_bucket('1 hour', time) AS one_hour_bucket, 
       AVG(temperature) AS avg_temperature
FROM sensor_readings
WHERE device_id = 1
GROUP BY one_hour_bucket
ORDER BY one_hour_bucket;
Enter fullscreen mode Exit fullscreen mode

Conclusion

PostgreSQL extensions offer incredible functionality with minimal setup. Most require just a single command to install, yet can dramatically enhance what your database can do.

What's your favorite PostgreSQL extension? Have you tried any of these? Let me know in the comments!

Top comments (0)