Raunak Jain

Posted on Feb 22

How to persist data in a dockerized postgres database using volumes?

#docker #devops #volume

When you run a Postgres database in Docker, it is very important to keep your data safe even if the container stops or is removed. In this article, we will learn how to persist data in a dockerized Postgres database using Docker volumes. I will use simple words and short sentences so it is easy to follow. This guide is written for beginners who work with Docker and databases.

Introduction

Postgres is a popular database. Many people use it to store application data. When you run Postgres in a Docker container, by default, the data is stored inside the container. If the container is removed or crashes, you can lose your data. That is why data persistence is very important.

Docker volumes let you store data outside the container. This means that even if you delete the container, your data remains safe. For more on this idea, you can read an understanding Docker volumes article that explains the basics.

In this guide, we will cover:

What are Docker volumes and why they are needed.
How to create and use volumes with Postgres.
How to use Docker Compose to manage your Postgres container with persistent data.
Best practices for data persistence.

Before we start, it is useful to have a basic understanding of Docker. If you are new to Docker, you might find an introduction to Docker article helpful.

Why Persist Data in a Dockerized Postgres Database?

When you run Postgres in a container, any data that is written inside the container’s file system is temporary. Containers are designed to be ephemeral. If you update the container or rebuild your image, the data may vanish. By using volumes, you can make your data persist independently of the container lifecycle.

There are many benefits to data persistence:

Safety: Data remains even when the container is deleted.
Portability: You can easily move your data between containers or nodes.
Backup: It is easier to backup the data stored in volumes.

Persisting data is very important in production. For a detailed step-by-step, you can check a guide on creating and using Docker volumes that gives you more examples.

What Are Docker Volumes?

Docker volumes are storage areas that exist outside of the container’s writable layer. They can be managed by Docker and are independent of the container lifecycle. Volumes are the recommended way to persist data.

Here are some key points about volumes:

They are stored on the host machine.
They can be easily backed up or moved.
They work across container updates and removals.

When you use volumes, the data for your Postgres database will be stored on your host and will not be deleted when the container is removed.

For a deeper look at how volumes work, you can see more details on listing and inspecting Docker volumes.

Setting Up a Dockerized Postgres Database

Let us start by creating a simple Docker command to run a Postgres container. Open your terminal and run the following command:

docker run --name my_postgres -e POSTGRES_PASSWORD=mysecretpassword -d postgres

This command does the following:

Runs a container from the official Postgres image.
Sets the password for the Postgres user.
Runs the container in detached mode.

At this point, the database is running. However, if you stop and remove the container, the data will be lost.

Using Volumes for Data Persistence

To persist data, you need to add a volume to the container. You can do this with the -v flag. Here is an example:

docker run --name my_postgres -e POSTGRES_PASSWORD=mysecretpassword -d -v pgdata:/var/lib/postgresql/data postgres

Let us break down the new part of this command:

The flag -v pgdata:/var/lib/postgresql/data tells Docker to create (or use an existing) volume named pgdata.
/var/lib/postgresql/data is the default directory in Postgres where the database stores its files.

Now, any data that Postgres writes to this directory is stored in the pgdata volume. Even if the container is deleted, the volume will still exist and keep your data safe.

If you want to learn more about how to mount host directories or volumes, you can read an article on mounting host directories to docker containers.

Using Docker Compose for Better Management

For development and production, it is common to use Docker Compose. With Docker Compose, you can manage multiple containers and set up volumes in a clear and organized way.

Create a file called docker-compose.yml with the following content:

version: "3.8"
services:
  postgres:
    image: postgres
    container_name: my_postgres
    environment:
      - POSTGRES_PASSWORD=mysecretpassword
    volumes:
      - pgdata:/var/lib/postgresql/data
    ports:
      - "5432:5432"

volumes:
  pgdata:

This file defines a Postgres service with:

The Postgres image.
A volume named pgdata mounted to the directory where Postgres stores data.
Port mapping so you can access the database from your host on port 5432.

Using Docker Compose makes it easier to start, stop, and manage your containers. You can start the service by running:

docker-compose up -d

And stop it with:

docker-compose down

A guide on writing a simple docker-compose yml file can help you learn more about this process.

Backing Up and Restoring Your Data

One benefit of using volumes is that they can be easily backed up. To backup your Postgres data, you can use the docker run command with volume mounting. For example:

docker run --rm -v pgdata:/data -v $(pwd):/backup alpine tar czvf /backup/pgdata_backup.tar.gz -C /data .

This command does the following:

Runs a temporary Alpine container.
Mounts the pgdata volume to /data inside the container.
Mounts the current directory to /backup inside the container.
Creates a compressed backup file of the volume.

To restore the backup, you can run a similar command in reverse:

docker run --rm -v pgdata:/data -v $(pwd):/backup alpine sh -c "cd /data && tar xzvf /backup/pgdata_backup.tar.gz"

Backing up your data is crucial for production environments. With volumes, backups become much more manageable.

Best Practices for Data Persistence with Postgres

When working with dockerized databases, keep these best practices in mind:

Use Named Volumes:

Always use named volumes instead of anonymous ones. This makes it easier to manage and backup your data. The guide on creating and using Docker volumes can help you understand this better.
Keep Your Data Separate:

Do not store your database data in the container’s file system. Always use volumes to ensure data persistence.
Regular Backups:

Schedule regular backups of your volumes. This way, you can restore data in case of failure.
Monitor Your Volumes:

Use Docker commands to list and inspect your volumes periodically. It helps in managing storage and detecting issues early.
Security:

Protect your volumes by setting proper permissions and using secure storage solutions when needed.

By following these practices, you can maintain a robust and reliable database in your Docker environment.

Troubleshooting Common Issues

Sometimes, you may run into issues when persisting data. Here are some common problems and solutions:

Volume Not Persisting Data

If you see that data is not being saved, check that:

The volume is mounted to the correct directory (/var/lib/postgresql/data).
The container is using the correct volume name.
You have not accidentally overridden the volume mount in your Docker Compose file or run command.

Permission Problems

Postgres may have issues with file permissions on the volume. To fix this:

Ensure that the volume’s permissions match what Postgres expects.
You might need to adjust the user or group settings in your container.

Backup and Restore Failures

If your backup or restore commands fail:

Double-check the paths you use in your docker run commands.
Make sure that the volume is not being used by another container during the backup.

For more help on troubleshooting, you can read articles on how to list and inspect Docker volumes.

Real-World Example: Using a Dockerized Postgres with Persistent Data

Imagine you are building an application that uses Postgres to store user data. You need to make sure that every time you update your application, the data remains intact. By using Docker volumes, you can safely upgrade your containers without data loss.

Here is a step-by-step scenario:

Create the Docker Compose File:

Write the docker-compose.yml as shown earlier. This file defines the Postgres service and the named volume pgdata.
Start Your Services:

Run the command:

   docker-compose up -d

This command will start the Postgres container with data persisted in the pgdata volume.

Connect Your Application:

Your application can connect to Postgres on localhost:5432 (or the appropriate network address if you are using Docker networks).
Test Data Persistence:

Add some data using your application. Then, stop and remove the container with:

   docker-compose down

Restart the service with:

   docker-compose up -d

Your data should still be there because it was stored in the volume.

Backup the Data: Run the backup command provided earlier to secure your data.

This process shows a typical workflow for using volumes to persist data in a dockerized Postgres database.

Additional Considerations

Environment Variables:

When running Postgres, you can set environment variables (like POSTGRES_PASSWORD, POSTGRES_USER, and POSTGRES_DB) to customize your database. Make sure these values are consistent across container restarts.
Scaling Out:

If you plan to scale your database or run it in a production environment, consider using external storage solutions or managed volume plugins. These can offer better performance and high availability.
Using Docker Networks:

In a multi-container environment, make sure your containers are on the same Docker network. This allows your application to connect to Postgres seamlessly.
Learning More:

Docker is a powerful tool and learning more about its capabilities can help you improve your deployments. For example, reading about mounting host directories may provide additional insights into managing storage.

Conclusion

Persisting data in a dockerized Postgres database using volumes is an essential skill. With Docker volumes, you can store your database files outside the container, ensuring that your data is safe even if the container is removed or updated.

In this guide, we learned:

The importance of data persistence when running a Postgres container.
What Docker volumes are and why they are used.
How to run a Postgres container with a volume using the -v flag.
How to use Docker Compose to manage your Postgres container and set up persistent storage.
Best practices and troubleshooting tips to ensure your data remains safe and accessible.
Backup and restore techniques to secure your data further.

By following these steps, you can create a reliable and robust Postgres database that is dockerized and has persistent storage. This makes it much easier to update your applications without worrying about data loss. As you gain more experience with Docker and Postgres, you will find that these methods can be adapted to many different scenarios.

Remember, data persistence is not only about keeping your data safe—it also makes your deployments more flexible and easier to manage. For more detailed information on the basics of Docker and how it can improve your development workflow, check out our introduction to Docker article.

I hope this guide helps you understand how to persist data in a dockerized Postgres database using volumes. Keep practicing, testing, and refining your setup. Happy containerizing and good luck with your projects!

DEV Community