By the end of this blog you will understand the basics of PostgreSQL architecture and where it store its data especially databases and tables, we will walk through a live example in which we will create a database and a table, and then navigate to where they are stored.
Postgres Installation:
To follow this blog you should have PostgreSQL already installed on your machine, you can install it from here (but pay attention to where you install PostgreSQL, the default option will be on C:\Program Files\PostgreSQL\version, don't forget to add the path of /bin folder to your environment variables to be able to access PostgreSQL from your terminal using psql.
Creating a database with a table:
We will first create a database called test_db with a table called users, we will make it as simple as possible:
postgres=# CREATE DATABASE test_db;
CREATE DATABASE
postgres=# \c test_db
test_db=# CREATE TABLE users (id SERIAL PRIMARY KEY, username TEXT);
CREATE TABLE
Then insert a row (tuple) into users table:
test_db=# INSERT INTO users (username) VALUES ('ahmed hisham');
INSERT 0 1
A database cluster:
PostgreSQL runs on a single host and manages one database cluster (a database cluster is a collection of databases), as you can see below, each database contains its tables.
In PostgreSQL all objects including (databases and tables which are refered to as objects) are managed by a respective object identifiers (OIDs), this is a unique value for each object. Each database is a subdirectory named by its oid. PostgreSQL stores its databases in the base folder PostgreSQL/YOUR_VERSION/data/base. The parent directory (data) or PGDATA has the following architecture:
Each database is a subdirectory in /base folder and it is named by its oid as described above, tables are files stored inside database subdirectory, and named by its oid (table oid) as well, so our test_db is inside /base directory, and users table is a file inside test_db subdirectory.
Our mission is to move to users table file and check if the row we inserted above to users table is really stored there. But wait, all databases subdirectories are named by their oid and we don't know the oid of test_db as if you ls inside base subdirectory you will get the following output (list of database oids):
ahmeduser@ahmed:/usr/local/pgsql-12/bin/data/base$ ls
1 12674 12675 32768 40960 49152 pgsql_tmp
So to get test_db subdirectory oid we need to know its oid, to do that run the following command:
test_db=# SELECT datname, oid FROM pg_database where datname='test_db';
datname | oid
---------+-------
test_db | 49152
(1 row)
The above command go to pg_database which is a system catalog that stores information about databases and their corresponding oids. We need to do the same for users table and get its oid:
test_db=# SELECT relname, oid FROM pg_class where relname='users';
relname | oid
---------+-------
users | 57346
(1 row)
pg_class is the same as pg_database but for tables.
Now let's navigate to test_db subdirectory using its oid 49152:
ahmeduser@ahmed:/usr/local/pgsql-12/bin/data/base$ cd 49152
ahmeduser@ahmed:/usr/local/pgsql-12/bin/data/base/49152$
Then open users table file using any editor you like, I will use vim in my linux terminal:
ahmeduser@ahmed:/usr/local/pgsql-12/bin/data/base/49152$ vim 57346
You should get a similar result, you can notice that the username value I inserted was ahmed hisham which appears at the bottom of the file (note: these characters @^ represent free spaces, free spaces are necessary for PostgreSQL to manage how to store and scan data inside table files).
We are done!, you should by now have an idea of where and how PostgreSQL stores databases and tables, I hope that was useful, feel free to ask any question.
Top comments (0)