In database management, primary keys are crucial for ensuring data integrity and establishing relationships between tables. However, sometimes, tables may be created without primary keys, which can lead to data anomalies. In this article, we’ll explore a Bash script that connects to a PostgreSQL instance, iterates through its databases, and identifies tables that lack primary keys.
Overview of the Script
The script is designed to:
- Connect to a PostgreSQL database instance.
- Retrieve a list of databases, excluding system databases.
- For each database, check for tables that do not have primary keys.
- Output the results.
Prerequisites
To run the script, ensure you have:
- Bash installed.
- Access to a PostgreSQL instance with sufficient permissions to query system catalogs.
- The psql command-line tool available.
The Script
#!/bin/bash
###################################################################################
#
# iterate_all_the_DBs_on_the_DB_instance_and_find_the_tables_with_no_pks_access.sh
#
# This scripts iterates databases and
# find the tables with no primary keys
#
# Date: 10-Jan-2024
#
# Author: Dmitry
#
###################################################################################
trim() {
local var="$*"
# remove leading whitespace characters
var="${var#"${var%%[![:space:]]*}"}"
# remove trailing whitespace characters
var="${var%"${var##*[![:space:]]}"}"
printf '%s' "$var"
}
export PGCONNECT_TIMEOUT=15
export DB_HOST="localhost"
export DB_PORT="5432"
the_query_number_of_dbs_to_process="
select count(M.*)
from
(select datname
from pg_database
where datname not in ('postgres', 'template0', 'template1')
order by 1) M
"
the_query_dbs_to_process="
select M.*
from
(select datname
from pg_database
where datname not in ('postgres', 'template0', 'template1')
order by 1) M
"
num_of_dbs_to_process=$(psql -h $DB_HOST -p $DB_PORT -U postgres -t -c "$the_query_number_of_dbs_to_process" postgres)
num_of_dbs_to_process=`trim $num_of_dbs_to_process`
echo "The number of DBs to process: $num_of_dbs_to_process"
export the_yyyymmdd=$(date '+%Y%m%d')
export hh24miss=$(date '+%H%M%S')
idx=1
for DBs_to_process in $(psql -h $DB_HOST -p $DB_PORT -U postgres -t -c "$the_query_dbs_to_process" postgres); do
nTablesWithoutPKs=$(psql -h $DB_HOST -p $DB_PORT -U postgres -d $DBs_to_process -t << EOF
select count(1)
from information_schema.tables tab
left join information_schema.table_constraints tco
on tab.table_schema = tco.table_schema
and tab.table_name = tco.table_name
and tco.constraint_type = 'PRIMARY KEY'
where tab.table_type = 'BASE TABLE'
and tab.table_schema not in ('pg_catalog', 'information_schema')
and tco.constraint_name is null;
EOF
)
nTablesWithoutPKs=`trim $nTablesWithoutPKs`
if [ "$nTablesWithoutPKs" -gt "0" ]; then
the_output=$(psql -h $DB_HOST -p $DB_PORT -U postgres -d $DBs_to_process -t << EOF
select tab.table_schema,
tab.table_name
from information_schema.tables tab
left join information_schema.table_constraints tco
on tab.table_schema = tco.table_schema
and tab.table_name = tco.table_name
and tco.constraint_type = 'PRIMARY KEY'
where tab.table_type = 'BASE TABLE'
and tab.table_schema not in ('pg_catalog', 'information_schema')
and tco.constraint_name is null
order by table_schema,
table_name;
EOF)
echo "DB: $DBs_to_process"
echo "$the_output"
fi
idx=$(( idx + 1 ))
done
echo " "
echo "End"
echo " "
The Script Breakdown
Here’s a detailed breakdown of the script:
1. Setup and Configuration
export PGCONNECT_TIMEOUT=15
export DB_HOST="localhost"
export DB_PORT="5432"
The script starts by setting connection parameters for the PostgreSQL instance, including the host and port.
2. SQL Queries for Database Retrieval
the_query_number_of_dbs_to_process="
select count(M.*)
from
(select datname
from pg_database
where datname not in ('postgres', 'template0', 'template1')
order by 1) M
"
the_query_dbs_to_process="
select M.*
from
(select datname
from pg_database
where datname not in ('postgres', 'template0', 'template1')
order by 1) M
"
Two SQL queries are defined:
- One to count the number of databases to process.
- The other to list the names of those databases, excluding the default system databases.
3. Counting and Processing Databases
num_of_dbs_to_process=$(psql -h $DB_HOST -p $DB_PORT -U postgres -t -c "$the_query_number_of_dbs_to_process" postgres)
num_of_dbs_to_process=$(trim "$num_of_dbs_to_process")
echo "The number of DBs to process: $num_of_dbs_to_process"
The script executes the first query to get the number of databases and trims any whitespace from the output.
4. Iterating Through Each Database
for DBs_to_process in $(psql -h $DB_HOST -p $DB_PORT -U postgres -t -c "$the_query_dbs_to_process" postgres); do
The script iterates over each database and performs the following checks:
5. Checking for Tables Without Primary Keys
nTablesWithoutPKs=$(psql -h $DB_HOST -p $DB_PORT -U postgres -d $DBs_to_process -t << EOF
select count(1)
from information_schema.tables tab
left join information_schema.table_constraints tco
on tab.table_schema = tco.table_schema
and tab.table_name = tco.table_name
and tco.constraint_type = 'PRIMARY KEY'
where tab.table_type = 'BASE TABLE'
and tab.table_schema not in ('pg_catalog', 'information_schema')
and tco.constraint_name is null;
EOF
)
This query checks for base tables in the database that lack a primary key. If such tables are found, the script collects their schema and names:
the_output=$(psql -h $DB_HOST -p $DB_PORT -U postgres -d $DBs_to_process -t << EOF
select tab.table_schema,
tab.table_name
from information_schema.tables tab
left join information_schema.table_constraints tco
on tab.table_schema = tco.table_schema
and tab.table_name = tco.table_name
and tco.constraint_type = 'PRIMARY KEY'
where tab.table_type = 'BASE TABLE'
and tab.table_schema not in ('pg_catalog', 'information_schema')
and tco.constraint_name is null
order by table_schema,
table_name;
EOF)
6. Outputting the Results
if [ "$nTablesWithoutPKs" -gt "0" ]; then
echo "DB: $DBs_to_process"
echo "$the_output"
fi
Conclusion
This script serves as a practical tool for database administrators and developers to quickly identify tables lacking primary keys in their PostgreSQL databases. Regularly auditing your databases for such issues can help maintain data integrity and ensure a robust database design.
Feel free to modify the script according to your specific needs, such as changing database credentials or extending functionality to include other types of constraints.
Top comments (0)