Hi there! I'm Shrijith Venkatrama, the founder of Hexmos. Right now, I’m building LiveAPI, a super-convenient tool that simplifies engineering workflows by generating awesome API docs from your code in minutes.
In this tutorial series, I am on a journey to build for myself DBChat - a simple tool for using AI chat to explore and evolve databases.
See previous posts to get more context:
- Building DBChat - Explore and Evolve Your DB with Simple Chat (Part 1)
- DBChat: Getting a Toy REPL Going in Golang (Part 2)
Problem - How Best to Specify Databases That We Want To Deal With?
My initial thought was that - we'd have a connect <connection_string>
format supported in our REPL.
User can start the REPL and enter the configuration.
But on further consideration - it looked like having friendly names for databases was a better approach.
So, I settled on the following approach for now:
-
~/.dbchat.toml
- A configuration file for dbchat in home folder. Initially, it will be simple - just a "connections" section in it, listing various database URLs. - A new
connect
command within the shell. One can connect to both stored database connections or literal database connections. That isconnect <name from .dbchat.toml>
orconnect <connection literal>
will both be supported
In the next sections, I will explain how the above (2) were implemented
Configuring database connections in ~/.dbchat.toml
The sample configuration is initially assumed to be something like this:
# DBChat Sample Configuration File
# Copy this file to ~/.dbchat.toml and modify as needed
[connections]
# Format: name = "connection_string"
local = "postgresql://postgres:postgres@localhost:5432/postgres"
liveapi = "postgresql://user:password@ip:port/database_name"
I implemented cmd/dbchat/utils/config.go
to read and list connections from the configuration like so:
package utils
import (
"fmt"
"os"
"path/filepath"
"strings"
"github.com/BurntSushi/toml"
)
// Config holds the application configuration
type Config struct {
Connections map[string]string `toml:"connections"`
}
// LoadConfig reads the configuration from ~/.dbchat.toml
func LoadConfig() (*Config, error) {
home, err := os.UserHomeDir()
if err != nil {
return nil, fmt.Errorf("error getting home directory: %v", err)
}
configPath := filepath.Join(home, ".dbchat.toml")
var config Config
if _, err := toml.DecodeFile(configPath, &config); err != nil {
// Return empty config if file doesn't exist
if os.IsNotExist(err) {
return &Config{Connections: make(map[string]string)}, nil
}
return nil, fmt.Errorf("error reading config file: %v", err)
}
return &config, nil
}
// ListConnections returns a formatted string of all configured connections
func ListConnections(config *Config) string {
if len(config.Connections) == 0 {
return "No connections configured in ~/.dbchat.toml"
}
var sb strings.Builder
for name := range config.Connections {
sb.WriteString(fmt.Sprintf("- %s\n", name))
}
return sb.String()
}
The connect
Command Implementation
There are two variations supported in the connect
command:
connect <connection name>
connect <literal connection>
So in the REPL Eval handling, I added a new case for handling connect:
case cmd == "connect":
if len(fields) == 1 {
return `Usage: connect <connection_string | connection_name>
You can either provide a full connection string:
Example: connect postgresql://username:password@localhost:5432/dbname
Format: postgresql://[user]:[password]@[host]:[port]/[dbname]?[params]
Or use a connection name from ~/.dbchat.toml:
Example: connect local
Available connections in config:
` + utils.ListConnections(h.config)
}
connectionStr := strings.Join(fields[1:], " ")
// Check if the argument matches a configured connection name
if configStr, exists := h.config.Connections[connectionStr]; exists {
connectionStr = configStr
}
// Close existing connection if it exists
if h.db != nil {
h.db.Close()
}
// Connect using the connection string
newDb, err := db.Connect(connectionStr)
if err != nil {
return fmt.Sprintf("Failed to connect: %v", err)
}
h.db = newDb
return "Successfully connected to PostgreSQL database! 🎉"
dump schema
Command To Get Database Context
The most important action for us is to get the whole schema of the connected database.
The schema is important because: We can use the schema to inform the LLM layer later. And this information will help the LLM generate useful and accurate SQL queries.
So I created cmd/dbchat/db/schema.go
:
package db
import (
"database/sql"
"fmt"
"strings"
)
const schemaQuery = `
WITH tables AS (
SELECT
t.table_schema,
t.table_name,
t.table_type,
obj_description((quote_ident(t.table_schema)||'.'||quote_ident(t.table_name))::regclass, 'pg_class') as table_comment
FROM information_schema.tables t
WHERE t.table_schema NOT IN ('pg_catalog', 'information_schema')
ORDER BY t.table_schema, t.table_name
),
columns AS (
SELECT
c.table_schema,
c.table_name,
c.column_name,
c.data_type,
c.is_nullable,
c.column_default,
col_description((quote_ident(c.table_schema)||'.'||quote_ident(c.table_name))::regclass,
c.ordinal_position) as column_comment
FROM information_schema.columns c
WHERE c.table_schema NOT IN ('pg_catalog', 'information_schema')
ORDER BY c.table_schema, c.table_name, c.ordinal_position
)
SELECT
t.table_schema,
t.table_name,
t.table_type,
t.table_comment,
string_agg(
format(
' %s %s%s%s%s',
c.column_name,
c.data_type,
CASE WHEN c.is_nullable = 'NO' THEN ' NOT NULL' ELSE '' END,
CASE WHEN c.column_default IS NOT NULL THEN ' DEFAULT ' || c.column_default ELSE '' END,
CASE WHEN c.column_comment IS NOT NULL THEN ' -- ' || c.column_comment ELSE '' END
),
E'\n'
) as columns
FROM tables t
LEFT JOIN columns c
ON c.table_schema = t.table_schema
AND c.table_name = t.table_name
GROUP BY t.table_schema, t.table_name, t.table_type, t.table_comment
ORDER BY t.table_schema, t.table_name;
`
// DumpSchema returns a formatted string containing the database schema
func DumpSchema(db *sql.DB) (string, error) {
if db == nil {
return "", fmt.Errorf("database connection required. Use 'connect' command first")
}
rows, err := db.Query(schemaQuery)
if err != nil {
return "", fmt.Errorf("error querying schema: %v", err)
}
defer rows.Close()
var sb strings.Builder
var currentSchema string
for rows.Next() {
var (
schema, tableName, tableType, columns string
tableComment sql.NullString
)
if err := rows.Scan(&schema, &tableName, &tableType, &tableComment, &columns); err != nil {
return "", fmt.Errorf("error scanning row: %v", err)
}
// Print schema header if we're entering a new schema
if schema != currentSchema {
if currentSchema != "" {
sb.WriteString("\n")
}
sb.WriteString(fmt.Sprintf("Schema: %s\n", schema))
sb.WriteString(strings.Repeat("=", len(schema)+8) + "\n\n")
currentSchema = schema
}
// Print table information
sb.WriteString(fmt.Sprintf("Table: %s (%s)\n", tableName, tableType))
if tableComment.Valid {
sb.WriteString(fmt.Sprintf("Comment: %s\n", tableComment.String))
}
sb.WriteString("Columns:\n")
sb.WriteString(columns)
sb.WriteString("\n\n")
}
if err = rows.Err(); err != nil {
return "", fmt.Errorf("error iterating rows: %v", err)
}
return sb.String(), nil
}
The Demo: Configure, Connect and Dump Schema From A PostgreSQL Database
Next Steps
Since we have database connection & schema dump mechanism - we are ready to move onto the next stage: LLM integration.
In the next stage, we will combine user queries in natural language and database schema to generate SQL queries.
In the final stage - the query must be executed to produce results as well - but that is not of immediate concerned.
Like/Subscribe/Share to encourage developing this series of posts. Thanks for reading.
Top comments (0)