DEV Community

Thiago
Thiago

Posted on

GOCDC and Postgres

Change Data Capture (CDC) and Postgres

Recently, I've written this post here, where I show you guys how to setup Change Data Capture (CDC) in Postgres. So, if you don't know how to setup CDC in Postgres, I totally recommend you guys to take a quick look at that tutorial first.

GOCDC

GoCdc it's an Opensource API for data streaming, developed (I mean, in development) in Golang that I would like to share with you guys. In short, the concept behind it is similar to Debezium, but in Golang, which in my opinion, it's easier for anyone to hack and adapt the application according to your needs. The focus of GoCdc is on simplicity. Simplicity to setup, to modify, etc.

Hands-on πŸ› οΈ

Required:
Docker -> https://www.docker.com/get-started

1 - Creating the Project

First of all, create a docker-compose.yml file in the directory of your preference.

version: "3"
services:
  gocdc:
    image: "133thiago/gocdc:latest"
    ports:
      - "8000:8000"

  db:
    image: "postgres:11"
    container_name: "my_postgres"
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=postgres
      - POSTGRES_DB=example_db
    ports:
      - "5432:5432"
    command:
      - "postgres"
      - "-c"
      - "wal_level=logical"
    volumes:
      - my_dbdata:/var/lib/postgresql/data

  zookeeper:
    image: wurstmeister/zookeeper
    ports:
      - "2181:2181"

  kafka:
    image: wurstmeister/kafka
    ports:
      - "9092:9092"
    environment:
      KAFKA_ADVERTISED_HOST_NAME: localhost
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

volumes:
  my_dbdata:

2 - Running in Docker

Now you're going to run the docker-compose up -d, and then run docker ps. Then you should see the containers created.

3 - Postgres Setup

As I said at the beginning of this post, this step is going to be this tutorial in here -> (How to use Change Data Capture (CDC) with Postgres)[https://dev.to/thiagosilvaf/how-to-use-change-database-capture-cdc-in-postgres-37b8].

IMPORTANT: Just make sure you create either create your database as example_db, as it is in the value of POSTGRES_DB in our docker-compose.yml, or you change the value in our docker-compose.yml to the database that you created. It is up to you.

4 - Kafka config:

Now, it's time to create our Kafka Topic, which is going to be the topic that our connector is going to send the database changes.
First, run docker ps again and get the CONTAINER ID of your Kafka container, it will be something like this 4bed6164a8e9

CONTAINER ID        IMAGE                          COMMAND                  CREATED             STATUS              PORTS

4bed6164a8e9        wurstmeister/kafka             "start-kafka.sh"         5 hours ago         Up About an hour    0.0.0.0:9092->9092/tcp

Then, run the following command:

docker exec -it 4bed6164a8e9 "bash" # Remember to replace the Container ID with yours

Finally, let's create the topic, I'm going to call it test, but it is up to you. just bear in mind, you will need the topic later!

bash-4.4# Kafka-topics.sh --create --zookeeper zookeeper:2181 --replication-factor 1 --partitions 1 --topic test

5 - Tinder match: Postgres ❀️ Kafka

GOCDC provides a REST Interface for creating the connection between the Database and Kafka. If you look again to our docker-compose.yml file, GOCDC is using the Port 8000. So let's do a POST to our localhost:8000/connectors/Postgres

curl --location --request POST 'http://localhost:8000/connectors/postgres' \
--header 'Content-Type: application/JSON' \
--data-raw '{
    "connector_name":"Conn PG Test",
    "db_host": "localhost",
    "db_port": 5432,
    "db_user": "postgres",
    "db_pass": "postgres",
    "db_name": "example_db",
    "db_slot": "slot",
    "kafka_brokers": [ "localhost:9092" ],
    "kafka_topic": "test"
        "lookup_interval": 5000
}
'

Let's dive into the JSON object sent in this request:
connector_name: No big deal here, just an Identifier. It is unique though, so you can use it to Edit via PUT request.

db_host: The IP where your Postgres is running. In our example, it is localhost.

db_port: The Port to access our Postgres database.

db_user AND db_pass: User and Password of our Postgres database.

db_name: The name of the database

db_slot: The name of the Replication Slot.

kafka_brokers: An array with our Kafka brokers. Well, we're running only one in our example, but you can set multiple.

kafka_topic: Here is the Topic that we created in Step 4.

lookup_interval: Here you tell the connector how often it should execute a CDC lookup. In our example, every 5 seconds (yes, the parameter value is in milliseconds).

6 - Kafka Consumer:

So, we now have our Postgres database up and running, our Kafka "cluster" also up and running and the Connector created!
Assuming that you have the Database and a Table created in your Postgres (Step 3), Connect to your Kafka once again (Step 4) and run the following command:

bash-4.4# Kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test

And now, you will be able to see every Insert, Update and Delete from your Database, being sent to your Kafka consumer.

I hope you enjoyed the Tutorial, if you've got stuck at some step, please leave a comment and do my best to help! πŸ––

Top comments (1)

Collapse
 
austince profile image
Austin Cawley-Edwards

Thanks for the nice article -- I think it might be nice to mention that you are the author of GOCDC :)