Athreya aka Maneshwar

Posted on Feb 24

Why Kafka? A Developer-Friendly Guide to Event-Driven Architecture

#webdev #programming #javascript #beginners

What is Kafka?

Kafka is an open-source distributed event streaming platform designed for handling real-time data feeds.

Originally developed at LinkedIn and later open-sourced under the Apache Software Foundation, Kafka is now widely used for building high-throughput, fault-tolerant, and scalable data pipelines, real-time analytics, and event-driven architectures.

What Problem Does Kafka Solve?

Before Kafka, traditional message queues like RabbitMQ and ActiveMQ were widely used, but they had limitations in handling massive, high-throughput real-time data streams.

Kafka was designed to address these issues by providing:

Large-scale data handling – Kafka is optimized for ingesting, storing, and distributing high-volume data streams across distributed systems.
Fault tolerance – Kafka replicates data across multiple nodes, ensuring that even if a broker fails, data remains available.
Durability – Messages persist on disk, allowing consumers to replay events when needed.
Support for event-driven architecture – It enables asynchronous communication between microservices, making it ideal for modern cloud applications.

When to Use Kafka

Kafka is the right choice when you need:

High-throughput, real-time data processing – Ideal for log processing, financial transactions, and IoT data streams.
Microservices decoupling – Kafka acts as an intermediary, allowing microservices to communicate asynchronously without direct dependencies.
Event-driven systems – If your architecture revolves around reacting to changes (e.g., a user event triggering multiple downstream actions), Kafka is a solid choice.
Reliable message delivery with persistence – Unlike traditional message queues that may drop messages, Kafka retains messages for a configurable period, ensuring durability and replayability.
Scalability and fault tolerance – Kafka’s distributed nature allows it to scale horizontally while maintaining fault tolerance through replication.

How Kafka Works

Kafka consists of several key components:

1. Message

A message is the smallest unit of data in Kafka.

It can be a JSON object, a string, or any binary data.

Messages may have an associated key, which determines which partition the message will be stored in.

2. Topic

A topic is a logical channel where messages are sent by producers and read by consumers. Topics help categorize messages (e.g., logs, transactions, orders).

3. Producer

A producer is a Kafka client that publishes messages to a topic. Messages can be sent in three ways:

Fire and forget – The producer sends the message without waiting for confirmation, ensuring maximum speed but risking data loss.
Synchronous send – The producer waits for an acknowledgment from Kafka before proceeding, ensuring reliability but adding latency.
Asynchronous send – The producer sends messages in batches asynchronously, offering a balance between speed and reliability.

Kafka allows configuring acknowledgments (ACKs) to balance consistency and performance:

ACK 0 – No acknowledgment required (fastest but riskier).
ACK 1 – The message is acknowledged when the leader broker receives it (faster but less safe).
ACK All – The message is acknowledged only when all replicas confirm receipt (slower but safest).

Producer Optimizations

Message Compression & Batching – Kafka producers can batch and compress messages before sending them to brokers. This improves throughput and reduces disk usage but increases CPU overhead.
Avro Serializer/Deserializer – Using Avro instead of JSON requires defining schemas upfront, but it improves performance and reduces storage consumption.

4. Partition

Kafka topics are divided into partitions, which allow for parallel processing and scalability.

Messages in a partition are ordered and immutable.

5. Consumer

A consumer reads messages from partitions and keeps track of its position using an offset.

Consumers can reset offsets to reprocess older messages.

Kafka consumers work on a polling model, meaning they continuously request data from the broker rather than the broker pushing data to them.

Consumer Optimization

Partition Assignment Strategies:
- Range – Consumers get consecutive partitions.
- Round Robin – Partitions are evenly distributed across consumers.
- Sticky – Tries to minimize changes during rebalancing.
- Cooperative Sticky – Like Sticky but allows cooperative rebalancing.
Batch Size Configuration – Consumers can define how many records or how much data should be retrieved per poll cycle.

6. Consumer Group

A consumer group is a set of consumers that work together to process messages from a topic.

Kafka ensures that a single partition is consumed by only one consumer within a group, maintaining order.

7. Offset Management

When a consumer reads a message, it updates its offset—the position of the last processed message.

Auto-commit – Kafka automatically commits the offset at regular intervals.
Manual commit – The application explicitly commits the offset, either synchronously or asynchronously.

8. Broker

A broker is a Kafka server that stores messages, assigns offsets, and handles client requests.

Multiple brokers form a Kafka cluster for scalability and fault tolerance.

9. Zookeeper

Zookeeper manages metadata, tracks brokers, and handles leader elections.

However, newer Kafka versions are working towards eliminating Zookeeper dependencies.

Example: Kafka in Action

To understand Kafka better, let's look at a simple example where a producer sends messages to a topic, and two different consumers process those messages separately: one simulating an email notification service and the other storing messages in a database.

Setup Kafka (docker-compose.yml)

services:
  zookeeper:
    image: confluentinc/cp-zookeeper:latest
    container_name: zookeeper
    restart: always
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181

  kafka:
    image: confluentinc/cp-kafka:latest
    container_name: kafka
    restart: always
    depends_on:
      - zookeeper
    ports:
      - "9092:9092"
      - "29092:29092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092,PLAINTEXT_INTERNAL://kafka:29092
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092,PLAINTEXT_INTERNAL://0.0.0.0:29092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_INTERNAL:PLAINTEXT
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1

Producer Code (producer.js)

const { Kafka } = require("kafkajs");

const kafka = new Kafka({
  clientId: "family-producer",
  brokers: ["localhost:9092"],
});
const producer = kafka.producer();

async function sendMessage() {
  await producer.connect();
  console.log("🟢 Producer connected");

  const message = {
    id: Date.now(),
    content: `Hi Mom! Time is ${new Date().getMinutes()}:${new Date().getSeconds()}`,
  };
  await producer.send({
    topic: "family-topic",
    messages: [{ value: JSON.stringify(message) }],
  });

  console.log(`📨 Sent: ${JSON.stringify(message)}`);
  await producer.disconnect();
}

sendMessage();

Consumer for Email Notifications (consumer.js)

const { Kafka } = require("kafkajs");

const kafka = new Kafka({
  clientId: "family-email-consumer",
  brokers: ["localhost:9092"],
});
const consumer = kafka.consumer({ groupId: "email-group" });

async function consumeMessages() {
  await consumer.connect();
  await consumer.subscribe({ topic: "family-topic", fromBeginning: true });
  console.log("🟢 Email Consumer Connected");

  await consumer.run({
    eachMessage: async ({ message }) => {
      const msg = JSON.parse(message.value.toString());
      console.log(`📩 Notification Sent: "${msg.content}"`);
      console.log(`📧 Email Sent: "${msg.content}" \n`);
    },
  });
}

consumeMessages();

Consumer for Database Storage (dbconsumer.js)

const { Kafka } = require("kafkajs");

const kafka = new Kafka({
  clientId: "family-db-consumer",
  brokers: ["localhost:9092"],
});
const consumer = kafka.consumer({ groupId: "db-group" });

async function consumeMessages() {
  await consumer.connect();
  await consumer.subscribe({ topic: "family-topic", fromBeginning: true });
  console.log("🟢 DB Consumer Connected");

  await consumer.run({
    eachMessage: async ({ message }) => {
      const msg = JSON.parse(message.value.toString());
      console.log(`💾 Storing message in DB: "${msg.content}" \n`);
    },
  });
}

consumeMessages();

Final Thoughts

Kafka is a powerful tool that has transformed real-time data processing.

However, while it offers incredible scalability and durability, it’s crucial to evaluate whether it's the right fit for your architecture.

Stay tuned! I will write a follow-up article comparing Kafka vs. Redis to explore their use cases and when to choose one over the other. 🚀

Athreya aka Maneshwar

Technical Writer | 100k+ Reads | i3 x Mint | Learning, building, improving, writing :)

I’ve been working on a super-convenient tool called LiveAPI.

LiveAPI helps you get all your backend APIs documented in a few minutes

With LiveAPI, you can quickly generate interactive API documentation that allows users to execute APIs directly from the browser.

If you’re tired of manually creating docs for your APIs, this tool might just make your life easier.

Sources: Some images have been taken from here: 1

Top comments (8)

Alexander • Feb 24

Thanks for short JS samples. When I first heard about Kafka, I thought it's about Franz Kafka.

Athreya aka Maneshwar • Feb 24

Thanks :)
Haha xD

Jester Lee • Feb 26

Lol same ... my first thought was "cockroach" 🪳

Amir Hosein Haseli • Feb 25

Really liked the way you described it, nice and informative 👍🏻

Athreya aka Maneshwar • Feb 25

Thanks a lot :)

Alex P • Feb 25

Hi! Don’t forget that setting up Kafka properly for your needs requires careful planning

Here are a few things to consider:

Plan the number of partitions in advance
Avoid making a single message handler do too many things — otherwise, processing speed will be limited by the slowest one and your consumer lag increase
Think ahead about data deletion. If you ever need to instantly remove a user's data, the only way to do that is if the data was encrypted with a unique key per user — deleting the key would then erase the data (data will be useless without the key)
If your topic already holds terabytes of data, you need to decide where new consumers should start reading from
Topic configuration (retention period, compaction mode, number of partitions, etc.) should involve both sysadmins and developers
Don’t overlook failover testing, batch reading, commit modes, and various client-side details

Overall, Kafka is great — I’ve worked with it a lot

By the way, why didn’t you mention Kafka Streams?

Jester Lee • Feb 26

I'm glad to read about Kafka here. It's pretty capable and proven, as you can see here:

blog.cloudflare.com/using-apache-k...

Madhurima Rawat • Feb 26

Great article 👏 Love the Kafka memes 😂 The breakdown of each component is really good.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.

DEV Community

Why Kafka? A Developer-Friendly Guide to Event-Driven Architecture

What is Kafka?

What Problem Does Kafka Solve?

When to Use Kafka

How Kafka Works

1. Message

2. Topic

3. Producer

Producer Optimizations

4. Partition

5. Consumer

Consumer Optimization

6. Consumer Group

7. Offset Management

8. Broker

9. Zookeeper

Example: Kafka in Action

Setup Kafka (docker-compose.yml)

Producer Code (producer.js)

Consumer for Email Notifications (consumer.js)

Consumer for Database Storage (dbconsumer.js)

Final Thoughts

Athreya aka Maneshwar

Top comments (8)

Read next

A Beginner's Guide to C# Programming

Running NestJS in a Lambda function with LocalStack and Serverless Framework

Mastering the `<iframe>` Tag in React with TypeScript: A Comprehensive Guide

Using React Context API in Next.js 15 for Global State Management

What is Kafka?

What Problem Does Kafka Solve?

When to Use Kafka

How Kafka Works

1. Message

2. Topic

3. Producer

Producer Optimizations

4. Partition

5. Consumer

Consumer Optimization

6. Consumer Group

7. Offset Management

8. Broker

9. Zookeeper

Example: Kafka in Action

Setup Kafka (docker-compose.yml)

Producer Code (producer.js)

Consumer for Email Notifications (consumer.js)

Consumer for Database Storage (dbconsumer.js)

Final Thoughts

Athreya aka ManeshwarFollow

Read next

A Beginner's Guide to C# Programming

Running NestJS in a Lambda function with LocalStack and Serverless Framework

Mastering the `<iframe>` Tag in React with TypeScript: A Comprehensive Guide

Using React Context API in Next.js 15 for Global State Management

Athreya aka Maneshwar