DEV Community

Streaming Audio: A Confluent podcast about Apache Kafka®

Real-Time Stream Processing with Kafka Streams ft. Bill Bejeck

Kafka Streams is a native streaming library for Apache Kafka® that consumes messages from Kafka to perform operations like filtering a topic’s message and producing output back into Kafka. After working as a developer in stream processing, Bill Bejeck (Apache Kafka Committer and Integration Architect, Confluent) has found his calling in sharing knowledge and authoring his book, “Kafka Streams in Action.” As a Kafka Streams expert, Bill is also the author of the Kafka Streams 101 course on Confluent Developer, where he delves into what Kafka Streams is, how to use it, and how it works. 

Kafka Streams provides the abstraction over Kafka consumers and producers by minimizing administrative details like the need to code and manage frameworks required when using plain Kafka consumers and producers to process streams. Kafka Streams is declarative—you can state what you want to do, rather than how to do it. Kafka Streams leverages the KafkaConsumer protocol internally; it inherits its dynamic scaling properties and the consumer group protocol to dynamically redistribute the workload. When Kafka Streams applications are deployed separately but have the same application.id, they are logically still one application. 

Kafka Streams has two processing APIs, the declarative API or domain-specific language (DSL)  is a high-level language that enables you to build anything needed with a processor topology, whereas the Processor API lets you specify a processor typology node by node, providing the ultimate flexibility. To underline the differences between the two APIs, Bill says it’s almost like using the object-relational mapping framework (ORM) versus SQL. 

The Kafka Streams 101 course is designed to get you started with Kafka Streams and to help you learn the fundamentals of: 

  • How streams and tables work 
  • How stateless and stateful operations work 
  • How to handle time windows and out of order data
  • How to deploy Kafka Streams

EPISODE LINKS

Episode source