DEV Community

Rishabh Agarwal
Rishabh Agarwal

Posted on

How Pinterest uses Kafka for Long-Term Data Storage

I spent hours diving into this so you don’t have to!

Here is what I learned:

  • Pinterest doesn't store all data on Kafka brokers forever.
  • Older data is moved to a remote storage like Amazon S3.
  • They built a tool called Segment Uploader to automate this process.
  • The Segment Uploader periodically transfers older data from Kafka brokers to remote storage.
  • Segment Uploader runs as a sidecar alongside the Kafka broker.
  • They also developed a specialized Consumer Library to fetch data intelligently.
  • The library fetches old data directly from remote storage and new data from Kafka brokers.

By combining Kafka’s real-time capabilities with cost-efficient remote storage, Pinterest ensures scalability, reliability, and efficient long-term data management.


PS - I recently published an article on my free Newsletter covering this case study in-depth with visuals: https://designsystemsweekly.substack.com/p/how-pinterest-leverages-kafka-for

Top comments (0)