Rust vs Node — Kafka producer performance with AWS Lambda

#aws #lambda #rust #kafka

Introduction

If you have heard of Rust, you are probably aware of how performant the language is when compared to the other programming languages of today. To explore how much of a difference we are talking I ran a very simple load test on two AWS Lambda functions; one on Node and the other on Rust. Both run the exact same logic by pushing a sample payload to Kafka. The results were interesting.

The setup

The code as I mentioned, takes an incoming event payload and pushes it to a Kafka topic. I’ve set up an MSK cluster for this. Various factors contribute to application performance, and one can be payload size. I ensured to use a payload size of 256KB as I imagine that most use cases would do just fine with a size as large as this one.

The load test was done using Artillery. It’s perfect for running tests quickly without requiring too much of an effort on configuration. Using an API Gateway configured on both Lambda functions, Artillery will hit the endpoint(s) along with the payload and defined throughput.

To monitor the performance metrics, I decided to use OpenTelemetry with Datadog. Otel if you aren’t aware, is an observability tool/framework for instrumenting your application to gather logs, traces & metrics. It enables you to publish all collected data to a monitoring platform of your choice by following a consistent pattern making it vendor-agnostic as long as the platform supports Otel. Datadog has good support for Otel and offers a host of features under its product umbrella when it comes to complete visibility for your applications.

Instrumenting the Node function was quite straightforward as I only had to include the Otel Layers and employ the use of the CDK Datadog construct. The Rust function (also having the CDK construct), on the other hand, required a bit of a manual approach since auto-instrumentation is not available. I was able to cobble together some pieces by referring this repository from James Eastham.

The Rust Kafka client library I used internally can be explored from this repository. The Lambda code and CDK IaC can be found here.

Comparing test results

Looking at the median p50 metric, the latency difference doesn’t seem significant. The p99 does paint a different picture though. Also, memory utilization showed quite the difference; with Rust taking around 70–90MB, and Node around 150–176MB (some times even more).

However, I do want to highlight a few things before we crown Rust the supreme winner. Though both Lambdas are essentially doing the same thing, the underlying Kafka client used isn’t something I created. Depending on how those libraries are built, the overall performance can possibly vary. The Node Lambda requires two additional layers for instrumenting with Otel as I opted to auto-instrument and not manually set up the tracing code, unlike the Rust function. I do think that the auto-instrumenting could also affect the performance of the Node function and how the traces/metric data is collected & exported, maybe not significantly but nevertheless worth taking note of.

Switch to Rust?

It’s clear that Rust offers great advantage with cost and performance but is it worth the steep learning curve? I think yes. Although, I have been playing around and exploring Rust on and off but am still no where close to being comfortable with it.

While its great to use the best tool for the job, sometimes, it requires a little more thought. If you are looking to get going real quick with a new product development cycle, unless everyone on your backend team is proficient with Rust and can comfortably handle all curve balls that might come with the evolution of the product you are building, you might want to take a step back before going all-in. It is quite likely that some of the tools you use within the workflow are even yet to offer support for Rust.

Closing thoughts

In the Serverless ecosystem, because Lambdas can be designed to handle isolated pieces of your application, it becomes a lot easier to mix and match. You could pick one of those functions having the highest throughput that also consumes the highest memory along with longer execution times and start with that for Rustification (yes, it’s fine to call it that). In essence, you want to hand-pick those parts of the application that are performance-critical and then explore the feasibility of porting them to Rust.

Though Rust’s ecosystem isn’t quite there yet when compared to the other languages, its community support and overall appeal is growing. And that’s a good sign to start getting your hands dirty and begin the journey sooner rather than later.

If you have tried doing anything interesting with Rust, I’d love to hear it!

DEV Community

Rust vs Node — Kafka producer performance with AWS Lambda

Introduction

The setup

Comparing test results

Switch to Rust?

Closing thoughts

Top comments (0)

Read next

Integración IoT y Generative AI: Cómo Crear una App que Cuenta Chistes Basados en la Temperatura

[Boost]

AWS Introduces Multi-Account Sign-In Feature!

让安卓手机不再吃灰：在安卓手机上搭建 Rust 开发环境