Introduction
In the article series Spring Boot 3 application on AWS Lambda we already introduced Spring Cloud Function and its AWS Adapter in the part 8. In the part 9 we explained how to implement AWS Lambda function with Spring Cloud Function AWS using Java 21 and Spring Boot 3.2 and in the part 10 we measured performance (cold and warm start time) of the Lambda function, including enabling Lambda SnapStart and introducing and applying various priming techniques on top of SnapStart.
As Spring Boot 3.2 used in the example was released more than 1 year ago and the current version at the time of preparing this article (end of December 2024) was already 3.4, I decided to update all my examples and re-measure Lambda performance. Also, the version of Spring Cloud Function AWS Adapter and many other dependencies used the example application got their version updates. I also decided to do some extended Lambda performance measurements using different Java compilation options and also better visualize the effect of the Lambda SnapStart snapshot tiered cache.
How to write AWS Lambda function with Spring Cloud Function AWS using Java 21 managed runtime and Spring Boot 3.4
Spring Cloud Function concepts introduced in the part 8 and explanations from the part 9 about how to implement AWS Lambda with Spring Cloud Function AWS are still valid.
The sample simple application also remains the same, see the architecture below:
But I updated all the dependencies to the newest version at the time of writing (end of December 2024) and published the source code in the spring-boot-3.4-with-spring-cloud-function repository. We use Spring Boot version 3.4.0 and Spring Cloud Function AWS Adapter 4.2.0. Of course, since then other minor or major updates of the libraries have been released, but I assume that no other changes besides the version updates in the pom.xml are required to make the applications work. As far as I remember I only made one big change comparing to the previous version - I found out that I can exclude the dependency to the web application server Tomcat (spring-boot-starter-tomcat) from the pom.xml as it's not required due to the fact that we use Amazon API Gateway. The same was true for the AWS Serverless Java Container.
Following needs to be installed in order to build and deploy the sample application:
- Java 21, for example Amazon Corretto 21
- Apache Maven
- AWS CLI
- AWS SAM
In order to build the application, execute mvn clean package
.
In order to build the application, execute sam deploy -g
.
In order to create the product with id equal to 1, execute
curl -m PUT -d '{ "id": 1, "name": "Print 10x13", "price": 0.15 }‘ -H "X-API-Key: a6ZbcDefQW12BN56WEJ34" https://{$API_GATEWAY_URL}/prod/products
In order to retrieve the product with id equal to 1, execute
curl -H "X-API-Key: a6ZbcDefQW12BN56WEJ34" https://{$API_GATEWAY_URL}/prod/products/1
What I noticed was that the compiled artifact size to be deployed on AWS Lambda with Spring Boot version 3.4 was smaller than with 3.2 (24.000 vs 26.800 KB). I assume that the artifacts of the updated versions of all dependencies got a bit bigger but the exclusion of the Tomcat dependency of several MBs of size contributed to this effect. We know that the smaller the artifact size is, the lower the Lambda function cold start will be. So, let's see.
Measuring cold and warm start time of the AWS Lambda function with Spring Cloud Function AWS using Java 21 managed runtime and Spring Boot 3.4
All techniques to measure AWS Lambda performance introduced in the part 10 are also still valid. We will apply them all: SnapStart and also additionally DynamoDB and API Gateway request invocation priming.
The results of the experiment below were as well based on reproducing more than 100 cold and approximately 100.000 warm starts with Lambda function GetProductByIdFunction with 1024 MB memory setting for the duration of 1 hour. The experiments have been performed with the Java Corretto version java:21.v27. For it I used the load test tool hey, but you can use whatever tool you want, like Serverless-artillery or Postman.
I additionally measured Lambda performance using 2 different Java compilation options : tiered compilation, which is the default compilation option in Java 21 and compilation option XX:TieredStopAtLevel=1 for which you need to additionally set JAVA_TOOL_OPTIONS environment variable of the Lambda function to "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" in the template.yaml as showed below:
Globals:
Function:
Runtime: java21
....
Environment:
Variables:
JAVA_TOOL_OPTIONS: "-XX:+TieredCompilation -XX:TieredStopAtLevel=1"#
.....
I'd also like to better visualize the effect of the Lambda SnapStart snapshot tiered cache, showing you the performance measurements for all 100 cold starts, but also for the last 70 cold starts dropping the approximately first 30 slower cold starts. Depending on how often the respective Lambda function is updated and some layers of the cache are invalidated, Lambda function can experience thousands or tens of thousands of cold starts, so that the first longer lasting cold starts are no longer significant. You can read more about the effect Lambda SnapStart snapshot tiered cache in the article and talk by Mike Danilov AWS Lambda Under the Hood. I also investigated this effect for using pure Java 21 on AWS Lambda in my article AWS SnapStart - Part 17 Impact of the snapshot tiered cache on the cold starts with Java 21.
So let's provide the results of the measurement. Abbreviation c stays for the cold start and w is for the warm start.
Cold (c) and warm (w) start time in ms with tiered compilation:
Scenario Number | c p50 | c p75 | c p90 | c p99 | c p99.9 | c max | w p50 | w p75 | w p90 | w p99 | w p99.9 | w max |
---|---|---|---|---|---|---|---|---|---|---|---|---|
No SnapStart enabled | 4792.29 | 4869.54 | 4923.38 | 5032.83 | 5098.66 | 5102.64 | 6.11 | 6.83 | 8.00 | 20.09 | 50.44 | 1464.91 |
SnapStart enabled but no priming applied, all | 1880.38 | 1937.62 | 2260.04 | 2294.18 | 2301.07 | 2302.85 | 6.40 | 7.16 | 8.33 | 20.82 | 55.04 | 1729.04 |
SnapStart enabled but no priming applied, last 70 | 1847.88 | 1879.34 | 1936.42 | 2107.16 | 2107.16 | 2107.16 | 6.25 | 7.04 | 8.20 | 20.33 | 48.84 | 1513.19 |
SnapStart enabled with DynamoDB invocation priming, all | 1015.91 | 1059.46 | 1348.03 | 1382.14 | 1404.42 | 1404.94 | 6.20 | 6.99 | 8.20 | 20.17 | 47.68 | 707.41 |
SnapStart enabled with DynamoDB invocation priming, last 70 | 997.5 | 1020.28 | 1070.52 | 1162.7 | 1162.7 | 1162.7 | 6.15 | 6.88 | 8.07 | 19.69 | 46.55 | 584.19 |
SnapStart enabled with API Gateway request invocation priming, all | 621.28 | 670.31 | 858.87 | 1105.99 | 1147.66 | 1147.84 | 6.11 | 6.83 | 8.13 | 19.15 | 37.30 | 189.05 |
SnapStart enabled with API Gateway request invocation priming, last 70 | 616.99 | 638.74 | 693.62 | 763.9 | 763.9 | 763.9 | 6.11 | 6.83 | 8.00 | 18.85 | 35.01 | 189.05 |
Cold (c) and warm (w) start time in ms with -XX:+TieredCompilation -XX:TieredStopAtLevel=1 compilation option:
Scenario Number | c p50 | c p75 | c p90 | c p99 | c p99.9 | c max | w p50 | w p75 | w p90 | w p99 | w p99.9 | w max |
---|---|---|---|---|---|---|---|---|---|---|---|---|
No SnapStart enabled | 4806.68 | 4869.54 | 4938.16 | 5047.95 | 5058.05 | 5062.18 | 6.21 | 6.93 | 8.13 | 20.73 | 54.60 | 1509.42 |
SnapStart enabled but no priming applied, all | 1931.82 | 2008.61 | 2282.75 | 2345.19 | 2440.85 | 2441.58 | 6.45 | 7.21 | 8.46 | 20.99 | 59.13 | 1652.3 |
SnapStart enabled but no priming applied, last 70 | 1891.82 | 1956.28 | 2028.09 | 2268.08 | 2268.08 | 2268.08 | 6.40 | 7.16 | 8.33 | 20.49 | 54.60 | 1522.27 |
SnapStart enabled with DynamoDB invocation priming, all | 986.88 | 1041.61 | 1333.29 | 1369.76 | 1405.82 | 1406.81 | 6.20 | 6.93 | 8.07 | 19.69 | 44.03 | 773.12 |
SnapStart enabled with DynamoDB invocation priming, last 70 | 980.16 | 1002.13 | 1048.31 | 1097.18 | 1097.18 | 1097.18 | 6.10 | 6.82 | 8.00 | 19.38 | 40.98 | 562.86 |
SnapStart enabled with API Gateway request invocation priming, all | 626.27 | 654.43 | 860.59 | 968.32 | 999.79 | 1000.54 | 6.11 | 6.83 | 8.13 | 19.46 | 39.75 | 205.29 |
SnapStart enabled with API Gateway request invocation priming, last 70 | 605.95 | 630.86 | 660.27 | 707.5 | 707.5 | 707.5 | 6.02 | 6.72 | 8.00 | 18.85 | 37.30 | 174.81 |
Conclusion
In this article, we updated our sample application to use Spring Boot 3.4, Spring Cloud Function AWS Adapter version 4.2.0 and other dependencies to their recent version at the time of end of December 2024. We also measured Lambda performance with different methods and with the different Java compilation options. My general impression is that tiered compilation produces slightly better result in terms of the lower cold start and warm start times for the measurements where Lambda SnapSart wasn't enabled or it was enabled but priming wasn't used. It was other way around for SnapStart enabled and both priming techniques (DynamoDB and API Gateway requests priming). Although the results varied a bit depending on the percentile.
Compared to the Lambda performance measurements for Spring Boot 3.2 which we only did for the -XX:+TieredCompilation -XX:TieredStopAtLevel=1 compilation option we observe slightly lower cold and warm start times for Spring Boot 3.4, which might the result of the smaller deployment artifact size and other performance optimizations made in the updated versions. It's worth investigating which not required dependencies can be additionally excluded in the pom.xml as we've already done for the spring-boot-starter-logging and spring-boot-starter-tomcat dependencies, see the following code snippet from the pom.xml :
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
<exclusions>
<exclusion>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-logging</artifactId>
</exclusion>
<exclusion>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-tomcat</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
Maybe there are also other dependencies to be excluded, for example JPA implementation as we use DynamoDB NoSQL database. In such a case we can further reduce the Lambda deployment artifact size and therefore the cold start times.
What we also clearly observe is the described effect of the Lambda SnapStart snapshot tiered cache. So don't stop by measuring only the first several cold start times as there are really quite slow but significantly improve with subsequent invocations. The first longer lasting cold starts might not significantly impact the overall performance of your application.
In next part of this series, we'll update this sample application to be deployed as a Docker Container Image Lambda runtime.
If you have read my article(s) and liked its content, please support me by following me on my GitHub account and giving my repos a star.
Top comments (0)