I promised to write an entire article on this topic after struggling for a couple hours to get the graphviz
library working in AWS Lambda. In this article I wanted to cover the tools I used to create a working version of the binary.
Build target
If you're looking to compile a binary for your Lambda you could either run an EC2 instance with the Amazon Linux 2 AMI or use the amazonlinux:2
docker image. Here's how you can get an interactive shell running:
$ docker run --rm -it amazonlinux:2 bash
In either case, have in mind these won't ship compilation tools by default so you will have to install them. sudo yum groupinstall "Development Tools"
would get you a decent list to start with. To find out about the shared dependencies for a binary you could use ldd
.
Once you compiled your binary you should copy it to the development environment for triaging.
Development environment
When working on an AWS Lambda function with a custom binary it's definitely useful to have a fast feedback loop. You can run a container with docker-lambda for this and it will replicate filesystem permissions and binaries bundled by default (among other things).
If you spawn commonly used binaries, like /usr/bin/find
, make sure these are bundled by default (if they're not, you could include them yourself):
$ docker run --rm --entrypoint /bin/bash lambci/lambda:provided -c 'ls -lha /usr/bin'
It works on my machine
Once you have the binaries and libraries necessary you can include them in your Lambda deployment or in case you want to re-use them you can upload it as a layer. It won't really matter the flavor of serverless of your choice.
Layers are mounted in the /opt
directory so some mangling is required to run these custom binaries. Supposing your layer included the bin
and lib
directories you will need to append these values to the $PATH and $LD_LIBRARY environment variables respectively. Since this is needed at runtime, it will be only language-specific step required. Here's how you could do it in Node.js:
process.env.PATH = `${process.env.PATH}:/opt/bin`;
process.env.LD_LIBRARY_PATH = `${process.env.LD_LIBRARY_PATH}:/opt/lib`;
module.exports.handler= async () => {
/* code */
}
For a list of environment variables, including runtime-specific ones, see here.
Layer tooling
No matter which serverless flavour you're using if you want to download a layer from an ARN you will need to use AWS CLI. Even in some cases, when you're still browsing alternatives for a layer, you want to check for its content first before downloading it. It could be to verify the binary it's adding or to check the size of the files it contains.
I've built a small tool that can help to check binaries included in a layer, it's size and permissions: describeawslayer.com.
Convenience kills reproducibility
This is kinda the motto behind DockerHub's registry success, right? I know AWS SAR exists but it's not necessarily targeted for dependencies. It feels like there's still room for a layer manager (or layer registry?) to fill in the gap of re-using existing layers. I'm aware that there's also awesome-lambda-layers but it falls short to display if there's active usage on those layers or to even mention an ARN for all listed items. I trust that a tool like this could speed up the development lifecycle and would expand the notion of what we think it's suited for serverless even further.
If you used a different tool set for troubleshooting your binary within AWS Lambda or if you encountered different problems I would love to hear from you in the comments section.
Other resources:
Top comments (3)
Thanks for the write up. Did you consider using the AWS SAM/LambCI "build" variant containers to avoid installing developer tools? Everyone has this and it follows the "lambci/lambda:build-" naming convention. SAM uses these build variants too. I make heavy use of them not only for layers building (github.com/customink/ruby-vips-lambda) but for local development/test too.
Also, have you seen Michael Hart's yumda project (github.com/lambci/yumda) on LambCI? In some cases it can be useful. I tend to put layer building into 3 buckets.
Hope that helps!
Thanks for reading! I did knew the build images from lambci but wanted to illustrate that for some cases you won't get all the necessary tools and that you could still craft your way into a build image that suits your needs.
It's the first time I read about
yumda
but sounds exactly what I was after originally. I had a similar experience with what you mention about not using layers for code, it was kind of a pain and limits your tree-shaking possibilities in real world projects. It's a super interesting conversation on when it's worth to use a layer and I think binaries are one of those use-cases.I came across the need to use GraphViz in Lambda.
Successfully did it: schemaviz.surge.sh/
There is a working Lambda layer on github: github.com/Nummulith/SchemaViz
I'm happy to answer any questions or receive feedback.