Lambda Internals: Why AWS Lambda Will Not Help With Machine Learning

#serverless #aws #lambda #cloudcomputing

Explore the constraints in the use of AWS Lambda in machine learning, and discover the capabilities of Cloudflare for GPU-accelerated serverless computing.

AWS Lambda, built atop Firecracker, offers a lightweight, secure, and efficient serverless computing environment. I'm very passionate about this technology but there is one caveat that we need to understand - it can't use GPU. In today’s short article, I invite you to continue our dive into the world of Lambda we started recently, and I will explain to you why the use of Lambda in Machine Learning is limited.

What’s Wrong with Firecracker?

First, let’s see how Firecracker works on the inside.

Firecracker's architecture uniquely supports memory oversubscription, allowing it to allocate more virtual memory to VMs than physically available, enhancing its efficiency.

How is it possible to allocate more virtual memory to VMs than is physically available? This cool feature of Firecracker, called memory oversubscription, leverages the fact that not all applications use their maximum allocated memory simultaneously. By monitoring usage patterns, Firecracker dynamically allocates physical memory among VMs based on current demand, increasing the efficiency and density of workloads.

This strategy allows for a high number of microVMs to run concurrently on a single host, optimizing resource utilization and reducing costs.

This architecture leverages microVMs for rapid scaling and high-density workloads. But does it work for GPU? The answer is no. You can look at the old 2019 GitHub issue and the comments to it to get the bigger picture of why it is so.

But to be short, memory oversubscription will not work with GPU because of the following reasons:

With current GPU hardware, performing device pass-through implies physical memory which would remove your memory oversubscription capabilities;
You can only run one customer workload securely per physical GPU, and switching takes too long.

Can We Use Lambda in Machine Learning at All?

AWS Lambda cannot directly handle GPU-intensive tasks like advanced machine learning, 3D rendering, or scientific simulations, but it can still play a crucial role. By managing lighter aspects of these workflows and coordinating with more powerful compute resources, Lambda serves as an effective orchestrator or intermediary, ensuring that heavy lifting is done where best suited, thus complementing the overall machine learning ecosystem.

This includes tasks such as initiating and managing data preprocessing jobs, coordinating interactions between different AWS services, handling API requests, and automating routine operational workflows, thereby optimizing the overall process efficiency.

For heavy-duty machine learning computations that require GPUs, Lambda can seamlessly integrate with AWS's more robust computing services like Amazon EC2 or Amazon SageMaker. This integration allows Lambda to delegate intensive tasks to these services, thereby playing a vital role in a distributed machine learning architecture. Lambda's serverless model also offers scalability and cost-efficiency, automatically adjusting resource allocation based on the workload, which is particularly beneficial for variable machine learning tasks.

As we can see, AWS Lambda may not execute the most computationally intensive tasks of a machine learning workflow, but its role is an orchestrator. Its scalability and its ability to integrate diverse services and resources make it an invaluable component of the machine learning ecosystem.

But I Want a Serverless GPU! Is It Impossible in All of the Worlds?

For those exploring serverless architectures that require GPU capabilities, it's essential to look beyond AWS Lambda to platforms designed with GPU support in mind. While AWS Lambda, a pioneer in serverless computing, does not directly offer GPU capabilities, the technological ecosystem is vast and diverse, offering other platforms that cater to this specific need.

If you want to explore serverless architectures that necessitate GPU support for tasks such as deep learning, video processing, or complex simulations, a notable example can be Cloudflare, as its recently presented WebGPU which supports Durable Objects, and it is built with modern cloud-native workloads in mind.

Unlike Lambda, the use of Durable Objects makes it possible to perform tasks such as:

Machine learning - implement ML applications like neural networks and computer vision algorithms using WebGPU compute shaders and matrices;
Scientific computing - perform complex scientific computation like physics simulations and mathematical modeling using the GPU;
High performance computing - unlock breakthrough performance for parallel workloads by connecting WebGPU to languages like Rust, C/C++ via WebAssembly.

What’s also important, the use of Durable Objects ensures memory and GPU access safety, and it also guarantees a reduced driver overhead and better memory management.

In essence, while the quest for serverless GPU computing may seem daunting within the confines of AWS Lambda's current capabilities, the broader technological ecosystem offers promising avenues. Through innovative platforms that embrace device pass-through and GPU support, the dream of a serverless GPU is becoming a reality for those willing to explore the cutting-edge of cloud computing technology.

Wrapping Up

Finalizing our today’s explanation, the quest for serverless GPU capabilities, while elusive within the constraints of the use of Lambda in Machine Learning, is far from a lost cause. The landscape of serverless computing is rich and varied, offering innovative platforms like Cloudflare that bridge the gap between the desire for serverless architectures and the necessity for GPU acceleration.

By leveraging the Cloudflare's cutting-edge Durable Objects technology, this platform offer a glimpse into the future of serverless computing where GPU resources are accessible and scalable, aligning perfectly with the needs of modern, resource-intensive applications.

As the serverless paradigm continues to evolve, it's clear that the limitations of today are merely stepping stones to the innovations of tomorrow. The journey towards a fully realized serverless GPU environment is not only possible but is already underway, promising a new era of efficiency, performance, and scalability in cloud-native applications.

Stay tuned with our special series on AWS Lambda, and feel free to contact us if you need professional cloud computing development services!