When working with Google Vertex AI's API, one of the challenges developers face is its reliance on access tokens that are valid for only 3600 seconds (or 1 hour). Unlike other APIs that support long-lived API tokens, Vertex AI requires you to generate and refresh these short-lived access tokens. This introduces additional complexity when deploying applications, especially in containerized environments where stateless operations are preferred.
In this blog post, we'll explore how to integrate Google Vertex AI's authentication requirements into an Elixir application running inside a Docker container on Fly.io. We'll also discuss how to use S6-overlay for process orchestration, ensuring smooth execution of database migrations, GCloud CLI setup, and maintaining statelessness by avoiding hardcoding sensitive credentials like service account key files.
To authenticate with Google Vertex AI, you need to generate an access token using the gcloud CLI. This involves setting up a service account, obtaining a JSON key file, and authenticating the CLI. In a containerized environment, this setup must be automated and integrated seamlessly into your application's lifecycle.
Additionally, when deploying to platforms like Fly.io, there are specific constraints:
Fly.io Deployment Flow : After successful deployment, Fly.io runs migration commands as part of the release process.
Firecracker VMs : Fly.io containers run as lightweight virtual machines (Firecracker VMs), which can interfere with traditional container entrypoint setups like ENTRYPOINT.
These challenges necessitate a robust solution for managing processes within the container, ensuring that:
Database migrations run before the app starts.
The gcloud CLI is properly configured during the app's lifecycle.
The container remains stateless, avoiding hardcoded credentials.
For container process orchestration, I’ve come to rely on s6-overlay. It’s a lightweight, powerful tool that allows you to manage multiple processes in a container, define dependencies between them, and ensure they start and stop in the correct order. It’s perfect for scenarios where you need to run one-off tasks (like database migrations) alongside long-running processes (like an Elixir app).
In this case, I needed to set up a dependency graph like this:
migration → Elixir app → gcloud
Migration: A one-shot task that runs database migrations.
Elixir app: The main application that runs continuously.
gcloud: A one-shot task that sets up the gcloud CLI and generates the access token.
Both the migration and gcloud tasks are one-shot processes—they only need to run once per container lifecycle. The Elixir app, however, is a long-running process. If gcloud fails, it only affects the function that relies on the Google Vertex API, allowing the rest of the app to continue running.
Deploying to Fly.io added another layer of complexity. Fly.io uses Firecracker VMs to run containers, and this introduced an unexpected issue: s6-overlay wouldn’t start as usual. After some digging, I found a GitHub gist where someone had faced a similar problem and shared a solution. With a few tweaks, I was able to adapt their approach to get s6-overlay working in the Fly.io environment.
ENTRYPOINT [ \
"unshare", "--pid", "--fork", "--kill-child=SIGTERM", "--mount-proc", \
"perl", "-e", "$SIG{INT}=''; $SIG{TERM}=''; exec @ARGV;", "--", \
"/init" ]
One of the key benefits of using s6-overlay was the ability to keep the container stateless. Instead of copying the gcloud keyfile into the container, I passed it as an environment variable. The s6-overlay setup script then used this variable to authenticate the gcloud CLI. This approach not only simplified the container setup but also improved security by avoiding hardcoding sensitive credentials into the container image.
Here’s a high-level overview of the final setup:
Service Account Keyfile: Stored as an environment variable in Fly.io.
s6-overlay: Used to orchestrate the processes and manage dependencies.
Migration Script: Run as a one-shot task before the Elixir app starts.
gcloud Setup: Run as a one-shot task to authenticate and generate the access token.
Elixir App: The main application, which starts after the migration and gcloud tasks complete.
The folder structure to setup the process is as such
Integrating Google Vertex API into an Elixir app deployed on Fly.io was a challenging but rewarding experience. By leveraging s6-overlay for process orchestration and adapting to Fly.io’s unique environment, I was able to create a robust, stateless deployment pipeline. If you’re facing similar challenges, I hope this post provides some inspiration and guidance.
Top comments (0)