“Traditional” deployment patterns separate the application from its infrastructure: the operations team builds out the physical infrastructure, while the application development team builds and releases versions of the software. These two steps happen independently, then the built application is deployed onto a running server. Often with great ceremony and a long list of sign-offs.
Lambda deployments turn this model on its head: when you create a Lambda, you must provide the code it will run. This is arguably the “cloud native” way to deploy, and is similar to building a Docker image or “pre-baked” Amazon Machine Image.
But unlike traditional monolithic applications, Lambda-based applications tend to be composed of multiple Lambdas and supporting resources. The code for these Lambdas evolves at different rates, and it’s all too easy for team members to “step on” each other when deploying an update.
This post looks at several Lambda deployment patterns that can help you coordinate work in a team. Each of them has pros and cons, and only you can decide which is best for your team.
The New Traditional
In this approach, you deploy your infrastructure with a dummy Lambda body, and then use the update-function-code command-line utility to deploy your actual function code. I’ve used this approach in several of the examples for this site.
There’s one infrastructure (Terraform / CloudFormation) definition for the entire application. This is particularly important when you have resources such as SQS queues or RDS databases that exist independently of any of the Lambdas.
By using the command-line to deploy their code, developers don’t have to worry about accidentally damaging infrastructure, including other developers’ Lambdas.
Build tools, and the CI/CD pipelines that run them, are generally designed to produce (and optionally deploy) a single artifact.
Creating a “dummy” Lambda function can be surprisingly difficult. CloudFormation lets you specify an inline function for NodeJS or Python, but not for any compiled languages (confusingly, the configuration property is named
ZipFile, but it isn’t ZIPped or a file). To use a compiled language with CloudFormation, you’ll need to store the dummy implementation on S3. With Terraform, you have more flexibility, because you can provide a local file.
You also have to craft the “dummy” Lambda to behave appropriately. For example, if you’re implementing a web-app endpoint you’ll want to return a 404 or 501 status. If you’re reading from a data source such as Kinesis or SQS, you need to decide whether the Lambda should indicate success (which means that the events will be consumed), or throws an exception (which will prevent the events from being consumed, but may cause them to end up on a dead-letter queue.
Replicating your deployment means running the infrastructure scripts and the individual deployment scripts.
In this approach, each Lambda has its own build/deploy script, and the full application deployment requires running all of those individual scripts. This is the approach embodied in Amazon’s Serverless Application Model (SAM), but can also be implemented easily with Terraform or CloudFormation.
Each developer is completely responsible for both infrastructure and code. The work that one developer does will not affect anyone working on a separate part of the application.
Unwieldy in larger applications, often requiring that all developers coordinate a major release. This is especially an issue when a new version of a Lambda expects or emits data that doesn’t correspond to its collaborators.
These scripts may “drift” over time, as one set of developers adopt practices that others don’t.
It also limit your ability to share resources such as security groups or IAM policies.
There are always components — databases being the most obvious — that don’t “belong” to any single Lambda. These require their own infrastructure scripts, which can impact all team members if they introduce unexpected changes.
In this approach, you separate build from deployment: build scripts upload your Lambda deployment bundles to S3, and deployment scripts reference those bundles. The bundles are named using a scheme that incorporates the application name, component, and version. This approach is similar to that used for deploying Docker images, and appears to be (based on tooling) the one that the Lambda development team had in mind.
All Lambdas must be built and deployed to S3 before the infrastructure can be deployed (although, see below, partial deployments are possible).
The artifacts remain available until you delete them from the bucket. This means that you can easily replicate the deployment.
Requires a consistent naming strategy. This is, to be honest, a solved problem: the “group / artifact / version” scheme of Maven would work, as would the “bundle / version” scheme from Docker. How that translates into an S3 location is up to you, but I intentionally used slashes in the previous sentence.
Requires a dedicated S3 bucket, and some thought about how it should be configured. Do you enable object lock or versioning, to avoid losing artifacts? Do you permit users to overwrite artifacts? The latter is extremely useful during development, but something you want to avoid for production releases. You should also give thought to who has permission to write to the bucket: all of your developers, or just the CI/CD pipeline?
Not to belabor the point, but S3 is a really clunky way to manage artifacts, because the rules for accessing the bucket are entirely in the hands of users. Repository managers, including Amazon’s CodeArtifact, already provide a solution to this problem but can’t be used to deploy Lambdas. I can only hope that the tooling (CloudFormation and Terraform) evolves to use them.
At present, you must keep your S3 bucket in the same region as the deployed Lambda. For a multi-region deployment, you need to replicate the bucket.
The shared infrastructure script means that developers can still step on each other, changing non-Lambda components that other Lambdas depend upon.
The guarantee of “All Lambdas must be built and available before the deployment can run” doesn’t apply to Terraform, which will update those Lambdas with deployment bundles in S3, and then fail when it discovers a Lambda whose bundle is missing. CloudFormation is better in this respect: it will roll back the entire update if it can’t find the deployment bundle.
Whichever approach you choose, use an Alias
However you deploy your Lambdas, you don’t want to blindly overwrite whatever is already running. Well, maybe you’re OK with that in development, but definitely not in production. One way to avoid mistakes is a Lambda Alias.
By default, Lambda functions aren’t versioned. More correctly, they are always at the “LATEST” version, and when you update the function’s code, the new deployment becomes “LATEST” and the previous deployment is lost. However, you can publish the function, creating a numbered version that will be retained until you explicitly delete it.
All well and good, but what about aliases? The answer is that an alias refers to a specific Lambda version, and anything that refers to a Lambda function directly can refer to an alias instead. Publishing a new version of the function doesn’t change the alias, so anything that references the alias still uses the old function. When you’re ready, you update the alias to point to the new version.
As I said at the top, there’s not one right answer. I’m currently working on a project that started out as one big deployment, was broken into micro-deployments, and I think may evolve into using an S3 repository. The goal should be to find something that works for all of your developers, or at least is least-disliked by all of your developers (because finding two developers that agree on deployment practices is next to impossible).