AWS Step Functions blue-green deployment with Serverless Framework

By Daniel Aniszkiewicz ยท 5 March, 2022

Introduction

AWS Step Functions, is a very powerful tool for orchestrating services in the AWS cloud. It's much easier to run complex workflows, especially if there are many steps to do in a particular workflow, and the ability to wait both x seconds/minutes/hours, and for a specific date (where we can wait a year at most).

With simple Lambda this would not be possible (max duration 15 minutes).

But today, we will focus on blue green deployment with Step Functions.

General content

To be on the same page, let's compare Step Functions to OOP, the image below illustrates it well:


s3

Let us focus more on the execution part. Let's assume that we have simple State Machine inside AWS Step Functions, simple workflow - wait state and Lambda.


s3


This works in production, however we would like to add more steps to the above workflow, choice to, depending on the result of the lambda, move on to the next step, or do an invocation of some other AWS service, or pass some additional parameters.


if we change during other executions of the state machine, it is highly likely that the excecution can be failed, e.g. by changing something in the lambda, or some parameter will be missing. We need to ensure that existing executions would always run against the correct versions of our code.


Remember - State machine executions are immutable and Lambda versions are immutable!

To avoid broken invocations in future, we need to reference the exact versions of the functions.


If you use the serverless framework to create step functions, with the step functions plugin, this is very easy to do. You simply need to add in the definition of the state machine:




And done!


Summary

As you can see, it was very simple to add blue-green deployment to step functions if you use Serverless Framework with the step functions plugin.


If we didn't do this, when changing state machine definitions inside AWS Step Functions, we would have problems for existing executions of our state machine. From now on we can sleep better at night :).