...using Step Function Variables and JSONata
Paginated results exist everywhere. Whether you are fetching data from an API, or iterating through database queries. It has been traditionally very difficult to deal with paginated results natively within your AWS Step Function workflows. The primary reason is that there was no out-of-the-box way to keep the state of the results and then append to it. That all changed when AWS announced support for Step Function variables and JSONata. This made many people (myself included) very happy.
Introduction
The code for this article can be found here:
https://github.com/Crockwell-Solutions/cdk-stepfunctions-iterator
This CDK / TypeScript project gets you up and running with all the resources you need to iterate through some example records in DynamoDb using Step Functions. Also included is a seeding function to populate your DynamoDb table with 100 sample records to iterate through.
It is worth noting that at the time of writing this blog, the CDK does not support JSONata based Step Functions using the native fromChainable
method. JSONata is supported using CDK when using ASL and the fromFile
method.
Step Function Variables and JSONata
No more "Wait, where did my data go" ResultPath
or hours spent manipulating data in the Data Flow Simulator to conclude it is just not possible to add data to the top level of the JSON object with OutputPath
. The support for JSONata has completely eliminated the need. It is just Arguments
(input), Output
and optionally Variables
(see below). JSONata is a query and transformation language for JSON data. More information is available at the official guide and a cool simulator.
When a string in the value of an Step Function field, a JSON object field, or a JSON array element is surrounded by {%
%}
characters, that string will be evaluated as JSONata. It is that simple. JSONata supports basic operations such as selecting attributes and complex operations such as sorting and grouping.
Step Function variables has complemented the support of JSONata by allowing assignment of variables for the Step Function rather than just for a task. This allows you to fetch and manipulate Step Function Variables through tasks.
We will use both of these new bits of functionality within our Step Function Iterator.
The Architecture
We are starting with a simple Step Function that will query DynamoDb. If there is a nextToken
provided in the response, it will continue to query and append to the Step Function Variable $results
. Before Step Functions Variables, this simple workflow was not possible like this.
Although we are focusing on DynamoDb as an example, this pattern equally applies to other paginated systems such as APIs that have a limit
and page
option.
The key elements of the Step Function are the instantiation of the Step Function Variable in the Query DynamoDb Table
task:
"Assign": {
"results": "{% $states.result.Items %}"
}
$states.result
is the output from the task, used within an JSONata expression indicated but the {% %}
encapsulation. And then we can now append to this array when we fetch the next page of results like this:
"Assign": {
"results": "{% $append($results, $states.result.Items) %}"
}
Hooray to JSONata support š
Getting Started
To get started with the example project, clone the repo and deploy following the steps in the README. When you invoke the SeedDynamoDbFunction
, it will create 100 records in the DynamoDb table that we can iterate through.
If we wanted to return a query a set of results where the PK is ItemGroup
, this could lead to paginated results (depending on the total size of the response).
You can see from the output that the Step Function Variable results
now contains an array with the 100 DynamoDb results. The results were fetched across 4 queries, where the limit of 30 was applied to each query. Winner!
Next Steps
This simple pattern can be used to iterate results from alternative data sources. Prior to Step Function Variables and JSONata support, there was either the need for very complex task management, or more likely, you simply invoked a Lambda function to handle the logic. Now with native support, this is an effective way to get a complete result set in your Step Function workflow.
About Me
Iām Ian, an AWS Serverless Specialist and AWS Certified Cloud Architect based in the UK. I work as an independent consultant, having worked across multiple sectors, with a passion for Aviation.
Let's connect on linkedIn, or find out more about my work at Crockwell Solutions
Top comments (1)
be aware that the total size of all variables within a single Assign field cannot exceed 256KB