DEV Community

Cover image for Building Robust Serverless Workflows with AWS Step Functions: Long-Running Order Fulfillment with Retries and Human Approvals
Sidra Saleem for SUDO Consultants

Posted on • Originally published at sudoconsultants.com

Building Robust Serverless Workflows with AWS Step Functions: Long-Running Order Fulfillment with Retries and Human Approvals

Serverless computing has revolutionized the way we build and deploy applications. It allows developers to focus on writing code without worrying about infrastructure management. However, building stateful, long-running workflows in a serverless environment can be challenging. AWS Step Functions, combined with SDK integrations, provides a powerful solution for designing such workflows. In this article, we will walk through the process of building a stateful serverless application using AWS Step Functions to design a long-running order fulfillment workflow that includes retries and human approval steps.

1. Introduction to AWS Step Functions

AWS Step Functions is a fully managed service that makes it easy to coordinate the components of distributed applications and microservices using visual workflows. Step Functions allows you to design and run workflows that integrate with AWS services, such as Lambda, SNS, SQS, and more, as well as custom applications using the AWS SDK.

Step Functions uses state machines to define workflows. A state machine is a collection of states that can perform work, make decisions, and control the flow of the workflow. Each state in the state machine can perform a specific task, such as invoking a Lambda function, waiting for a certain amount of time, or making a decision based on input data.

2. Overview of the Order Fulfillment Workflow

The order fulfillment workflow we will design is a long-running process that involves several steps:

  1. Order Received: The workflow starts when an order is received.
  2. Validate Order: The order is validated to ensure all required information is present.
  3. Process Payment: The payment for the order is processed.
  4. Check Inventory: The inventory is checked to ensure the ordered items are in stock.
  5. Human Approval: If the order exceeds a certain amount, it requires human approval.
  6. Ship Order: The order is shipped to the customer.
  7. Send Confirmation: A confirmation email is sent to the customer.

Throughout the workflow, we will implement retries and error handling to ensure the process is robust and can handle failures gracefully.

3. Setting Up the AWS Environment

Before we start designing the workflow, we need to set up the AWS environment. This includes creating an IAM role for Step Functions, setting up Lambda functions, and configuring other AWS services that will be used in the workflow.

3.1 Creating an IAM Role for Step Functions

Step Functions needs permissions to invoke AWS services, such as Lambda, SNS, and others. We will create an IAM role with the necessary permissions.

AWS Console Steps:

  1. Open the IAM console.
  2. In the navigation pane, choose Roles, then choose Create role.
  3. Choose Step Functions as the service that will use this role.
  4. Attach the following policies:
    • AWSLambda_FullAccess
    • AmazonSNSFullAccess
    • AmazonDynamoDBFullAccess
  5. Review the role and give it a name, such as StepFunctionsExecutionRole.
  6. Choose Create role.

CLI Steps:

aws iam create-role --role-name StepFunctionsExecutionRole --assume-role-policy-document file://trust-policy.json
aws iam attach-role-policy --role-name StepFunctionsExecutionRole --policy-arn arn:aws:iam::aws:policy/AWSLambda_FullAccess
aws iam attach-role-policy --role-name StepFunctionsExecutionRole --policy-arn arn:aws:iam::aws:policy/AmazonSNSFullAccess
aws iam attach-role-policy --role-name StepFunctionsExecutionRole --policy-arn arn:aws:iam::aws:policy/AmazonDynamoDBFullAccess

The trust-policy.json file should contain:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "states.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

3.2 Setting Up Lambda Functions

We will create several Lambda functions that will be used in the workflow:

  1. ValidateOrder: Validates the order details.
  2. ProcessPayment: Processes the payment for the order.
  3. CheckInventory: Checks the inventory for the ordered items.
  4. ShipOrder: Ships the order to the customer.
  5. SendConfirmation: Sends a confirmation email to the customer.

AWS Console Steps:

  1. Open the Lambda console.
  2. Choose Create function.
  3. Choose Author from scratch.
  4. Give the function a name, such as ValidateOrder.
  5. Choose Create function.
  6. Repeat the steps for the other functions.

CLI Steps:

aws lambda create-function --function-name ValidateOrder --runtime nodejs14.x --handler index.handler --role arn:aws:iam::123456789012:role/lambda-execution-role --zip-file fileb://validate-order.zip
aws lambda create-function --function-name ProcessPayment --runtime nodejs14.x --handler index.handler --role arn:aws:iam::123456789012:role/lambda-execution-role --zip-file fileb://process-payment.zip
aws lambda create-function --function-name CheckInventory --runtime nodejs14.x --handler index.handler --role arn:aws:iam::123456789012:role/lambda-execution-role --zip-file fileb://check-inventory.zip
aws lambda create-function --function-name ShipOrder --runtime nodejs14.x --handler index.handler --role arn:aws:iam::123456789012:role/lambda-execution-role --zip-file fileb://ship-order.zip
aws lambda create-function --function-name SendConfirmation --runtime nodejs14.x --handler index.handler --role arn:aws:iam::123456789012:role/lambda-execution-role --zip-file fileb://send-confirmation.zip

3.3 Configuring Other AWS Services

We will also need to configure other AWS services, such as SNS for sending confirmation emails and DynamoDB for storing order information.

AWS Console Steps:

  1. Open the SNS console.
  2. Choose Create topic.
  3. Give the topic a name, such as OrderConfirmation.
  4. Choose Create topic.
  5. Open the DynamoDB console.
  6. Choose Create table.
  7. Give the table a name, such as Orders.
  8. Set the primary key to OrderId.
  9. Choose Create table.

CLI Steps:

aws sns create-topic --name OrderConfirmation
aws dynamodb create-table --table-name Orders --attribute-definitions AttributeName=OrderId,AttributeType=S --key-schema AttributeName=OrderId,KeyType=HASH --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5

4. Designing the Step Functions State Machine

Now that the AWS environment is set up, we can design the Step Functions state machine for the order fulfillment workflow.

4.1 Defining the State Machine

The state machine will consist of the following states:

  1. OrderReceived: The starting state of the workflow.
  2. ValidateOrder: Invokes the ValidateOrder Lambda function.
  3. ProcessPayment: Invokes the ProcessPayment Lambda function.
  4. CheckInventory: Invokes the CheckInventory Lambda function.
  5. HumanApproval: A wait state that pauses the workflow until human approval is received.
  6. ShipOrder: Invokes the ShipOrder Lambda function.
  7. SendConfirmation: Invokes the SendConfirmation Lambda function.
  8. OrderCompleted: The final state of the workflow.

4.2 Creating the State Machine

AWS Console Steps:

  1. Open the Step Functions console.
  2. Choose Create state machine.
  3. Choose Design your workflow visually.
  4. Drag and drop the states onto the canvas and connect them in the correct order.
  5. Configure each state to invoke the corresponding Lambda function.
  6. Configure the HumanApproval state to wait for human input.
  7. Review the state machine and give it a name, such as OrderFulfillmentWorkflow.
  8. Choose Create state machine.

CLI Steps:

aws stepfunctions create-state-machine --name OrderFulfillmentWorkflow --definition file://state-machine-definition.json --role-arn arn:aws:iam::123456789012:role/StepFunctionsExecutionRole

The state-machine-definition.json file should contain:

{
  "Comment": "Order Fulfillment Workflow",
  "StartAt": "OrderReceived",
  "States": {
    "OrderReceived": {
      "Type": "Pass",
      "Result": "Order Received",
      "Next": "ValidateOrder"
    },
    "ValidateOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:ValidateOrder",
      "Next": "ProcessPayment"
    },
    "ProcessPayment": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:ProcessPayment",
      "Next": "CheckInventory"
    },
    "CheckInventory": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:CheckInventory",
      "Next": "HumanApproval"
    },
    "HumanApproval": {
      "Type": "Wait",
      "Seconds": 300,
      "Next": "ShipOrder"
    },
    "ShipOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:ShipOrder",
      "Next": "SendConfirmation"
    },
    "SendConfirmation": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:SendConfirmation",
      "End": true
    }
  }
}

5. Implementing the Workflow with AWS SDK Integrations

In this section, we will implement the workflow using AWS SDK integrations. We will use the AWS SDK for JavaScript (Node.js) to interact with AWS services from within the Lambda functions.

5.1 Implementing the ValidateOrder Lambda Function

The ValidateOrder Lambda function will validate the order details and return a success or failure response.

const AWS = require('aws-sdk');
const dynamoDb = new AWS.DynamoDB.DocumentClient();

exports.handler = async (event) => {
  const order = event.order;

  if (!order.orderId || !order.customerId || !order.items) {
    return {
      statusCode: 400,
      body: JSON.stringify({ message: 'Invalid order details' }),
    };
  }

  const params = {
    TableName: 'Orders',
    Item: order,
  };

  try {
    await dynamoDb.put(params).promise();
    return {
      statusCode: 200,
      body: JSON.stringify({ message: 'Order validated successfully' }),
    };
  } catch (error) {
    return {
      statusCode: 500,
      body: JSON.stringify({ message: 'Error validating order', error }),
    };
  }
};

5.2 Implementing the ProcessPayment Lambda Function

The ProcessPayment Lambda function will process the payment for the order.

const AWS = require('aws-sdk');
const sns = new AWS.SNS();

exports.handler = async (event) => {
  const order = event.order;

  // Simulate payment processing
  const paymentStatus = Math.random() > 0.5 ? 'SUCCESS' : 'FAILURE';

  if (paymentStatus === 'SUCCESS') {
    return {
      statusCode: 200,
      body: JSON.stringify({ message: 'Payment processed successfully' }),
    };
  } else {
    return {
      statusCode: 400,
      body: JSON.stringify({ message: 'Payment processing failed' }),
    };
  }
};

5.3 Implementing the CheckInventory Lambda Function

The CheckInventory Lambda function will check the inventory for the ordered items.

const AWS = require('aws-sdk');
const dynamoDb = new AWS.DynamoDB.DocumentClient();

exports.handler = async (event) => {
  const order = event.order;

  const params = {
    TableName: 'Inventory',
    Key: {
      itemId: order.items[0].itemId,
    },
  };

  try {
    const data = await dynamoDb.get(params).promise();
    if (data.Item.quantity >= order.items[0].quantity) {
      return {
        statusCode: 200,
        body: JSON.stringify({ message: 'Inventory check successful' }),
      };
    } else {
      return {
        statusCode: 400,
        body: JSON.stringify({ message: 'Insufficient inventory' }),
      };
    }
  } catch (error) {
    return {
      statusCode: 500,
      body: JSON.stringify({ message: 'Error checking inventory', error }),
    };
  }
};

5.4 Implementing the ShipOrder Lambda Function

The ShipOrder Lambda function will ship the order to the customer.

const AWS = require('aws-sdk');
const sns = new AWS.SNS();

exports.handler = async (event) => {
  const order = event.order;

  // Simulate shipping process
  const shippingStatus = Math.random() > 0.5 ? 'SUCCESS' : 'FAILURE';

  if (shippingStatus === 'SUCCESS') {
    return {
      statusCode: 200,
      body: JSON.stringify({ message: 'Order shipped successfully' }),
    };
  } else {
    return {
      statusCode: 400,
      body: JSON.stringify({ message: 'Order shipping failed' }),
    };
  }
};

5.5 Implementing the SendConfirmation Lambda Function

The SendConfirmation Lambda function will send a confirmation email to the customer.

const AWS = require('aws-sdk');
const sns = new AWS.SNS();

exports.handler = async (event) => {
  const order = event.order;

  const params = {
    Message: `Your order ${order.orderId} has been shipped.`,
    TopicArn: 'arn:aws:sns:us-east-1:123456789012:OrderConfirmation',
  };

  try {
    await sns.publish(params).promise();
    return {
      statusCode: 200,
      body: JSON.stringify({ message: 'Confirmation email sent successfully' }),
    };
  } catch (error) {
    return {
      statusCode: 500,
      body: JSON.stringify({ message: 'Error sending confirmation email', error }),
    };
  }
};

6. Adding Retries and Error Handling

In a long-running workflow, it's important to handle errors and retries gracefully. AWS Step Functions allows you to define retry and catch mechanisms for each state.

6.1 Adding Retries to the ProcessPayment State

We will add retries to the ProcessPayment state to handle transient failures.

AWS Console Steps:

  1. Open the Step Functions console.
  2. Select the OrderFulfillmentWorkflow state machine.
  3. Edit the ProcessPayment state.
  4. Add a retry configuration with the following parameters:
    • Error EqualsStates.ALL
    • Interval Seconds: 5
    • Max Attempts: 3
    • Backoff Rate: 2
  5. Save the changes.

CLI Steps:

Update the state-machine-definition.json file to include the retry configuration:

{
  "Comment": "Order Fulfillment Workflow",
  "StartAt": "OrderReceived",
  "States": {
    "OrderReceived": {
      "Type": "Pass",
      "Result": "Order Received",
      "Next": "ValidateOrder"
    },
    "ValidateOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:ValidateOrder",
      "Next": "ProcessPayment"
    },
    "ProcessPayment": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:ProcessPayment",
      "Retry": [
        {
          "ErrorEquals": ["States.ALL"],
          "IntervalSeconds": 5,
          "MaxAttempts": 3,
          "BackoffRate": 2
        }
      ],
      "Next": "CheckInventory"
    },
    "CheckInventory": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:CheckInventory",
      "Next": "HumanApproval"
    },
    "HumanApproval": {
      "Type": "Wait",
      "Seconds": 300,
      "Next": "ShipOrder"
    },
    "ShipOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:ShipOrder",
      "Next": "SendConfirmation"
    },
    "SendConfirmation": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:SendConfirmation",
      "End": true
    }
  }
}

6.2 Adding Error Handling to the CheckInventory State

We will add error handling to the CheckInventory state to handle insufficient inventory errors.

AWS Console Steps:

  1. Open the Step Functions console.
  2. Select the OrderFulfillmentWorkflow state machine.
  3. Edit the CheckInventory state.
  4. Add a catch configuration with the following parameters:
    • Error EqualsStates.ALL
    • NextOrderFailed
  5. Save the changes.

CLI Steps:

Update the state-machine-definition.json file to include the catch configuration:

{
  "Comment": "Order Fulfillment Workflow",
  "StartAt": "OrderReceived",
  "States": {
    "OrderReceived": {
      "Type": "Pass",
      "Result": "Order Received",
      "Next": "ValidateOrder"
    },
    "ValidateOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:ValidateOrder",
      "Next": "ProcessPayment"
    },
    "ProcessPayment": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:ProcessPayment",
      "Retry": [
        {
          "ErrorEquals": ["States.ALL"],
          "IntervalSeconds": 5,
          "MaxAttempts": 3,
          "BackoffRate": 2
        }
      ],
      "Next": "CheckInventory"
    },
    "CheckInventory": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:CheckInventory",
      "Catch": [
        {
          "ErrorEquals": ["States.ALL"],
          "Next": "OrderFailed"
        }
      ],
      "Next": "HumanApproval"
    },
    "HumanApproval": {
      "Type": "Wait",
      "Seconds": 300,
      "Next": "ShipOrder"
    },
    "ShipOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:ShipOrder",
      "Next": "SendConfirmation"
    },
    "SendConfirmation": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:SendConfirmation",
      "End": true
    },
    "OrderFailed": {
      "Type": "Fail",
      "Cause": "Order processing failed"
    }
  }
}

7. Incorporating Human Approval Steps

In some cases, orders may require human approval before they can be shipped. We will incorporate a human approval step into the workflow.

7.1 Adding a Human Approval State

We will add a HumanApproval state that pauses the workflow until human approval is received.

AWS Console Steps:

  1. Open the Step Functions console.
  2. Select the OrderFulfillmentWorkflow state machine.
  3. Add a new state called HumanApproval.
  4. Set the state type to Wait.
  5. Configure the wait time to 300 seconds (5 minutes).
  6. Connect the CheckInventory state to the HumanApproval state.
  7. Connect the HumanApproval state to the ShipOrder state.
  8. Save the changes.

CLI Steps:

Update the state-machine-definition.json file to include the HumanApproval state:

{
  "Comment": "Order Fulfillment Workflow",
  "StartAt": "OrderReceived",
  "States": {
    "OrderReceived": {
      "Type": "Pass",
      "Result": "Order Received",
      "Next": "ValidateOrder"
    },
    "ValidateOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:ValidateOrder",
      "Next": "ProcessPayment"
    },
    "ProcessPayment": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:ProcessPayment",
      "Retry": [
        {
          "ErrorEquals": ["States.ALL"],
          "IntervalSeconds": 5,
          "MaxAttempts": 3,
          "BackoffRate": 2
        }
      ],
      "Next": "CheckInventory"
    },
    "CheckInventory": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:CheckInventory",
      "Catch": [
        {
          "ErrorEquals": ["States.ALL"],
          "Next": "OrderFailed"
        }
      ],
      "Next": "HumanApproval"
    },
    "HumanApproval": {
      "Type": "Wait",
      "Seconds": 300,
      "Next": "ShipOrder"
    },
    "ShipOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:ShipOrder",
      "Next": "SendConfirmation"
    },
    "SendConfirmation": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:SendConfirmation",
      "End": true
    },
    "OrderFailed": {
      "Type": "Fail",
      "Cause": "Order processing failed"
    }
  }
}

7.2 Implementing Human Approval

To implement human approval, we can use AWS SNS to send a notification to a human approver. The approver can then manually approve or reject the order by invoking a Lambda function that resumes the workflow.

AWS Console Steps:

  1. Open the SNS console.
  2. Create a new topic called OrderApproval.
  3. Subscribe an email address to the topic.
  4. Open the Lambda console.
  5. Create a new Lambda function called ApproveOrder.
  6. Add code to the Lambda function to send an SNS notification to the OrderApproval topic.
  7. Configure the HumanApproval state to wait for a response from the ApproveOrder Lambda function.

CLI Steps:

aws sns create-topic --name OrderApproval
aws sns subscribe --topic-arn arn:aws:sns:us-east-1:123456789012:OrderApproval --protocol email --notification-endpoint approver@example.com
aws lambda create-function --function-name ApproveOrder --runtime nodejs14.x --handler index.handler --role arn:aws:iam::123456789012:role/lambda-execution-role --zip-file fileb://approve-order.zip

The approve-order.zip file should contain the following code:

const AWS = require('aws-sdk');
const sns = new AWS.SNS();

exports.handler = async (event) => {
  const order = event.order;

  const params = {
    Message: `Please approve or reject the order ${order.orderId}.`,
    TopicArn: 'arn:aws:sns:us-east-1:123456789012:OrderApproval',
  };

  try {
    await sns.publish(params).promise();
    return {
      statusCode: 200,
      body: JSON.stringify({ message: 'Approval request sent successfully' }),
    };
  } catch (error) {
    return {
      statusCode: 500,
      body: JSON.stringify({ message: 'Error sending approval request', error }),
    };
  }
};

8. Testing and Debugging the Workflow

Once the workflow is designed and implemented, it's important to test and debug it to ensure it works as expected.

8.1 Testing the Workflow

AWS Console Steps:

  1. Open the Step Functions console.
  2. Select the OrderFulfillmentWorkflow state machine.
  3. Choose Start execution.
  4. Enter the following input:
{
  "order": {
    "orderId": "12345",
    "customerId": "67890",
    "items": [
      {
        "itemId": "item1",
        "quantity": 2
      }
    ]
  }
}
  1. Choose Start execution.
  2. Monitor the execution to ensure it completes successfully.

CLI Steps:

aws stepfunctions start-execution --state-machine-arn arn:aws:states:us-east-1:123456789012:stateMachine:OrderFulfillmentWorkflow --input file://input.json

The input.json file should contain:

{
  "order": {
    "orderId": "12345",
    "customerId": "67890",
    "items": [
      {
        "itemId": "item1",
        "quantity": 2
      }
    ]
  }
}

8.2 Debugging the Workflow

If the workflow fails, you can use the Step Functions console to debug the issue. The console provides detailed logs for each state, including input, output, and error messages.

AWS Console Steps:

  1. Open the Step Functions console.
  2. Select the OrderFulfillmentWorkflow state machine.
  3. Choose the failed execution.
  4. Review the execution history to identify the failed state.
  5. Check the input, output, and error messages for the failed state.
  6. Make the necessary changes to the state machine or Lambda functions and retry the execution.

9. Deploying and Monitoring the Workflow

Once the workflow is tested and debugged, it can be deployed to production. AWS Step Functions provides built-in monitoring and logging capabilities to help you monitor the workflow in real-time.

9.1 Deploying the Workflow

AWS Console Steps:

  1. Open the Step Functions console.
  2. Select the OrderFulfillmentWorkflow state machine.
  3. Choose Deploy.
  4. Confirm the deployment.

CLI Steps:

aws stepfunctions update-state-machine --state-machine-arn arn:aws:states:us-east-1:123456789012:stateMachine:OrderFulfillmentWorkflow --definition file://state-machine-definition.json

9.2 Monitoring the Workflow

AWS Console Steps:

  1. Open the Step Functions console.
  2. Select the OrderFulfillmentWorkflow state machine.
  3. Choose Monitoring.
  4. Review the metrics and logs to monitor the workflow in real-time.

CLI Steps:

aws cloudwatch get-metric-statistics --namespace AWS/States --metric-name ExecutionsStarted --dimensions Name=StateMachineArn,Value=arn:aws:states:us-east-1:123456789012:stateMachine:OrderFulfillmentWorkflow --start-time 2023-10-01T00:00:00Z --end-time 2023-10-02T00:00:00Z --period 3600 --statistics Sum

10. Conclusion

In this article, we have walked through the process of building a stateful serverless application using AWS Step Functions and SDK integrations. We designed a long-running order fulfillment workflow that includes retries and human approval steps. By leveraging AWS Step Functions, we were able to create a robust and scalable workflow that can handle complex business logic and integrate with various AWS services.

Building stateful serverless applications can be challenging, but with the right tools and techniques, it is possible to create powerful and efficient workflows. AWS Step Functions provides a flexible and easy-to-use platform for designing and running stateful workflows, making it an ideal choice for building serverless applications.

By following the steps outlined in this article, you can design and implement your own stateful serverless workflows using AWS Step Functions and SDK integrations. Whether you are building an order fulfillment system, a data processing pipeline, or any other type of workflow, AWS Step Functions can help you achieve your goals with ease.

Top comments (0)