Are you looking to stress your applications in order to isolate faults - so that you can address these before you experience these faults in production?
AWS launched the chaos engineering fully managed service Fault Injection Simulator (FIS) at re:invent2020 and it went generally available in March 2021.
I remember upon hearing about this release; how flashbacks of early morning callouts, Friday afterno-- :)...came flooding in.
This was and remains an exciting release in its own right - but for me personally, as someone who spent years doing production support - this is a really exciting release.
AWS FIS allows customers to run controlled experiments in order to stress test their applications and validate resiliency - and ultimately practice with failure scenarios before they happen in production.
The process of running experiments on AWS using AWS FIS is a two-phased process; first you create the experiment template and then you start the experiment.
There are three ways in which you can create the experiment template; you can use the FIS console or the AWS Command Line Interface (CLI). It is also possible to use Systems Manager documents to inject faults into your EC2 instances.
In this article, we will look at how to create experiment templates using the AWS CLI.
What services can you set as targets in an AWS FIS experiment?
Each AWS FIS experiment targets a specific set of AWS resources and performs a set of actions on them.
Currently the following services are supported as targets for AWS FIS experiments:
- Amazon Elastic Compute Cloud (EC2),
- Amazon Elastic Container Service (ECS),
- Amazon Elastic Kubernetes Service (EKS), and
- Amazon Relational Database Service (RDS)
Pre-requisites to running an AWS FIS experiment
Basic principles and guidelines that must be followed prior to running your AWS FIS experiment are outlined in the user guide, I therefore won't detail them here. On a high level - however, the following considerations are necessary as pre-requisites to running an AWS FIS experiment:
- Identify the target deployment for the experiment
- Review the application architecture
- Define steady-state behavior
- Form a hypothesis
1) IAM permissions for a successful AWS FIS experiment
To use AWS FIS, the following Identity & Access Management (IAM) permissions need to be in place:
- Permissions for the IAM users and roles that will work with AWS FIS
- Permissions for AWS FIS that allow it to run experiments on your behalf
- Service-linked roles in AWS FIS (taken care of by AWS FIS - nothing for you to do here)
As you may already be aware, policies in AWS define permissions. When you create a permissions policy to restrict access to a resource, you can choose an identity-based policy or a resource-based policy.
Following the definition above, in order for a successful AWS FIS experiment - we need to define identity-based policies.
The identity-based policies will be attached to the IAM user and the role that will work with AWS FIS.
The policies are detailed in the IAM permissions set up guide and as seen in the diagrams below depicting how I defined these in my own account:
1.1) IAM user that will work with AWS FIS
Let's first take a look at the user that will be creating the template.
1.2) User permissions
The user illustrated in 1.1) above will require the following permissions as described in Step 1 of the IAM permissions set up guide - also see in the video below these permissions as attached to the user:
1.3) Role permissions
Step 2 of the IAM permissions set up guide details how to set up the IAM role for the AWS FIS service. The role grants the AWS FIS service permission to perform actions on your behalf. The IAM policy for the IAM role must grant permission to modify the resources that you specify as targets in your experiment template.
For my demo, I used the policy as it stands in the set up guide. However it is important to note that as a good practice, it is recommended to follow the standard security advice of granting least privilege. This can be done by specifying specific resource Amazon Resource Names (ARNs) or tags in your policy.
1.4) Trust Relationships
Lastly, the IAM role must have a trust relationship that allows the AWS FIS service to assume the role.
Again, I used the trust relationship exactly as illustrated in the set up guide.
2) Create the experiment template
You are done - almost! All your permissions are now all set, you are now ready to create your experiment template.
As previously mentioned, we will be using the AWS CLI to create the experiment template. AWS has provided examples of how you can construct various templates in JavaScript Object Notation (JSON). You can copy and change these examples accordingly - in order to get started with creating your templates.
In addition to these, you can also check my GitHub repository for the JSON format that I used in the demo illustrated here.
Create the template by running the following command in the AWS CLI, substituting "input-file" with the actual name of your file:
A successfully created experiment template can be viewed - and started - in the AWS FIS console.
Conclusion
I am looking forward to hearing feedback from you - to hear about all the experiments that you create, and how you ultimately stress test your applications for better resiliency.
Here is a service launch session video - presented by Adrian Hornsby from re:invent2020 that details AWS FIS.
References
- AWS Fault Injection Simulator โ Use Controlled Experiments to Boost Resilience (https://aws.amazon.com/blogs/aws/aws-fault-injection-simulator-use-controlled-experiments-to-boost-resilience/)
- Photo by Girl with red hat on Unsplash
Top comments (0)