Alexy Grabov for AWS Community Builders

Posted on Jun 28 • Originally published at alexy-grabov.Medium

AWS Resource Names Validation and Generation

#aws #validation #python #boto

Have you ever wondered how can you validate AWS resource definitions (names, ARNs, patterns) in runtime? Well, if you have, you probably know that you can’t.

In this blogpost we’ll cover the current solutions, their limitations and introduce a new open-source package that actually can perform those validation for you automatically. It can also generate said patterns for your testing and mocking needs.

Workflows Without Validation or Generation

Let me tell you a short story and see if it sounds familiar to you.

You make changes to your CDK code, you deploy. A CloudFormation template is synthesized, which takes a good minute. Then, AWS starts creating the resources you requested. It runs for a few minutes — and fails. Your Lambda function name is too long.

This, of course, is not the only workflow that might be affected by validations that are performed too late. Imagine your code receives a string during a business flow, either as user input or from another application it interacts with. It’s supposed to represent an ARN of a resource — but it doesn’t. You code tries to “access” this resource using a boto3 client — and fails. Now, you need to debug. Is this resource really missing? was is deleted? did AWS fail to find it? did the boto3 client expect to receive it in a different format?

Let's also consider testing. Often when you need to test your logic flows, you might need to somehow get your hands on “real” AWS resource ARNs or paths. You might need them to check your internal validations and error flows, or just use them as return values to your mocks. What most developers do in those cases is just go to their development AWS environment, find the suitable resource — and copy names or parameters to their test code.

Resource Schemas and Constraints Sources

AWS is usually really good with documentation, and this case is no different. You can actually find CloudFormation schemas publicly published, but I really doubt anyone actually reads those.

A more convenient way would be to search for any constraints or validation patterns in docs.aws.amazon.com, and indeed we can have a look at this example for a Lambda Function create body:

It’s not bad, as far as documentation goes. But who has the time or patience to read it, or search for it every time it’s required?

Current Solutions

You probably know that in software engineering you’re probably not the first ever to hit a certain issue. Someone has probably already dealt with this same exact thing, there are probably already 10 threads on StackOverflow and Reddit discussing it, other engineers have suggested solutions etc. Just pick a solution you like and copy-paste.

In this case, unfortunately, I was unable to find a suitable solution to some of the problems I was facing.

First, AWS itself have recently acknowledged my pain and have baked some of the validation functionality into AWS CloudFormation.

AWS CloudFormation improves its deployment experience to validate customer stack operation upfront for invalid resource property errors.

{
  "StackId": "arn:aws:cloudformation:us-west-2:123456789012:stack/MyStack/50d6e750-5a71-11e6-afc7-50d5ca9f1234",
  "EventId": "6ba1a560-5a71-11e6-bf4a-500c28168c4b",
  "StackName": "MyStack",
  "LogicalResourceId": "MyS3Bucket",
  "ResourceType": "AWS::S3::Bucket",
  "Timestamp": "2024-03-14T19:57:18.129Z",
  "ResourceStatus": "CREATE_FAILED",
  "ResourceStatusReason": "Property validation failure: [unexpected property PropertyName1]",
  "ResourceProperties": {
    "BucketName": "my-bucket",
    "PropertyName1": "invalid-value",
    "AccessControl": "PublicRead"
  }
}

What this means is that your deployments will fail much sooner, when CDK generates the CloudFormation template, instead of during the later deployment stage, without creating any actual resources on AWS. Great news!

If you like your validation a little bit more hard-core, you might want to look at AWS CloudFormation Linter. Remember all those CloudFormation chemas nobody reads? Well, those guys have (or, at leased, parsed) and have created a linter based on them.

It can run as a standalone linter before CF tried to template your CDK code and also supports custom rules. This allows you to add custom validations for conventions that might be specific to your organization.

Very cool, but still does not solve our runtime and testing challenges.

Runtime Resource Property Validator and Generator

To solve our real-time validation and testing problems, we must have a solution that runs on-demand, and not in the linting to deployment stage.

Exactly for those purposes, I have developed the aws_resource_validator package.

It contains auto-generated classes from this botocore dataclasses repository (special shout-out to fellow AWS Community Builder Michael Kirchner who made me aware of it). Each respective class represents an AWS service. Within each service there are resources which can be accessed using CamelCase and snake_case names. Each resource has some informative fields about it’s own limitations and expected pattern. It can also validate itself — and generate a string conforming to all validations and patterns for your testing needs.

So a typical resource looks like this:

lambda
  - name (or Name)
    - .validate()
    - .generate()
    - pattern
    - min_length
    - max_length
  - arn (or Arn)
    - .validate()
    - .generate()
    - pattern
    - min_length
    - max_length

And a usage example might look like this: (note the mixed usage of camel & snake cases)

from aws_resource_validator.class_definitions import Acm, class_registry

# Use type hint so that you can use `api_registry` with full class definitions
acm: Acm = class_registry.Acm

print(acm.Arn.pattern)
print(acm.Arn.type)
print(acm.arn.validate("example-arn"))
print(acm.Arn.generate())

It’s that simple!

All you have to do is install the package from PyPI and you’re good to go:

pip install aws_resource_validator

By the way, those auto-generated classes I’ve mentioned? They are pretty cool. They are deducted from the JSON files in the botocore repository, their members are generated and then written, as Python-readable code, into a file. It’s actually Python code that writes Python code. No AI, but still makes you wonder when you’re going to be replaced, right?

Anyway, if you want to check out the logic behind this package, you can check out my GitHub repository — it’s open source. Special thanks to my beautiful wife and talented Principal DevOps engineer Yafit Tupman who added GitHub Action based pipelines and overall repository productization, so that you can be sure each release is tested and uploaded to PyPI automatically.

Feel free to open bugs, report issues or contribute code to this project if you find it useful or interesting.

A Common Use Case Example

Let’s take a quick look of a concrete usage example. We will consider a typical Lambda function code where a 3rd party service is passing items for it to process. For the purposes of the example, we will assume those items represent ARNs of IoT Job templates.

from http import HTTPStatus
from typing import Any, Dict, List

from aws_lambda_context import LambdaContext
from aws_resource_validator.class_definitions import JobTemplateArn, class_registry

def handler(event: Dict[str, Any], context: LambdaContext) -> Dict[str, Any]:
    arn_list: List[str] = event['body']  # we would parse and validate the event first, of course
    job_template_arn: JobTemplateArn = class_registry.JobTemplateArn
    if not all(job_template_arn.validate(template_arn) for template_arn in arn_list):
        return {'statusCode': HTTPStatus.BAD_REQUEST, 'headers': {'Content-Type': 'application/json'}, 'Your ARNs are invalid'
    # some boto3 calls here
    return {'statusCode': HTTPStatus.OK, 'headers': {'Content-Type': 'application/json'}, None

As you can see, not only did we not perform any boto3 calls (where we would have discovered the error) — we can also report back to the calling service to inform it of an issue in it’s code.

Now, let’s look at a usage example inside a test. We will test the same handler code.

from typing import Any, Dict, List

from aws_resource_validator.class_definitions import JobTemplateArn, class_registry

def test_handler():
    job_template_arn: JobTemplateArn = class_registry.JobTemplateArn
    arns_list: List[str] = [job_template_arn.generate() for _ in enumerate(10)]
    event = {'body': arns_list}
    assert(handler(event, None))

With just 2 lines of code we have generated a list of 10 valid ARNs that represent IoT job templates to be used in our test. No mocks necessary!

Recap

We have discussed validators for AWS resources naming conventions and why they are important. We have also mentioned the different times when said validators can be run — at code runtime, during the linting stage or during CloudFormation template generation.

We also briefly mentioned the usability of generated resource names during testing and how the aws_resource_validator package can help you with that as well.

Thanks for reading and I hope you learned something new :)

Top comments (1)

Deepak Kumar • Jun 29

Hello everyone,

I hope you're all doing well. I recently launched an open-source project called the Ultimate JavaScript Project, and I'd love your support. Please check it out and give it a star on GitHub: Ultimate JavaScript Project. Your support would mean a lot to me and greatly help in the project's growth.

Thank you!

DEV Community

AWS Resource Names Validation and Generation

Workflows Without Validation or Generation

Resource Schemas and Constraints Sources

Current Solutions

Runtime Resource Property Validator and Generator

A Common Use Case Example

Recap

Top comments (1)

Read next

Unlocking Secure Web Access with Amazon WorkSpaces Secure Browser

Tagging AWS resources the right way using Terraform

A Modern Python Toolkit: Pydantic, Ruff, MyPy, and UV

Automate the Boring Stuff: How I Built a Code Generator to Save Hours of Redundant Work🧑‍💻