DEV Community

Cover image for [Solved] AWS Resource limit exceeded
Math
Math

Posted on

[Solved] AWS Resource limit exceeded

TL;DR:

While deploying an OpenSearch domain with logging via CDK, I hit a “Resource limit exceeded” error due to more than 10 resource access policies for CloudWatch log groups. To fix this, I created CloudWatch log groups in CDK and passed their ARNs to a Lambda function. In the Lambda, I used the AWS SDK’s putResourcePolicy to update the existing policy with OpenSearch as the principal and attached the log group ARNs. I also set suppressLogsResourcePolicy: true in OpenSearch to stop CDK from creating resource policies automatically. This bypassed the limit and gave me full control over the policies.


Introduction

While deploying my CDK stack for OpenSearch with logging enabled, I encountered this error:

Received response status [FAILED] from custom resource. Message returned: Resource limit exceeded.
Enter fullscreen mode Exit fullscreen mode

This error occurs when attempting to create more than 10 resource access policies for CloudWatch. Finding a solution wasn't straightforward, especially since there weren't many resources addressing this specific issue. Let me share how I resolved it.

P.S. This works for updating existing resource policies for any service and is NOT specific to OpenSearch — just skip the last step.


The Solution

Step 1: Create Log Groups required

First, I created the necessary CloudWatch log groups with appropriate naming and retention policies:

this.opensearchAppLogGroup = new logs.LogGroup(this, `${props.id}-opensearch-app-loggroup`, {
    logGroupName: `/aws/opensearch/app`,
    removalPolicy: props.domainRemovalPolicy,
});

this.opensearchSlowIndexLogGroup = new logs.LogGroup(this, `${props.id}-opensearch-slowIndex-loggroup`, {
    logGroupName: `/aws/opensearch/slow-index`,
    removalPolicy: props.domainRemovalPolicy,
});

this.opensearchSlowSearchLogGroup = new logs.LogGroup(this, `${props.id}-opensearch-slowSearch-loggroup`, {
    logGroupName: `/aws/opensearch/slow-search`,
    removalPolicy: props.domainRemovalPolicy,
});
Enter fullscreen mode Exit fullscreen mode

Step 2: Set Up Lambda to Update the Existing Resource Policy

Implementing the SDK inside CDK was a gamechanger for me. I wrote a Lambda function that checks for existing CloudWatch resource policies, adds missing log group ARNs, and updates or creates a statement allowing OpenSearch (es.amazonaws.com) to access the log groups. It then applies the updated policy using put_resource_policy to ensure OpenSearch has the correct permissions.

import json
import boto3
import os

cloudwatch_logs = boto3.client('logs')

def handler(event, context):
    policy_name = os.environ.get('POLICY_NAME')
    new_resources = os.environ.get('LOG_GROUP_ARN', None)

    if isinstance(new_resources, str):
        new_resources = new_resources.split(',')
    else:
        new_resources = []


    try:
        response = cloudwatch_logs.describe_resource_policies()
        # Check if the specified policy exists
        existing_policy = next(
            (policy for policy in response.get('resourcePolicies', [])
             if policy.get('policyName') == policy_name), 
            None
        )

        if existing_policy:
            policy_document = json.loads(existing_policy['policyDocument'])
        else:
            print('Policy not found. Creating a new policy.')
            policy_document = {
                "Version": "2012-10-17",
                "Statement": []
            }

        new_resources_str = [str(resource) for resource in new_resources]

        existing_policy_document_str = json.dumps(policy_document)

        resources_to_add = []

        for new_resource in new_resources_str:
            if new_resource not in existing_policy_document_str:
                print(f"New resource {new_resource} not found, adding it.")
                resources_to_add.append(new_resource)

        if resources_to_add:
            es_statement = next(
                (stmt for stmt in policy_document['Statement']
                 if stmt['Principal'].get('Service') == 'es.amazonaws.com'), 
                None
            )

            if es_statement:
                if 'Resource' not in es_statement:
                    es_statement['Resource'] = []
                es_statement['Resource'].extend(resources_to_add)
            else:
                es_statement = {
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "es.amazonaws.com"
                    },
                    "Action": [
                        "logs:CreateLogStream",
                        "logs:PutLogEvents"
                    ],
                    "Resource": resources_to_add
                }
                policy_document['Statement'].append(es_statement)

        else:
            print('No new resources to add.')
        update_params = {
            'policyName': policy_name,
            'policyDocument': json.dumps(policy_document),
        }

        update_response = cloudwatch_logs.put_resource_policy(**update_params)

        return {'status': 'SUCCESS'}

    except Exception as e:
        print('Error occurred while updating policy:', str(e))
        return {'status': 'FAILURE', 'error': str(e)}
Enter fullscreen mode Exit fullscreen mode

Step 3: Create a Custom Resource

After creating my Lambda function, I needed to trigger it at the right time during stack deployment.

Step 4: Update OpenSearch Domain Configuration

Finally, I configured OpenSearch to use the log groups and disabled automatic resource policy creation:

logging: {
    appLogEnabled: true,
    appLogGroup: opensearchAppLogGroup,
    slowIndexLogEnabled: true,
    slowIndexLogGroup: opensearchSlowIndexLogGroup,
    slowSearchLogEnabled: true,
    slowSearchLogGroup: opensearchSlowSearchLogGroup,
},
suppressLogsResourcePolicy: true,  
Enter fullscreen mode Exit fullscreen mode

Gotchas and Lessons Learned

What Didn't Work

I tried using logs.fromLogGroupName() with its addToResourcePolicy method, but that gave me the same resource limit error. Apparently, you can't modify resources outside your stack this way (thanks, GitHub issue #6548!).

Mistakes along the way

Initially, I created both the log groups and the Lambda function within the same construct, while OpenSearch was placed in a different construct that received the log groups as props. This setup caused issues because the log groups were being referenced by the correct log group name in the same construct and a different one outside of the construct in the opensearch construct, which led to the Lambda failing to update the policy correctly. The solution came when I realized the importance of ensuring that the log groups were referenced by their concrete names. This insight came from reviewing the CDK implementation, which can be found in this file and the docs of it say:

"Returns an environment-sensitive token that should be used for the resource's 'name' attribute (e.g., bucket.bucketName). Normally, this token will resolve to nameAttr, but if the resource is referenced across environments, it will be resolved to this.physicalName, which will be a concrete name."


Final Thoughts

While this solution worked for me, I’m still relatively new to CDK, so there might be better approaches out there. I just wanted to document my findings in one place, hoping that it might help someone, even in the smallest way. The GitHub issues I referenced were incredibly helpful in providing context and guiding me toward a solution. I honestly wouldn’t have been able to find a resolution without those discussions. If you have a more elegant solution or if I’ve made any mistakes in my understanding, I’d truly appreciate hearing your thoughts!

References

https://github.com/aws/aws-cdk/pull/28707

https://github.com/aws/aws-cdk/issues/23637

https://github.com/aws/aws-cdk/issues/6548

https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_logs.LogGroup.html#static-fromwbrlogwbrgroupwbrarnscope-id-loggrouparn

https://github.com/aws/aws-cdk/blob/main/packages/aws-cdk-lib/aws-logs/lib/log-group.ts

https://github.com/aws/aws-cdk/issues/20567

Top comments (1)

Collapse
 
sooraj_js profile image
sooraj js

Very helpful , it saved me days on my work.
Thanks alot