In one of my previous posts Building a serverless connected BBQ as SaaS - Part 4 - AuthZ I touched on the topic around Authentication and Authorization with distributed PEPs (Policy Enforcement Points) and a centralized PDP (Policy Decision Point). In this post I will dig a bit deeper and expand on that setup. I'll explore how these concepts work in practice, the benefits they offer, and how we can leverage them in our serverless architecture using AWS Lambda, API Gateway, and Cognito User Pools.
Additionally, I’ll talk about Role-Based Access Control (RBAC) model, how to implement it using Cognito Groups and DynamoDB, and how caching can boost the performance of our authorization system.
The entire setup, with detailed deployment instructions, and all the code can be found on Serverless Handbook PEP and PDP
Let's start with a short recap.
Authentication (AuthN) vs. Authorization (AuthZ)
It’s crucial to distinguish between Authentication and Authorization, two terms that often get mixed up, and that I have had to explain on so many occasions, but serve very different purposes.
Authentication (AuthN)
Authentication is all about verifying identity. It answers the question, Who are you?
When a user logs into our application, authentication ensures that they are who they claim to be. This could involve something as simple as a username and password or more complex multi-factor authentication (MFA).
Authorization (AuthZ)
Once a user’s identity is authenticated, authorization kicks in. This process answers the question, What can you do?
Authorization determines what resources, data, and actions a user is permitted to access based on their roles and permissions.
What are PEP and PDP?
Before diving into how to implement them in AWS, it’s essential to understand the roles that PEP and PDP play in authorization.
PEP (Policy Enforcement Point)
In simple terms, the PEP is the gatekeeper. It is the points in our system where access control decisions are enforced. When a user attempts to access a protected resource, the PEPs is the component responsible for checking whether the request is allowed or denied based on the user’s permissions.
In our case, the Lambda Authorizer in API Gateway acts as the PEP. The Lambda Authorizer intercepts every incoming API request, validates the JWT token (typically from Cognito User Pool or any identity provider), and forwards the user’s information (claims) to the PDP for authorization evaluation.
The PEP ensures that the JWT is valid, checks its expiration, verifies its signature, and validates claims (like aud
, iss
, and sub
). It then passes the claims to the PDP for a final decision on whether the user is authorized to access the requested resource.
PDP (Policy Decision Point)
The PDP is where the authorization logic resides. Once the PEP checks the JWT and ensures that the token is valid, the PDP determines whether the user is allowed to access the requested resource based on their roles, permissions, or policies.
The PDP is a separate, implementation, a separate micro service. In our case a Lambda function, that performs the actual authorization decision. It checks the user’s roles, which are stored in the groups
claim in the JWT (from Cognito), and compares them against the permissions required to access a specific resource, stored in a data store. In our case we'll use DynamoDB.
The PDP validates if the user has the necessary permissions (like Admin
, User
, or Manager
) to access the resource (e.g., GET /admin
, POST /profile
). The PDP can also incorporate additional business logic, such as checking time-based access or geo-fencing.
Benefits of using PEP and PDP in Authorization
Implementing distributed PEP and centralized PDP offers several benefits, especially as our applications scales.
Separation of Concerns By splitting the concerns of enforce (PEP) and decision (PDP), we gain cleaner, more maintainable code. The Lambda Authorizer (PEP) is focused purely on validation and enforcement. While the PDP is dedicated to policy evaluation.
Reduced Latency: By placing PEPs close to where decisions need to be enforced, we can reduce the latency, with an caching strategy this can be reduced even more.
Management: With a centralized PDP, all of our authorization logic is centralized in one location. This makes it easier to manage and update policies as our requirements evolve. Whether it’s modifying roles or adding new permission sets, having a central PDP reduces the overhead of updating policies in multiple places.
Consistency and Compliance: Every request is evaluated against the same set of policies, ensuring consistent decision-making across our system.
Scalability: Both the PEP and PDP components scale independently based on demand. If our system needs to handle a larger volume of requests, API Gateway and Lambda can scale automatically. Additionally, the PDP can be optimized for performance by implementing caching.
Flexibility: A PDP allows us to adapt the authorization model to our needs. If our requirements change (for example, moving to attribute-based access control (ABAC) or introducing a more granular permission system), we can easily modify the PDP to accommodate these changes without affecting other parts of the system.
Using PEP and PDP in AWS with Serverless Architecture
In AWS, the PEP and PDP integration fits perfectly with serverless components like Lambda and API Gateway.
PEP - API Gateway Lambda Authorizer
When a client sends a request to our API Gateway endpoint, the Lambda Authorizer (PEP) intercepts the request before it hits our backend service. Our implementation will perform several key steps.
JWT Validation: It decodes the JWT, validates the signature, and checks if the token has expired.
Forwarding Claims: After verifying the token, the Lambda Authorizer forwards the claims (such as sub
, groups
, and role
) to the PDP for further authorization checks. In our solution we will actually forward the entire JWT token.
To reduce the number of calls to our PEP and also PDP we can utilize the authorization cache that exists in API Gateway.
PDP - Authorization logic Lambda function
The PDP is implemented as separate Lambda function, in our case, and will receive the entire JWT token, or claims, to perform the authorization logic, that will include several steps.
- Check the user’s role (using the
groups
claim from Cognito). - Query a DynamoDB table that contains role-to-permission mappings (e.g., which roles have access to which API endpoints).
- Evaluates whether the user’s role matches the required permissions for the requested, resource or API endpoint.
ID Token vs Access Token
As we implement the PEP and PDP workflow, it’s essential to understand the difference between ID Tokens and Access Tokens, as both are often used in authorization workflows.
ID Token
The ID Token is primarily used for authentication and contains information about who the user is. It contains claims about the identity of the authenticated user, such as name, email, and phone_number.
Access Token
The Access Token is used to grant the user access to protected resources, authorization. The Access Token contains information about the user’s permissions, such as what resources they are allowed to access and the scopes they have been granted, which define what the user can do (e.g., read:profile
, write:profile
). The access token do not include the aud
claim.
Token customization in Cognito
With the Pre token generation Lambda trigger we could before only customize the ID Token, therefor it was often used for authorization as well. With the introduction of new V2 event in Cognito User Pools we can customize both the ID anf Access token.
Implementing PEP and PDP
With that introduction completed let's dig into implementing a PEP and PDP with RBAC. Our PEP will be the Lambda Authorizer in API Gateway and our PDP will be a separate Lambda function. The PDP will use using Cognito Groups and DynamoDB for the RBAC authorization logic.
Architecture Overview
Just as a reminder, the entire code and all of the architecture can be found on Serverless Handbook PEP and PDP
In this solution we will implement our PEP using Lambda Authorizer in API Gateway. The PDP in this case will also be implemented using a Lambda function. We will assign users a Role using Cognito Groups and we keep an Role - Permission mapping in DynamoDB.
To better understand the flow during an API access.
As seen we will not use an API Gateway for our PDP. Instead our PEP will invoke the PDP Lambda function. There are pros and cons with this approach of course.
On the pro side we have lower latency, a direct Lambda invocation is often faster than an API call. Lower cost as we don't have to pay for the API Gateway invocation. On the backside, we do create a more tight coupling and changing the PDP implementation might get harder. We would need to implement a separate cache in the PDP, using an API Gateway we could rely on the API Gateway cache.
However, the approach you choose need to be a case by case approach, there is not a golden rule exactly how to implement this.
Deploy authentication and Cognito
The first thing we will do is to deploy and setup Cognito and the resources needed for login. We will setup the Cognito User Pool, configure the managed login, and a simple website that will handle the callbacks from Cognito and display our JWT tokens. For simplicity it will just be a static html page from CloudFront and some Lambda@Edge functions. I will use the setup that I have described in this blog post, so for a deep dive I recommend that you read that.
So as a first step deploy the Lambda@Edge, CloudFront distribution, and SSL certificate from Serverless Handbook PEP and PDP
Next, let's deploy and setup Cognito. We will create the UserPool, a client, login style, etc.
AWSTemplateFormatVersion: "2010-09-09"
Transform: "AWS::Serverless-2016-10-31"
Description: Creates the User Pool and Client used for Authentication
Parameters:
ApplicationName:
Type: String
Description: The application that owns this setup.
DomainName:
Type: String
Description: The domain name to use for cloudfront
HostedAuthDomainPrefix:
Type: String
Description: The domain prefix to use for the UserPool hosted UI <HostedAuthDomainPrefix>.auth.[region].amazoncognito.com
Resources:
UserPool:
Type: AWS::Cognito::UserPool
Properties:
UsernameConfiguration:
CaseSensitive: false
AutoVerifiedAttributes:
- email
UserPoolName: !Sub ${ApplicationName}-user-pool
Schema:
- Name: email
AttributeDataType: String
Mutable: false
Required: true
- Name: name
AttributeDataType: String
Mutable: true
Required: true
UserPoolClient:
Type: AWS::Cognito::UserPoolClient
Properties:
UserPoolId: !Ref UserPool
GenerateSecret: True
AllowedOAuthFlowsUserPoolClient: true
CallbackURLs:
- !Sub https://${DomainName}/signin
AllowedOAuthFlows:
- code
- implicit
AllowedOAuthScopes:
- phone
- email
- openid
- profile
SupportedIdentityProviders:
- COGNITO
HostedUserPoolDomain:
Type: AWS::Cognito::UserPoolDomain
Properties:
Domain: !Ref HostedAuthDomainPrefix
ManagedLoginVersion: 2
UserPoolId: !Ref UserPool
ManagedLoginStyle:
Type: AWS::Cognito::ManagedLoginBranding
Properties:
ClientId: !Ref UserPoolClient
UserPoolId: !Ref UserPool
UseCognitoProvidedValues: true
UserPoolIdParameter:
Type: AWS::SSM::Parameter
Properties:
Name: !Sub /${ApplicationName}/userPoolId
Type: String
Value: !Ref UserPool
Description: SSM Parameter for the User Pool Id
Tags:
ApplicationName: !Ref ApplicationName
UserPoolHostedUiParameter:
Type: AWS::SSM::Parameter
Properties:
Name: !Sub /${ApplicationName}/userPoolHostedUi
Type: String
Value: !Sub https://${HostedAuthDomainPrefix}.auth.${AWS::Region}.amazoncognito.com/login?client_id=${UserPoolClient}&response_type=code&scope=email+openid+phone+profile&redirect_uri=https://${DomainName}/signin
Description: SSM Parameter for the User Pool Hosted UI
Tags:
ApplicationName: !Ref ApplicationName
Outputs:
CognitoUserPoolJwksUri:
Value: !Sub https://cognito-idp.${AWS::Region}.amazonaws.com/${UserPool}/.well-known/jwks.json
Description: The UserPool jwks uri
Export:
Name: !Sub ${AWS::StackName}:jwks-url
CognitoUserPoolID:
Value: !Ref UserPool
Description: The UserPool ID
CognitoAppClientID:
Value: !Ref UserPoolClient
Description: The app client
Export:
Name: !Sub ${AWS::StackName}:app-audience
CognitoUrl:
Description: The url
Value: !GetAtt UserPool.ProviderURL
CognitoHostedUI:
Value: !Sub https://${HostedAuthDomainPrefix}.auth.${AWS::Region}.amazoncognito.com/login?client_id=${UserPoolClient}&response_type=code&scope=email+openid+phone+profile&redirect_uri=https://${DomainName}/signin
Description: The hosted UI URL
With this deployment done we can move over to the Console and create groups that users can be added to. I will create three groups, Admin
, Developer
, and Test
. Click on Create Group
and give it a name. The group name would represent the Role that the user will have and determine what permission he/she will get, more on that setup further down.
We can then create some users and assign them to one of the groups.
To test this setup we can navigate to the webpage deployed with the CloudFront distribution and inspect the JWT tokens, cookies.
If we copy the access token and decode that, I use jwt.io, we can see that my user has the claim cognito:groups
that our PEP and PDP will use later for permissions.
Setup and deploy PDP
Next we can deploy our Authorization service, our PDP, responsible for making permission decisions.
Logic will be implemented in a Lambda function and to manage role-based permissions, we will create a DynamoDB tabe that stores permissions for each role. Each permission defines what resources the user can access this could be a specific API endpoint and HTTP method, but of course not limited to that. We'll model the table data
PK (Partition Key): The Role (e.g., Admin
, User
).
SK (Sort Key): The resource, for example endpoint and Method e.g. GET /unicorn
.
Action: The action e.g. GET, PUT, WRITE, READ, LIST etc
Resource: The Resource, for example the endpoint /unicorn
Effect: The Effect, Allow or Deny
Description: A description of the permission.
PK | SK | Action | Resource | Effect | Description |
---|---|---|---|---|---|
Admin | GET /unicorn | GET | /unicorn | Allow | Admin can access all unicorns |
Test | POST /unicorn | POST | /unicorn | Allow | Test can post on unicorns |
Developer | DELETE /unicorn | DELETE | /unicorn | Deny | Manager cannot delete a unicorn |
This allows us to efficiently look up permissions for each role using a simple DynamoDB query.
AWSTemplateFormatVersion: "2010-09-09"
Transform: "AWS::Serverless-2016-10-31"
Description: Connected BBQ Application Tenant Service
Parameters:
ApplicationName:
Type: String
Description: Name of owning application
UserManagementStackName:
Type: String
Description: The name of the stack that contains the user management part, e.g the Cognito UserPool
Globals:
Function:
Timeout: 30
MemorySize: 2048
Architectures:
- arm64
Runtime: python3.12
Resources:
PermissionsTable:
Type: AWS::DynamoDB::Table
Properties:
TableName:
Fn::Sub: ${ApplicationName}-pdp-role-permission-map
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: PK
AttributeType: S
- AttributeName: SK
AttributeType: S
KeySchema:
- AttributeName: PK
KeyType: HASH
- AttributeName: SK
KeyType: RANGE
LambdaPDPFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: Lambda/AuthZ
Handler: authz.handler
Policies:
- DynamoDBReadPolicy:
TableName: !Ref PermissionsTable
Environment:
Variables:
JWKS_URL:
Fn::ImportValue: !Sub ${UserManagementStackName}:jwks-url
AUDIENCE:
Fn::ImportValue: !Sub ${UserManagementStackName}:app-audience
PERMISSIONS_TABLE:
!Ref PermissionsTable
Outputs:
PDPLambdaArn:
Value: !GetAtt LambdaPDPFunction.Arn
Description: The ARN of the PDP Lambda Function
Export:
Name: !Sub ${AWS::StackName}:pdp-lambda-arn
PDPLambdaName:
Value: !Ref LambdaPDPFunction
Description: The Name of the PDP Lambda Function
Export:
Name: !Sub ${AWS::StackName}:pdp-lambda-name
Role Authorization logic
The PDP Lambda will decode the JWT, retrieve the role from the cognito:groups
claim, and query the DynamoDB table to check if the role has permission to access the requested resource.
import os
import json
import jwt
import boto3
from jwt import PyJWKClient
from botocore.exceptions import ClientError
dynamodb = boto3.resource("dynamodb")
table = dynamodb.Table(os.environ["PERMISSIONS_TABLE"])
JWKS_URL = os.environ["JWKS_URL"]
AUDIENCE = os.environ["AUDIENCE"]
def handler(event, context):
data = event
jwt_token = data["jwt_token"]
resource = data["resource"]
action = data["action"]
return check_authorization(jwt_token, action, resource)
def check_authorization(jwt_token, action, resource):
try:
jwks_client = PyJWKClient(JWKS_URL)
signing_key = jwks_client.get_signing_key_from_jwt(jwt_token)
decoded_token = jwt.decode(
jwt_token,
signing_key.key,
algorithms=["RS256"],
audience=AUDIENCE,
)
role = (
decoded_token["cognito:groups"][0]
if "cognito:groups" in decoded_token
else None
)
if not role:
raise Exception("Unauthorized: Role not found in the token")
if validate_permission(role, action, resource):
response_body = generate_access(
decoded_token["sub"], "Allow", action, resource
)
return {
"statusCode": 200,
"body": json.dumps(response_body),
"headers": {"Content-Type": "application/json"},
}
except Exception as e:
print(f"Authorization error: {str(e)}")
response_body = generate_access(decoded_token["sub"], "Deny", action, resource)
return {
"statusCode": 403,
"body": json.dumps(response_body),
"headers": {"Content-Type": "application/json"},
}
def validate_permission(role, action, resource):
print(f"validate_permission Role: {role}, Action: {action}, Resource: {resource}")
try:
response = table.query(
KeyConditionExpression="PK = :role AND SK = :endpoint",
ExpressionAttributeValues={
":role": role,
":endpoint": f"{action} {resource}",
},
)
if response["Items"] and response["Items"][0]["Effect"] == "Allow":
return True
else:
return False
except ClientError as e:
print(f"Error querying DynamoDB: {e}")
return False
def generate_access(principal, effect, action, resource):
auth_response = {
"principalId": principal,
"effect": effect,
"action": action,
"resource": resource,
}
return auth_response
Deploy API and PEP
Now we can deploy our API and PEP, Lambda Authorizer.
AWSTemplateFormatVersion: "2010-09-09"
Transform: "AWS::Serverless-2016-10-31"
Description: Create the API for self service certificate management
Parameters:
ApplicationName:
Type: String
Description: Name of owning application
UserManagementStackName:
Type: String
Description: The name of the stack that contains the user management part, e.g the Cognito UserPool
PDPStackName:
Type: String
Description: The name of the stack that contains the PDP service
Globals:
Function:
Timeout: 30
MemorySize: 2048
Runtime: python3.12
Resources:
LambdaGetUnicorn:
Type: AWS::Serverless::Function
Properties:
CodeUri: Lambda/API/GetUnicorn
Handler: handler.handler
Events:
GetUnicorns:
Type: Api
Properties:
Path: /unicorn
Method: get
RestApiId: !Ref UnicornApi
UnicornApi:
Type: AWS::Serverless::Api
Properties:
Description: API for creating and managing Unicorns
Name: !Sub ${ApplicationName}-api
StageName: prod
OpenApiVersion: '3.0.1'
AlwaysDeploy: true
EndpointConfiguration: REGIONAL
Cors:
AllowMethods: "'GET,PUT,POST,DELETE,OPTIONS'"
AllowHeaders: "'Content-Type,Authorization,X-Amz-Date,X-Api-Key,X-Amz-Security-Token'"
AllowOrigin: "'*'"
Auth:
AddDefaultAuthorizerToCorsPreflight: false
Authorizers:
LambdaRequestAuthorizer:
FunctionArn: !GetAtt LambdaApiAuthorizer.Arn
FunctionPayloadType: REQUEST
Identity:
Headers:
- Authorization
ReauthorizeEvery: 600
DefaultAuthorizer: LambdaRequestAuthorizer
LambdaApiAuthorizer:
Type: AWS::Serverless::Function
Properties:
CodeUri: Lambda/Authorizer/
Handler: auth.handler
Policies:
- LambdaInvokePolicy:
FunctionName:
Fn::ImportValue: !Sub ${PDPStackName}:pdp-lambda-name
Environment:
Variables:
JWKS_URL:
Fn::ImportValue: !Sub ${UserManagementStackName}:jwks-url
AUDIENCE:
Fn::ImportValue: !Sub ${UserManagementStackName}:app-audience
PDP_AUTHZ_ENDPOINT:
Fn::ImportValue: !Sub ${PDPStackName}:pdp-lambda-name
We set our PEP as the default authorizer that way it will be added to each resource and method. To reduce the number of calls to our PDP the Authorization cache in API gateway is used with a TTL of 600 seconds.
PEP Authorization logic
The PEP Lambda Authorizer will decode the JWT, check the validity, and then call the PDP for a final permission decision.
import os
import json
import jwt
import boto3
from jwt import PyJWKClient
lambda_client = boto3.client("lambda")
def handler(event, context):
print(f"Event: {json.dumps(event)}")
token = event["headers"].get("authorization", "")
path = event["path"]
method = event["httpMethod"]
if not token:
raise Exception("Unauthorized")
token = token.replace("Bearer ", "")
decoded_token = None
try:
jwks_url = os.environ["JWKS_URL"]
jwks_client = PyJWKClient(jwks_url)
signing_key = jwks_client.get_signing_key_from_jwt(token)
decoded_token = jwt.decode(
token,
signing_key.key,
algorithms=["RS256"],
audience=os.environ["AUDIENCE"],
)
data = {
"jwt_token": token,
"resource": path,
"action": method,
}
response = lambda_client.invoke(
FunctionName=os.environ["PDP_AUTHZ_ENDPOINT"],
InvocationType="RequestResponse",
Payload=json.dumps(data),
)
response_payload = json.loads(response["Payload"].read())
body = json.loads(response_payload["body"])
effect = body["effect"]
return generate_policy(
decoded_token["sub"], effect, event["methodArn"], decoded_token
)
except Exception as e:
print(f"Authorization error: {str(e)}")
return generate_policy(
decoded_token["sub"], "Deny", event["methodArn"], decoded_token
)
def generate_policy(principal_id, effect, resource):
auth_response = {
"principalId": principal_id,
"policyDocument": {
"Version": "2012-10-17",
"Statement": [
{"Action": "execute-api:Invoke", "Effect": effect, "Resource": resource}
],
},
}
return auth_response
Importance of caching
Caching is important in optimizing our authorization flow. By reducing calls to the PDP and speeding up decision-making, caching helps improve the overall performance, scalability, and cost-efficiency of our application.
Reduce Latency: By caching role and permission data, the PEP avoids repeated calls our PDP, leading to faster response times and lower latency for each request.
Decrease PDP Load: Caching minimizes the number of calls made our PDP, reducing the risk of hitting rate limits or throttling.
Improve Scalability: With fewer requests hitting our PDP, our architecture can scale more efficiently.
Lower Costs: Caching reduces the need for repeated PDP invocations, which directly lowers Lambda invocation costs.
Summary and conclusion
Implementing PEP and PDP in our authorization flow offers a highly scalable, flexible, and secure way to control access to resources. By leveraging AWS Lambda and API Gateway, we can build a serverless authorization system that separates authentication and authorization concerns, scales with demand, and simplifies policy management.
With the addition of Role-Based Access Control and DynamoDB for storing permissions, combined with in-memory caching for enhanced performance, we can create an authorization solution that fits both current and future needs.
Understanding the difference between ID Tokens and Access Tokens ensures that our system uses each appropriately, helping us build a more secure and efficient authorization system.
Happy coding, and stay secure!
Source Code
The entire setup, with detailed deployment instructions, and all the code can be found on Serverless Handbook PEP and PDP
Final Words
Don't forget to follow me on LinkedIn and X for more content, and read rest of my Blogs
As Werner says! Now Go Build!
Top comments (0)