DEV Community

Deep Dive on AWS Clean Rooms with Integration to AWS Glue

“ I have checked the documents of AWS to deep dive on aws clean rooms with integration to aws glue. So I checked for aws documents to get an idea on how aws clean rooms works for querying the analysis and used in machine learning modeling. Pricing of aws clean rooms depends on sql compute cost and type of analysis rule.”

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development. AWS Glue provides all of the capabilities needed for data integration so that you can start analyzing your data and putting it to use in minutes instead of months. AWS Glue provides both visual and code-based interfaces to make data integration easier. Users can easily find and access data using the AWS Glue Data Catalog.

AWS Clean Rooms helps you and your partners analyze and collaborate on your collective datasets to gain new insights without revealing underlying data to one another. You can use AWS Clean Rooms, a secure collaboration workspace, to create your own clean rooms in minutes, and start analyzing your collective datasets with just a few steps. You can choose the partners with whom you want to collaborate, select their datasets and configure restrictions for participants.

In this post, you will get to know how to deep dive on aws clean rooms with integration to aws glue. Here I have created a s3 bucket, glue database and table to further integrate with aws clean rooms service.

Prerequisites

You’ll need an Amazon Simple Storage Service for this post. Getting started with Amazon Simple Storage Service provides instructions on how to create a bucket in simple storage service.

You’ll need an AWS Glue for this post. Getting started with AWS Glue provides instructions on how to create a glue database and table.

Architecture Overview

Image description
The architecture diagram shows the overall deployment architecture with data flow, amazon s3, aws glue, aws clean rooms, cloudwatch.

Solution overview

The blog post consists of the following phases:

  1. Create of Members Collaboration in AWS Clean Rooms
  2. Integrate of Glue Database and Table in AWS Clean Rooms and Associate It To Collaboration
  3. Output of Queries Analysis in Collaboration

I have a s3 bucket, glue database and table created as below →

Image description

Image description

Image description

Phase 1: Create of Members Collaboration in AWS Clean Rooms

  1. Open the aws clean rooms console and create a collaboration with required parameters such as name, members you want to add include yourself and other aws account members, specify of member abilities, who pay for queries, query logging info and optional settings. Also specify configure membership info query logging settings, query results store destination and result format. Review the details and create it.

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Phase 2: Integrate of Glue Database and Table in AWS Clean Rooms and Associate It To Collaboration

  1. Configure a new table with a glue database and tables already created. Choose of schema and allowed columns with specify of new table name. Also configure the Analysis rule with certain parameters such as rule type, aggregate function, join controls and soon… Specify query results controls with review and configure it. Associate of table created to collaboration.

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Phase 3: Output of Queries Analysis in Collaboration

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Image description

Clean-up

Delete S3 Bucket, Glue Database and Table, AWS Clean Rooms, Cloudwatch
Log Group.

Pricing

I review the pricing and estimated cost of this example.

Cost of Data Transfer = $0.0

Cost of Cloudwatch = $0.0

Cost of Glue = $0.0

Cost of S3 = $0.02

Cost of AWS Clean Rooms (SQL Compute per CRPU hour is $2.00) = $0.0

Total Cost = $0.02

Summary

In this post, I showed “how to deep dive on aws clean rooms with integration to aws glue”.

For more details on AWS Clean Rooms, Checkout Get started AWS Clean Rooms, open the AWS Clean Rooms console. To learn more, read the AWS Clean Rooms documentation.

Thanks for reading!

Connect with me: Linkedin
Image description

Top comments (0)