DEV Community

k.goto for AWS Heroes

Posted on

How to Easily Delete Amazon S3 Tables

Amazon S3 Tables

Amazon S3 Tables was announced at AWS re:Invent 2024 keynote.

Amazon S3 Tables deliver the first cloud object store with built-in Apache Iceberg support, and the easiest way to store tabular data at scale. S3 Tables are specifically optimized for analytics workloads, resulting in up to 3x faster query throughput and up to 10x higher transactions per second compared to self-managed tables.

Announcing Amazon S3 Tables – Fully managed Apache Iceberg tables optimized for analytics workloads

New Amazon S3 Tables: Storage optimized for analytics workloads

Please refer to the reference for details.


S3 Bucket Types

With the introduction of S3 Tables, a new type of bucket called "Table Buckets" has been added to S3.

Currently, S3 has the following types of buckets:

  • General Buckets
    • Traditional S3 buckets
  • Directory Buckets
    • Buckets used with S3 Express One Zone
  • Tables Buckets
    • Buckets used with the newly announced "S3 Tables"
    • Apache Iceberg format
    • Can be queried from other AWS services through integration with Amazon Athena, Amazon EMR, Apache Spark, etc.
    • While S3 Tables itself is GA, as of February 2, 2025, the integration features are in preview

Components of S3 Tables

The components of Tables Buckets in S3 Tables are as follows:

  • Table Buckets
  • Namespaces
  • Tables

These form a hierarchical structure of "Table Bucket / Namespace / Table".

As of February 2, 2025, there are default service quotas per AWS account:

  • Table Buckets
    • Maximum of 10 table buckets per account (per region)
  • Namespaces
    • Maximum of 10,000 namespaces per table bucket
  • Tables
    • Maximum of 10,000 tables per table bucket

For more details, please refer to the reference.


Deletion Rules for S3 Tables (Table Buckets)

To delete a Table Bucket in S3 Tables, the following rules apply:

  • You must first delete all namespaces in the target table bucket
  • To delete a namespace, you must first delete all tables within that namespace

In other words, deleting a table bucket requires the following sequence:

  1. Retrieve all namespaces in the target table bucket
  2. Retrieve all tables belonging to each namespace
  3. Delete all tables in each namespace
  4. Delete all namespaces in the target table bucket
  5. Delete the target table bucket

I've made it possible to do this with a single command.


cls3

The cls3 is a convenient CLI tool I developed as OSS for S3 deletion.

GitHub repository is here. (Stars would be appreciated!)


Previous Features

Please refer to the blog and GitHub mentioned above for details.

Specifically, you can:

  • Delete or empty buckets
  • Achieve fast bucket deletion through parallel object deletion
    • Automatically retries when throttling errors occur during bulk deletions
  • Search and select multiple buckets in interactive mode
  • Delete multiple buckets across regions at once
  • Delete buckets with versioning enabled
  • Delete only old versions of objects
  • Delete Directory Buckets for S3 Express One Zone
  • Provides functionality for GitHub Actions

Installation

I offer various installation methods:

  • Homebrew
  • One-liner script
    • Linux, Darwin (macOS) and Windows
  • aqua
  • asdf
  • GitHub Releases
    • Direct download from browser
  • Git clone + install
    • For Go language developers

For detailed installation instructions, please refer to the installation section in the README.


Table Buckets Mode

I've incorporated the ability to delete S3 Tables' Table Buckets in cls3 v0.23.0.

In cls3, you can either directly specify bucket names using the CLI option (-b) or select buckets interactively using interactive mode (-i).

This release introduces Table Buckets Mode, which allows you to easily delete S3 Tables' Table Buckets as shown below (using interactive mode as an example):

cls3

The "-t | --tableBucketsMode" option deletes the "contents" of a Table Bucket, meaning it deletes all namespaces and tables in the target Table Bucket.

When combined with the "-f | --forceMode" option, it will delete the entire Table Bucket along with all namespaces and tables.


Important Notes

The S3 Tables API has different endpoints from existing S3.

Whether it's because the S3 Tables API is new or it's by design, when performing parallel deletion operations, you may encounter TooManyRequest errors or Internal Errors that require retries.

Therefore, cls3 minimizes the degree of parallelism and includes an automatic retry mechanism for when these errors occur.

*Note: If errors still occur, I plan to further reduce the degree of parallelism in future releases.

Top comments (0)