DEV Community

Data Tech Bridge
Data Tech Bridge

Posted on

Amazon RDS - Cheat Sheet

AWS RDS Service Cheat Sheet for AWS Certified Data Engineer - Associate (DEA-C01)

Core Concepts and Building Blocks

Amazon Relational Database Service (RDS) is a managed relational database service that makes it easier to set up, operate, and scale a relational database in the cloud. It provides cost-efficient, resizable capacity for industry-standard relational databases and manages common database administration tasks.

Key components:

  • DB Instances: Primary building block of RDS
  • DB Engines: Supported database software (MySQL, PostgreSQL, Oracle, SQL Server, MariaDB, Aurora)
  • DB Parameter Groups: Configuration settings for database engines
  • Option Groups: Additional features for specific DB engines
  • Subnet Groups: Collection of subnets for DB instance placement in VPC
  • Security Groups: Control network access to DB instances
  • Snapshots & Backups: Point-in-time recovery mechanisms
  • Multi-AZ Deployments: High availability configuration
  • Read Replicas: Performance scaling for read operations

Mind Map: AWS RDS Components

Amazon RDS
├── DB Instances
│   ├── Instance Classes (T3, M5, R5, etc.)
│   ├── Storage Types (GP2, GP3, io1, io2)
│   └── Configurations (CPU, Memory, Storage)
├── Database Engines
│   ├── MySQL
│   ├── PostgreSQL
│   ├── MariaDB
│   ├── Oracle
│   ├── SQL Server
│   └── Aurora (MySQL/PostgreSQL compatible)
├── High Availability
│   ├── Multi-AZ Deployments
│   └── Automatic Failover
├── Scaling Options
│   ├── Vertical Scaling (Instance Resize)
│   ├── Storage Scaling
│   └── Read Replicas
├── Security
│   ├── VPC & Security Groups
│   ├── IAM Authentication
│   ├── Encryption (At-rest & In-transit)
│   └── SSL Connections
├── Monitoring & Management
│   ├── CloudWatch Metrics
│   ├── Enhanced Monitoring
│   └── Performance Insights
└── Backup & Recovery
    ├── Automated Backups
    ├── Manual Snapshots
    └── Point-in-Time Recovery
Enter fullscreen mode Exit fullscreen mode

Key Points for AWS RDS

  1. RDS supports six database engines: MySQL, PostgreSQL, MariaDB, Oracle, Microsoft SQL Server, and Amazon Aurora.

  2. DB instances are the basic building blocks of RDS, consisting of isolated database environments in the cloud.

  3. RDS offers three storage types: General Purpose SSD (gp2/gp3), Provisioned IOPS SSD (io1/io2), and Magnetic (standard, legacy).

  4. General Purpose SSD (gp3) provides baseline performance of 3,000 IOPS and 125 MiB/s throughput, with ability to provision up to 16,000 IOPS and 1,000 MiB/s throughput.

  5. General Purpose SSD (gp2) provides baseline 3 IOPS per GB with a minimum of 100 IOPS and can burst up to 3,000 IOPS.

  6. Provisioned IOPS SSD (io1/io2) allows you to specify IOPS rate when creating or modifying a DB instance, with maximum ratio of 50:1 between IOPS and storage in GiB.

  7. Maximum storage size for RDS instances is 64 TiB for most engines (16 TiB for SQL Server).

  8. Maximum provisioned IOPS is 256,000 for io2 Block Express (supported only on specific instance types).

  9. Multi-AZ deployments provide enhanced availability and durability by automatically provisioning and maintaining a synchronous standby replica in a different AZ.

  10. During a planned or unplanned outage of your DB instance, Amazon RDS automatically switches to the standby replica (typically 60-120 seconds failover time).

  11. Read replicas allow you to create read-only copies of your database to offload read traffic from the primary DB instance.

  12. You can create up to 15 read replicas per source DB instance.

  13. Read replicas can be promoted to standalone DB instances (breaking replication).

  14. Cross-region read replicas are supported, allowing for global distribution of read workloads.

  15. RDS supports automated backups with point-in-time recovery (PITR) for up to 35 days.

  16. Manual DB snapshots can be retained indefinitely until explicitly deleted.

  17. RDS instances can be encrypted at rest using AWS KMS keys, but encryption cannot be added to an existing unencrypted instance (must create new encrypted instance).

  18. All engines support encryption of data in transit using SSL/TLS.

  19. RDS supports IAM database authentication for MySQL and PostgreSQL, allowing token-based authentication.

  20. Parameter groups allow you to manage database engine configuration with parameters that apply to all databases on the instance.

  21. Option groups let you enable and configure additional features for specific database engines.

  22. RDS instances can be placed in a VPC for network isolation, with security groups controlling access.

  23. Enhanced Monitoring provides OS-level metrics for your DB instance with granularity as low as 1 second.

  24. Performance Insights helps you monitor database load and analyze performance issues.

  25. RDS supports storage autoscaling, automatically increasing storage when free space is low.

  26. Maximum database size varies by engine: MySQL/MariaDB/PostgreSQL/Oracle (64 TiB), SQL Server (16 TiB).

  27. RDS instances can be accessed only through the endpoint DNS name, not through IP addresses.

  28. Each DB instance has a weekly maintenance window for system updates.

  29. You can modify most DB instance settings without downtime, but some changes require an instance reboot.

  30. RDS Proxy provides connection pooling and reduces connection management overhead for applications.

  31. RDS supports major and minor version upgrades for all database engines.

  32. Blue/Green deployments allow for safe database updates with minimal downtime.

  33. RDS can automatically scale storage when approaching capacity limits (if enabled).

  34. RDS supports resource-level permissions through IAM policies.

  35. RDS instances can be tagged for cost allocation and organization.

  36. RDS supports reserved instances for significant cost savings on long-term commitments.

  37. RDS instances can be stopped for up to 7 days (storage costs still apply).

  38. RDS supports integration with AWS Secrets Manager for credential management.

  39. RDS can publish logs to CloudWatch Logs for centralized log management.

  40. RDS supports CloudWatch Events for DB instance state changes.

Database Engine Comparison

Feature MySQL PostgreSQL MariaDB Oracle SQL Server Aurora
Max DB Size 64 TiB 64 TiB 64 TiB 64 TiB 16 TiB 128 TiB
Max Tables Unlimited Unlimited Unlimited Unlimited Limited by storage Unlimited
IAM Auth Yes Yes No No No Yes
Encryption Yes Yes Yes Yes Yes Yes
Read Replicas 15 15 15 5 5 15
Multi-AZ Yes Yes Yes Yes Yes Built-in
Engine Versions 5.7, 8.0 10-15 10.3-10.6 12c, 19c 2014-2019 MySQL 5.7/8.0, PostgreSQL 11-15
Global Database No No No No No Yes
Serverless No No No No No Yes (v1, v2)
Auto Scaling Storage only Storage only Storage only Storage only Storage only Compute & Storage

Instance Types and Use Cases

Instance Family Use Case Key Characteristics
T3/T4g Dev/Test, Small Production Burstable performance, low cost
M5/M6g/M6i General Purpose Balanced compute, memory, and network
R5/R6g/R6i Memory-Optimized High memory-to-vCPU ratio for memory-intensive workloads
X2g Memory-Optimized Highest memory-to-vCPU ratio for in-memory databases
C5/C6g/C6i Compute-Optimized High performance for compute-bound workloads
z1d High Frequency High frequency (4.0 GHz) for latency-sensitive workloads

Storage Types and Performance

Storage Type Use Case IOPS Throughput Cost
General Purpose SSD (gp3) Most workloads 3,000 baseline, up to 16,000 125 MiB/s baseline, up to 1,000 MiB/s $$
General Purpose SSD (gp2) Dev/Test, small-to-medium DBs 3 IOPS/GB (min 100, max 16,000) Limited by IOPS $$
Provisioned IOPS SSD (io1) I/O-intensive workloads Up to 64,000 Limited by IOPS $$$
Provisioned IOPS SSD (io2) Critical, I/O-intensive workloads Up to 64,000 (256,000 for Block Express) Limited by IOPS $$$
Magnetic (standard) Legacy applications N/A N/A $

Example Storage Calculation

For a database requiring 5,000 IOPS with 500 GB storage:

  1. gp2: 500 GB × 3 IOPS/GB = 1,500 baseline IOPS (insufficient)
  2. gp3: 3,000 baseline IOPS + 2,000 additional IOPS = 5,000 IOPS
  3. io1/io2: Directly provision 5,000 IOPS

Cost comparison (example, prices vary by region):

  • gp3: $0.10/GB-month + $0.10/provisioned IOPS-month above 3,000 = $50 + $0.20 = $50.20/month
  • io1: $0.125/GB-month + $0.10/provisioned IOPS-month = $62.50 + $500 = $562.50/month

RDS Features and Details

Feature Description Limits/Notes
DB Instances Primary building block of RDS Up to 40 per region (default, can be increased)
Storage SSD options for different performance needs 20 GB to 64 TiB (16 TiB for SQL Server)
Multi-AZ Synchronous standby replica in different AZ 60-120 second failover time
Read Replicas Asynchronous replicas for read scaling Up to 15 per source instance
Automated Backups Daily full backup with transaction logs 0-35 day retention period
Manual Snapshots User-initiated backups Retained until explicitly deleted
Point-in-Time Recovery Restore to any point within backup retention period 5-minute recovery point objective (RPO)
Encryption KMS-based encryption for data at rest Cannot encrypt existing unencrypted instance
IAM Authentication Token-based authentication MySQL and PostgreSQL only
Enhanced Monitoring OS-level metrics 1, 5, 10, 15, 30, or 60-second intervals
Performance Insights Database performance analysis 7 days free, extended retention available
Event Notifications SNS notifications for RDS events Up to 20 subscriptions per region
Parameter Groups Engine configuration settings Engine-specific parameters
Option Groups Additional database features Engine-specific options
Storage Autoscaling Automatic storage increases Must enable, 10% free space threshold
RDS Proxy Connection pooling service Reduces connection management overhead
Blue/Green Deployments Database update staging environment Minimal downtime for major updates
Reserved Instances Discounted pricing for 1-3 year commitments Up to 60% cost savings
Global Databases Cross-region replication Aurora only, 1-second typical replication lag
Serverless Automatic scaling based on workload Aurora only (v1, v2)

RDS Limits

  1. Maximum DB instances per region: 40 (default, can be increased)
  2. Maximum storage per DB instance: 64 TiB (16 TiB for SQL Server)
  3. Maximum IOPS with io1/io2: 256,000 (with io2 Block Express)
  4. Maximum read replicas per source instance: 15
  5. Maximum backup retention period: 35 days
  6. Maximum parameter groups per region: 50
  7. Maximum option groups per region: 20
  8. Maximum event subscriptions per region: 20
  9. Maximum DB security groups per region: 25
  10. Maximum rules per DB security group: 20
  11. Maximum DB subnet groups per region: 50
  12. Maximum subnets per DB subnet group: 20
  13. Maximum VPC security groups per DB instance: 5

Performance Considerations

  1. Choose the appropriate instance class based on workload requirements (memory, CPU, network).
  2. Use Provisioned IOPS (io1/io2) for I/O-intensive workloads requiring consistent performance.
  3. For cost-effective performance, use gp3 with ability to independently configure IOPS and throughput.
  4. Monitor storage IOPS utilization and increase provisioned IOPS if consistently above 90%.
  5. Use read replicas to offload read traffic from the primary instance.
  6. Consider Aurora for workloads requiring high throughput and low latency.
  7. Optimize database queries and indexes to improve performance.
  8. Use appropriate parameter group settings for your workload (buffer sizes, connection limits).
  9. Monitor connection count to avoid hitting max_connections limit.
  10. Use RDS Proxy to manage database connections efficiently for applications with many short-lived connections.
  11. Enable Performance Insights to identify database load bottlenecks.
  12. Use Enhanced Monitoring to identify OS-level resource constraints.
  13. Consider placement in the same AZ as your application for lower latency (if not using Multi-AZ).
  14. For write-heavy workloads, scale vertically by choosing a larger instance class.
  15. For read-heavy workloads, scale horizontally using read replicas.
  16. Implement connection pooling at the application level to reduce connection overhead.
  17. Use appropriate transaction isolation levels for your workload.

Data Ingestion and Migration

  1. AWS Database Migration Service (DMS) can be used for homogeneous and heterogeneous migrations to RDS.
  2. Native database tools (mysqldump, pg_dump) can be used for smaller databases.
  3. For large databases, consider using physical backups and AWS Snowball for initial data load.
  4. RDS supports importing data from Amazon S3 for MySQL and PostgreSQL.
  5. Use AWS SCT (Schema Conversion Tool) for heterogeneous migrations.
  6. DMS supports continuous data replication with minimal downtime.
  7. For high-volume data ingestion, consider batch processing to avoid performance impact.
  8. Implement throttling in your application to control the rate of writes to the database.
  9. For bulk data loads, consider temporarily disabling foreign key constraints and indexes.
  10. RDS MySQL supports LOAD DATA FROM S3 for efficient data loading.
  11. RDS PostgreSQL supports the COPY command for efficient data loading.
  12. When migrating to RDS, ensure network bandwidth is sufficient for data transfer.
  13. For replayable data ingestion pipelines, use change data capture (CDC) with DMS.
  14. Consider using Aurora parallel query for faster data loading and analysis.
  15. Implement idempotent data ingestion processes to handle retries safely.

Monitoring RDS with CloudWatch

Metric Description Threshold Recommendation
CPUUtilization Percentage of CPU utilization <80% sustained
DatabaseConnections Number of client connections <80% of max_connections
FreeableMemory Amount of available RAM >20% of total memory
FreeStorageSpace Available storage space >20% of allocated storage
ReadIOPS/WriteIOPS I/O operations per second <90% of provisioned IOPS
ReadLatency/WriteLatency Average time for I/O operations <20ms for SSD, <100ms for magnetic
DiskQueueDepth Number of outstanding I/O requests <10 for most workloads
ReplicaLag How far behind read replica is from primary Depends on RPO, typically <30s
SwapUsage Amount of swap space used Should be near zero
BinLogDiskUsage Binary log space used (MySQL/MariaDB) Monitor for growth trends
NetworkReceiveThroughput/NetworkTransmitThroughput Network traffic to/from instance Monitor for growth trends
  1. Set up CloudWatch alarms for critical metrics like CPUUtilization, FreeStorageSpace, and DatabaseConnections.
  2. Use Enhanced Monitoring for deeper insights into OS processes and resource utilization.
  3. Performance Insights provides a Performance Management dashboard with key metrics like DB load and top SQL queries.
  4. Create custom CloudWatch dashboards combining RDS metrics with application metrics for end-to-end monitoring.
  5. Set up RDS Event subscriptions for operational notifications.

High Availability and Disaster Recovery

  1. Multi-AZ deployments provide high availability within a region with automatic failover.
  2. Read replicas can be promoted to primary in case of regional failure (manual process).
  3. Cross-region read replicas provide disaster recovery capabilities across regions.
  4. Aurora Global Database provides cross-region replication with fast failover for disaster recovery.
  5. RDS automated backups and manual snapshots can be copied across regions for disaster recovery.
  6. Recovery Time Objective (RTO) for Multi-AZ failover is typically 60-120 seconds.
  7. Recovery Point Objective (RPO) for automated backups is typically 5 minutes or less.
  8. Test failover procedures regularly to ensure they work as expected.
  9. Document recovery procedures for different failure scenarios.
  10. Implement application-level retry logic to handle temporary database unavailability.

Security Best Practices

  1. Use VPC security groups to restrict access to RDS instances.
  2. Enable encryption at rest for all production databases.
  3. Use SSL/TLS for all database connections.
  4. Implement IAM database authentication where supported.
  5. Rotate database credentials regularly.
  6. Use AWS Secrets Manager to manage and rotate database credentials.
  7. Enable audit logging for database activity.
  8. Use AWS Config and Security Hub to monitor RDS security compliance.
  9. Implement least privilege access for database users.
  10. Regularly patch database engines by upgrading to the latest minor version.

Cost Optimization

  1. Use Reserved Instances for predictable workloads to save up to 60%.
  2. Stop non-production instances when not in use (storage costs still apply).
  3. Right-size instances based on CloudWatch metrics and Performance Insights.
  4. Use Aurora Serverless for variable or unpredictable workloads.
  5. Monitor and optimize storage usage to avoid unnecessary costs.
  6. Use gp3 storage for better price-performance ratio compared to io1/io2.
  7. Consider Multi-AZ only for production environments requiring high availability.
  8. Use read replicas strategically only where needed for performance.
  9. Set appropriate backup retention periods based on compliance requirements.
  10. Use Storage Auto Scaling to avoid over-provisioning storage.

Implementing Throttling and Overcoming Rate Limits

  1. RDS has service quotas like maximum DB instances per region (40 by default) that can be increased via AWS Support.
  2. Implement connection pooling to manage database connections efficiently and avoid hitting max_connections limit.
  3. Use exponential backoff and jitter in applications when encountering throttling or connection errors.
  4. For bulk operations, implement batch processing with controlled throughput.
  5. Monitor API call rates to avoid hitting AWS API throttling limits.
  6. Use RDS Proxy to manage connection pools and reduce connection churn.
  7. Implement circuit breakers in applications to prevent cascading failures during database overload.
  8. For write-heavy workloads, consider sharding data across multiple RDS instances.
  9. Use caching solutions like ElastiCache to reduce database load.
  10. Implement rate limiting at the application layer to control database access.

Throughput and Latency Characteristics

  1. RDS MySQL/PostgreSQL can typically handle thousands of transactions per second depending on instance size and configuration.
  2. Aurora can provide up to 5x the throughput of standard MySQL and 3x the throughput of standard PostgreSQL.
  3. Network latency between application and database is typically <1ms within same AZ, 1-2ms across AZs, and 50-100ms across regions.
  4. Read replicas add replication lag (typically milliseconds to seconds) but can significantly increase read throughput.
  5. Storage throughput varies by storage type: gp3 (125-1,000 MiB/s), io1/io2 (limited by IOPS and instance type).
  6. Transaction commit latency is higher in Multi-AZ deployments due to synchronous replication.
  7. Aurora Global Database typically has <1 second replication lag between regions.
  8. RDS Proxy adds minimal latency (typically <1ms) while providing connection pooling benefits.
  9. Larger instance types generally provide better network performance and higher EBS-optimized throughput.
  10. I/O-intensive operations like full table scans can saturate storage throughput and increase latency for other operations.

Replayability of Data Ingestion Pipelines

  1. Use AWS DMS with CDC (Change Data Capture) for continuous replication with the ability to restart from checkpoints.
  2. Implement idempotent data processing to safely handle retries and duplicates.
  3. Use Amazon MSK (Managed Streaming for Kafka) or Kinesis as a buffer for database writes with replay capability.
  4. Store raw data in S3 before processing to enable reprocessing if needed.
  5. Implement a dead letter queue (DLQ) for failed database operations to retry later.
  6. Use transaction IDs or sequence numbers to track progress and enable resumption.
  7. Implement checkpointing in data pipelines to restart from known good states.
  8. Consider using AWS Glue for ETL processes with built-in retry mechanisms.
  9. Use AWS Step Functions to orchestrate complex data ingestion workflows with error handling and retry logic.
  10. Implement proper logging and monitoring to quickly identify and resolve data ingestion issues.

RDS Aurora Specific Features

  1. Aurora provides up to 15 low-latency read replicas with typical lag under 100ms.
  2. Aurora storage automatically scales in 10GB increments up to 128TB.
  3. Aurora replicates data across 6 storage nodes in 3 AZs for high durability.
  4. Aurora can automatically recover from physical storage failures.
  5. Aurora Global Database supports cross-region replication with typical lag under 1 second.
  6. Aurora Serverless v2 automatically scales capacity based on workload, from 0.5 to 128 ACUs (Aurora Capacity Units).
  7. Aurora supports zero-downtime patching for most maintenance operations.
  8. Aurora parallel query can improve analytical query performance by pushing processing to the storage layer.
  9. Aurora supports backtrack, allowing you to "rewind" a database to a previous point in time without restoring from backup.
  10. Aurora supports fast cloning using copy-on-write protocol, making database clones very storage-efficient.

Open Source Components in RDS

RDS Engine Open Source Version Key Differences in RDS Implementation
MySQL Community Edition Limited root access, automated management, some plugins disabled
PostgreSQL Community Edition Limited superuser access, automated management, some extensions pre-installed
MariaDB Community Edition Limited root access, automated management, some plugins disabled
  1. RDS MySQL is based on the community edition but restricts certain features that require file system access.
  2. RDS PostgreSQL supports many but not all PostgreSQL extensions (e.g., no custom C functions).
  3. RDS MariaDB follows similar restrictions as MySQL regarding root access and filesystem access.
  4. Aurora MySQL is wire-compatible with MySQL but uses a completely different storage architecture.
  5. Aurora PostgreSQL is wire-compatible with PostgreSQL but uses Aurora's distributed storage system.
  6. RDS provides automated minor version upgrades for all open source engines.
  7. RDS parameter groups allow configuration of most but not all engine parameters.
  8. RDS manages log rotation and backup management differently than self-managed open source databases.
  9. RDS provides enhanced monitoring and metrics beyond what's available in open source versions.
  10. RDS restricts direct OS access that would be available in self-managed open source databases.

Advanced Features and Integrations

  1. RDS supports integration with AWS Lambda for custom automation via event notifications.
  2. RDS can publish logs to CloudWatch Logs for centralized log management and analysis.
  3. RDS integrates with AWS Backup for cross-service backup management.
  4. RDS supports tagging for cost allocation and resource organization.
  5. RDS can be integrated with AWS Directory Service for AD authentication (SQL Server and Oracle).
  6. RDS PostgreSQL supports logical replication starting with version 10.
  7. RDS MySQL supports GTID-based replication starting with version 5.7.
  8. RDS Oracle supports Oracle Data Guard for managed disaster recovery.
  9. RDS SQL Server supports native backup and restore to/from S3.
  10. RDS PostgreSQL supports Foreign Data Wrappers for connecting to external data sources.

Best Practices for Data Engineering

  1. Design database schemas with partitioning for large tables to improve query performance and manageability.
  2. Implement appropriate indexing strategies based on query patterns.
  3. Use connection pooling at the application layer to efficiently manage database connections.
  4. Implement a caching strategy using ElastiCache to reduce database load for frequently accessed data.
  5. Design for horizontal scaling using read replicas for read-heavy workloads.
  6. Implement database sharding for write-heavy workloads that exceed single-instance capacity.
  7. Use appropriate transaction isolation levels based on application requirements.
  8. Implement proper error handling and retry logic in applications accessing RDS.
  9. Regularly analyze and optimize slow queries using Performance Insights.
  10. Implement a comprehensive monitoring strategy covering database performance, availability, and security.
  11. Use AWS Glue Data Catalog to make RDS data discoverable for analytics.
  12. Consider using Aurora for mission-critical applications requiring high throughput and availability.
  13. Implement proper backup and disaster recovery procedures based on RPO/RTO requirements.
  14. Use blue/green deployments for major version upgrades to minimize risk and downtime.
  15. Implement a database change management process using infrastructure as code (e.g., AWS CloudFormation).
  16. Consider data lifecycle management, archiving old data to S3 for cost optimization.
  17. Use AWS DMS for continuous replication to data warehouses or data lakes for analytics.
  18. Implement proper security controls including encryption, network isolation, and access management.
  19. Consider multi-region architectures for global applications requiring low latency and high availability.
  20. Regularly test recovery procedures to ensure they meet business requirements.

Top comments (0)