DEV Community

Vijay Dilip Bhosale
Vijay Dilip Bhosale

Posted on

AWS Data Migration Considerations - Part IV

Relational databases migration to AWS environment into S3 data lake using AWS DMS - Part IV:

This is the fourth blog in a multi-part series on considerations and observations during the migration of relational databases to an AWS S3 data lake using the AWS DMS service. Relational databases such as Oracle, MS SQL Server, MySQL, PostgreSQL, and others were migrated to AWS.

Part I in this series covered considerations such as:

  • Modified date not populated properly at the source.
  • Enabling supplemental logging for Oracle as the source.
  • Network bandwidth issues between source and target databases.

This blog will cover the following additional considerations to address the network bandwidth issue (#3) mentioned above:

It was already discussed in the first blog about the following CloudWatch metrics verified to identify the delay at the Oracle source or S3 target:

  • CDCLatencySource
  • CDCLatencyTarget
  • CDCIncomingChanges
  • NetworkReceiveThroughput

It was clear from these metrics that the issue was related to network bandwidth between the source (Oracle) and target (AWS S3) and was observed in the CDCLatencySource metrics.

There are some other points considered, and changes made in the DMS tasks configuration to resolve the network bandwidth issue are as follows:

Single or few ongoing changes (CDC) tasks for a given source:

AWS DMS tasks for the Oracle tables were configured as follows:

  • Separate full load task for each large table.
  • Separate CDC task for each large table.
  • One full load plus CDC task for each schema with small and medium tables grouped

Full load tasks were executing fine to capture the source database backup one time and load it into target databases. For the CDC tasks, it reads the changes from redo and archived logs (Oracle as the source considered here) for the committed transactions, and network bandwidth is consumed to fetch these logs into the replication instance of the DMS for further processing.

It was observed that network bandwidth issues were evident at midnight during the end of the day and at month-end processing when larger archived log files were generated around 50-60GB per hour compared to 10-15GB per hour during other times.

Also, Oracle redo and archived logs shipped into the replication instance of the DMS for CDC processing gets multiplied by the number of CDC tasks configured in DMS.

For example, with 5 CDC tasks for the Oracle tables, for 50GBs of the archived log generated, 50*5 = 250GBs of the total logs would be shipped into the replication instance for processing the changes.

Hence, considering the network bandwidth issue, it was decided to have a single or few CDC tasks configured to avoid multiplied archived logs in GBs required to ship into the replication instance. So, CDC tasks were configured as:

  • Single or very few CDC tasks for all the tables across schemas

With a single CDC task, 50GBs of archived log would be shipped compared to the same number of tables with 5 CDC tasks to ship 50*5 = 250GBs of the archived logs.

Image description
Image description

Changes in memory configuration for DMS CDC tasks:

MemoryKeepTime- It Sets the maximum time in seconds that each transaction can stay in memory before being written to disk with default value as 60.

MemoryLimitTotal- It Sets the maximum size (in MB) that all transactions can occupy in memory before being written to disk with default value as 1024

After consulting the AWS team, it was discussed that increasing values for both these properties would help to manage memory efficiently for the CDC tasks by keeping transactions in memory longer. Hence, these properties were changed as follows:

MemoryKeepTime - 600
MemoryLimitTotal - 10240

The exact values to set for these properties may differ on a case-by-case basis, and it is necessary to consult the AWS team before changing these values.

Other points checked for addressing the performance issues for the DMS tasks:

  • Source database configuration to verify CPU and Memory for overall load on the source database during CDC processing.
  • Replication instance configuration for CPU and Memory to process the incoming changes.
  • Archived logs generation (in size in GBs) at the source.
  • Network logs for network bandwidth used between Oracle source and AWS cloud.
  • LOB support for the tables – The number of LOB columns in tables and the LOB option (such as Limited, Full, and Inline) for migration can affect data migration performance.
  • Binary reader over Logminer for Oracle as the source for reading log files for better CDC performance.
  • Minimum required supplemental logging to avoid additional details captured in archived logs.

Conclusion:
In this blog, the main focus was on the considerations for performance optimization of the Oracle tables migration issue faced during CDC tasks.

Top comments (0)