Database migration is one of the most critical and common aspects of cloud migration activities that DevOps engineers and cloud experts encounter on a regular basis. DB migration can also be one of the more complex problems to solve in such cases, since in spite of its frequent occurrence, a straightforward solution is not always readily available for it, due to a varying number of use cases and requirements for each customer application. For these cases, DevOps engineers need to think outside the box, be innovative, and develop a custom solution to fulfill the criteria for their specific use case.
In a similar occurrence, a recent development effort for one of our clients called for the migration of DynamoDB tables between two AWS accounts. Considering the extensive catalog of services and functionalities offered by AWS, one would assume they might have provided an inherent functionality to export the table backups to another AWS region or AWS account, similar to what is currently possible for RDS. However, I soon found out disappointingly that this is not possible. So I began my research for migrating DynamoDB data across AWS accounts.
I assumed that it would be easy to implement as you can probably create a small script to fetch all the data from a source table and make queries to add the data to a destination table. After hours of scouring Google search results, GitHub repositories, and Stack Overflow threads, I was unable to come across an appropriate solution that would work for my use case. The ones that I found struggled to handle tables with a large amount of data. In my case, I was dealing with a specific table that had approximately 200,000 items.
The AWS recommended solution for migrating data in this scenario contained a two step process where data was first exported to an S3 bucket. From this S3 bucket, the data could be either copied over or exported to another S3 bucket in the destination AWS account with the necessary permissions configured. The data could then be imported from that S3 bucket to the destination DynamoDB table to complete the migration process. In theory, these steps seem simple and straightforward right up to the point where you figure out that AWS has not provided any easy way to import data from an S3 bucket to DynamoDB. They do provide a way to export data from a DynamoDB table to an S3 bucket, but in order to import the data, the recommended approach is to use AWS Data Pipeline.
AWS Data Pipeline is a service that sets up an automated pipeline that can be run manually or on a schedule and the pipeline itself utilizes an EMR cluster to perform data migration and data transformation steps. The problem with this approach is that it is not easy to set up and it would definitely incur extra costs on the AWS account as it would deploy some resources in the EMR cluster which are going to be charged for the amount of time they are up and running.
Nevertheless, even with the already provided template to import DynamoDB data from S3, I was not able to setup AWS DataPipeline successfully nor could I get the logs to work in order to figure out what was wrong. At this point, I started looking into alternatives since it seemed that this solution would require more effort and time to make it work.
A few suggested solutions involving custom Python scripts and Node modules simply fetched the data from a table and added all the entries to another table. This solution did not require the use of any additional AWS resources. So far so good. This seemed like a promising lead. However, as I proceeded with this solution I realized that it started to struggle at scale, with the migration time increasing for tables with more than 200,000 entries. It took around 3-4 hours to transfer 50% of the table entries, which was definitely not ideal. I needed a more optimized solution.
I finally decided to write a script of my own that utilized the asynchronous nature of NodeJS to achieve the desired functionality. The approach I used was to first fetch all items from the table using multiple scan calls until all of the table entries are fetched. I then proceeded to use the BatchWriteItem call to add items to the table; this call imposes a max limit of 25 items at a time. To cater for this limit, I divided the table entries into batches of 25 items and executed the BatchWriteItem call for each batch in an asynchronous manner so that the script does not wait for the response of one batch call to send another one. This approach greatly reduced the execution time of the script. It transferred the data from the table with 200,000 entries within 6-7 minutes, instead of hours.
The next problem I faced was that this BatchWriteItem approach was not certain to process all items and according to the documentation, it sometimes returned a list of unprocessed items. For these unprocessed items, I had to send a request again, which was also done asynchronously. The script would retry batch write calls on all the unprocessed items and wait for all the calls to be completed before checking if there are still some remaining unprocessed items. This process was repeated until all items were processed successfully and all entries from the source DynamoDB table were migrated to the destination table in a different AWS account. In between each retry call, an exponential backoff algorithm was implemented, a recommendation from AWS documentation. The algorithm works by introducing a small delay between each retry call and it would double the delay time after each retry. For example, if we start with a one second delay after the initial attempt at retrying the batch calls, then before the second attempt, there will be a two second delay, and in the next retry, there will be a four second delay, and so on.
For better understanding, the diagram below shows the complete workflow of the script:
In order to save fellow developers out there facing a similar problem, we have decided to open source the code for this interesting, unique, and highly efficient solution. The script has been developed by Xgrid in collaboration with our partner company, copebit AG. The code for the script is available in this GitHub repository.
We are planning to further optimize this script in the future and will also publish an NPM package for this solution so that it is modular and simple enough to be used by anyone. Further enhancements will be adding a CLI tool within the NPM package to make it even easier to consume.
Our team regularly faces interesting scenarios in our day to day activities while developing custom solutions and applications for our customers on AWS and other cloud environments. We plan on writing a series of blogs on other similar solutions that we have developed and are open sourcing in the future to share our exciting experiences and insights with other developers so that they can take advantage of these tools or solutions and enhance them for their own use cases and requirements.
Xgrid is an IT services and cloud consulting company that has been working in the areas of Test Automation, Continuous Integration/Delivery, Workflows and Process Automation, Custom Application and Tool Development (with end-to-end UX design and UI development); all tied to private, public and hybrid clouds since it was founded in 2012. Xgrid specializes in the above verticals and provides best-in-class cloud migration, software development, automation, user experience design, graphical interface development and product support services for our customers’ custom-built solutions.
For more details on our expertise, you can visit our website.
About copebit AG
copebit AG is an innovative and dynamic Swiss IT company that focuses on Cloud Consulting and Engineering. Besides their requirement to always master the latest Cloud portfolio, copebit also offers project management from the classic way to the versatile world of agile methods.
For more details on their expertise, you can visit their website.