Using AWS Lambda to clean DynamoDB of redundant data

I’d like to show off some code that I’ve written to remove redundant data from my DynamoDB tables related to my app. The reason why I’m using AWS lambda to execute this cleaning script is partly because I wanted to try it out, but also partly because I wanted to optimize performance. If done client-side this “cleaning” would have resulted in alot of calls to the DynamoDB api, which in turn would take up alot of resources and worsen performance.  Lambda is a fully managed service where in this case the only thing provided by me is code. It’s quite nice not having to worry about setting up environments when only smaller tasks are executed. There are a few ways to trigger a Lambda function among which are triggered events from other AWS services (such as S3 or DynamoDB) , manual triggers from clients and scheduled events. I’m planning on using scheduled events to execute this “cleaning” function maybe once a day, which seems like a good start for my case considering I really don’t have many users yet. The code can be written in Node.js, Java, Go and C# and Python, and depending on the chosen language the code may have to be uploaded or else written directly into AWS’s web editor interface. I decided to go with Node.js since I’ve recently been learning vanilla JavaScript and I’m hungry for more.

One thing I initially ran into was using promises in Node.js to make sure asynchronous tasks are handled properly. Since I’m new to Node.js , the concept were alien to me and it took a while to wrap my head around the process. My solution would require me to initiate two separate requests against two different DynamoDB tables. The results from these requests would then be compared to determine which items should be deleted. The problematic part is getting good code readability, and my second attempt at the problem significantly increased my codes readability.

 

 

 

 

 

 

 

 

 

 

 

 

 

Above is my first solution which is nesting a promise inside a promise. I think it is hard to read and follow. For example, the outer promise encapsulates an inner promise and the log statements are cluttering the code. The reason behind this code is that I needed to be sure all data was fetched before processing it. If there only was a way to encapsulate multiple promises in a single block of code…

 

 

 

 

 

 

 

 

 

 

Enter Promise.all which is a function that makes sure all promises stated as arguments are fulfilled before execution continues. By using this function I’m able get all results or errors of given promises, thus only creates one level of scope instead of two. This does in my meaning create code that is easy to read and follow. Further improving the code I deleted some redundant logging statements and used the ternary operator to decide what message to log. Considering these simple steps when refactoring I hopefully have created less future headache for myself when debugging.

Doing this little experiment opened my eyes to what Lambda functions may be usable for. Things such as processing data before inserting into databases, cleaning redundant resources and other minor utility tasks that may require alot of processing power can easily be done outside of the main app code. I also think this may be advantageous for keeping code clean and more easily debuggable since the code will be separated in smaller modules executing int their own environment.  In the future I will almost certainly run into situation where a Lambda function might be useful.

Leave a Reply

Your email address will not be published. Required fields are marked *