The tricky part is to proper organise architecture — split logic into multiple functions and transfer data between them. That splitting makes sense especially if one part of logic is invoked more often than other, or if one part is more complicated, takes more time / memory to compute. Another reason could be, that for one function you would like to have concurrency limit— for example to avoid too many requests to external API or DB (throttling issues).
Keep in mind, that Lambda initialisation can take some time (not so much as compiled languages, though), so it might not be necessary to split logic into Lambdas, which computation time takes less < 1s.
Know the difference between sync and async Lambda
You can invoke Lambda synchronously or asynchronously. If your logic is simple and you expect direct response from your function, invoke Lambda synchronously (InvocationType = RequestResponse). For example Lambda calls via API Gateway or Elastic Load Balancer are synchronous, as you expect response.
For more complicated logic it’s recommend to split it between functions and invoke them asynchronously (InvocationType = Event). If you would like to call Lambda from other Lambda synchronously — keep in mind, that you may pay twice, as time of both Lambdas will be billed separately.
Example of async / event driven architecture: one Lambda sends message to topic (SNS), or Event Bus (EventBridge), or for example updates record in database, and another Lambda is subscribed to that specific event / stream. Communication in that case is event driven - there is no other Lambda waiting for response as it could be in sync model.
You can orchestrate logic using AWS Step Functions, so your event driven application will be easier to maintain. If you would like to orchestrate async workflow, but expect sync response after workflow will be finished (eg. by API Gateway) — consider Synchronous Express Workflow. For some applications AWS Step Functions might be too costly because of double-billing issue, so use it carefully. Sometimes it might be better to stick with EventBridge or Lambda destinations instead.
Async Lambda has its own event queue (don’t confuse with SQS), so you can implement retry logic in case event won’t be delivered (max 2 retries) and it can keep events for max. 6 hours, eg. when function doesn’t have enough capacity to handle all incoming requests (throttling errors). Keep in mind that event queue is eventually consistent (event can be sometimes delivered more than once). If you expect throttling issues, consider implementing SQS queue (instead of relying on event queue) to have guaranteed, that all events will be delivered.
Do not forget about handling errors — you can set SQS queue or SNS topic as dead letter queue or as an on failure destination. In first case you will have access to all discarded events, in second case you will have access to events, but also to errors responses.
Think about multiple ways of triggering Lambda
You can trigger Lambdas by events emitted from different sources. List of possible event sources is long.
The most popular ways to trigger Lambda are:
- SNS — pub/sub (push) functionality; one or many Lambdas send message to topic, which others Lambdas/services can be subscribed to;
- SQS — queues (pull) are ideal solution when you expect throttling problem — for example if your traffic is very dynamic and you would like to avoid to lose any messages, or if you want to optimize Lambda autoscaling (60 additional instances per minute to a maximum of 1,000 concurrent invocations);
- EventBridge (CloudWatch Events) — used for more complex events management, where you can filter by event patterns, subscribe to scheduled job (cron), 3rd party emitters and more — such as communication between accounts;
- Kinesis — dedicated for streaming or data driven applications;
- S3 — you can trigger Lambda based on changes in S3 bucket eg. one service upload file, other service does some operation on it;
- DynamoDB — Lambda can read records from DB stream, so you can react each time, when data changes;
By default SNS and SQS don’t guarantee, that messages will be delivered in the same order, as they were published. Occasionally there could be situation that messages will be delivered more than once, so you might need deduplication mechanism. If you would like to prevent duplicate messages from delivered and need order to be guaranteed, think about SNS FIFO or SQS FIFO.
Get some inspirations from others
There are many different ways to deal with events in AWS, so I encourage you to read about how others build theirs async architecture. Great source of patterns and solutions is AWS Solutions Reference Architectures and cdkpatterns.com.
Understand Lambda execution to improve speed
When you are Node.js developer, cold starts (increased invocation latency) might not be your main issue. If Lambda does a lot of work eg. soon after invocation (making connection to DB, retrieving data from SSM or so), you might want to improve that speed — especially if that could improve UX.
Good idea is to cache some data in memory, so it can be used in another invocation within the same execution environment (runtime) later. You can play with Middy middleware (cache, ssm packages) or memoize some pure functions eg. with Memoizee.
Think about provisioned concurrency (requested number of execution environments will be always prepared to respond to your function’s invocations), but be careful — it might be too much costly for some cases. If your traffic is very dynamic, you can manage your provision concurrency within Application Auto Scaling.
Learn about Lambda execution environment lifecycle. It might be good idea to run some logic in Init phase — out of the handler, so it can be run after starting runtime, eg. by provisioned concurrency, so before invocation. You can put into Init phase synchronous code, up to 10 seconds to compute. Async code in Node.js (such us loading SSM by SDK) might not be finished in Init phase (will be frozen and resume in first invocation). If you would like to finish loading data (eg. from SSM or DB) in Init phase, try Python.
Problem of async code in Node.js is not limited do Init phase only. If you run async code in Lambda, make sure that it’s finished before you will return response / resolve handler. The best way is to always keep async logic in async-await promises instead of callbacks. Callback pattern (such as using setTimeout) can lead to execution leaks (problem, when executing code runs in a different invocation than the original execution context).
Use Lambda layers
You might be in situation, that one service share a lot of code with others. In Node.js you can have some common node_modules and it might not be good idea to include all of them in each bundle, or to deploy to each Lambda container separately. Think about deploy common code as AWS Lambda layer. It could be useful also if some part of your Lambdas code is heavy, and you would like to avoid deploying it each time, when you change something.