Luke Mattfeld, Software Engineer II
AWS Lambdas are a fantastic technology that allows developers to run a bit of code in one of many languages, and quickly get it deployed and integrated into either new or existing cloud infrastructure. Being serverless, they save money when workloads are sporadic or bursty.
Even though Lambdas are not designed for computationally intensive tasks, performance matters. Both for application performance and because Lambda invocations are billed by time. The less performant, the longer the code runs and the more it will cost. These costs scale with the number of invocations. If your customer base doubles, so will the potential savings increased performance could bring.
But the cost of the service itself is one of many things to consider. A potentially more significant cost is the time the developers spend building and maintaining the service. The time to find a tricky bug or train a new team member on a unique technology can add up quickly. If the time to iterate is large enough or the development experience needs to improve, it will discourage developers from writing performant code.
This blog post will explore the performance of lambda runtimes in this balance of compute performance and development efficiency. This will be done by first looking at the available runtimes, their strengths, and the performance numbers under various workloads. Then, we’ll examine the development experience of each runtime by looking at the tools, community, and developer feedback for each runtime. Finally, we will consider these results and provide recommendations for lambda runtime usage.
Runtimes
AWS provides several natively supported runtimes and the ability to compile a custom lambda runtime for any programming language. Here, we will test most of the supported runtimes and 1 custom runtime (for Rust).
The following is a quick summary of the runtimes that we’ll cover:
Runtime | Features |
---|---|
Java |
|
Python |
|
NodeJS |
|
.Net (C#) |
|
Go |
|
Rust |
|
THE NUMBERS
Tests
To get some hard numbers on the performance of each runtime, we will build on the work of Aleksandr-Filichkin, who set up a CloudFormation-based project to test this very thing. The original tests run by Aleksandr made use of JMeter. We will use a simple go script to run the same tests for flexibility. We will test 2 different workloads (one that tests a more typical lambda workload and one that is a more compute-heavy task). During these tests, we will capture two different metrics: Lambda cold-start times for various memory allocations and average request duration.
Test Case #1
I ran the same DynamoDB runtime as the original work using this script. This first test sends an object to the lambda repeatedly. The lambda must decode the JSON object and save it to DynamoDB. It then JSON-encodes the object and sends it back. This is meant to simulate a more typical workload for lambda, albeit a computationally lightweight one. The results are as follows:
Overall, the trend is that runtimes that require more memory take longer to start up. Java is the slowest across the board, which is not surprising as the JVM must be started before any code can run. When deployed with 128MB of RAM, the Java lambda could not start up. Rust and Go, both compiled languages, are consistently quite fast. Python, an interpreted language, holds its own and can score 2nd-3rd place across the board.
We can see here that after the first invocation, the time goes down. The lambda doesn’t have to spend any time setting up the environment. In three separate runs, we see that Java, C#, and NodeJS have the biggest changes. Java, especially, can speed up execution to compete with Rust and Go, which are consistently the fastest. Another thing to note is that once the lambdas are warmed up, NodeJS has, on average, the worst response times.
Test Case #2
In the second test case, we introduce a more computationally intensive task: sorting. The process begins with sending a large JSON array to the lambda function. Upon receiving the array, the lambda decodes it and sorts its elements. Once sorted, the lambda sends the resulting array back, and the correctness of the sorting is verified. We only ran the test on the 256M lambda for this particular case. This decision was made due to the low memory requirements of the test and our primary focus on measuring the compute time rather than the runtime startup time. In addition, early testing with larger lambda instances did not lead to significantly different results. The results for test case #2 are as follows:
Here, we see similar relative cold-start results as test case #1, with the outlier being Python. Within the standard group, we see that, again, runtimes with larger startup costs have a longer initial response time. Java is the second slowest, with around 2.5 seconds, and Go and Rust take the lead with sub-second times. Python stands out with a spectacular 9+ second response time. However, we know from the previous test that this is not due to the setup time nor the time to decode the JSON payload. This considerable delay comes from Python’s poor performance for this highly computational task. We will see this continue in the following result.
Overall, we see results that align with the average duration for test case #1. Java quickly improves and jumps to second place, and Rust and Go stay consistently fast. Also, as seen in the cold-start test, Python is nearly an order of magnitude slower than the rest.
Test Summary
What can we learn from these tests? A few patterns emerge. Java struggles with low memory and is slow to start up, but it can be one of the fastest once it gets going. NodeJS and C# have consistent, though underwhelming performance. Go and Rust are consistently quick, with Rust almost always taking first place.
Cost
We can calculate an estimated cost per runtime for different workloads based on these two tests. While the dollar amounts calculated may not be directly applicable, the relative difference in price between the runtimes is insightful. Amazon states that Lambdas Cold-starts only happens on 1% of invocations. So, we will set up a 1% cold-start and a 99% average weight on lambda invocation time for these calculations. For simplicity, we will only calculate the cost for a 256M lambda per 1 million requests. The results are as follows:
Red - Highest Cost
Green - Lowest Cost
Blue - Second-lowest cost
We see that for the lighter DynamoDB workload, the cost favors runtimes that can start up quickly. The time to parse and re-serialize the JSON is not significantly different between the runtimes for such a small object. So, the startup cost contributes more to the overall invocation time.
We see a few changes of place for the more computationally heavy workload. The startup cost is negligible since the time to sort now becomes the most significant portion of the invocation time. So, the cost of a given runtime will depend greatly on the usage pattern for that lambda.
Energy, Time, and Memory
Before we delve into the developer experience aspect, let’s look at one more measure of performance: efficiency. While researching this blog post, I stumbled upon a fascinating paper titled “Energy Efficiency Across Programming Languages,” authored by a research group based in Portugal. This study provides valuable insights into the efficiency of languages, measuring them against several critical factors such as energy consumption, execution time, and memory usage.
Modified table from Energy Efficiency Across Programming Languages
With runtimes examined in this blog highlighted
This table presents results that correlate with the results of our performance tests, supporting the researchers’ observations. As highlighted in the study’s conclusion, “Although the most energy efficient language in each benchmark is almost always the fastest one, the fact is that there is no language which is consistently better than the others… The situation on which a language is going to be used is a core aspect to determine if that language is the most energy-efficient option.” So, while the fastest languages are usually the most efficient, it will again depend on the use case and constraints of the environment it’s run in.
THE EXPERIENCE
We will now delve into the user experience aspect of these languages. Recognizing that this area may be subject to personal bias and heated debates, we aim to maintain an objective perspective by basing our exploration on tangible data. We’ll first analyze the size, growth, and usage of these languages and their affiliated libraries. Then, we will scrutinize the findings from usage and desirability surveys gathered from a large number of developers. From this, we’ll glimpse what developers are using, which runtimes have the greatest support, and what developers want to use in the future.
Language Usage
For information about language usage, we will examine results from Github’s State of Open Source from 2022 and StackOverflow’s Developer Survey 2023. The data gathered from these sources has been modified to either show or highlight the relevant information. Please see the sources for the complete dataset.
The following graph is from the Popular Technology section of the Stack Overflow survey. It shows the percentage of developers who have used each language substantially over the last year.
The following graph is from the “The top programming languages” section of Github’s State of Open Source. It shows a time-based ranking of language contributions on GitHub from 2014-2022.
Unsurprisingly, the top 4 languages follow closely in both results (removing Bash and SQL from the StackOverflow results). Javascript, Java, and Python have been widely used for many years. Typescript, a superset of Javascript, has grown quickly in the past few years to compete - partly due to the widespread use of Javascript itself and partly because it brings much-needed features for large and cross-stack applications. With the rise of Javascript on the backend (in Lambda’s, for example), Typescript has made itself ever more useful.
Here, GitHub lists the top ten growing languages. This gives us insight into what languages are gaining in popularity and support. We see Typescript’s popularity again and the recent growth of Rust and Go.
This graph helps measure the desire of developers to work with languages while trying to calculate how much of the desire is just hype. The blue dot indicates what percentage of all developers are interested in using the given language in the future. The red dot represents the percentage of developers who use the shared language and want to continue using it. So, the farther apart the dots, the more likely the desire to use the language also translates to enjoyment in using the language.
We see that most of the languages in our study are well-loved. Javascript is highly used, but only a little over half of developers currently using it want to continue. Java, the big outlier, has less than half of the developers wanting to continue.
Libraries
To get some data on the number of libraries for each language, we will use modulecounts.com. According to their website, “Data is collected by scraping the relevant websites once a day via a cron job and then stored in a Postgresql database for later retrieval. Growth rates are calculated by averaging data over the last week.” The results for the past year are as follows:
Please note: the data presented here was collected in early July 2023. An apparent anomaly can be observed in the library count for npm, indicated by the substantial surge followed by a sharp decrease. This fluctuation is likely due to a reporting error because the value resumes its approximate placement in line with the prior growth trajectory. As such, the average growth figure may not be accurate. Data captured a few months earlier suggested an average growth rate of 1757/day.
The graph represents the number of library packages per programming language and reveals a significant gap. Node.js is the fastest-growing and has the most packages - a testament to its large community.
While having numerous library packages has its advantages, more isn’t always better. As Node.js’s infamous ‘left-pad’ library debacle shows, some of Node’s most-used packages maybe shouldn’t exist. Quality and relevance must factor into the decision-making process, ensuring developers avoid unnecessary bloat in their projects.
It’s also worth noting Go’s absence from this graph. Go differs from the rest by relying on public Github repositories instead of a traditional package registry. This makes it hard to measure the number of available packages, so it is not included.
Summary
From all of these developer experience metrics, we came up with the following tabular summary:
Language | Support | Usage | Popularity |
---|---|---|---|
NodeJS | Great support | Widely used | Typescript is well loved and growing |
Java | Great support | Widely used | More devs don’t like it, losing in popularity |
Python | Great support | Widely used | More people want to use it |
C# | Good support | More Domain-specific | Liked more than Java |
Go | Fair support | Moderately used | Pretty well loved and growing |
Rust | Fair support | Not widely used | Very loved and growing fast |
Conclusion
With all of this information, we can make a few educated recommendations that should be helpful when designing new AWS infrastructure.
When it comes to selecting an optimal language for AWS Lambda, NodeJS, and Go are the most efficient choices most of the time. NodeJS, due to its wide usage and excellent library support, is particularly helpful when there’s a need for integration with Types and ensuring consistency with a Front-End platform. Although Go may not be as popularly used, its speed, ease of learning, and native Lambda support make it a robust choice for ecosystem-agnostic endpoints that necessitate consistent performance, data processing tasks (such as queue-style workloads), and scripted tasks.
On the other hand, Java, though widely used and relatively performant, tends to be too bulky for the context of Lambdas. Its considerable cold-start times render it unfavorable for intermittent workloads. The use of Rust can prove to be a bit intricate due to its uniqueness, despite its speed, consistency, and security benefits, making it less appealing for Lambdas. Lastly, while Python is popular for its ease of use and solid library support, it falls short in performance, particularly for computation-heavy tasks. In most cases, what Python can accomplish can be done just as effectively using Go.
Overall, every language has unique strengths and weaknesses, and it’s essential to make the selection based on the larger context of the project and its specific requirements.
Thanks for the insightful article! Have you tried using Lambda SnapStart for Java to reduce coldstarts and improve startup performance. It is said to improve performace by up to 10x at no extra cost.