Recently, I created a toy benchmark in C# / .NET Core (on Linux) that spawns one million async tasks, to test memory overhead and scalability. Shortly thereafter, a kind stranger sent a PR with a Go version. I thought his results were very interesting and decided to run them myself.
Each of the one million tasks runs an infinite loop that does nothing but increment a shared counter variable (atomic Int64) and sleep for one second. The test ends after ~60 million hits are observed, which should thus take ~60 seconds. (It’s approximate because a master thread is polling the score count every tenth of a second, and even that time is approximate if your CPUs are saturated. This test should not be run to fully saturate your CPUs.)
The test uses /usr/bin/time to record total CPU and Wall time, and maximum memory usage. In addition, the test programs periodically print /proc/self/status so we can compute average memory usage and see what the garbage collector is doing. I ran the tests on a dual-core Pentium (no hyper threading), and did separate runs using 1, 2, and 4 worker threads.
Name UserCPU SysCPU AvgRSS MaxRSS Wall
------- --------- -------- ----------- ----------- ------
c#_1t 92.79 5.40 710,581 992,608 1:30
c#_2t 163.08 5.91 807,874 1,057,916 2:08
c#_4t 171.37 6.05 925,995 1,381,076 2:11
go_1t 53.34 0.88 2,639,375 2,639,740 1:04
go_2t 96.52 3.04 2,605,048 2,612,364 1:03
go_4t 88.39 3.93 2,607,513 2,613,348 1:02
- This is a micro-benchmark. Your application is not a micro-benchmark.
- Benchmarks can be gamed. Benchmarks are hard to do correctly.
- Be mindful of Goodhart’s law: “When a measure becomes a target, it ceases to be a good measure.”
- This code is not designed for high core count machines due to the single (not sharded) atomic counter
As you can see, Go uses less CPU but more memory. But I must admit, I’m very intrigued at how similar they are on this micro-benchmark.
Let’s keep this in perspective: the workload only increased an atomic counter to 60 million. A single thread in can do that in a loop 150X faster. Concurrency is not throughput. An application developer should be thinking creatively about how to solve problems 10x-100x faster by decomposing the problem and analyzing the problem domain data (and probably using batching).
And to be clear, there are implementation differences. C# requires you to “know the color” of all your async vs sync functions. In Go, there are only “async” functions so there is no confusion. However, C# can directly inter-operate with C/C++ and futures/callbacks from other languages.
I started programming in Logo on an Apple II and QuickBasic on an IBM XT. For the last 5 years I’ve been disillusioned by Python (for anything larger than small scripts) and C++ (for secure network services), but haven’t found an adequate replacement. The last time I used anything from Microsoft was in 2001.
However, I’ve very much enjoyed my recent experiences with C# on Linux. I ported a Python project to it and it felt very natural. I am optimistic that programming can feel enjoyable again —something I haven’t felt for a while. I welcome more competition for open-source, cross-platform, safe, fast, and productive programming languages.
I am in between jobs right now and am enjoying the respite— that’s why I have time to play around with fun stuff like this :)
I recently ported my wife’s handmade cards site from Python to C# (1000 LOC) and am very happy with the result. Razor templates are nice.
- Source Code on github. I used dotnet 2.2.103 and go 1.11.5, on Ubuntu 18.04.
- I hope I didn’t mess up as badly as this hilarious benchmark failure
- I tried Tiered Compilation but experienced a slight regression. In any case, this is not a good benchmark for measuring small changes like that. This is a “ballpark” test with some margin of error.
- You also might be interested in my “500K socket connections in C#” test on github.
- Another person’s higher-level test from 2017 with similar results
- A good post: What Color is your Function