Hi, I am Okada(@ocadaruma), a member of the LINE Ads Platform team. I've been interested in Go's GC (Garbage Collection or Garbage Collector) for a while, which got me even to write a post about it. Go is a programming language developed by Google and supports garbage collection. Go also supports concurrency through channels. Many companies, including Google, are using Go, and LINE also uses Go for developing tools and services.
Using Go, you can easily develop low-latency applications. Go GC seems much simpler than the runtime of other languages. As of Go 1.10, the garbage collector for Go is the Concurrent Mask & Sweep (CMS) collector which does not compact and is not generational. This is nothing like JVM.
It is a concurrent mark and sweep that uses a write barrier. It is non-generational and non-compacting. – mgc.go
Here is a comparison of Java GC and Go GC. Compared to Java, Go GC looked a bit too simple to me, so I decided to dig into it to see how Go GC works.
Java (Java8 HotSpot VM)
|Collector||Several collectors (Serial, Parallel, CMS, G1)||CMS|
|Compaction||Compacts||Does not compact|
|Generational GC||Generational GC||Non-generational GC|
Depends on the collector.
Multiple parameters available.
Garbage collection can either be non-moving or moving.
Non-moving garbage collectors do not relocate objects in a heap. CMS, the collector Go uses, is non-moving. Generally, if you repeat memory allocation and deallocation in non-moving garbage collection, you end up with heap fragmentation, and thus lessened performance for allocation. But, of course, this will depend on how you implement the memory allocator.
Moving garbage collectors compact the heap by relocating alive objects to the end of the heap. An example of a moving garbage collection is Copying GC, which is used on HotSpot VM.
Compaction has the following merits:
- Avoiding fragmentation
- Being able to implement a high-performance memory allocator thanks to bump allocation (Since all objects are located at the end of the heap, we can increment right at the end for new memory allocation.)
Why Go did not opt for compaction
Rick Hudson, of Google, shared the following in his keynote, Getting To Go, at International Symposium on Memory Management (ISMM).
- In 2014, they originally planned to do a read barrier free concurrent copying GC.
- Due to lack of time — they were in the middle of rewriting the runtime written in C to Go (Changes to the runtime) — they decided to opt for CMS.
- They adapted a TCMalloc-based memory allocator which solved fragmentation and optimized allocation.
For more information on Go memory allocation, see the code comments for the runtime.
The purpose of generational GC is to optimize GC by dividing objects in a heap by its age (how many times they have survived from GC), thus generations. The generational hypothesis states that in many applications, new objects mostly die young. The following strategy, based on the hypothesis, optimizes GC because we can eliminate the need to scan old objects multiple times.
- Collect garbage more frequently from young space (Minor GC).
- Promote old objects in the space, which have survived several GC cycles, by relocating them in the space where garbage is collected less frequently (Major GC).
All collectors for Java8 HotSpot VM implement generational GC.
A drawback of generational GC is having overheads in an application even when garbage collection is not running. Let's have a look at an example of minor GC.
If we check the root only for the pointers to new generations and then collect unreachable ones, then a new object referred by an old one, like
obj2 in our diagram, gets collected unexpectedly. But, if we check the whole heap including old objects to avoid young ones collected, then there is no point for generational GC. So, a process is added, to keep a record of a reference by an old object to a new one, when substituting or rewriting references. We call this additional process a write barrier. Using generational GC probably has more benefits that could compensate the drawback (write barrier overhead).
Why not generational GC?
As we've seen earlier, generational garbage collection requires a write barrier for recording pointers between generations. Going back to Getting To Go, Rick Hudson's keynote, we can see that they did consider generational GC, but gave up on it because of the write barrier overhead.
The write barrier was fast but it simply wasn't fast enough
With Go, the compiler's escape analysis is outstanding and if required, programmers can control objects not to be allocated on a heap, so short-lived objects are often allocated on a stack instead of heap; this means that there is no need for GC. Overall, what you can get from generational GC is less than that of other runtime. There are libraries written in Go known for speed which happen to be also known for zero allocation. Still, we still have this consuming task, to scan long-lived objects numerous times in each GC cycle. Google's Ian Lance Taylor has mentioned about this in golang-nuts, Why golang garbage-collector not implement Generational and Compact gc?, as follows:
That is a good point. Go's current GC is clearly doing extra work, but it's doing it in parallel with other work, so on a system with spare CPU capacity Go is making a reasonable choice. But see https://golang.org/issue/17969.
Well, I think that there remains a chance for generational GC to be implemented, in my opinion.
By studying Go garbage collection, I was able to understand the background for the current structure of Go GC and how it overcame its weaknesses. Go is evolving so fast. If you are interested in Go, it will be better to keep yours eyes on it (Go 1.11 was released in August 2018, when I wrote this article).