Visualizing Your Ruby Heap


In a previous post, I wrote a bit about how Ruby objects are laid out in memory. Today we’ll use that information to write a program that will allow us to take a Ruby heap dump and visualize the layout and fragmentation of that heap.

Ruby Object Layout Recap

Just as a recap, Ruby objects are fixed width. That is, every Ruby object is the same size: 40 bytes. Objects are not really allocated with malloc, but instead they are placed inside of pages.

A Ruby process has many pages. Pages have many objects.

Which Page Does This Object Belong To?

Objects are allocated in to a page. Each page is 2 ^ 14th bytes. In other words, Ruby objects aren’t allocated one at a time, but the GC allocates one page (also known as an “arena”), and when a new Ruby object is requested, it is placed inside that page.

Pages aren’t exactly 2 ^ 14 bytes. When we allocate a page, we want that page to be aligned with Operating System memory pages, so the total malloc’d size needs to be some number less than a multiple of 4kb (which is the OS page size). Since the malloc system call has some overhead, we have to subtract some some amount from the actual malloc’d size so that the Ruby page aligns and fits on contiguous OS pages. The padding size we use is sizeof(size_t) * 5. So the size of a page is actually (2 ^ 14) - (sizeof(size_t) * 5).

Each page has a header that contains some information about the page. The size of that header is sizeof(void *).

This means the maximum number of Ruby objects that can be stored on a Ruby page is ((2 ^ 14) - (sizeof(size_t) * 5) - sizeof(void *)) / 40.

Since the number of objects per page is bounded, we can apply a bitmask to the bottom 14 bits (remember page sizes are 2 ^ 14, IOW 1 left shifted 14) of a Ruby object address and calculate the page that object lives on. That bitmask is ~0 << 14.

In ASCII art, say we have a Ruby object address 0x7fcc6c845108. In binary:

11111111100110001101100100001000101000100001000
^---------- page address --------^- object id ^

“object id” in the above chart isn’t the traditional Ruby object id, it’s just the part of the bits that represent that individual object on the page. The entire address is considered the traditional “object id”.

Lets extract these numbers to some Ruby code:

require 'fiddle' SIZEOF_HEAP_PAGE_HEADER_STRUCT = Fiddle::SIZEOF_VOIDP SIZEOF_RVALUE = 40
HEAP_PAGE_ALIGN_LOG = 14
HEAP_PAGE_ALIGN = 1 << HEAP_PAGE_ALIGN_LOG HEAP_PAGE_ALIGN_MASK = ~(~0 << HEAP_PAGE_ALIGN_LOG) REQUIRED_SIZE_BY_MALLOC = Fiddle::SIZEOF_SIZE_T * 5 HEAP_PAGE_SIZE = HEAP_PAGE_ALIGN - REQUIRED_SIZE_BY_MALLOC HEAP_PAGE_OBJ_LIMIT = (HEAP_PAGE_SIZE - SIZEOF_HEAP_PAGE_HEADER_STRUCT) / SIZEOF_RVALUE

I mentioned this a little earlier, but I will be explicit here: Ruby pages are allocated with aligned mallocs. In other words, when a Ruby page is allocated it’s allocated on an address that is divisible by 2 ^ 14, and the size of the page is slightly smaller than 2 ^ 14.

Lets write a function that, given an object address, returns the address of the page where that object was placed:

def page_address_from_object_address object_address object_address & ~HEAP_PAGE_ALIGN_MASK
end

Now lets print the page addresses for 3 object addresses:

p page_address_from_object_address(0x7fcc6c8367e8) p page_address_from_object_address(0x7fcc6c836838) p page_address_from_object_address(0x7fcc6c847b88) 

We can see from the output that the first two objects live on the same page, but the third object lives on a different page.

How many objects are on this page?

Ruby objects are also aligned, but they are aligned inside the existing page. They are aligned at 40 bytes (which is also the size of the object). That means that every Ruby object address is guaranteed to be divisible by 40 (this statement isn’t true for non heap allocated objects like numbers).

Ruby objects are never allocated, but they are placed inside a page that has been allocated. The pages are aligned on 2 ^ 14, but not every number divisible by 2 ^ 14 is also evenly divisible by 40. That means some pages store more objects than others. Pages that are evenly divisible by 40 will store one more object than objects that aren’t.

Lets write a function that, given a page address, calculates the number of objects it can store as well as where those object are placed, and returns an object that represents the page information.

Page = Struct.new :address, :obj_start_address, :obj_count def page_info page_address limit = HEAP_PAGE_OBJ_LIMIT obj_start_address = page_address + SIZEOF_HEAP_PAGE_HEADER_STRUCT if obj_start_address % SIZEOF_RVALUE != 0 delta = SIZEOF_RVALUE - (obj_start_address % SIZEOF_RVALUE) obj_start_address += delta limit = (HEAP_PAGE_SIZE - (obj_start_address - page_address)) / SIZEOF_RVALUE end Page.new page_address, obj_start_address, limit
end

Now that we can get information about the page where the object is stored, lets examine page information for the object addresses we were working with in the previous example.

page_address = page_address_from_object_address(0x7fcc6c8367e8)
p page_info(page_address) page_address = page_address_from_object_address(0x7fcc6c836838)
p page_info(page_address) page_address = page_address_from_object_address(0x7fcc6c847b88)
p page_info(page_address) 

The first two objects live on the same page, and that page can store 408 objects. The third object lives on a different page, and that page can only store 407 objects.

It may not seem like it, but we now have all of the key pieces of information we need in order to visualize the contents of our heap.

Data Acquisition

In order to visualize a heap, we actually need a heap to visualize. I will use ObjectSpace to dump the heap to a JSON file, and we’ll used the code from above along with a JSON parser and ChunkyPNG to generate a graph.

Here is our test program:

require 'objspace' x = 100000.times.map { Object.new }
GC.start
File.open('heap.json', 'w') { |f| ObjectSpace.dump_all(output: f)
}

All it does is allocate a bunch of objects, GC, then dump the heap to a JSON file called heap.json. Each line in the JSON document is one object in the Ruby heap.

Now lets write a program to process the JSON file. What we’ll do is change the Page class so that it can keep track of objects that are living on that page, then iterate over the JSON document and add each object to its respective page.

class Page < Struct.new :address, :obj_start_address, :obj_count def initialize address, obj_start_address, obj_count super @live_objects = [] end def add_object address @live_objects << address end
end pages = {} File.open("heap.json") do |f| f.each_line do |line| object = JSON.load line if object["type"] != "ROOT" address = object["address"].to_i(16) page_address = page_address_from_object_address(address) page = pages[page_address] ||= page_info(page_address) page.add_object address end end
end

Visualizing the Heap

So far our processing program has divided the objects up by which pages they belong to. Now it’s time to turn that data in to a visualization of the heap. Unfortunately, we have one slight problem: the heap dump gave us information about live objects in the system. How can we visualize empty spaces in the heap?

We have a few bits of information we can use to figure out where the empty spots are in our heap. First, we know that object addresses are divisible by 40. Second, we know which address is the first address for storage (Page#obj_start_address). Third, we know how many objects a page can store (Page#obj_count). So if we start at obj_start_address, and increment by SIZEOF_RVALUE, we should either find that we read that address from the JSON file, or not. If we read the address from the JSON file, we know it’s a live object, if not, then it’s an empty slot.

So lets add a method to the Page object that iterates over the possible object addresses on the page, and yields :full if there is an object, or :empty if there is no object:

class Page < Struct.new :address, :obj_start_address, :obj_count def each_slot return enum_for(:each_slot) unless block_given? objs = @live_objects.sort obj_count.times do |i| expected = obj_start_address + (i * SIZEOF_RVALUE) if objs.any? && objs.first == expected objs.shift yield :full else yield :empty end end end
end

Now, from page to page, we’re able to differentiate empty slots from full slots. Lets use ChunkyPNG to generate a PNG file where each column in the PNG represent one page, and each 2 x 2 pixel square in each page represents an object. We’ll color the object red if it’s full, but just leave it empty if it’s empty:

require 'chunky_png' pages = pages.values height = HEAP_PAGE_OBJ_LIMIT * 2
width = pages.size * 2 png = ChunkyPNG::Image.new(width, height, ChunkyPNG::Color::TRANSPARENT) pages.each_with_index do |page, i| i = i * 2 page.each_slot.with_index do |slot, j| if slot == :full j = j * 2 png[i, j] = ChunkyPNG::Color.rgba(255, 0, 0, 255) png[i + 1, j] = ChunkyPNG::Color.rgba(255, 0, 0, 255) png[i, j + 1] = ChunkyPNG::Color.rgba(255, 0, 0, 255) png[i + 1, j + 1] = ChunkyPNG::Color.rgba(255, 0, 0, 255) end end
end png.save('heap.png', :interlace => true)

After running this, we should have a file output called heap.png. Here’s the one I generated:

filled heap

This one doesn’t look as neat because we filled the heap up. Lets try dumping the heap from a relatively empty process and see what it looks like:

$ ruby -robjspace -e'File.open("heap.json", "wb") { |t| ObjectSpace.dump_all(output: t) }'

If I process this heap, the output looks like this:

empty heap

Ok that’s the end. Thank you!

You can find the full code listing from this post here.

<3<3<3<3<3