BRIN Index for PostgreSQL: Don’t Forget the Benefits


PostgreSQLBRIN Index was introduced in PostgreSQL 9.5, but many users postponed the usage of it in their design and development just because it was “new”. But now we understand that it has stood the test-of-time! It is time to reconsider BRIN if you have not done it yet. I often see users who forget there is a provision to select the type of Index by specifying USING clause when creating an index.

BRIN Index is a revolutionary idea in indexing first proposed by PostgreSQL contributor Alvaro Herrera. BRIN stands for “Block Range INdex”. A block range is a group of pages adjacent to each other, where summary information about all those pages is stored in Index.  For example, Datatypes like integers – dates where sort order is linear – can be stored as min and max value in the range. Other database systems including Oracle announced similar features later. BRIN index often gives similar gains as Partitioning a table.

BRIN usage will return all the tuples in all the pages in the particular range. So the index is lossy and extra work is needed to further filter out records. So while one might say that is not good, there are a few advantages.

  1. Since only summary information about a range of pages is stored, BRIN indexes are usually very small compared to B-Tree indexes. So if we want to squeeze the working set of data to shared_buffer, this is a great help.
  2. Lossiness of BRIN can be controlled by specifying pages per range (discussed in a later section)
  3. Offloads the summarization work to vacuum or autovacuum. So the overhead of index maintenance on transaction / DML operation is minimal.

Putting BRIN into a test

Let’s take a simple example to examine the benefits of BRIN index by creating a simple table.

Now let’s Insert some data into this table.

Please note that values in id column and date columns keep increasing for new records, which is common for transaction records. This is an important property which BRIN make use off.

A query at this stage may have to do a full scan of the table and it will be quite expensive.

As we can see above, PostgreSQL employs 2 workers and does the scan in parallel, where it takes 1766.454 ms. In a nutshell, it is heavy on the system and takes a good amount of time.

As usual, our tendency is to create an index on the filtering column. (B-Tree by default)

Now let’s see how the previous SELECT query works:

Obviously, B-Tree is lossless, and it can give a tremendous boost to the performance and efficiency of SELECT queries, especially when Index is freshly created. But a B-Tree index has following side effects:

  1. There will be a reasonable penalty for DMLs
  2. Index size can go big and consume a good amount of shared_buffers
  3. The random page seeks, especially when index pages are not cached, can become expensive.

It will be interesting to check the Index to table size ratio of this fresh index.

So we have B-Tree index 171MB for a 1.3GB table. So Index to table size ratio is 0.13, for this is a fresh index.

But this index-table ratio can keep deteriorating over time as index undergoes continuous updates. Index ratio crossing 0.5 is common in many production environments. As the ratio becomes bad, the efficiency of the index goes bad and it starts occupying more shared buffers.

Things get much worse if the table grows to a bigger size (hundreds of GB or TB) as Index also grows in the same ratio. The impact of B-Tree index on DMLs is heavy – sometimes I measured up to 30% overhead especially for bulk loads.

Now let’s see what happens if we replace a B-Tree index with a BRIN index.

The very first observation is that the impact of Index on the same bulk insert operation is measured to be 3% which is within my noise/error limits. 30% overhead Vs 3% overhead.

If we consider the size of the newly created BRIN index:

As we can see it is just 64kB! 171MB of B-Tree index Vs 64 kb of BRIN index.

So far BRIN wins my heart. Now its time to look at how much query performance improvement it can bring in.

As I expected, it is not that efficient as a fully cached B-Tree index. However, performance improvement from 1766.454 ms to 87.703 ms means approximately 20 times better! That’s with a single worker consuming fewer resources. With just a 64kb Overhead, the cost-benefit of BRIN is very positive.

Storage and maintenance

As we saw the BRIN index uses much less space compared to normal B-Tree index, but the lossy index can reduce the effectiveness of Index. Luckily this can be adjusted using the storage parameter pages_per_range. BRIN stores entries for a range of pages in the corresponding table. The larger the range of pages, the smaller the index, and it gets lossier.

As we already discussed, part of the index maintenance during DML is offloaded to vacuum. In fact, there are 2 cases.

  1. Pages in the table which are already summarized are updated
  2. New pages which are not part of the summary.

In the first case, summary in BRIN is updated straight away along with DML. But if new pages are not summarized already, it will be done by VACUUM or AUTOVACUUM.

There are 2 functions provided for this. A high-level, function call which can be executed against the index like:

This summarizes all ranges that are not currently summarized. The return value indicates the number of new page range summaries that were inserted into the index. This function operates on the top of a lower-level function brin_summarize_range which accepts the range of pages.

A key point to note is that auto-summarization is off by default, which we can enable by storage parameter auto summarize:

In this case, automatic summarization will be executed by autovacuum as insertions occur.

Limitation:

BRIN indexes are efficient if the ordering of the key values follows the organization of blocks in the storage layer. In the simplest case, this could require the physical ordering of the table, which is often the creation order of the rows within it, to match the key’s order. Keys on generated sequence numbers or created data are best candidates for BRIN index.

Discuss on Hacker News