One of the main reasons people use the Citus extension for Postgres is to distribute the data in Postgres tables across multiple nodes. Citus does this by splitting the original Postgres table into multiple smaller tables and putting these smaller tables on different nodes. The process of splitting bigger tables into smaller ones is called sharding—and these smaller Postgres tables are called “shards”. Citus then allows you to query the shards as if they were still a single Postgres table.
Yes, that’s right, we have open sourced the shard rebalancer! The Citus 10 shard rebalancer gives you an easy way to rebalance shards across your cluster and helps you avoid data hotspots over time. Let’s dig into the what and the how.
Why use the shard rebalancer?
When Citus initially splits a Postgres table into multiple shards, Citus distributes the shards across the nodes in your cluster (unless of course you’re running Citus on a single node.) This is done to divide the workload across the different nodes. Over time, both the traffic to the cluster and the amount of data stored in your database usually increase. At some point, you might want to add some more nodes to the cluster, to lighten the workload on each individual node. This is called “scaling out.”
There’s one problem though: When you first add nodes to an existing Citus cluster, there’s no data on them yet. This is because all the shards are still on the old nodes. So, all these new nodes will just be doing nothing.
This is where shard rebalancing comes in. Shard rebalancing ensures that the shards are distributed fairly across all nodes. The Citus shard rebalancer does this by moving shards from one server to another.
To rebalance shards after adding a new node, you can use the rebalance_table_shards function:
Different rebalancing strategies
By default, Citus divides the shards across the nodes in such a way that every node has the same number of shards. This approach works fine for a lot of workloads. However, when shards have significantly different sizes, this can lead to one node having much more data than another.
One common scenario where you end up with differently-sized shards is with multi-tenant SaaS applications. Most SaaS apps use the customer_id as the sharding key (in our docs we call this the distribution column.) If you work on a SaaS app, sometimes a few of your customers have more activity and store a lot more data than the rest. Thus, the shards containing the data for these big customers are probably much larger than the other shards in your Citus cluster.
Luckily Citus can use different strategies when rebalancing shards. By default, Citus uses the simple
by_shard_count strategy, but in a multi-tenant SaaS scenario like this, you might want to use the
by_disk_size strategy instead.
To rebalance the shards such that the size of the shards is taken into account, you can use the following SQL query:
SELECT rebalance_table_shards(rebalance_strategy := 'by_disk_size');
By default, Citus comes with the two rebalance strategies we covered (
by_shard_count). You can also add your own rebalance strategies in case these two don’t match what’s needed for your workload. Our Citus docs have various examples of different rebalance strategies. With help from these examples, you can create your own rebalance strategies to:
- Isolate a shard to a specific node. For instance, a shard containing an important customer that you’ve acquired dedicated hardware for.
- Divide shards based on number of queries going to the shard. This can be useful if you are bottlenecked on CPU and you want to distribute the queries more evenly across servers. Often balancing by number of queries would result in a balance similar to
by_disk_size, since more queries usually means more data. This isn’t the case for all workloads though, so in those cases it can be better to create your own strategy based on the number of queries.
- Make the rebalancer aware of difference in capacity between nodes. This can be useful if half of your nodes have 1 TB disks and the other half have 2 TB disks. In this case you probably want twice the amount of shards on the nodes with 2 TB disks.
Shrinking your cluster
There’s one more use case for rebalancing shards that Citus 10 supports by default: You realise your workload could be handled with fewer nodes and you’d like to save some money on server costs. Of course, you won’t want to lose access to any of the data. So before physically turning off any nodes, you will want to move those shards to the servers you plan to keep.
Citus also supports this “scaling in” using citus_drain_node:
SELECT citus_drain_node('10.0.0.1', 5432);
Understanding what the shard rebalancer does
The way Citus does shard rebalancing consists of two phases. In the first phase Citus generates a plan. This plan contains the moves that are needed to divide the shards fairly across the nodes.
Just like you can use
EXPLAIN to understand what a PostgreSQL query will do without executing the query, the same can be done for the Citus shard rebalancer using
get_rebalance_table_shards_plan, with this query:
Or if you want to see what happens if you use a different rebalance strategy, you can see plans for alternate rebalance strategies too:
SELECT get_rebalance_table_shards_plan( rebalance_strategy := 'by_disk_size' );
Then during the second phase of rebalancing, Citus moves the shards one-by-one, according to the generated plan. You can still read from a shard while it is being moved, but writing to it is blocked (meanwhile, writing to other shards can continue normally). If you are using Hyperscale (Citus), you can also continue writing to the shard that’s being moved and that’s because we are able to use some extra tricks in Azure Database for PostgreSQL.
If you want to move a shard to a specific node yourself, instead of according to the rebalance plan, you can do that using the citus_move_shard_placement function. There’s a downside to manually moving shards though: Running
rebalance_table_shards could undo this manual change without you realising it. To avoid this, it’s recommended to create your own rebalance strategy. This way you can make the shard rebalancer aware that you want to have a shard on a specific node.
An important part of your Postgres toolbox for scaling out
You won’t need to use the shard rebalancer in the beginning when you’re getting started with Citus to scale out Postgres. But it’s good to know that the Citus shard rebalancer is there for you. Because at some point as your application grows, you will likely want to rebalance.
Sometimes people talk about things like database hygiene or Postgres toolboxes. Well, the Citus shard rebalancer should be part of your toolbox because it ensures your Citus cluster continues to perform well over time. And now that—big news!—we’ve open sourced the shard rebalancer, Citus 10 gives you a very easy way to scale and grow your Citus cluster.