Intro to how to design consistent hashing

To achieve horizontal scaling, it is important to distribute requests/data efficiently and evenly across servers. Consistent hashing is a commonly used technique to achieve this goal.

The rehashing problem

If you have n cache servers, a common way to balance the load is to use the following hash method:

serverIndex = hash(key) % N, where N is the size of the server pool.

1

To fetch the server where a key is stored, we perform the modular operation f(key) % 4. For instance, hash(key0) % 4 = 1 means a client must contact server 1 to fetch the cached data.

2

This approach works well when the size of the server pool is fixed, and the data distribution is even. However, problems arise when new servers are added, or existing servers are removed. For example, if server 1 goes offline, the size of the server pool becomes 3. Using the same hash function, we get the same hash value for a key. But applying modular operation gives us different server indexes because the number of servers is reduced by 1.

1

1

Most keys are redistributed, not just the ones originally stored in the offline server (server 1). This means that when server 1 goes offline, most cache clients will connect to the wrong servers to fetch data. This causes a storm of cache misses. Consistent hashing is an effective technique to mitigate this problem.

Consistent hashing

Quoted from Wikipedia: "Consistent hashing is a special kind of hashing such that when a hash table is re-sized and consistent hashing is used, only k/n keys need to be remapped on average, where k is the number of keys, and n is the number of slots.

Hash space and hash ring

Now we understand the definition of consistent hashing, let us find out how it works. Assume SHA-1 is used as the hash function f, and the output range of the hash function is: x0, x1, x2, x3, …, xn. In cryptography, SHA-1’s hash space goes from 0 to 2^160 - 1.

That means x0 corresponds to 0, xn corresponds to 2^160 – 1, and all the other hash values in the middle fall between 0 and 2^160 - 1.

image

By collecting both ends, we get a hash ring like following:

image

Hash servers

Using the same hash function f, we map servers based on server IP or name onto the ring.

image

Hash keys

One thing worth mentioning is that hash function used here is different from the one in “the rehashing problem,” and there is no modular operation. As shown, 4 cache keys (key0, key1, key2, and key3) are hashed onto the hash ring:

image

Server lookup

To determine which server a key is stored on, we go clockwise from the key position on the ring until a server is found.

image

Add a server

Using the logic described above, adding a new server will only require redistribution of a fraction of keys. Let us take a close look at the logic. Before server 4 is added, key0 is stored on server 0. Now, key0 will be stored on server 4 because server 4 is the first server it encounters by going clockwise from key0’s position on the ring. The other keys are not redistributed based on consistent hashing algorithm.

image

Remove a server

When a server is removed, only a small fraction of keys require redistribution with consistent hashing. When server 1 is removed, only key1 must be remapped to server 2. The rest of the keys are unaffected.

image

The basic steps are:
  • Map servers and keys on to the ring using a uniformly distributed hash function.
  • To find out which server a key is mapped to, go clockwise from the key position until the first server on the ring is found.
Two problems are identified with this approach.
  • First, it is impossible to keep the same size of partitions on the ring for all servers considering a server can be added or removed.
  • A partition is the hash space between adjacent servers.
  • It is possible that the size of the partitions on the ring assigned to each server is very small or fairly large.

For example, if s1 is removed, s2’s partition (highlighted with the bidirectional arrows) is twice as large as s0 and s3’s partition.

image

Virtual nodes

A virtual node refers to the real node, and each server is represented by multiple virtual nodes on the ring. In the following figure, both server 0 and server 1 have 3 virtual nodes. The 3 is arbitrarily chosen; and in real-world systems, the number of virtual nodes is much larger.