A thread-safe generic support map library

Original link: https://colobu.com/2022/09/04/a-thread-safe-and-generic-supported-map/

orcaman/concurrent-map is a very efficient thread-safe map library, as its documentation says, the standard library sync.Map is more suitable for append-only scenarios, or scenarios with less write-heavy reads, if For multi-read and multi-write scenarios, concurrent-map may be more advantageous. It reduces the granularity of locks through sharding, thereby improving performance.

At the beginning of this year, this library was transformed and started tosupport generics , but unfortunately, it only supports Value generics, and its key can only be of string type, which limits its application scenarios.

I forked this project and created a new thread-safe library smallnest/safemap that supports both Key and Value generics, which can be applied to more sh use cases.

I can understand why the key of concurrent-map does not support generics, because it is not easy to implement. Because for a generic Key, we are not very good at calculating its hash value. Either convert it into a string and then calculate the hash value with low performance, or provide a hash function variable so that users can implement the hash algorithm by themselves, which is inconvenient for users to use.

I recently saw a new project alphadose/haxmap , which is also a thread-safe map. Based on the Harris lock-free list algorithm, referring to its Key hash calculation method, it implements patching the concurrent-map so that the Key and Value also supports generics, which led to the smallnest/safemap library.

Of course, in the process of copying the hash algorithm of haxmap, it was also found that its hash algorithm had problems with some types, and repaired it.

You can use safemap to handle concurrent read and write scenarios, and you can use safemap to learn how to transform an existing project with generic support.

Map Key/Value also supports generic transformation

The concurrent-map is defined as follows:


1
2
3
4
5
6

type ConcurrentMap[V any] []*ConcurrentMapShared[V]
type ConcurrentMapShared[V any] struct {
items map [ string ]V
sync.RWMutex // Read Write mutex, guards access to internal map.
}

As you can see, it only defines the V any type constraint, which is used as the type of Value, and its Key type can only be of string type: items map[string]V .
The advantage is that it can only implement a specific hash algorithm for the string type, regardless of various key types:


1
2
3
4
5
6
7
8
9
10

func fnv32(key string ) uint32 {
hash := uint32 (2166136261 )
const prime32 = uint32 (16777619 )
keyLength := len (key)
for i := 0 ; i < keyLength; i++ {
hash *= prime32
hash ^= uint32 (key[i])
}
return hash
}

Why not use the built-in map hash algorithm to calculate hash values ​​for all comparable types? This issue has also been discussed in #21195 , but in the end it’s gone. Although Go 1.19 officially exposed the internal memhash algorithm, for the scenario in this article, we expect to expose a generic Hash algorithm.

Anyway, let’s assume that there is a way, we implement a hash algorithm for generics ourselves. Then we need to transform ConcurrentMap now.

I first named it SafeMap , which is shorter and supports both Key and Value type constraints:


1
2
3
4
5
6
7
8
9

type SafeMap[K comparable, V any] struct {
shared []*SafeMapShared[K, V]
hasher func (K) uintptr
}
type SafeMapShared[K comparable, V any] struct {
items map [K]V
sync.RWMutex
}

A generic hasher is defined here: hasher func(K) uintptr , which is used to calculate the hash value of a generic Key. Let’s implement it below.

Then there is the place where the repair program is marked in red. That is, where the original hard-coded string, we have to replace it with K, for example:


1
2
3
4
5
6
7
8
9
10
11
12
13

func (m *SafeMap[K, V]) GetShard(key K) *SafeMapShared[K, V] {
k := uint (m.hasher(key))
return m.shared[ uint (k)% uint (ShardCount)]
}
// Sets the given value under the specified key.
func (m *SafeMap[K, V]) Set(key K, value V) {
// Get map shard.
shard := m.GetShard(key)
shard.Lock()
shard.items[key] = value
shard.Unlock()
}

Receiver is going to be modified because it currently has two type constraints ( K , V ). Where Key string is involved, use Key K to replace.

The hash calculation for this uses the hasher method. Next, let’s take a look at our default hasher algorithm.

hasher

The difficulty in implementing this hasher method is to support generics. If it is a specific type, such as int64, string, we can directly use its underlying value to calculate the hash value. If it is interface{} , it is easy to handle, and the type switch can also be used for convenient model processing. If it is a generic type, it is executed at compile time, and you cannot use type switch directly, so what should we do?

First, we determine the type of Key when initializing, which can be achieved by reflection (first convert the Key to interface{} , and then use type switch ), and then return a specific hasher algorithm according to different types, so that the hasher is actually initialized When it is determined, it does not affect the performance of subsequent Set and Get methods.

Here is the snippet that generates the hasher:


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
twenty one
twenty two
twenty three
twenty four
25
26
27
28
29
30
31
32
33
34
35
36
37
38

var (
// qword hasher, key size -> 8 bytes
qwordHasher = func (key uint64 ) uintptr {
k1 := key * prime2
k1 = bits.RotateLeft64(k1, 31 )
k1 *= prime1
h := (prime5 + 8 ) ^ k1
h = bits.RotateLeft64(h, 27 )*prime1 + prime4
h ^= h >> 33
h*= prime2
h ^= h >> 29
h*= prime3
h ^= h >> 32
return uintptr (h)
}
float64Hasher = func (key float64 ) uintptr {
k := *(* uint64 )(unsafe.Pointer(&key))
h := prime5 + 4
h ^= uint64 (k) * prime1
h = bits.RotateLeft64(h, 23 )*prime2 + prime3
h ^= h >> 33
h*= prime2
h ^= h >> 29
h*= prime3
h ^= h >> 32
return uintptr (h)
}
)
func genHasher[K comparable]() func (K) uintptr {
var hasher func (K) uintptr
switch any(* new (K)).( type ) {
case int64 , uint64 :
hasher = *(* func (K) uintptr )(unsafe.Pointer(&qwordHasher))
case float64 :
hasher = *(* func (K) uintptr )(unsafe.Pointer(&float64Hasher))

First convert the generic value to interface{} , so that you can type switch: switch any(*new(K)).(type) { .

If it is int64, uint64 type, we can use the predefined qwordHasher qwordHasher, but its type is func(key uint64) uintptr For int64 type, you can use unsafe to force it.

In this way, we have overcome the most difficult point. Various types can be handled independently, and we only need to further define the corresponding hasher for different types. You can use various hash algorithms to achieve this.

In this way, the safemap is transformed. How about the performance, I suggest you actually test it, and it’s best to test it according to your actual scenario. Some basic benchmark tests are provided in the project.

In addition, I also recommend you to pay attention to alphadose/haxmap , it will perform better in my initial test, but I still need to wait for it to develop and become more stable.

This article is reprinted from: https://colobu.com/2022/09/04/a-thread-safe-and-generic-supported-map/
This site is for inclusion only, and the copyright belongs to the original author.

Leave a Comment