RmlUI’s style cache

Original link: https://blog.codingnow.com/2022/07/rmlui_style_cache.html

The GameUI of our game engine uses RmlUI . My idea is to use mature CSS to describe the presentation of the UI, and use the method of web front-end development to make the UI of the game.

But I don’t want to embed too complicated web rendering engine, and the needs of the game will be different, so I chose the lightweight RmlUI. Also, to be consistent with the game’s development language, we use Lua instead of javascript to control the CSS.

In the process of using RmlUI, we tried to be consistent with the upstream as much as possible at the beginning, fixed a lot of bugs, and merged them into the upstream. Later, we found that we have many requirements that are different from upstream, and we need to make some drastic changes, so we fork a branch of our own.

The two most important changes are: First, Lua Binding is fully implemented, because the original implementation is very inefficient and complex, and many interface designs are unreasonable. Making large-scale interface changes must break forward compatibility, which was the main motivation for our fork.

Second, the typesetting module implemented by RmlUI was abandoned and replaced with Yoga maintained by Facebook.

After migrating to our own branch, we have successively refactored the code of RmlUI.

Recently, an original bug of RmlUI was found. Sometimes after modifying a css node attribute, it will affect the attribute of another irrelevant node. After careful study, it is judged that this is caused by its inherent design error.

We know that CSS is very flexible, and the properties of each node can be affected by many data sources. Deriving a property on a node would be a fairly expensive operation if we didn’t do any optimizations. So, some caching has to be done on it.

However, there are some design problems in the attribute caching algorithm of RmlUI, which is based on the C++ object pointer of the node as the cache index. This is bound to be affected by the deletion and rebuilding of objects. (Because the value of the memory pointer may be repeated) and the implementation of this part of the code is very complicated, I think it will be better to redesign and implement after thinking.


First define the problem:

The attribute table of each node is actually an array of kv pairs. Where k is an enum, the range is not large (less than 100 kinds) and will not repeat; v is an object. And this object must be serializable from text (because the source data is written in text).

These property sheets are composed of several property sheets. There are two ways of composition, inheritance from a predefined type and inheritance from the parent node of the DOM. The latter will be affected by whether the attribute can be inherited or not, which attributes can be inherited from the parent node is determined from the beginning and will not change.

The final attribute table of each node is synthesized from several attribute tables in a specific order.

If we think of the attribute table as an object, then the composition table A and table B can be regarded as A * B, where * is the composition operation. The table on each node can be seen as the result of a continuous synthesis operation of a string of data tables A*B*C. ….

If we want to do caching, we should cache the results of any two tables A * B to improve performance.


We can divide the attribute table into two categories, one is the element data table, and the other is the composite result of the two tables. For the former type of table, its lifetime is maintained by the user, while the latter can be completely managed by the cache module. Because once the cache is invalidated, users can find a way to rebuild them.

I use 64bit ID to refer to the attribute table (odd numbers are used to refer to the data table, and even numbers are used to refer to the composite table), if the data table is released, or the composite table disappears from the cache, the ID will not be recycled, so, we It’s easy to know which IDs are still valid and which are no longer valid.

As for the content of the attribute table, you can do an extra layer of Cache and Interning them. The attribute table with the same content actually points to the same piece of data.

For composite tables, it is only necessary to remember which two tables are referenced at the beginning, and the composite operation can be deferred until the final evaluation. If there is a source table change, the previous synthesis result needs to be cleaned up.


When I realized the above ideas, I specifically thought about how to design the internal data structure better. One of the interesting ones is the structure of the Cache of the composite table.

All composite tables are managed by the Cache, and the user does not complicate their lifetime. That is, if you compute A * B = C , then the lifetime of C is not managed by itself, but does (weak) reference. Since it is Cache management, there are times when it will fail. Because the table on each node is the result of the combined action of a series of tables, when the result is invalid, the user can simply do the synthesis process again. This process may be very long, and we need to cache intermediate variables as much as possible.

The composite table should first be managed by an LRU queue. I directly used a fixed-size (4095) array, which was linked internally with a doubly linked list. In this way, for any job application, you can easily adjust it to the front of the queue.

Then, we need two ways to quickly index objects in the LRU queue. Index from ID, and from a combination of both IDs. That is, if A * B = C is computed, then from (A, B) can be quickly indexed to the already created object C .

Each index table I use an 8191 size array as a hash table. It’s twice the size of the Cache, so there should be almost no collisions. But collisions are theoretically unavoidable. In terms of resolving collision conflicts, my design is quite special, neither an open hash table nor a closed hash table.

I use 64bit as a slot size. Because the original data is 4095, which is 12 bits, this slot can put down 5 index values, that is, theoretically, it can collide up to 5 times. If a hash value does collide 5 times (very rare), then you need to traverse the entire LRU queue to see if there is a 6th value.

Attached code: https://github.com/cloudwu/stylecache

This article is reprinted from: https://blog.codingnow.com/2022/07/rmlui_style_cache.html
This site is for inclusion only, and the copyright belongs to the original author.

Leave a Comment

Your email address will not be published.