Web Development

Optimizing React Rendering through Virtualization

October 4th, 2016 | By Juho Vepsäläinen | 5 min read

Why is it relevant to optimize React Rendering? Even though React is performant out of the box, sometimes you need to tune it.

The most common trick is to implement the shouldComponentUpdate lifecycle method so that React can skip rendering based on a custom check. This can be convenient if equality checks against the data happen to be cheap (i.e. you are using some library providing immutable structures).

Sometimes this isn’t enough. Consider rendering thousands of lines of tabular data. It can quickly become a heavy operation even if you have nice checks. That is when you must be more clever and implement a context-specific optimization.

Given tabular data fits well within a viewport (think CSS overflow), it is possible to render only the visible rows and skip rendering the rest. This technique – virtualization – is worth understanding in detail as it gives you greater insight into React.

Basic Idea of Virtualization

The biggest constraint of virtualization is that you need a viewport. You could treat even the browser window as such. Often, however, you set up something more limited.

A simple way to achieve this is to fix the width/height of a container and set overflow: auto property through CSS. The data will then be displayed within the container.

The question is, how to figure out what to render?

The naïve case is simple. Assuming we are rendering table rows, we will render initially only as many as fit within the viewport. Let’s say the viewport is 400 pixels tall and our rows each take 40 pixels. Based on that we can say we can render up to ten rows at first.

Problem of Scrolling

The problems begin once the user starts to scroll. We can capture scrolling-related information using an onScroll handler. Within there, we can dig the current y coordinate of the scroll top position through e.target.scrollTop.

Let’s say the user scrolled for 50 pixels vertically. Quick math tells us we should skip rendering two rows (50 – 20 * 2 = 10) and start rendering from the third. We should also offset the content by 30 pixels (50 – 20 = 30). One way to handle the offset is to render an extra row at the beginning and set its height accordingly.

There is also a similar problem related to the rows at the end. Since we’ll want to skip rendering rows there too, we’ll need to perform similar math. This time we’ll figure it out based on the viewport height while taking offset into account. To get the scrollbar height right, we can render an extra row using this information.

Based on these calculations and a bit of extra work we can figure out the following things:

Extra padding at the beginning – extra row to make the scrollbar look right.
Amount of rows to render including location – these should be sliced from the whole data set.
Extra padding at the end – extra row to make the scrollbar look right.

Problem of Indices

This isn’t everything. There are a couple of gotchas in the scheme. If we have styled our content with even/odd kinds of styling through CSS, you will see that something is wrong quite fast. A basic algorithm will lead to flickering as you scroll.

The flickering has to do with the fact that we are always rendering from “zero” without taking actual indexing into account. Fortunately, this is an easy problem to fix. Once you know where you are slicing the data from, you can check whether the starting index is even or not. If it’s even, render an extra row at the start. It can have zero height.

This little cheat makes the rendering result stable and gets rid of flickering.

Problem of Heights

In the example above we made it easier for ourselves by assuming that row height is set to some specific value. This is the most common way to approach it and often enough. What if, however, you allowed arbitrary row heights?

This is where things get interesting. It is still possible to implement virtualization, but there is an extra step to perform – measuring.

Idea of Measuring

One way to solve this problem is to handle it through a React feature known as context and callbacks to update the height information at a higher level where the logic exists. Depending on your design, regular props could work as well, or you could use a state management solution for this part. No one right way.

The idea is that once the componentDidMount or componentDidUpdate lifecycle method of a row gets called, you’ll trigger the callback with the row height information. You can capture this using a ref and offsetHeight field.

You may also want to pass id information related to the row to the callback. This allows you to tell the measurements apart. It’s better to use the actual rowid over-index as the latter won’t yield predictable results.

Getting Initial Measurements

Beyond measuring, you’ll need to use the measurement data to figure out the same as we did above. The math is a notch harder, but the ideas are the same. The biggest difficulty is actually handling the initial render. You don’t have any measurement data there and you need to get it somehow.

I ended up handling this problem by implementing componentDidMount and componentDidUpdate lifecycle methods at the logic level. It is important to note that if you use an inline CSS solution such as Radium, the initial measurement data might be invalid! The solutions take a while to apply their styling and as a result, React’s lifecycle methods might not work as you expect.

To give you a rough idea of how I solved this, I ended up with the code below:

...
class VirtualizedBody extends React.Component {
    ...
    componentDidMount() {
            // Trigger measuring initially
            this.checkMeasurements();
        },
        componentDidUpdate() {
            // Called due to forceUpdate -> cycle continues if
            // there are no measurements yet
            this.checkMeasurements();
        },
        ...
    checkMeasurements() {
        if (this.initialMeasurement) {
            const rows = calculateRows(...);

            if (!rows) {
                return;
            }

            // Refresh the rows to trigger measurement
            setTimeout(() => {
                this.forceUpdate(
                    () => {
                        // Recalculate rows upon completion
                        this.setState(
                            rows,
                            /* If calculateRows returned something,
                            (not {}), finish*/
                            () => this.initialMeasurement = false
                        );
                    }
                );
            }, 100); // Try again in a while
        }
    }
}

export default VirtualizedBody;

It might not be the most beautiful solution, but it operates within the given constraints. Most importantly it allows us to capture initial measurement data that in turn can be extrapolated to give us average row height. The more data we render, the better average we can get.

In a scheme like this, we will most often be dealing with incomplete data. In practice, having something that’s good enough is valuable. Especially if you are rendering thousands of rows, small inaccuracies won’t affect scrollbar rendering much.

The most accurate result can be gained by measuring each row, but that in turn would eat our performance. It would most likely be possible to implement more sophisticated measuring to gather the data on the background, but so far a rough approximation such as the one above has proven to be enough.

Conclusion

Even though virtualization doesn’t feel like a hard problem on paper, it can be difficult to handle when you get to the browser. This is particularly true if you loosen the constraints and allow arbitrary row heights. You can turn a relatively simple problem into a much harder one this way.

From the user's point of view not having to specify height can be nice even if it’s trickier to implement.

The greatest advantage of virtualization is that it cuts down the amount of rendering substantially.

Often viewports are somewhat limited. Instead of thousands of rows, you will end up rendering tens of rows in the worst case. That is a massive difference even if you skip performing shouldComponentUpdate per row. The need for that simply disappears.

You can see this particular approach in action at my table component, Reactabular. See react-virtualized for another take on the topic.

Jscrambler

The leader in client-side Web security. With Jscrambler, JavaScript applications become self-defensive and capable of detecting and blocking client-side attacks like Magecart.

View All Articles

Must read next

Javascript

Practical data visualization concepts in D3.js

The strategic use of accessible data visualization is not only common sense but also provides a significant competitive advantage.

September 1, 2016 | By João Samouco | 7 min read

Learn More