High Performance Browser Applications

Article status: Work in progress. Still gathering and formulating the raw thoughts I've accrued since mid-2018.

There is a certain class of browser applications for which high performance is a primary concern. Oftentimes these are applications that make heavy use of data visualization, geospatial visualization, media processing, or other resource intensive tasks. They’re not particularly concerned with bundle size or power consumption or network usage because they’re intended for desktop users who are prioritizing efficiency and scale. Not only that, loading large data payloads is just a fact of life if you want to explore and analyze tens of millions of rows of data in a browser. Examples of these apps might be a dashboard for exploring all the Uber trips in a major city over time, or Figma, which runs a full on professional design tool in the browser.

These types of applications make some very different trade offs than your typical web application, and they require some special considerations and techniques. For example, just because loading 10mb of data is acceptable in your scenario doesn’t mean you can send 10mb of JSON and treat it the same way you would a 15kb chunk. Passing more than 2 or 3 megabytes of data to JSON.parse() is enough to spike your browser’s memory and cause significant jank, and more than 5mb will probably crash your browser tab.

This article aims to document and discuss the special considerations and techniques required when writing applications like the ones described above. I’ve long struggled to describe this type of application and distinguish it from “normal” browser applications. For now, the best term I have come up with is high performance browser applications.

WebAssembly is the future

Before addressing the current state of the art we should acknowledge that, over time, the use of WebAssembly in performance critical applications will increase.

The introduction of WebAssembly represents an enormous, fundamental change in what can be done in a browser. In addition to the potential for exponential performance improvements, it also opens the door to using countless existing libraries right in the browser. Codecs and algorithms written and painstakingly tuned for a myriad of different industries and use cases, all (theoretically) portable to the most ubiquitous and accessible runtime container in existence.

That said, its massive power is thanks to its specialized, performance-focused runtime. That runtime doesn't run JavaScript, so using WebAssembly is a much bigger undertaking than just installing a library from npm or writing your code in a certain way. (There is AssemblyScript, which "compiles a strict subset of TypeScript to WebAssembly", but that's still a big leap that puts the JavaScript ecosystem mostly out of reach.)

To be fair, some companies like Autodesk, Figma, and Google are already using WebAssembly. It exists in the present, but it's far from mainstream. As it stands today WebAssembly is probably a bridge too far for most teams. I can't wait to use it, but it's not quite general purpose yet.

What do we mean by High Performance?

With that qualification out of the way, let's take a moment to provide some context for terminology. High performance computing is a term that describes the area of the tech industry where things just this side of theoretical meet the realities of modern hardware and software architecture.

Specialized supercomputers aside, companies like Intel, AMD, and Nvidia have been leading the way while chasing Moore’s Law for decades at this point. Hardware from these companies and their contemporaries are only limited by the capabilities of their manufacturing processes, which are undoubtedly some of the most advanced in the world. In fact, hardware designs are increasingly running up against challenges posed by the laws of physics themselves. They can make transistors so small and pack them so close together that keeping the electromagnetic fields from interfering with each other is a primary limiting factor.

Quantum computers and other attempts to supersede the limits of electrons traveling on wires are still limited to the realm of elite laboratories and R&D projects.

Today, the most practical and common way to maximize computing power is by using clustering. Whether it’s CPUs, GPUs, cores, or entire servers, the concept of horizontal scaling is pretty straightforward. More hardware, bigger hardware, and more deployments all provide more computing power, more or less by definition.

Of course clustering only works if the tools are designed to take advantage of the cluster's horizontal scale. If the work cannot be parallelized, or broken up, across the cluster, then the cluster will provide no benefit. The power comes in being able to do lots of small jobs simultaneously instead of one huge job or a few really big jobs. As it turns out, one of the primary reasons GPUs are so fast these days is they're massively parallel.

Massively parallel you say?

Yep. I don’t fully grok the hardware architecture of GPUs, but this article has some great visual aids and metaphors. The idea of many, many small pipes versus one or a few big ones especially rings true once you’ve written your first pixel shader. [^pixelshader]

At some point, people realized this massively parallel model could be used not just to push bigger, brighter, and more realistic displays, but also to perform resource intensive tasks at a level of performance previously not achievable without supercomputers. General purpose GPU programming, or GPGPU, is sort of the umbrella term for this movement. GPUs and GPU clusters are now standard issue for ML, data science, and other workloads that far exceed the capabilities of your typical MacBook Pro.

High Performance JavaScript Applications

Personally, I’ve been using GPU programming and other techniques to do data visualization and geospatial visualization at a scale that's out of reach with DOM based techniques. I spent many years mastering D3 and even wrote a book about it, but it wasn’t intended for very large datasets. Using the most common D3 output of SVG, your element limit is somewhere around 2000.

Contrast this with technologies like WebGL, which fully utilizes the GPU and can support millions of points out of the box. With virtually no memory usage and no effect on the main thread that's running the rest of your app.

[^pixelshader]: As the name implies, a pixel shader is run for each pixel in the drawing area, every frame. That means drawing a 3840 x 2160 4K scene at 60 FPS will execute your pixel shader 497,664,000 times every second. I don’t care what kind of hardware you’re on, doing that many operations one after another in under one second would be nearly impossible. It has to be massively parallelized, and that’s where GPUs shine because it’s always been that way. GPUs have been architected and perfected for decades now to maximize how many millions of operations can be performed at the same time.