Log and power scales

As we saw in D3 in 5 Days, scales are a fundamental part of D3. For the sake of simplicity we focused on linear scales, but the rabbit hole goes much deeper. Linear scales belong to a subset of scales called continuous scales. Per the documentation:
Continuous scales map a continuous, quantitative input domain to a continuous output range.
Umm… OK. What does that mean?
In this case continuous just means not pre-defined. Even though we might define a scale's domain or range with concrete bounds like [0, 100]
, thanks to floating point math and decimals the number of possible values within those bounds is essentially infinite.
The documentation also tells us "a continuous scale is not constructed directly; instead, try a linear, power, log, identity, time or sequential color scale". We've already covered linear scales so let's look at the next two types.
Power scales
Let's once again start with the documentation.
Power scales are similar to linear scales, except an exponential transform is applied to the input domain value before the output range value is computed.
OK that's not too hard to parse. Given an input value, a power scale will raise that value to a specified exponent, or power, before translating the result to an output value.
const powScale = d3 ; ; // 4; // 9; // 25
The key in that example is .exponent(2)
. If you were to omit that line, or change it to its default value of .exponent(1)
, you'd have a regular linear scale.
In the real world power scales are pretty rare, save one particular use case. When you create a scatter plot, also known as a bubble chart, you'll often use a power scale to determine the radius of your circles. This is because the area of a circle does not scale proportionally to its radius.
This is due to the fact that a circle's area is calculated as πr2, or in code Math.PI * (radius * radius)
. That means a circle with a radius of 5 has an area of 78.54, while a radius of 10 creates an area of 314.16.
So using the data to set the radius directly would mean that scaling your data by a factor of two results in the visual representation scaling by a factor of four. That's no good. Can you guess what might help? 😆
Accurately sizing circles
Since the area formula squares the radius, you actually want to get the square root of your input value before transforming it. In terms of exponents, to get the square root of a number you raise it to the power of 0.5, so the code to create your scale would look like d3.scalePow().exponent(0.5)
.
However, this is so common that D3 actually provides a convenience function. d3.scaleSqrt()
is a shorthand for d3.scalePow().exponent(0.5)
, and generally what you'll encounter in examples in the wild.
With all the details out of the way let's see what this looks like in practice. Given the data points [5, 10, 15, 20, 25]
you get the following output depending on the kind of scale you use.
As you can see the difference is quite pronounced. The code that produced this image is available here if you'd like to experiment with it.
Log scales
Whereas power scales apply an exponential transform, log scales apply a logarithmic transform to input values before computing the output value. Its been approximately twelve centuries since I took a math class so I'll just grab the details directly from the docs for this one:
The mapping to the range value y can be expressed as a function of the domain value x:
y = m * log(x) + b
.
In practical terms, log scales come in handy when some of the items in your data are exponentially larger than other items. For example, imagine you were encoding bank balances as bar widths. If the minimum balance in your dataset is $10 and the maximum is $100 million, a linear scale just isn't going to cut it. Compare the output of a linear scale and a log scale when mapping these values to an output range of [10, 600]
.
const data = 10 100 10000 1000000 // 1 million 100000000 // 100 million
The difference is quite significant once again. Making any meaningful comparisons is virtually impossible using the linear scale. The bar representing 1 million isn't even twice the size of the bar for 10! You'd obviously need corresponding axis labels but the log scale does a much better job of enabling visual examination of the data.
You can of course see and experiment with the code for this example too.
Until next time!