D3.js is a great and in my opinion very elegant library for building visualizations in a web
browser. Unfortunately some concepts are not that easy to understand from its documentation, especially when it comes to
animathing graphs. I made that experience myself and when I tried to teach these concepts to my students using the
official resources.
I am sure that ObservableHQ is a great way to tinker and learn more about
certain features, but its nature of being inspired by a spreadsheet that automatically updates makes it hard to
understand how these features have to be combined in a real world application.
Therefore I am going to try to explain my (and maybe also other's) troubles when learning D3.js by using a quite simple
example. This won't cover many different features of D3.js, but it will explain the parts I feel are the hardest to
understand. This guide adds some pieces to the introduction available on the D3.js homepage that
would have made learning D3.js easier for me.
The below examples assume that the d3
variable and a <svg>
tag is in scope. This can be done by e.g. using a script
tag as described in the D3.js documentation.
<svg />
<script src="https://d3js.org/d3.v5.js"></script>
Building a simple graph
Basically D3.js allows you to generate different markup (e.g. HTML or SVG) based on data. Let's have a look at an
example, which will visualize the numbers of an array as differently sized circles.
const data = [10, 30, 20];
d3.select('svg').selectAll('circle')
.data(data)
.join(
(enter) => enter
.append('circle')
.style('fill', 'red')
.attr('r', (d) => d)
.attr('cx', (d, i) => (i + 1) * 50)
.attr('cy', 50)
)
The data
variables holds the data we want to visualize. In order to do so, we need the
d3.select
function, which returns a
D3.js selection. Such a selection acts as a wrapper for the DOM, which focuses on
mass manipulation using data. The d3.select
function returns a selection with the first element matching its passed
CSS selector, therefore the only <svg>
tag in our HTML document. D3.js makes use of a
fluent interface enabling us to use the just returned
selection and make another function call selectAll
. This will return
all cirlce
elements being a descendant of the
first <svg>
tag in our document. This feels a bit weird, because this results in an empty selection, but we'll get to
that in a second.
So using the d3.select('svg').selectAll('circle')
call, we have an empty selection, because no circle exists within
the <svg>
tag yet. Now we can assign our data
variable using the
data
function. This will give us a data selection knowing which
elements we have to add to our visualization. And this is where it starts to get interesting.
Such a data selection offers a join
method, taking up to three
arguments. In the above example only one of the arguments has been used, being a function taking the enter
selection.
This selection holds all the values from our data set (10
, 30
and 20
) currently not being tied to a SVG circle
element. Since no elements exist yet, it's our job to create them. This is what the
append
method does. We called it with the 'circle'
argument,
which means there will be three circle
elements once the call is finished, one for every missing data point.
Afterwards the style
and
attr
calls are used on these circles, in order to style them in
some way. Checkout the circle
documenation on MDN
in order to learn about the available properties. These functions are not only taking simple values; they also accept
functions as arguments. If a function gets passed, this function will be called with the value (the d
variable) and
the index (the i
variable) of the current data point. This way we can set the radius r
to the passed value, and
move the circles to the right using the index to calculate the cx
attribute.
With these few lines of code we have already created three red circles with a radius of 10, 30 and 20 pixels.
Updating a current selection
Until now we only have a static representation of three values as circles. That's not really interesting so far; we
could have simply written a few lines of SVG ourselves to achieve the same result. The popularity of D3.js comes from
its more advanced features. Let's see how we can change the data when a new data set arrives. At the beginning I was
wondering how this works, since it was not evident from the documentation. Basically the trick is to call the exact
same code again with a different data set. D3.js will then find out which elements have already existed, which ones
have to be added, and if any of the elements have to be removed. The only thing we have to do (besides calling that
code once more) is to tell D3.js how it should handle these different sets of elements.
We do that by passing more function parameters to the join
method. The third parameter handles the elements that are
about to be removed, and the second one handles the elements that have already existed before based on the selectAll
call, which explains why it has been there in the above code from the beginning. We will put the entire D3.js code from
above into a separate function, which will be called again when new data arrives. So the selectAll
will only return an
empty set on its very first call, but the consecutive calls of it will return the old circle
elements.
The following code shows that concept and assume that you have a <button>
tag somewhere in your HTML, that will load
the next data set when it is clicked (no out-of-bounds check for brevity):
const data = [
[10, 30, 20, 40],
[20, 20, 20],
[40, 10, 20, 10, 20],
[20, 10, 20],
];
let currentIndex = 0;
document.getElementsByTagName('button')[0].addEventListener(
'click',
() => update(++currentIndex)
);
function update(index) {
d3.select('svg').selectAll('circle')
.data(data[index])
.join(
(enter) => enter.append('circle'),
(update) => update,
(exit) => exit.remove(),
)
.style('fill', 'red')
.attr('r', (d) => d)
.attr('cx', (d, i) => (i + 1) * 50)
.attr('cy', 50);
}
update(0);
The data
variable now contains an array of arrays, whereby the inner array is interpreted as a single data set. We
have a index variable that starts with 0
, and each click on the first button of the document will call the update
function and increase the value of the index. The update
function contains the D3.js specific code, but also uses the
index
parameter to access the current data set. As explained before the join
method is used with 3 callbacks:
- The
enter
callback is called for data points not being currently represented in the visualization. In here theappend
method is used to add acircle
SVG element for each new data point. - The
update
callback is called for data points that were already represented in the visualization. At the moment we don't do anything special with these elements, so we are just returning that set. - The
exit
callback is called for elements in the visualization, that do not have a data point in the new data set anymore. This example uses theremove
method to delete these elements immediately.
In case you are familar with React, you can think of D3.js doing a similar job: You just tell
it what should happen on each case, and D3.js figures out for you when to call which function.
Since except for appending new circles for the enter
set we often want to handle the enter
and update
set in the
same way, the join
method will merge these two sets and return the combination of both. So the return value of the
join
method can be used to set styles and other attributes to both of these sets (we could also have added the
subsequent method calls to directly in the enter
and update
callback, which is usually done when the handling
between those sets differs). That part of the code has just been copied from the first example.
Add transitions for smoother animations
The previous example changed the visualization on every button click, but it still does not really feel like a
sophisticated visualization. D3.js also comes with support for transitions, which will make the animations a lot
smoother. The most important method to achieve this is called
transition
, and returns an object similar to a D3.js
selection. After calling transition
you can use the attr
and style
methods as described previously, but instead
of being immediately applied, these changes will be animated. In order to control the speed of the animation, this
selection-like object also contains the duration
method.
For delaying the animation the delay
method is available.
Both of these functions take a parameter being interpreted as milliseconds.
There is a very important side note: A transition object, despite its similarity, is not a D3.js selection. If a
transition instead of a selection is passed, D3.js might throw errors. For this reason the
call
method exists. The call
method will execute the passed
function, but instead of returning the value the passed function returns, it will always return the selection on which
the call
method was executed. This way the method chain can be broken into pieces, without creating separate
variables bloating our code. See the following example, which just shows a new body for the update
function above:
d3.select('svg').selectAll('circle')
.data(data[index])
.join(
(enter) => enter.append('circle')
.attr('r', 0)
.attr('cy', 50),
(update) => update,
(exit) => exit.call((exit) => exit
.transition()
.duration(500)
.attr('r', 0)
.remove()
)
)
.transition()
.duration(500)
.attr('r', (d) => d)
.attr('cx', (d, i) => (i + 1) * 50)
This is almost the same code as above, but three crucial changes were made in order to animate the transitions between
the data sets:
-
The
transition
method is called after thejoin
method. This way we tell the merge of theenter
andupdate
set of thejoin
method that the subsequentattr
calls should be animated, with a duration of 500 ms. -
The
enter
set of thejoin
method gets ar
andcy
value immediately. These values are the start values before the animation. Setting thecy
immediately and leave it untouched will cause the circles to move only on the x-axis. -
The
exit
set of thejoin
method uses also thetransition
andduration
calls. Theattr
method is called again after thetransition
method, which will make the cirlce shrink until it is not visible anymore. Theremove
of the transition will remove the element after the animation has finished (in contrast to theremove
method of the standard selection, which would remove the element immediately). Although not necessary for theexit
set (because it will not be further manipulated), thecall
method has been used, in order to return a proper selection and not a transition from the third callback.
Identify data points by using keys
This already makes a pretty slick visualization! But another important piece is missing: The circles currently do not
have any identity. That means if more numbers than previously have been passed the difference is being passed to the
enter
set, and if less number are passed the difference is passed to the exit
set. But in many cases this is not
enough. Imagine the visualization of an election: There might be two new parties and one party doesn't exist anymore,
so there should be some elements in the enter
and in the exit
set. In order to make this possible, we have to be
able to identify the data somehow. And this is why the key function in the data
method has been introduced.
We are going to use the above code to show the result of different elections. I know that a bar diagram would be the
better approach to this visualization, but we are going to use the circles again, so please bear with me. In order to
make the circle identifiable by D3.js, and also allow it to move the existing circles instead of just making them
disappear and appear again, we are going to make use of the key
function passed as second parameter to the data
method. We are also not using plain numbers as data anymore, but objects consisting of a fill
and a value
property.
The value
replaces the previous number, and the fill
property describes the color of the circle and acts as the
identifier of the circle. Let's have a look at the code:
const data = [
[
{fill: 'green', value: 30},
{fill: 'red', value: 18},
{fill: 'blue', value: 9},
],
[
{fill: 'black', value: 35},
{fill: 'red', value: 27},
{fill: 'green', value: 18},
{fill: 'pink', value: 3},
],
[
{fill: 'red', value: 30},
{fill: 'green', value: 15},
{fill: 'black', value: 14},
{fill: 'pink', value: 6},
],
];
let currentIndex = 0;
document.getElementsByTagName('button')[0].addEventListener(
'click',
() => update(++currentIndex)
);
function update(index) {
d3.select('svg').selectAll('circle')
.data(data[index], (d) => d.fill)
.join(
(enter) => enter.append('circle')
.style('fill', (d) => d.fill)
.attr('r', 0)
.attr('cy', 50),
(update) => update,
(exit) => exit.call((exit) => exit
.transition()
.duration(500)
.attr('r', 0)
.remove()
)
)
.transition()
.duration(500)
.attr('r', (d) => d.value)
.attr('cx', (d, i) => (i + 1) * 50)
}
update(0);
It again looks very similar to what we've had before, except for a few minor differences I want to highlight:
-
The
data
variable has changed as described previously. It is an array of array, whereby the inner array contains objects with afill
color and avalue
. The result of an election might be represented in a similar way. -
The
data
function gets a second argument passed, which is called thekey
function. We return thefill
value of thedata
variable, which will be used as the identifier of the circle. -
In the
transition
after thejoin
call theattr
call forr
has to be adapted, because the data point is an object instead of a number now.
Now D3.js is able to correctly assign the circles to the enter
, update
and exit
sets. When clicking through the
different data points, the circles should now be keeping their color and move around. If this is not done correctly,
the circles will change their color instead of moving the correct location. That might easily defeat the purpose of
such a visualization, because it what happened to the data is not clear to the viewer this way.
Conclusion
Hopefully the above points help others to understand D3.js faster than I did. Finally I want to say that it might be
easier in many cases to use a simpler charting library, but these libraries are not that powerful, and you might hit
dead ends quite soon. Once you have understood D3.js, you are also pretty fast at developing simple bar and line
chart, while keeping the possibility to turn your chart into something much more sophisticated if required. And as an
additional bonus: Visualizing data is fun!
Top comments (0)