A couple of weeks ago at work we were faced with the issue of how best to represent hierarchical structures, i.e., parent-child-relationships. This issue was something that I had actually been thinking about for a while already and then finally when a colleague approached me with the wish to create such visualizations, finally we decided to invest some time into how we best could implement this1.

In a first step, we searched for common visualizations of such relationships and fairly quickly arrived at so-called ‘node-tree diagrams’ (or simply ‘tree diagrams’). Additionally, the trees should eventually come along with the possibility to filter for certain characteristics and thereby only select parts of the tree that the user actually filters for (but this was an issue we kept aside for the moment and focussed on the core of the problem).

Because we had been working with Tableau a couple of times recently, instead of trying to implement tree-diagrams in R or Python, our choice in a first instance fell on Tableau.

While in the world of data science dedicated packages/libraries available in R (ggplot2, plotly) or Python (matplotlib, seaborn, ggplot, bokeh, gleam)2 are widely used, Tableau (with almost no coding involved) has more of a usage in the business word. At the same time, there are software developers as kind of a third group distinct from classical data scientists of business analysts that also need to visualize data at some point during their work and this is where frequently javascript libraries like Chart.js or D3.js are more frequently used as tools being more well known in this domain.

In the more recent past however, things are starting to overlap and software developers wanting to dive into the world of data science are also largely acquiring skills in tools like R or Python and data scientists are leaving their set of tools and also making more and more use of the javascript libraries we mentioned above.

It is exactly D3.js which caught my attention although it would require some investment into learning the syntax. But I saw a huge benefit in actually learning the syntax to create incredibly beautiful and concise visualizations. In fact, on Mike Boston’s homepage, he gather a huge set of example visualizations that give you a fairly good impression of what is possible with D3.js and that learning it is well worth the investment.

In this post I will try to describe as clearly as possible the generation of a tree-diagram using the javascript library D3.js.

The reason I stumbled over this in the first place is that when diving into Tableau we had stumbled over the possibility to create node-tree diagrams. In particular, Jeffrey Shaffer pioneered the generation of such node-tree diagrams in Tableau.

  1. Note that in the below we go through everything irrespective of the actual context of the parent-child-relationships. 

  2. The examples of packages/libraries we have given is just a small selection of available packages in both languages.