Why I Built Cirrus.js

cirrus_js_cover_760x400_light

I’m a D3.js fan. I’ve had the opportunity to use it to build multiple charts libraries for data-oriented companies in the past, including: Datameer dashboard, Plotly Micropolar, Boundary Firespray and Radio-Canada Moby. Many charts libraries are built on top of D3. Here is a list of more than 35.

Today at Planet OS we are launching the newest library – Cirrus.js. So why am I building this one?

cirrus-js_samples

One of the goals we have at Planet OS is to provide a single interface for the discovery of disparate datasets. Having the ability to effectively visualize a variety of data types from multiple domains is an essential capability for making sense of these data and requires a charts library that can scale and adapt as our platform matures.

Jerome Cukier wrote an excellent blog post (as always) to remind us why we love D3 and how we sometimes use it as a convenience without questioning our needs and evaluating alternatives. Given the visualization challenges we plan on tackling at Planet OS, it seemed like a nice opportunity to ask myself why I’m using D3 and what do I want that is not trivial with D3.

Decoupling from the DOM

How do you use D3 with Canvas? The community is always finding creative ways to use D3. Here are just a few examples.

What about WebGL? VML (using Raphael, for IE7+)? There’s always a solution. But what I wanted was a simple abstraction, a multi target renderer for SVG, Canvas and WebGL. An abstract rendering engine is just not in the scope of the core D3. So I studied:

But when I stepped back it appeared that my needs were very simple: just a common API to draw lines and basic shapes. So writing my own was just a few lines of code.

Then I wanted a pure HTML axis component (e.g. easier text formatting, working on platform without SVG support) and the D3 one needs SVG. It was easy since my personal challenge with Moby was to make it work on pure HTML. Even the bubble charts are made of divs with rounded corners and lines are divs with CSS rotations. The HTML axis component still needs the DOM, but it would be easy to use my simple multi-target renderer instead if needed.

But the thing I really wanted was to bring the architecture closer to a datavis pipeline.

A Datavis Pipeline

The best graphics libraries and frameworks (like ggplot and D3) are inspired by The Grammar of Graphics and other conceptual frameworks to formalize how to go from data space to pixel space. Here is a simplified (conveniently naive) version I’m using in Cirrus.js: data, scale, layout, attribute, component, renderer, interaction. I’m sharing this minimal charts library as an example of separation of concerns that I’m confident can scale well.

Data

Let’s start with data and leave aside considerations about collecting, hosting and loading, which I prefer to handle outside of the charts library. I like to use an array of objects format.

var data1 = [
    {name: 'Sensor1', values: [
        {x: 'Fri May 01 2015 13:00:00 GMT-0400 (EDT)', y: 1},
        {x: 'Sat May 02 2015 13:00:00 GMT-0400 (EDT)', y: 2},
        {x: 'Mon May 04 2015 13:00:00 GMT-0400 (EDT)', y: 3},
        {x: 'Tue May 05 2015 13:00:00 GMT-0400 (EDT)', y: 8}
    ]},
    {name: 'Sensor2', values: [
        {x: 'Fri May 01 2015 13:00:00 GMT-0400 (EDT)', y: 1},
        {x: 'Sat May 02 2015 13:00:00 GMT-0400 (EDT)', y: 2},
        {x: 'Mon May 04 2015 13:00:00 GMT-0400 (EDT)', y: 3},
        {x: 'Tue May 05 2015 13:00:00 GMT-0400 (EDT)', y: 8}
    ]}
];

Having x and y as keys already implies it will be displayed on x and y axes. If you don’t like it, you can use other keys and let the chart know using keyX and keyY. I also like to have data transform functions to bring raw datasets to this common format. This will all be in the data.js file, where there’s only a data validator for now. Other functions would be to convert formats or data types, transform, compute aggregates, derive datasets, etc.

Scale

Once we have a dataset in a consistent format, we can define the scales we need. I see two uses for scales: transforming data (like projecting to log scale) and projecting to pixel space. Wrapping a module to handle all D3 scale types can be a bit tedious, especially when you want to automatically configure the right scale. I want my charts to handle time, numerical and categorical. This work will go in the scale.js file.

Layout

Layout is D3 name for generators that calculate the graphics space values from the data. This graphical space doesn’t necessarily have to be in pixel space. It could be the index, some normalized values, etc. One thing I don’t totally master in the D3 tools is how each layout is slightly different in how it handles data (see this excellent discussion). BTW, the best book about layout IMO is D3.js in Action. I’m a bit lazy so I rolled-up my own the way I was doing it before I used D3. That way, I have a common layout for simple/stacked/percent bar chart and for simple/multiple/area line charts. I also extend the concept of layout to axes, legends and other components. All of this will go in layout.js. One important distinction I’m trying to make is that a layout is nothing without an attributes manager that will convert this graphics space to graphical attributes the way the renderer wants it.

Attributes

Graphical attributes, height/width, x/y, color, etc. are represented here in the pixel space directly usable by the renderer. The file attribute.js has an attribute generator for each graphical element. Its responsibility is to resolve layouts according to configuration and to the naming convention to use with the renderer. For example, the layout has the x and y coordinates of each point of a line chart, but the attributes generator takes care of converting this information to a path representing the area shapes.

Components

Recently, some very nice projects are trying to see building charts as assembling a set of components, like Plottable.js and D4. I really like the approach. In my case, components in component.js are responsible for assembling these graphics part from generated attributes.

Renderer

The abstract renderer in renderer.js, very minimal for now, is an adapter to the rendering engines I want to use, abstracting SVG and Canvas under a common API. I had 2D WebGL at some point, but after benchmarking it, I decided I needed a better strategy to make the most out of it.

Interaction

I always found interaction a difficult piece to implement. D3 solves it elegantly by using “behaviors”. Conceptually, it would be really nice to have a “Grammar of Interaction”, so we can describe user interactions as easily as we can describe graphics. The interaction.js file has basic tooltips. Its responsibility is to bind to events and route them to the right component.

Core, Utils, Automation

Some files doesn’t fit well in the conceptual pipeline, but are needed to support the implementation. One example is automatic.js where each config elements marked with ‘auto’ are resolved (like inferring the size of the chart from the container, the number of ticks from label sizes, etc.). The utils file has functions to export to png for example. One important file is core.js, which bootstraps the work and exposes the API. I like to have a single file showing the config object, external and internal and the chart methods. This is also where the pipeline is assembled.

I like when the conceptual and technical metaphors are in phase. Here is a pipeline design pattern used with this datavis pipeline framework, taken from Firespray:

var pipeline = fy.utils.pipeline(
    fy.setupContainers,
    fy.setupScales,
    fy.setupBrush,
    fy.setupAxisY,
    fy.setupAxisX,
    fy.setupHovering,
    fy.setupStripes,
    fy.setupGeometries
);

Of course, it’s not as clean as the conceptual framework and you have to mess with states, internal wiring and so on. But having this pipeline at the core of the architecture (file names, namespace, patterns), makes it easy for me to grow this minimal core while keeping separation of concerns in mind.

A Modular D3

The D3 ecosystem is amazing, with tons of plugins, development tools, examples, meetups, repos, documentation, galleries, blogs, books, Twitter news feeds, help forums, IRC. So what’s the next revolution in the D3 world? I don’t know, but we already have hints that it will come from more modularity. My personal hope is to have a better plugin ecosystem, more ownership on tiny parts of D3 core and tools, especially on components, layouts, and data transformers.

Working on Cirrus.js charts helped me to focus on the datavis pipeline architecture instead of simply “wrapping charts” or exploring the best design patterns to solve an architectural problem. I’m glad to open source it today, not to get another charts library in the already crowded list, but to share an example of my own way to implement the datavis pipeline and, hopefully, to be able to extract some modules to participate in my humble way to this modular revolution.

So the Planet OS team is pleased to share Cirrus.js to fuel the discussion on implementing the datavis pipeline. Feel free to reach out to me through the usual channels, especially to let me know how are you reflecting the datavis pipeline in your own architecture.

Chris Viau

@d3visualization

Planet OS Data Visualization Engineer


comments powered by Disqus