Static image of the final chart.  The full data is available below.

January 24, 2010

Popular car colors by country

I found a series of graphics showing car color popularity and rebuilt them in several stages. My goals were to increase readability, data density, and aesthetic appeal. I also created a more accessible web-based version.

The originals

Each year DuPont publishes reports on global car color popularity. Their report for 2009 contains ten graphics, each showing a bar chart for a different region. Each graphic shows either ten or eleven colored bars, drawn as cars, with text labels for the color and popularity.

  • Original image from DuPont for world.  The full data is available below.
  • Original image from DuPont for brazil.  The full data is available below.
  • Original image from DuPont for china.  The full data is available below.
  • Original image from DuPont for europe.  The full data is available below.
  • Original image from DuPont for india.  The full data is available below.
  • Original image from DuPont for japan.  The full data is available below.
  • Original image from DuPont for mexico.  The full data is available below.
  • Original image from DuPont for namerica.  The full data is available below.
  • Original image from DuPont for russia.  The full data is available below.
  • Original image from DuPont for skorea.  The full data is available below.

What’s right?

Even without reading the text, you could guess that these images are very likely about car color popularity; in this way it succeeds. In nearly every other respect this series of graphics is an ideal starting point for a lesson in what not to do.

What’s wrong?

The graphics are deceptive

This is the unforgivable sin of data graphics: to deceive the eye into perceiving incorrect information. Even if your graphic does not illuminate at very least it should do no harm.

No consistent scale is evident. For example, in the chart for world popularity, 8% red should be twice the width of 4% brown.

Small image showing that the bars for red and brown are of very similar size, even though should red should be twice as wide.

The scale changes between graphics: the Brazil’s 33% silver bar is the same width as the world’s 25% silver bar.

Small image showing that Brazil's 33% bar and the worlds 25% bar are the same width.

Also, the race car images greatly distort our perceptions. Take a quick look at the three images below. Of the two blue bar graphs, which looks most like the original in the center?

A recreation of one original image as a simple bar graph, assuming that the decorative cars are a part of the data.
Reduced version of original image.
A recreation of one original image as a simple bar graph, assuming that the decorative cars are not part of the data.

Even to me, moments after creating these images, the chart on the left most closely resembles the original. Yet we are deceived; the cars are not part of the data. Even taking into account the inconsistent scale, the blue chart on the right resembles the data much more closely.

Comparing countries is difficult

That 18% of new cars in Russia are green is trivia; that Russians appear to like green six times as much as any other country is interesting information. Often it is in comparison that data becomes interesting, and ease of comparison is a hallmark of good data visualization.

In this case, comparisons are made difficult by the use of multiple graphics and the inconsistencies between those graphics. Using another graphical page to display another dimension of data can be very effective, but small multiples work best when the graphics are clear at small sizes and changes from one to the other are instantly visible. Neither is the case here.

Ironically, color is poorly used

If graphics about car color get one thing right it should be color. Instead, each color is polluted by two gradients, one within the car and one in the car’s exhaust plume. These gradients are pointless and harmful, adding noise without aiding understanding.

There are many inconsistencies

For instance:

  • Numbers are sometimes presented as integers and sometimes as decimals.
  • Labels switch sides from left to right as the design runs out of room.
  • Colors are listed in different orders on each chart.

We should not worship consistency for its own sake, but if our audience is to make effective comparative judgements they must have an anchor point. Non-data changes should be minimized or eliminated so that changes in data are visible.

The source data is hard to get

Nowhere on DuPont’s site can we find a simple presentation of all the data in plain text. Providing the source data is the best insurance against total communication failure. If your graphic doesn’t speak as well as you’d like or is laden with too many of your own assumptions, at very least your audience can read the source data and form their own conclusions.

The data

The first step to improvement is to get that data. Here it is, manually extracted from DuPont’s graphics.

Popular car colors by country, 2009
Silver Black White Gray Blue Red Brown Green Yellow Other Orange
Brazil 33 25 10 14 3 9 3 2 1 1 -
China 36 23 12 10 6 9 - 1 2 1 1
Europe 27 20 10 18 10 6 4 2 1 2 -
India 26 6 23 5 11 16 5 1 6 1 -
Japan 23 23 28 10 8 3 3 1 1 1 -
Mexico 15 18 24 14 13 11 2 1 2 1 -
N. America 17 17 18 13 12 12 6 3 2 1 -
Russia 23 17 8 5 16 11 2 18 1 - -
S. Korea 39 29 14 5 3 4 3 1 - 2 1
World 25 23 16 13 9 8 4 1 1 1 -

Angled lines mean Other

Already this is a vast improvement. All the data is presented at once so it’s easy to see relationships. It’s a more efficient use of space and no graphics are wasted. This is an excellent test for data graphics: if your graphic is less legible than a text table, consider a redesign.

But there are too many numbers here for quick comprehension. This is the perfect use case for a data graphic.

The refactor

It’s not difficult to improve one of these graphics. I built this to address my concerns with the display of a single country. It maintains as many elements from the original as possible, including the car shape and the oblique type (presumably used in the original because it looked speedy).

A refactor of the original world graphic, removing most of the clutter.  Each bar is represented by an outline of a car, but with no gradients or other noise.

Other improvements to this graphic are possible, like a horizontal scale or better typography. Creating a full series for all countries would raise additional problems, like a consistent scale across countries, but we have bigger fish to fry.

One new graphic to rule them all

Improving the presentation of a single country is interesting and worthwhile, but it doesn’t address a fundamental problem with the original: the difficulty of comparing countries. That can be solved by building one graphic for all of the data.

In creating a new representation the first challenge is scale. The numbers range from 1% to 39%. The display of 1% must be legible, but if it’s too big the display of 39% (at 39 times its size) will make the chart very large. If I include text labels, the size of the text “1%” will dictate the size of the rest of the chart. I’ll try first without text labels for the colors.

I must include the country labels, though, and they are all much wider than they are tall; a graph that displays countries as rows will use space most efficiently. I’ll order the countries alphabetically and the colors by worldwide popularity, the same as the text table above.

A bar graph in which shows all colors for all countries as colored blocks.

This is a bar graph made in Numbers. Any spreadsheet software should allow you to make a similar graph in just a few minutes once you have the data. This chart displays all the data in a fraction of the space. It nicely highlights both relationships and outliers, like India’s apparent affection for yellow.

Communicating with color

This is an improvement, but it has problems. It looks very noisy, in part due to the discontiguous river of white through the middle. The areas for “other” get lost at the edges. Red and brown and gray are of similar luminosity, and placing them so near each other makes for difficult reading.

Most important, it has no real voice. Ordering countries alphabetically and colors by popularity are both easy, obvious choices, but they don’t help the graphic communicate. Part of a designer’s job is interpretation. Can I find meaning in these numbers, meaning outside the default order, and help readers find it as well?

For me, the most interesting thing about this data is not the individual colors or countries, but that 100 years after Henry Ford’s famous adage the most popular colors are not colors at all, but shades of black. Is there a way to tell this story?

A bar graph in which shows all colors for all countries as colored blocks.

I’ve used white to separate colors from grays and placed the least popular colors at the inside of the chart next to white, so they won’t get lost at the edges. I moved the most visible color, red, to the edge where it’s least distracting, and placed next to it the most popular color: blue. Lastly, I ordered the countries by popularity of saturated colors.

This version is much improved. There are far fewer disconnected regions of color and thus less visual noise. Immediately it’s clear that the graphic tells a story: some countries like colorful cars better than others.

How small is too small?

A tiny version of the previous image. As an exercise, how small can this get before it’s useless? Shrinking the display of 1% to one pixel gives a chart 100 pixels wide.

This does feel very cramped, but look at how much information we still see! South Korea really likes silver and doesn’t care much for color; Russia likes colors, particularly green. We lose details, though, and small numbers are very hard to read.

The best of both worlds?

Now I have two refactors: a plain text table and a graphic chart. It should be possible to use CSS to combine these. The basics are simple enough: styling the table, tbody, tr, and td tags like blocks instead of table elements, and specifying a percentage width for each table cell in the markup.

The tabs below changes only one CSS class; the HTML content of the table remains the same.

Popular car colors by country, 2009
Red Blue Green Yellow Brown Other Orange White Black Gray Silver
Russia 11 16 18 1 2 - - 8 17 5 23
India 16 11 1 6 5 1 - 23 6 5 26
N. America 12 12 3 2 6 1 - 18 17 13 17
Mexico 11 13 1 2 2 1 - 24 18 14 15
Europe 6 10 2 1 4 2 - 10 20 18 27
China 9 6 1 2 - 1 1 12 23 10 36
Brazil 9 3 2 1 3 1 - 10 25 14 33
Japan 3 8 1 1 3 1 - 28 23 10 23
S. Korea 4 3 1 - 3 2 1 14 29 5 39
World 8 9 1 1 4 1 - 16 23 13 25

Angled lines mean Other

This makes me happy, but it’s still improvable. What I like least is that for one color the difference between countries is not always easy to compare (especially, e.g., yellow). Since this is the web we have an interactive canvas, so I’ve added hover effects. If you move your mouse over any color bar all other bars of that color are highlighted. I’ve tried to make this more as discoverable as possible by triggering part of it with the tabs; I don’t consider it an ideal solution and hope to revisit it in the future.

The Point

This isn’t just a clever exercise. DuPont is a for-profit company. They paid someone to create these graphics and branded them with the DuPont logo; clearly they’re hoping for a return on their investment. If they created first-class data art more publications would use it. Instead, the originals are so poor that some feel compelled to redraw them or selectively sample them in order to communicate with their readers. This is a shame and a waste.