Common Data Visualization Mistakes

Beginners' guide on how to avoid mistakes and don’t be fooled by others’ errors

Anastasia Komissarova
Analytics Vidhya

--

As an information designer, I try to take data visualization seriously and usually spend much time finding the best way to visualize the numbers accurately and in a way that best supports the story. However, sometimes I get wrong, but the more I learn, the more I analyze and compare my visualizations, the more I see common pitfalls which are actually quite easy to avoid.
So in this article, I’d love to chat about the most common (probably the most dangerous) data visualization mistakes.

I grouped my mistakes into two categories — functional and communication.

Functional mistakes are the worst, as they promote the misleading representation of data and make it very hard or even impossible to get a fair insight.

Communication mistakes are not that dangerous, but they are responsible for the additional complications making a visualization confusing. In other words, it takes more time for a viewer to get to the main idea of the project.

Let’s take a closer look at each of these types

FUNCTIONAL ERRORS

Omitting the baseline or truncating the scale

Given visualization illustrates the distribution of Marine plastics contaminating the World ocean. The point was to show the disparity between different sources of plastic. Probably the designer thought that if the Plastic pellets exceeded the other sources so much that we can’t even examine the numbers of the other groups such as Marine Coating or Personal Care Products, there is no harm to truncate the bar saving the absolute values on the Y-axis.
But this chart destroys the real proportion between the bars and visually exaggerates the quantity of some of them.

Be very careful playing with axis — truncating or omitting baseline — since accurate data representation can be destroyed

Manipulating data by cherry-picking scale

This is the visualization of the temperature changes during the period from 1998 to 2012 that shows that climate doesn’t get any warmer as opposed to common belief. But if you zoom out you will see that this claim is far from reality.
Sometimes people see what they expect to see subconsciously, but sometimes people manipulate data to prove the attractive or beneficial point of view. Cherry-picking is a common practice used in science denial or as an instrument of propaganda. Like in this chart by deliberately cherry-picking appropriate periods an artificial “pause” can be created, even when there is an ongoing warming trend.
Knowing that you’ll not only avoid these errors in your projects but don’t be fooled by others.

COMMUNICATION ERRORS

Confusing choice of color

Icons, fonts, and color schemes all carry connotations that affect their perception.
This is a visualization of the regional parity prices for each state in which the range of green indicates the variance from the median across the US. The darker green — higher prices, and paler green — lower prices.

The main idea of the article for which this visualization was created is to show cheaper places to live while working remotely. In this case, high prices are a warning sign and while the green is usually perceived as a color of tranquility, health, or nature, the chosen color palette makes this chart confusing.
Here I should say that in other cases, for example when according to the message the high price is an advantage, this color scheme could work, but not in the presented case.

In my data visualization, I tried to avoid cliche with red as the only warning color and chose the following color scheme: it seems fresh but still communicates the same idea.

Too many… colors, details… just too many

This is my work about the Cereal Renaissance made for one of the #MakeoverMonday Challenges. At first, I liked it, I liked the variety of colors, how they’re combined, and the overall design. However, now I see that there are too many details that distract from the main question stated in the article. However technically this chart isn’t wrong, but for me now it looks unfinished and can be improved.

This is a good example of an intermediate stage that enables us to examine the share and the position of the other products gathering them together into a new category named ‘other products and bring the viewers attention to the group of items with the greatest market share — more or less comparable to that of the cereals.

Wrong audience

There are tons of ways to visualize data, a variety of tools, levels of interactivity, and complexity, you can use your computer or even draw your viz by hand. And every approach has the right to exist as long as you know which one is appropriate for the chosen audience and the chosen goals.

For creating a stunning viz you should explore not only the data you have but also the implied audience. There is a huge difference between an atlas for children and that for grown-ups, the infographics for the general public and the experts. For example, even though it’s commonly accepted that 0 is a starting point, but it isn’t exactly true for some systems and professional fields.

Be well informed about what your message is and who is your target audience.

Of course, this is just the tip of the iceberg, and as a rapidly developing field, data visualization leaves lots of space for best practices to come. But this is a good start to avoid the most common mistakes and recognize them in the that of others developing not only your data viz skills but also your critical thinking and data literacy.

--

--

Anastasia Komissarova
Analytics Vidhya

Information designer. Thoughts on design and data analysis. Expository writing. Twitter @anakomissarof