Making sense of facts, numbers, and measurements is a form of art – the art of data visualization. There is a load of data in the sea of noise. To turn your numbers into knowledge, your job is not only to separate noise from the data, but also to present it the right way. Show
Many of us come from the "PowerPoint generation" — this is where the roots of our understanding of data visualization and presentation lie. Unfortunately, it is far from anything related to good, and I stand before you as guilty myself. And if you think I'm too cynical about this, don't take only my word for it.
– Mark Goetz To avoid common pitfalls in your presentations, it wouldn’t hurt to review the basics of data visualization . In this article, I’ll try to undo some of the damage by sharing some of the best practices for data visualization and representation and, hopefully, save some kittens in the process. Data Visualization Best PracticesThere are four basic presentation types that you can use to present your data:
Unless you are a statistician or a data-analyst, you are most likely using only the two, most commonly used types of data analysis: Comparison or Composition. Selecting the Right ChartTo determine which chart is best suited for each of those presentation types, first you must answer a few questions:
Bar charts are good for comparisons, while line charts work better for trends. Scatter plot charts are good for relationships and distributions, but pie charts should be used only for simple compositions — never for comparisons or distributions. There is a chart selection diagram created by Dr. Andrew Abela that should help you pick the right chart for your data type. (You can download the PDF version here: Chart Selection diagram.) Let’s dig in and review the most commonly used chart types, some example, and the dos and don’ts for each chart type. TablesTables are essentially the source for all the charts. They are best used for comparison, composition, or relationship analysis when there are only few variables and data points. It would not make much sense to create a chart if the data can be easily interpreted from the table. Use tables when:
Use charts when the data presentation:
For example, if you want to show the rate of change, like sudden drop of temperature, it is best to use a chart that shows the slope of a line because rate of change is not easily grasped from a table. Column ChartsThe column chart is probably the most used chart type. This chart is best used to compare different values when specific values are important, and it is expected that users will look up and compare individual values between each column. With column charts you could compare values for different categories or compare value changes over a period of time for a single category. Best practices for column charts
Column HistogramsHistogram is a common variation of column charts used to present distribution and relationships of a single variable over a set of categories. A good example of a histogram would be a distribution of grades on a school exam or the sizes of pumpkins, divided by size group, in a pumpkin festival. Stacked Column ChartsUse stacked column charts to show a composition. Do not use too many composition items (not more than three or four) and make sure the composing parts are relatively similar in size. It can get messy very quickly. Before moving to the next chart type, I wanted to show you a good example of how to improve the effectiveness of your column chart by simplifying it. Credit: Joey CherdarchukBar ChartsBar charts are essentially horizontal column charts. If you have long category names, it is best to use bar charts because they give more space for long text. You should also use bar charts, instead of column charts, when the number of categories is greater than seven (but not more than fifteen) or for displaying a set with negative numbers.
Bar Histogram ChartsJust like column charts, bar charts can be used to present histograms.
Stacked Bar ChartsI’m not quite sure about a good application of stacked bar charts — except when there are only a few variables, composition parts, and the emphasis is on composition, not comparison. Stacked bars are not good for comparison or relationship analysis. The only common baseline is along the left axis of the chart, so you can only reliably compare values in the first series and for the sum of all series. Line ChartsWho doesn’t know line charts? We used to draw those on blackboards in school. Line charts are among the most frequently used chart types. Use lines when you have a continuous data set. These are best suited for trend-based visualizations of data over a period of time, when the number of data points is very high (more than 20). With line charts, the emphasis is on the continuation or the flow of the values (a trend), but there is still some support for single value comparisons, using data markers (only with less than 20 data points.) A line chart is also a good alternative to column charts when the chart is small. Timeline ChartsThe timeline chart is a variation of line charts. Obviously, any line chart that shows values over a period of time is a timeline chart. The only difference is in functionality — most timeline charts will let you zoom in and out and compress or stretch the time axis to see more details or overall trends. The most common examples of a time-line chart might be:
The Dos and Don’ts for Line Charts
Area ChartsAn area chart is essentially a line chart — good for trends and some comparisons. Area charts will fill up the area below the line, so the best use for this type of chart is for presenting accumulative value changes over time, like item stock, number of employees, or a savings account. Do not use area charts to present fluctuating values, like the stock market or prices changes. Stacked AreaStacked area charts are best used to show changes in composition over time. A good example would be the changes of market share among top players or revenue shares by product line over a period of time. Stacked area charts might be colorful and fun, but you should use them with caution, because they can quickly become a mess. Don’t use them if you need an exact comparison and don’t stack together more than three to five categories. Pie Charts and Donut ChartsWho doesn’t love pies or donuts, right? Not in data visualization, though. These charts are among the most frequently used and also misused charts. The one above is a good example of a terrible, useless pie chart - too many components, very similar values. A pie chart typically represents numbers in percentages, used to visualize a part to whole relationship or a composition. Pie charts are not meant to compare individual sections to each other or to represent exact values (you should use a bar chart for that). When possible, avoid pie charts and donuts. The human mind thinks linearly but, when it comes to angles and areas, most of us can’t judge them well. Stacked Donut ChartsI would not recommend using stacked donut charts at all! I mean, like, never! You might think that you could use a stacked donut to present composition, while allowing some comparison (with an emphasis on composition), but it would perform badly for both. Use stacked column charts instead. Here’s a good example of how to use pie chart effectively. Credit: Joey CherdarchukThe Dos and Don’ts for Pie chartsFor those of you who still feel sentimental about the old PowerPoint Pie charts, and want to keep using them, there are some things to keep in mind.
Scatter ChartsScatter charts are primarily used for correlation and distribution analysis. Good for showing the relationship between two different variables where one correlates to another (or doesn’t). Scatter charts can also show the data distribution or clustering trends and help you spot anomalies or outliers. A good example of scatter charts would be a chart showing marketing spending vs. revenue. Bubble ChartsA bubble chart is a great option if you need to add another dimension to a scatter plot chart. Scatter plots compare two values, but you can add bubble size as the third variable and thus enable comparison. If the bubbles are very similar in size, use labels. We could in fact add the fourth variable by color-grading those bubbles or displaying them as pie charts, but that’s probably too much. A good example of a bubble chart would be a graph showing marketing expenditures vs. revenue vs. profit. A standard scatter plot might show a positive correlation for marketing costs and revenue (obviously), when a bubble chart could reveal that an increase in marketing costs is chewing on profits. Use Scatter and Bubble charts to:
Map ChartsMap charts are good for giving your numbers a geographical context to quickly spot best and worst performing areas, trends, and outliers. If you have any kind of location data like coordinates, country names, state names or abbreviations, or addresses, you can plot related data on a map. Maps won’t be very good for comparing exact values, because map charts are usually color scaled and humans are quite bad at distinguishing shades of colors. Sometimes it’s better to use overlay bubbles or numbers if you need to convey exact numbers or enable comparison. A good example would be website visitors by country, state, or city, or product sales by state, region or city. But, don’t use maps for absolutely everything that has a geographical dimension. Today, almost any data has a geographical dimension, but it doesn’t mean that you should display it on a map. When to use map charts?
Gantt ChartsGantt charts were adapted by Karol Adamiecki in 1896. But the name comes from Henry Gantt who independently adapted this bar chart type much later, in the 1910s. Gantt charts are good for planning and scheduling projects. Gantt charts are essentially project maps, illustrating what needs to be done, in what order, and by what deadline. You can visualize the total time a project should take, the resources involved, as well as the order and dependencies of tasks. But project planning is not the only application for a Gantt chart. It can also be used in rental businesses, displaying a list of items for rent (cars, rooms, apartments) and their rental periods. To display a Gantt chart, you would typically need, at least, a start date and an end date. For more advanced Gantt charts, you’d enter a completion percentage and/or a dependency from another task. Gauge ChartsGauge charts are good for displaying KPIs (Key Performance Indicators). They typically display a single key value, comparing it to a color-coded performance level indicator, typically showing green for “good” and red for “trouble.” A Dashboard would be the most obvious place to use Gauge charts. There, all the KPIs will be in one place and will give a quick “health check” for your project or company. Gauges are a great choice to:
The bad side of gauge charts is that they take up a lot of space and typically only show a single point of data. If there are many gauge charts compared against a single performance scale, a column chart with threshold indicators would be a more effective and compact option. Multi Axes ChartsThere are times when a simple chart just cannot tell the whole story. If you want to show relationships and compare variables on vastly different scales, the best option might be to have multiple axes. A multi-axes chart will let you plot data using two or more y-axes and one shared x-axis. But it comes at a cost. That is, the charts are much more difficult to read and understand. Multi-axes charts might be good for presenting common trends, correlations (or the lack thereof) and the relationships between several data sets. But multi-axes charts are not good for exact comparisons (because of different scales) and you should not use this type if you need to show exact values. Use multi-axes charts if you want to:
Data Visualization Do’s and Don’ts – A General Conclusion
Recommended Books on Data Visualization and PresentationOther Resources & Further Reading About Data Visualization |