This page gives an overview of how to present data in a way that is informative, effective and even honest.
Generally, thematic maps provide the best visual presentation of comparing data for different geographies that are connected. For a small number of discrete, i.e. non-connected, areas, other forms of presentation of data may be more appropriate.
Many people profess difficulty in understanding or “reading” charts. Not every one agrees on the best graphical approach for a given set of data. Charts can provide a quick way to see the stories behind the numbers. They have the greatest value when the correct chart is used in conjunction with the data.
2016 Census of Population Household Living Arrangements are used to demonstrate eight different ways of presenting the same information. The geographies shown are those highlighted in United Way’s Poverty Solutions work. HRM is Halifax Regional Municipality. The five neighbourhoods and communities shown are all within HRM.
1. Tabular form showing the numbers for each item:
2. Tabular form showing the percentage of the total for each item:
3. Clustered Column – number of households for all geographies
The very large numbers for Canada compared to the other geographies renders this chart unusable.
4. Clustered Column – number of households for five comparable communities
These communities range in size from 1,000 to 7,000 households. As a result each column shows the relative number of households for each community. For the three of the communities, the dark brown bar (people living alone) is much larger than the other bars.
5. Stacked Bar – number of households for all geographies
This chart is similar to the one above, but is less cluttered and easier to read. It highlights the share of each type of living arrangement within the context of the total number of households in each community. The contrast with Canada, Nova Scotia and HRM is not demonstrated in 4. or 5.
6. Clustered Column – percentage of households for all geographies
Comparing percentages of each type of living arrangement allows a few quick facts to be pulled out. Overall Canada, Nova Scotia and HRM have close to the same mix of living arrangements while three of the communities have a disproportionate number of people living alone. The busyness, i.e. too much to look at, reduces its effectiveness.
7. Pie Charts – percentage of households for all geographies
Another way to compare percentages is through the use of pie charts. This works well for two or three geographic areas but is a bit overwhelming for all eight as shown below.
8. Stacked Bar – percentage of households for all geographies
This chart is similar to the one in 5. above, but since percentages are used, all 8 geographic areas can be shown in an effective way that identifies differences at a glance. For Dartmouth North, the share of people living alone and married couple families are dramatically different than Canada, Nova Scotia and HRM. For the Preston Area, female lone parent families are at a clearly higher percentage. By referring to the tabular data above a story with numbers can be told drawing on these visual images.
For a single neighbourhood, a story line that includes the percentage differences in comparison to a standard, e.g. Dartmouth North compared to HRM and supported by a pie chart for each provides an effective highlight for the area. For example:
Household living arrangements were quite different for Dartmouth North with people living in families in 43% of households compared to 65% for HRM. People lived alone in 49% of the households compared to 29% for HRM. Female lone-parent family households were 12% compared to 9% for HRM.
The charts in Changes in Population show the use of population pyramids as another example of graphical presentation.
While the above approaches demonstrate geographic comparisons, they also work with any two dimensional indicators, for example educational levels by gender or by age group.
There are many other chart types for presenting data. Line charts are generally appropriate for time series, especially to highlight a trend and with multiple variables.
As with thematic maps, there is no one right way to present your data visually. It is a case of choosing a method that shows your data quickly, fairly and honestly. If the story doesn’t pop, then either there isn’t one, or a different graphical presentation is needed.
The Distortion Factor
There are many ways to present data correctly. There are also ways to misrepresent or distort the presentation of data.One of the primary purposes of using charts is to provide a quick visual image of the data. The views expressed in the next few paragraphs are those of Dennis Pilkey and are likely not shared by people at such places as Statistics Canada and the Toronto Stock exchange.
The following are examples of the use of two graphs that exaggerate or distort time series changes. These were taken from Statistics Canada The Daily, April 6, 2018.
The above chart shows Canada’s unemployment rate for a five year period. The chart has a vertical axis origin starting at 5.5 and going to 8.0. This is typical for expression of labour force data. Small changes in this rate can have a large effect on the stock market when it is announced each month. The following chart shows the same data with a vertical axis that has a zero origin.
In the second chart changes appear to be less volatile over the five year time frame. The visual distortion or apparent changes depicted in the first chart are exaggerated by a factor of a little over three times. The data in both charts is the same, just presented differently. The chart for the number of people employed in Canada, from the same publication, has an origin that starts at 17,400 and goes to 18,800.
The visual impression is that of tremendous growth in the number of people employed over the five year period depicted. Using the same data and a zero origin gives a very different visual impression.
This chart shows the reality of modest steady growth. The visual distortion or apparent changes of the published chart is more than 13 times the actual situation.
Graphic reporting of daily stock market changes can show a very volatile situation for very small changes because of the use of non-zero origins. The chart for the S&P/TSX Composite index for April 27, 2018 showed the index with a distortion factor of over 52 times for 5 days, 8.25 times for 1 year and 3.4 times for 5 years. Of course, the sheer size of the stock market means small changes have major impacts on overall wealth. The distortion factor is calculated as the full range for the variable (from a zero origin) divided by the displayed range.
One of the reasons given for non-zero charts is that the changes in an indicator are not obvious. If the changes are not obvious, that may be the story. There are ways to present the data that gives a correct visual impression. For example, the following chart shows employment growth of almost 1 million jobs from March 2013 to March 2018.
This chart is exactly the same shape as the employment numbers displayed in The Daily as shown above. It is missing one piece of information that is included in the chart of The Daily. We can easily fix that in our story using the chart: Since March 2013, Canada has seen the addition of 994 thousand jobs to bring the overall number of people employed to 18.6 million, an increase of 5.6%.
This is a continuation of my long-running rant about non-zero origins. Most people look at the picture and get the visual impression intended by the presenter of the data. It is unlikely that this practice will end because of my strongly held views. Zero based origins for charts and fair representation of data continues to be an important part of the work we do.
A picture can be worth a thousand words, but the correct picture with a few words is worth much more.