We're still developing the platform, and your comments will help us identify any issues and opportunities for improvement.We need your feedback to help make it better.If you'd like to get more involved with platform development, do consider joining the Octopus user community.
Help us improve by providing feedback or contacting help@jisc.ac.uk
Research Problem
Rationale / Hypothesis
Method
Results
Analysis
Interpretation
Real World Application

Interpretation of an experiment to test the effects of different formats of communication of indirect uncertainty around the COVID-19 R number on a public audience

Publication type:Interpretation
Published:
Language:English
Licence:
CC BY 4.0
Peer Reviews (This Version): (0)
Red flags:

(0)

Actions
Download:
Sign in for more actions
Sections

This experiment set out to investigate the effects of the communication of indirect uncertainty (the level of consensus between different estimates) in combination with direct uncertainty (the statistical range around a point estimate) in a pair of experiments within a single online survey.

The first experiment presented people with a text form of the information (using terms such as ‘consensus’ or ‘likely to be’ to indicate the quality of the evidence for the range stated, as used by the scientific advisory groups to the UK government), the second experiment presenting people with a graphical format illustrating the degree of consensus between different groups’ predicted ranges (as also used by the scientific advisory groups). The measures in each experiment were the same, and were designed to discover whether a general public UK audience were able to:

  • understand the gist of the communication

  • understand the specific consensus range for R being communicated

  • whether any of the different ways of communicating the deeper uncertainties about that range affected their interpretation of the consensus range.

Experiment 1

Participants were randomly assigned to one of five conditions to read statements about the R number range :

Control: “The SPI-M group report that this week the overall reproduction number, R, of Covid-19 is between X and Y.

(no information about indirect uncertainty)

Stated consensus: “The SPI-M group report that this week their consensus is that the overall reproduction number, R, of Covid-19 is between X and Y.

(a format used often by government communications during the pandemic using ‘consensus’ as a short way of communicating low indirect uncertainty)

Highly likely: “The SPI-M group report that this week the overall reproduction number, R, of Covid-19 is highly likely to be between X and Y.

(a format used often by government communications during the pandemic which does not explicitly communicate indirect uncertainty but uses the term ‘likely’ to imply it)

Little consensus: “The SPI-M group report that this week the overall reproduction number, R, of Covid-19 is between X and Y. They state that there is little consensus on this.

(a format constructed for this experiment to explicitly communicate high indirect uncertainty)

High consensus: “The SPI-M group report that this week the overall reproduction number, R, of Covid-19 is between X and Y. They state that there is high consensus on this.

(a format constructed for this experiment to explicitly communicate low indirect uncertainty)

Each participant rated three statements in a randomised order, where the X-Y range was: 1.4-1.6 (Covid-19 increasing), 0.9-1.1 (Covid-19 stable), and 0.4-0.6 (Covid-19 decreasing).

The results suggest that participants understood the gist of what the text was trying to communicate about R generally, perceiving the risk to be higher when range estimated for R was higher (there was no significant difference between the different indirect uncertainty formats in people’s responses to the risk posed by Covid-19.) See Figure 1.

A graph showing the results of a covid-19 test

Description automatically generated

Figure 1: Perceived riskiness of the COVID-19 situation for each of the three ranges presented (shapes show overall distribution of responses, dot is mean and lines indicate standard deviation.) Brackets and asterixis indicate a significant pairwise difference between groups (*p < .05, **p < .01, ***p < .001) – see Analysis for details.

This was also seen in their answers to how likely they thought it that Covid-19 was increasing (this time there were differences between the indirect uncertainty formats). See Figure 2.

A graph of different colored shapes

Description automatically generatedFigure 2: Perceived likelihood that the pandemic is currently increasing, given each range for R and each format of indirect uncertainty presentation (shapes show overall distribution of responses, dot is mean and lines indicate standard deviation). Brackets and asterixis indicate a significant pairwise difference between groups (*p < .05, **p < .01, ***p < .001) – see Analysis for details.

The results also suggest that participants understood the specific consensus range for R being communicated, given that they said that they would be relatively unsurprised at R falling at the midpoint of the range compared with it falling outside the range. See Figure 3.

A diagram of a variety of shapes

Description automatically generatedFigure 3: Level of surprise if the true value of R turned out to be at the midpoint, above, or below the range described (shapes show overall distribution of responses, dot is mean and lines indicate standard deviation). Brackets and asterixis indicate a significant pairwise difference between groups (*p < .05, **p < .01, ***p < .001) – see Analysis for details.

The differences in levels of surprise across the different values of R are also interesting and suggest an expectation that range estimating a high value of R is more likely to be overly pessimistic and a low value of R to be more likely to be overly optimistic. An analysis of these findings taking into account demographic data and other co-variates would be interesting.

Finally, there were indeed differences between the different formats of communication of the indirect uncertainty around R.

Looking at the differences between the formats for the question about how surprised people would be if the true value of R were above or below the consensus range show that, as we might expect, when higher levels of indirect uncertainty are communicated, people essentially widen the range of values that they expect R to take (widening the range stated). See Figure 4.

A comparison of different colored shapes

Description automatically generatedFigure 4: Level of surprise if the true value of R turned out to be above or below the range described, by format of indirect uncertainty (shapes show overall distribution of responses, dot is mean and lines indicate standard deviation). Brackets and asterixis indicate a significant pairwise difference between groups (*p < .05, **p < .01, ***p < .001) – see Analysis for details.

They also suggest that the control format (“The SPI-M group report that this week the overall reproduction number, R, of Covid-19 is between X and Y.”) implies to the audience that there is high consensus on the number. ‘Highly likely’ has the same effect, and simply stating that there is ‘consensus’ in most cases does the same (only in the case of the question about the likelihood of R falling exactly on the upper limit do we see it convey slightly less than the control, ‘high consensus’ or ‘highly likely’). This is in line with previous findings, that when no indirect uncertainty is communicated, a lay audience assume that evidence is of high quality (1–3). The fact that a statement of consensus (or high consensus) acts as a cue of the quality of underlying evidence or the trustworthiness of the stated range is not surprising given previous research on the effects of such statements of scientific consensus in fields such as vaccination or climate change (e.g. (4,5)). See Figure 5.

A diagram of a number of indicators

Description automatically generated with medium confidenceFigure 5: Perceived likelihood of the true value of R falling at exactly the upper limit of the range described, by format of indirect uncertainty (shapes show overall distribution of responses, dot is mean and lines indicate standard deviation). Brackets and asterixis indicate a significant pairwise difference between groups (*p < .05, **p < .01, ***p < .001) – see Analysis for details.

This is mostly confirmed by the results of asking participants directly how certain they thought the R number was, where there was no difference between the effects of the control statement, saying there was ‘consensus’ and saying it was ‘highly likely’ that R fell within the range given. However stating overtly that there was ‘high consensus’ or ‘low consensus’ had a stronger effect (although again, as in previous experiments by Schneider et al., ‘high consensus’ was not significantly different from the control condition (1,2)). See Figure 6.

A diagram of a variety of colors

Description automatically generatedFigure 6: Perceived certainty of R, by format of indirect uncertainty (shapes show overall distribution of responses, dot is mean and lines indicate standard deviation). Brackets and asterixis indicate a significant pairwise difference between groups (*p < .05, **p < .01, ***p < .001) – see Analysis for details.

When it came to how trustworthy and how informative people found the information, the results mirrored that of their certainty (suggesting that they interpreted the question to be asking about the trustworthiness of the R value rather than the statement itself). See Figure 7.

A comparison of a diagram

Description automatically generated with medium confidenceFigure 7: Perceived trustworthiness and informativeness of the information, by format of indirect uncertainty (shapes show overall distribution of responses, dot is mean and lines indicate standard deviation). Brackets and asterixis indicate a significant pairwise difference between groups (*p < .05, **p < .01, ***p < .001) – see Analysis for details.

There was no difference in perceived trustworthiness between the different levels of R communicated. There was also no difference between the formats and the amount of effort people reported having to put into reading it.

However, interestingly as an aside, people did feel more informed when the R value was not around 1. See Figure 8.

A diagram of a diagram showing different shapes

Description automatically generated with medium confidenceFigure 8: How informed people felt after reading the information, by range of R portrayed (shapes show overall distribution of responses, dot is mean and lines indicate standard deviation). Brackets and asterixis indicate a significant pairwise difference between groups (*p < .05, **p < .01, ***p < .001) – see Analysis for details.

Experiment 2

This experiment required participants to draw conclusions from a graph, rather than simply read a statement. They were asked the same questions as in Experiment 1.

Participants were randomly assigned to one of three conditions to see graphs depicting the R number range. The control condition showed a single bar illustrating a point estimate and a symmetrical range with error bars. The two uncertainty conditions illustrated several point estimates with error bars (all showing the same direct uncertainty interval), representing the ranges output by different modelling groups. In one condition there was considerable consensus between the different ranges (‘low uncertainty’) and in the other, less consensus (‘high uncertainty’). See Figure 9.

A graph of different colored lines

Description automatically generated with medium confidenceFigure 9: Example of the three levels of indirect uncertainty (Quality of Evidence) in the three arms of Experiment 2. Top left, no uncertainty. Top right, high indirect uncertainty (low Quality of Evidence/consensus). Bottom Left, low direct uncertainty (high Quality of Evidence/consensus)

Again, participants understood the gist of what the graph was trying to communicate about R, perceiving the risk to be higher when R was higher, and Covid-19 more likely to be increasing when R was portrayed as above 1. See Figure 10. There was no effect of indirect uncertainty format on these results, in contrast to the work of Padilla et al, who found that showing the outputs of an ensemble model of COVID-19 forecasts as multiple lines on a line graph led to higher perceptions of the risk compared to a single range (6), but the individual model predictions were not as different from each other (particularly at the top end) as in their experiment.A comparison of a person's body

Description automatically generatedFigure 10: The perceived riskiness of the COVID-19 situation (left) and perceived likelihood that COVID-19 is increasing (right), given the consensus range of R portrayed (shapes show overall distribution of responses, dot is mean and lines indicate standard deviation). Brackets and asterixis indicate a significant pairwise difference between groups (*p < .05, **p < .01, ***p < .001) – see Analysis for details.

They also understood the specific consensus range communicated, because they said they would be more surprised if the true value of R turned out to be outside the range than in the middle of it. See Figure 11.

A diagram of different colors

Description automatically generated with medium confidenceFigure 11: The level of surprise people would feel if the true value of R turned out to fall t the midpoint (left), above (middle) or below (right) the consensus range portrayed, by format of indirect uncertainty (Quality of Evidence), and (for the surprise if above the range) by range presented (shapes show overall distribution of responses, dot is mean and lines indicate standard deviation). Brackets and asterixis indicate a significant pairwise difference between groups (*p < .05, **p < .01, ***p < .001) – see Analysis for details.

These results also confirmed that, as in Experiment 1, there were differences between the different formats of communication of the indirect uncertainty around the R consensus range, particularly in anticipating the upper boundaries of likelihood. As in Experiment 1, these results neatly show that participants were able to interpret the indirect uncertainty cue in the graph. When there was low indirect uncertainty (high quality of evidence - the ranges calculated by independent groups were close to each other) they had more confidence in the consensus range: would be more surprised if the true value fell outside it and less surprised if it fell at the midpoint, than if there were high direct uncertainty (low quality of evidence - the ranges calculated by independent groups were far from each other). Visual cues of consensus have previously been shown to be understood and interpreted by lay audiences (e.g. (7)) but these have generally been relatively simple representations (e.g. a pie chart showing the proportion of experts that agree with a statement).

The participants given no indirect uncertainty cues fell in between those given a high and those given a low uncertainty cue (but higher to assuming low uncertainty). This is confirmed when we asked them directly about the perceived certainty of the value of R. See Figure 12.

A diagram of a certain type of scale

Description automatically generated with medium confidenceFigure 12: Perceived certainty of R, by level of indirect uncertainty (Quality of Evidence) portrayed (shapes show overall distribution of responses, dot is mean and lines indicate standard deviation). Brackets and asterixis indicate a significant pairwise difference between groups (*p < .05, **p < .01, ***p < .001) – see Analysis for details.

When asked how likely they thought it that R fell on the upper limit of the range portrayed, the higher levels of indirect uncertainty appeared to make this feel more likely. See Figure 13.

A diagram of a graph

Description automatically generated with medium confidenceFigure 13: Perceived likelihood of the true value of R falling exactly at the upper limit of the consensus range shown, by level of indirect uncertainty (Quality of Evidence) portrayed (shapes show overall distribution of responses, dot is mean and lines indicate standard deviation). Brackets and asterixis indicate a significant pairwise difference between groups (*p < .05, **p < .01, ***p < .001) – see Analysis for details.

As in Experiment 1, participants’ perceptions of trustworthiness and informedness mirrored their perceptions of certainty, which probably means that they interpreted the question to mean the trustworthiness of the value of R, rather than of the information as a whole. See Figure 14.

A diagram of a person's body

Description automatically generatedFigure 14: Perceived trustworthiness and informativeness of the graph, by level of indirect uncertainty (Quality of Evidence) portrayed (shapes show overall distribution of responses, dot is mean and lines indicate standard deviation). Brackets and asterixis indicate a significant pairwise difference between groups (*p < .05, **p < .01, ***p < .001) – see Analysis for details.

Also interestingly, there was no difference in the ratings of effort required to interpret the different uncertainty formats (despite the considerable extra information in the two uncertainty formats), but there was more effort required in reading the graphs where R was above 1 rather than below it (in line with the finding of Experiment 1). See Figure 15.

A graph of a performance

Description automatically generated with medium confidenceFigure 15: Amount of effort it took to understand the information in the graph, by range of R portrayed (shapes show overall distribution of responses, dot is mean and lines indicate standard deviation). Brackets and asterixis indicate a significant pairwise difference between groups (*p < .05, **p < .01, ***p < .001) – see Analysis for details.

Comparing Experiment 1 and Experiment 2

Finally, because we used the same ranges and the same questions, we were able to compare the effects of text versus graphs in communicating the uncertainty around R in exploratory extra analyses.

These seem to suggest that overall there is not that much difference between the graph or the statement in communicating the risk of Covid-19, although the statement that there is ‘low consensus’ appears to communicate a higher level of indirect uncertainty than the degree illustrated in our ‘low Quality of Evidence’ graph. The ‘high consensus’ statement appears to communicate the same level of certainty than our ‘high Quality of Evidence’ graph. This is useful to know in terms of translating from graph to language.

A group of different colored shapes

Description automatically generated with medium confidenceThese findings are borne out again in the results to our questions about surprise.

A chart of different colored shapes

Description automatically generated with medium confidenceThe graphs illustrating the high indirect uncertainty (low quality of evidence/consensus) were seen as more trustworthy than when a verbal statement of low consensus was used. The effort required to interpret the graphs, however, was understandably higher than that required to read the statements.

A diagram of different colored shapes

Description automatically generated with medium confidence

Funders

This Interpretation has the following sources of funding:

Conflict of interest

This Interpretation does not have any specified conflicts of interest.