Examples and Solutions for Potentially Misleading Representations, COVID
Since the general public uses this site as a primary resource to keep up with the pandemic, there should be a strong obligation towards clarity and towards removing bias. Right now, while I have no reason to believe any data is inaccurate, it is easy for someone who isn't used to reading data, doesn't read disclaimers, or who has biases they are trying to confirm, to mistakenly use correct data to draw false conclusions. I'll point out each issue, and then suggest a few ways it could be fixed
Issue 1: Misleading shape and incomplete information posted as if it were complete on both the Confirmed Cases and Deaths graphs.
While there are some disclaimers explaining this, it is easy for someone to be confused when the daily cases reported are consistently far higher than the data shown on the graphs for the most recent several days. It's easy for someone to glance at the graph and see only its shape. The mound shape with a downturn at the end could cause them to assume that we are recovering, when really the data that shows that shape is just incomplete.
Solutions for issue 1: Show, in some clear way (counter, separate bar, etc.), the number of cases and deaths that are confirmed, but have yet to be added to the graph. If bars showing incomplete data must be added before they are finished, perhaps display them in a greyed-out color so the public can see at a glance that they are still in progress. Otherwise, simply do not update recent bars if it is known to be wildly different from the actual finished number.
Issue 2: Comparing number of reported cases and deaths, or percentage of total reported case and deaths among vastly unequal population sizes when measuring demographics.
This could cause people to make claims like “Wow, young people are doing almost all of the spreading!” That bracket is showing a range of 25 ages, and much larger percentage of the population than the other brackets. Similar incorrect claims could be mistaken about the Race/Ethnicity graphs since there is nothing shown that takes into account the relative population differences of each group.
Solutions for issue 2: Show the percentage of the population for each group off to the side, or as part of the graphical representation you choose.
Issue 3: Hospital data shown as an average or overall percentage.
Whenever anyone takes an overall percentage or an average, they are posing a hypothetical. “What would this look like if we evened everything out?” A lot of times it makes sense to do that. Other times it can be misleading. Someone could look at the red vs the gray on these charts and conclude that since there is still plenty of gray, no hospitals are in trouble. Even if half of the total beds are still available, some hospitals or some areas could easily still be full, or completely overwhelmed due to lack of other resources.
Solutions for issue 3: Show, maybe in addition to the charts already included, how many hospitals say they are prepared to take on a significant number of new COVID patients (maybe 5 or 10), and how many couldn’t.
Issue 4: Averaging/combining antibody testing with virus testing doesn’t have significant meaning.
Like I mentioned, all averages are hypotheticals. This percentage combines 2 different tests that measure 2 different things. The relative size of the number of each type of test also changes quite a bit over time. Comparing apples and oranges, or (more accurately) combining people that are currently eating apples and oranges with those that have ever eaten apples and oranges, doesn’t make sense and could mislead people into believing the percentage of antibody tests is higher than it is, or that the percentage of active virus tests is lower than it is.
Solution for Issue 4: Either remove it, or add a disclaimer showing why it’s included and what meaning is intended by it.
Issue 5: It could be made visually easier to see trends on any graph with erratic daily data.
Solution to issue 5: Include a line showing the 3-day or 7-day average to smooth out the shape and show the trends more clearly on any graph that tends towards erratic daily data.
Issue 6: Testing data does not show how changes in size of testing could influence the percentage of positives.
The number clearly matters. If we tested only those hospitalized with symptoms, the percentage would be way higher. If everyone got magically tested tomorrow, it would be lower. I like that it’s shown, but I’d like to see a line graph where I can see both trending together. When the percentage decreased, that data was used as the star evidence for us to re-open. It was not easy to see that this percentage decrease coincided with the quantity of tests increasing.
Solution to issue 6: I’m imagining adding a line graph that has 2 sets of axis that shows how the 2 different measurements might trend together. There may be a better way…
Issue 7: Projections are being made, and none of them are acknowledge or included.
Choosing to omit projections has the appearance of denying the work of professionals trying to provide an idea of what is in store for us. If I wanted to know the weather, I wouldn’t be satisfied with only current and past weather. I’d want to know the forecast, even if it includes a certain amount of uncertainty.
Solution to issue 7: List any known projections and provide links for people to check them.
Thank you for taking the time to read this. I hope these issues make sense. I hope going forward that even if these issues are not solved in the ways I’d like them to be, that an effort is made to minimize bias and avoid misleading the public.