Hand on laptop with

Understanding COVID-19 data: Comparing data across countries


This is the first of a series, presented by our partner SAS, that explores the role of data in understanding the COVID-19 pandemic. SAS is a pioneer in the data management and analytics field. (Check out other posts in the series on our Get Smart About COVID-19 Misinformation page.)

The COVID-19 pandemic has plunged us into a global public health crisis that has experts looking back 100 years for comparison: The 1918 Spanish flu pandemic. However, a lot of things have changed since then. While our medical systems are significantly more robust, we are also more connected globally, which allows disease to spread rapidly in new ways. Something else is spreading rapidly as well and marks another huge shift since 1918 – data.

Data is incredibly powerful. It gives us insight into the world around us and can help us quantify what’s happening so we can better understand a situation and make well-informed decisions. In recent years, as number-crunching technology has advanced, data has influenced almost every aspect of our lives: from the activity trackers we wear on our wrists to how social media platforms track our habits online. These developments have helped us verify — or refute — hunches, assumptions and generalizations about important issues based on quantifiable information.

Critical role during pandemic

This is especially critical now as leaders make unprecedented decisions with potentially life and death consequences as we wait, wondering what comes next. At the same time, most of us are not experts at interpreting public health data that describes different components of the crisis — presented through graphs, charts and statistics updated multiple times a day. How can we make the most sense of this data? How can we decide what to believe, what to dismiss and what might be causing unnecessary fear?

At SAS, we work with data every day. While we can’t answer health questions about the pandemic, we can help identify trends we see in COVID-19 data being presented, and point out possible concerns with how it could be interpreted. In this series we will look at three main areas where we ought to pay particularly careful attention: comparing data across countries; comparing data across time; and case fatality rate vs. mortality rate vs. risk of dying. Let’s begin:

Comparing COVID-19 data across countries

As residents of the United States, one piece of information that we might be interested in is how the disease is progressing in other countries, particularly those where infections first occurred and where leaders have implemented similar social distancing recommendations or community restrictions. We can use information from those nations to predict how the disease will progress here, how quickly that will happen, and when we might see light at the end of the tunnel.

But there are some big flaws with looking at these as direct comparisons. While it’s helpful to try to learn lessons from other nations, we must remember that the data doesn’t tell the whole story of what is happening. For example, Italy has the highest case fatality rate of any nation. (The U.S. Centers for Disease Control define case fatality rate as “the proportion of persons with a particular condition (cases) who die from that condition. It is a measure of the severity of the condition.”)

Differing factors

In the case of Italy, we must ask if this is because of the higher proportion of elderly adults in the population? Is it because of an overwhelmed hospital system? Is it a sign that fewer people with mild symptoms are requesting or receiving testing? The answer is probably some combination of those elements.

Here are some factors to consider:

  • Nations vary widely in the extent of diagnostic testing being conducted.
  • The rate of and criteria for testing greatly impact rates of detection.
  • Most testing can tell us only if a person is currently infected and not if that person had been infected and since recovered.

Those are just some of the factors that may contribute to the discrepancies we see in the chart below.

We see a wide variety in the extent of testing among nations, and therefore, the rate of positive tests. Countries with a high rate of positive test results compared to the number of tests conducted are likely requiring stricter testing criteria. Countries with a higher proportion of negative test results are likely instituting more widespread testing practices. As a consequence, they also might detect a higher percentage of actual cases.

One of the big issues in trying to make comparisons of COVID-19 data among nations is that it’s impossible to know, or control, all the factors that go into the numbers reported. When scientists create experiments to thoroughly understand a phenomenon, they carefully control the conditions and influences as much as possible. But the data surrounding the novel coronavirus isn’t coming from careful experiments; it’s coming from the real and messy world where we can’t control or even measure all the factors that go into different national responses or testing practices and where they are changing, with varying frequency.

Other articles in this series:

SAS logoAbout SAS: Through innovative analytics software and services, SAS helps customers around the world transform data into intelligence.

More Updates

Vetting election information: Tips for veterans, service members

To break through a confusing and misleading information landscape, the News Literacy Project hosted a panel of experts who work with the military community to discuss common types of election-related misinformation and practical tips for finding reliable news before voting.