21th of April 2020
A series on understanding the SARS-Cov-19 virus spread and death rates of its disease. Data, Statistics and Modelling.
We know that data on COVID-19 is incomplete and its statistics biased. What we do not know is how biased it is. By comparing the most comparative statistics we can learn what we know and what we do not know about this epidemic. However let us bear in mind couple of important facts: (i) the COVID-19 data is not representative of populations while seasonal flu data is; (ii) the COVID-19 data exists only for 4 months period, while estimates for the seasonal flu are yearly-based. Apart from that, both diseases are caused by a virus with similarities in symptoms and disease. What we do not know is, whether there are similarities in infection rates and case-fatality rates (how many deaths among those infected) as well?
Below are graphs of the data for the 88 the most affected countries, excluding Africa, on the WHO data of COVID-19 cases and deaths. This data is compared to the true estimates of seasonal flu related cases. Please read a disclaimer below before judging the graphs comparing countries on the following two statistics:
1. (COVID-19 related deaths / population of a country) * 100, compared to the estimates of seasonal flu related deaths for the USA and the World
2. (COVID-19 positive tested cases / population of a country) * 100, compared to the population-based seasonal flu infection rates in the USA
For more on comparing COVID-19 data to seasonal flu data read here: here: Can we compare it?
These countries are labelled as the most affected based on data collected by the World Health Organisation (WHO) on COVID-19. Countries are using different protocols for testing in terms of who they test and how many they test. The number of tested is not proportional to the size of a population which makes comparability across countries biased (some countries are testing more than others). It likely depends on resources and availability of tests. Sufficient data are not available to be able to account for these biases.
It could be safe to assume that the EU countries are following the same testing protocols therefore the comparability across the EU countries comes with less uncertainty.
The number of deaths related to COVID-19 are likely under-reported, e.g., not every person who died during that period of time was tested for the presence of virus.
On the other hand, seasonal flu is a well studied and understood topic, the infection rate estimates are reliable. With the virus that causes COVID-19 disease, no study has been done yet to reliably understand this statistics.
The testing strategy which includes only those who experience symptoms of the disease, but excludes those who have not experienced any symptoms or had only mild symptoms, does not enable accurate view of the state of this epidemics. The graphs below demonstrate a consequence of deriving information from such incomplete data.
Incomplete data produce biased insights. For example, measured infection rates for COVID-19 are only a fraction of the seasonal flu infection rates, whereas the number of COVID-19 deaths relative to the population of a country is for some countries much higher than in case of seasonal flu. Keep in mind that seasonal flu ratio is a yearly estimate, representative of a population, while the COVID-19 ratio includes measures on only the last four months. If this data would be considered to be complete, then we would conclude that the virus is not very infectious, but a way more deadly than the flu. However, this is not true. These insights are biased due to the available data being incomplete.
Based on all existent research on COVID-19, the most affected group of population are the elderly and a high number of deaths are related also to health-care systems being overwhelmed. However, the age structure is not that significantly different across the EU countries, with Italy indeed being the oldest European nation.
Another important fact to account for when putting the information in the bellow graphs into perspective is severity of lockdown policies across countries. The well-known countries with the mildest (partial) lockdown policies are Sweden, Netherlands, Switzerland and Singapore with the latter being the most advanced in terms of using data, science and technology to guide policies for controlling this epidemic.
Below is the list of all the countries that are included in the dataset from which the two graphs are derived. If you happen not to find a country of your choice on the graphs, it is because the values are too small to be on the list of 38 highest values for COVID-19 data. This does not mean that other countries are less affected. They might simply have less resources to test for the disease. As a result the number of infected cases is lower and consequently the number of deaths related to COVID-19 as well. More data is needed to evaluate this claim.
Albania, Algeria, Argentina, Armenia, Australia, Austria, Azerbaijan, Bangladesh, Belarus, Belgium, Bhutan, Bosnia and Herzegovina, Brazil, Brunei, Bulgaria, Cambodia, Canada, Chile, China, Colombia, Costa Rica, Croatia, Cuba, Cyprus, Czech, Denmark, Ecuador, Egypt, Estonia, Finland, France, Georgia, Germany, Greece, Hungary, Iceland, India, Indonesia, Iran, Iraq, Ireland, Israel, Italy, Japan, Kazakhstan, Korea, Kyrgyzstan, Latvia, Lebanon, Liechtenstein, Lithuania, Luxembourg, Malaysia, Maldives, Malta, Mexico, Moldova, Monaco, Mongolia, Montenegro, Nepal, Netherlands, New Zealand, North Macedonia, Norway, Panama, Peru, Philippines, Poland, Portugal, Romania, Russia, San Marino, Serbia, Singapore, Slovakia, Slovenia, Spain, Sri Lanka, Sweden, Switzerland, Thailand, Turkey, Ukraine, United Kingdom, USA, Uzbekistan, Vietnam.
‘Total’ presents all the cases in the world.
Use the above information to understand uncertainties of information showcased by the two graphs below.
Warning: The graphs do not reflect the true state of COVID-19 epidemic, but only the part on which data is available. More on the COVID-19 missing data can be found here: Data as a guide to balance societal trade-offs in COVID-19 epidemic and here: What data and statistics can and cannot reveal about COVID-19 disease.
Please read the above disclaimer to learn about the used data, and unquantifiable uncertainty. For questions and comments, get in touch via info@tarastats.com.
Warning: The graphs do not reflect the true state of COVID-19 epidemic, but only the part on which data is available. More on the COVID-19 missing data can be found here: Data as a guide to balance societal trade-offs in COVID-19 epidemic and here: What data and statistics can and cannot reveal about COVID-19 disease.
Please read the above disclaimer to learn about the used data, and unquantifiable uncertainty. For questions and comments, get in touch via info@tarastats.com.
ADDITIONAL INFORMATION
New York has younger population than San Marino.
San Marino has more inclusive health care system than New York.
Saying that one has died because of the COVID-19 can be interpreted in different ways. Was COVID-19 a single cause contributing to the death of a person or where there also other causes, for example, chronic diseases, weakened immunity, air-pollution, smoking, alcohol usage, etc.
We should ask, would a person still live if COVID-19 would not affect her? Or would there be another disease stopping that person’s life at exactly the same time as the COVID-19 did? These are challenging questions and difficult to be answered with precision.
Countries may use different guidelines when reporting on COVID-19 death cases. It is clear, that if a person was not tested before death, it is difficult to claim that the person has died because of COVID-19 (unless testing was performed on death bodies, which should be recorded if done). Therefore countries that are testing more, will also have more COVID-19 related deaths.
The above graphs look like it is developed world that is the most affected, but such interpretation is misleading, because it is also the developed world that has the most resources and does the testing in the most vigorous way.
Tarastats Statistical Consultancy | Kampinkuja 2, 00100 Helsinki, Finland | www.tarastats.com | info@tarastats.com
Tarastats Statistical Consultancy © 2025