A couple of days ago, I read John Ioannidis' opinion piece in Stat News on Covid-19, as well as tweets raging about, or defending, it. I quickly posted a rebuttal. A large part of the opinion piece centered on his estimates of the potential risk of dying from the coronavirus disease, Covid-19. But no sources were provided and it was complicated, so that was going to need more time. I've had a chance to do that now - helped along by people on Twitter who responded to my post (thank you everyone!).

Let's start with some context. Influenza pandemics have a case fatality rate (risk of dying if you're diagnosed) of under 0.1%. That's fewer than 1 in 1,000 people. [Update] However, that's based on CDC modeling in the US, and is itself a contestable estimate. The 1918 flu pandemic that killed over 50 million people – and maybe as many as 100 million – had a case fatality rate of over 2.5% (source). This massive toll was because the 1918 virus was a new one that was easily transmitted, so it spread far and wide in people with no immunity. Seasonal flu kills a lot of people, sadly, but fortunately only a few percent of the population generally get infected. With Covid-19, it's a new virus (SARS-CoV-2), and it can spread quickly. That means that even a case fatality rate (CFR) that sounds low, could have devastating consequences. (There's a great explanation of this in an interview with epidemiologist, Steve Goodman: you can read a summary here, or listen to it here.)

Ioannidis reckoned this out:

If we assume that case fatality rate among individuals infected by SARS-CoV-2 is 0.3% in the general population — a mid-range guess from my Diamond Princess analysis — and that 1% of the U.S. population gets infected (about 3.3 million people), this would translate to about 10,000 deaths. This sounds like a huge number, but it is buried within the noise of the estimate of deaths from “influenza-like illness.” If we had not known about a new virus out there, and had not checked individuals with PCR tests, the number of total deaths due to “influenza-like illness” would not seem unusual this year. At most, we might have casually noted that flu this season seems to be a bit worse than average. The media coverage would have been less than for an NBA game between the two most indifferent teams.

There are 2 key components here: (a) an average CFR for the population in the US of 0.3%, and (b), an infection rate of 1% of the US population. I couldn't find anything in Ioannidis' opinion piece that explains how he decided on an infection rate of only 1% for this disease. Infectious diseases epidemiologist and microbiologist, Marc Lipsitch, called the idea of an "exponentially growing uncontrolled pathogen" that infects only 1% of people "wishful fantasy". We'll come back to what other people are suggesting later.

Ioannidis does, at least partly, describe a chain of steps he took to get to that CFR of 0.3%, though. That trail starts with the cruise ship, the Diamond Princess, that ended up quarantined off the coast of Japan. This is why he chose that starting point:

The one situation where an entire, closed population was tested was the Diamond Princess cruise ship and its quarantine passengers. The case fatality rate there was 1.0%, but this was a largely elderly population, in which the death rate from Covid-19 is much higher.

He doesn't say what data source he used. Let's dig into this.

Timothy Russell and colleagues have posted a preprint about this outbreak, with details of their calculations, and that's what I'll use here, unless I state otherwise.

The Diamond Princess outbreak started because 1 passenger had the virus when he boarded the ship on 20 January. There were 3,711 passengers and crew, and at some point, some attempts were made to stem the outbreak on board, and the whole ship ended up quarantined.

By 20 February, 634 people had been diagnosed, and that was the data used for Russell & co's calculations. According to the tally being kept on Wikipedia, by now 712 passengers have been confirmed to be infected, of whom 7 or 8 have died. [Update 17 April: 14 have now died, bringing the current mortality rate to 2.0%.] According to Worldometers, of those 712, a substantial portion – 158 – have yet to recover. [Update 17 April: 7 are still listed as critical, but that may include one of the people who have since died.] As Russell and colleagues write, "the ratio of reported deaths to reported cases to date, will underestimate the true CFR because the outcome (recovery or death) is not known for all cases". It's just too soon to know the final outcomes for the unlucky people on that cruise.

What's more, although Ioannidis wrote that the "entire, closed population was tested", that's not the case. Only 3,063 people were tested, so we don't know for sure how many people were infected. That means there are substantial unknowns to left and right, and it's only possible to calculate the infection fatality rate (IFR) with estimates of those missing numbers. The IFR is what can tell us what the chances of dying are if we are infected. The CFR in this data tells us what the chances might be if we're among a similar group diagnosed under the set of testing circumstances on that ship.

The Russell preprint tackles the problem of not knowing how many of those people who have not recovered might still die by taking into account what we know about the time from hospitalization to death in the Wuhan outbreak for their CFR. They also estimated an IFR.

This is what Ioannidis said:

The case fatality rate there was 1.0%, but this was a largely elderly population, in which the death rate from Covid-19 is much higher. Projecting the Diamond Princess mortality rate onto the age structure of the U.S. population, the death rate among people infected with Covid-19 would be 0.125%. But since this estimate is based on extremely thin data — there were just seven deaths among the 700 infected passengers and crew — the real death rate could stretch from five times lower (0.025%) to five times higher (0.625%).

This is how his CFR calculation compares to Russell and co's CFR and IFR:


             Rate                     Ioannidis                     Russell   
          CFR (range)                     1.0% (0.025 to 0.625)                      2.3% (0.75 to 5.3)          
          IFR (range)                     n.a.            1.2% (0.38 to 2.7)

We don't know how he arrived at the extrapolation of 0.125% for the US population. The reason it's lower than the 1.0%, though, is because the passenger/crew mix was older (an average of 58) than the general population. Because of the unknowns, Ioannidis then assumed:

reasonable estimates for the case fatality ratio in the general U.S. population vary from 0.05% to 1%. 

And then he chose 0.3%, mid-range between 0.05 and 1.

In the Russell preprint, the raw CFR for China was 3.7% – quite a lot more. Which brings us to one of the other key issues about using the Diamond Princess as an anchor: the health care available to the group of people with symptoms isn't going to reflect what would happen in an uncontrolled outbreak in a larger community than a ship. And the tourists and crew would be healthier than the general population (a point acknowledged by Ioannidis); the passengers would likely have above-average income in comparison to the US. The cheapest berth on the Diamond Princess is about US$100 a day.

On the other hand, Marc Lipsitch wrote in his Stat News response to Ioannidis' piece, "In Wuhan, at the peak of the epidemic there, critical cases were so numerous that, if scaled up to the size of the U.S. population, they would have filled every intensive care bed in this country". 

Where does this leave us? Ioannidis' 1.0% CFR is too low for the Diamond Princess, and it's not clear how the conditions are applicable to the US population: the 0.3% CFR seems likely to be too low, too. The 1% rate of people infected, however, is likely to be very wide of the mark. It's far lower than the infection rate on the Diamond Princess (which was more like 20%), and it is much lower than occurs in an influenza pandemic, too, where the community almost always has some protection via exposure or vaccination.

Neil Ferguson and colleagues from Imperial College estimated that without any control measures, 81% of the populations of the UK and USA could be infected in the pandemic, with over 4% requiring hospitalization. And if the health systems had enough hospital and intensive care beds available, at a 1% CFR they predict "approximately 510,000 deaths in GB and 2.2 million in the US". Ioannidis' 10,000 deaths in the US scenario doesn't seem to be a credible possibility without extraordinary success in limiting the spread of the virus and timely access to the best possible care for everyone with severe disease.

So far, we've only looked at a single ship and China. We're seeing very different experiences elsewhere as this pandemic unfolds. Comparing rates in each place is problematic, given the different rates of testing, different health care available, and different practices for identifying Covid-19 as the cause of death.

Nevertheless, the Centre for Evidence-Based Medicine (CEBM) in Oxford has done a meta-analysis of the data from Worldometers, and they are updating it daily. (I've written a blog post on understanding data in meta-analyses, if you would like an explainer – although it focuses on meta-analysis in clinical trials.) First off, some countries aren't included (I'm not sure why). The meta-analyses below have not been widely scrutinized or peer reviewed, so take that into account – and that applies to my post, too. And the mostly low rates of testing and other differences mean we don't have truly comparable data everywhere. This is a far more complicated question than it might seem, and we are only at the beginning of people analyzing what is happening globally in enough detail.

In the first meta-analysis (screenshot here), the predictive interval calculated was 0.68% to 6.61%. However, others have pointed out that a different meta-analytic technique would be more appropriate for this data set because the data aren't comparable enough. Jean-Paul Salameh re-calculated using 2 types of model, and that shows what a difference analytic choices make. With a random effects model, the CFR was 1.98% (1.34 to 2.9); with a fixed effects model, he arrived at a CFR of 4.17% (4.08 to 4.26). And keep in mind, too, that the CFR is only for known cases: few places have enough testing to give you a good idea of how many people are actually infected, especially when they don't have symptoms.

We have to monitor and plan, but there are so many moving parts when extensive efforts are made to limit the spread of disease, that it's too soon to expect a predictive and precise estimate that won't change much. Be careful of any "hot take" that is radically different to what the pandemic experts are telling us. There is one prediction about this Covid-19 pandemic that I am prepared to make with absolute certainty: in the coming weeks and months, you will meet too much false precision!



Postscripts to this postscript:

1. Ioannidis published an editorial in a journal which I had not read when I wrote this. That editorial expands on most of the arguments he made in his opinion piece in Stat News, and adds discussion of several issues. It does not repeat several of the claims that have been so highly criticized. His position on the likely CFR of Covid-19 in that editorial includes this statement:

The most complete data come from Diamond Princess passengers, with CFR=1% observed in an elderly cohort; thus, CFR may be much lower than 1% in the general population; probably higher than seasonal flu (CFR=0.1%), but not much so.

2. The University of Padua tested all the residents of a small town in Italy after its first 2 infections were found, but I have only seen sketchy details and a variety of numbers are out there. This report says 2,778 people were tested in Vò. Testing the whole town found 66 people were infected (2.5%). All of these people were isolated, according to one of the researchers who was interviewed, including some who had no symptoms, and a subsequent test found this had stopped the outbreak.

3. Ioannidis has done an interview in a podcast, with the Sunday Times tech correspondent, Danny Fortson, which I haven't listened to.


Hilda Bastian

Originally posted 21 March 2020

Last updated 4 May 2020


With many thanks to the people who asked questions and raised issues in response to my post. And particular thanks to James Heilman, for helping out with information about the Diamond Princess outbreak and more importantly, for his massive role in keeping the Wikipedia page on Covid-19 on course, visited by hundreds of thousands of people a day in English alone (plus the other related pages).

[Updates 21 March] The first version of this post incorrectly referred to the predictive interval in the CEBM as a confidence interval - thank you to Jesper Kivelä for alerting me to my error. Additional correction: I noticed Ioannidis did in fact refer to worst case scenario by others, so I deleted this sentence from the original, "This was, in effect, his best case scenario, and the only scenario he provided". And another: corrected misspelling of Neil Ferguson's name - thanks to Robert McMullen for alerting me to the error.

Added the postscript on the new editorial - thank you to @ExerciseBiology for alerting me to it. And the postscript on the town of Vò: thank you to Harlan Campbell for alerting me to that. Then the podcast: thank you to @jobhunter50 for alerting me to that. 

[Update 22 March] Corrected error about fatality rate of influenza pandemics - the original said less than 1%, which is under 1 in 1,000, instead of less than 0.1%. Thank you to James A. Smith for alerting me to this.

[Update 25 March] Updated numbers for the Diamond Princess outbreak.

[Update 17 April] Updated numbers for the Diamond Princess outbreak.

[Update 4 May] Added note that the data on flu mortality are modeled,  not actual, data.