Sunday, April 05, 2020

Coronavirus: the uniformity error

An earlier post on this blog, the one dated 3/24/20 and titled "Coronavirus:predictions," estimated that the total number of coronavirus infections in the U.S. would be about 400,000 and the number of deaths would be about 5,800. These predictions were not very good, and subsequent events demonstrate one reason why. The predictions were made at a time when the the rate of change of observed cases in Washington state was 0.146, while that in New York state was 0.214. At the time of this post, however 49% of the observed cases in the United States are in New York and New Jersey. Obviously, New York and New Jersey do not contain 49% of the U.S. population. One assumption that went into the erroneous prediction of 3/24/20 was that the trajectory of infections in New York would follow that in California and Washington state. In other words. the prediction contained an erroneous assumption of uniformity. If, for example, the current number of total cases in New York was proportional to that in California, the number of cases in New York would be 7,024 instead of the observed 122,031.

This error of uniformity is perhaps the most common and significant impediment to predicting the course of coronavirus spread, and it takes many forms. It is for example, erroneous to assume that everyone is equally susceptible to infection, or that the virus will spread at the same rate among different locations and among different populations. It is an error to try and "extrapolate" from one population to another, or from a small population to a large one. The wilder estimates, those that predicted millions of deaths in the United States, or hundreds of thousands in the United Kingdom were based largely on this error. This error propagates throughout epidemiologic models and unfortunately influences decision-makers into policies that are just as erroneous as the faulty assumption that underlies them.

This error is compounded by another, equally prevalent fallacy, and that is differences in data between two places, say for example, Italy and Sweden, or South Korea and Spain, are accounted for by one or two factors, e.g. testing, or demographics or "not taking the virus seriously." This subsequent folly is dependent on the first. It is assumed that there is uniformity between Italy and South Korea such that if the Italians had done precisely as the South Koreans, they would have had precisely the same outcomes. Simple observation demonstrates that this is foolish, and a more relevant factor is being overlooked.

Another related error is extrapolating a report of a number of people being infected by seemingly trivial exposures into a notion that this is representative of how the infection spreads.

There are a number of observations that appear to be paradoxes that refute the erroneous assumption of uniformity, among them:

- the disproportionate number of cases in New York and New Jersey;
- the relatively high, but absolutely low number of infections in San Marino and n the Diamond Princess;
- the anecdotes of "super spreaders," where a large number of people appear to have been infected by relatively trivial exposures;
- the relatively low prevalence of infection in the Italian town of Vò, and the currently low rate of positive antibody tests in San Miguel County, Colorado.

A reasonable hypothesis regarding how the virus spreads and how best to model it must account for these observations. One such hypothesis has been presented in previous posts and will be expanded upon subsequently

No comments: