Thursday, April 09, 2020

Coronavirus: Data

There is probably more about the coronavirus that we do not know than that which we do. Our knowledge is constrained by available evidence , which consists primarily of reports of daily new cases, deaths and total number infected. No reasonable person expects that any of these data are accurate in themselves. There are too many sources of uncertainty, too many variables that affect the ability to characterize the extent of the contagion with precision at any point. The best we can do is to assume that the general trends and tendencies of the data that are reported reflect similar trends and tendencies in the wider world. This is also challenging.

The data are affected not only by changes in the course of infection, but by constant changes in the availability and use of testing, of suspected manipulation and misreporting of data for non-epidemiologic ends, of changes in procedures for attributing particular symptoms or deaths to coronavirus without detailed investigation, and the lack of standardization from place to place or from time to time within the same place.

A paradox arises because we rely on models to provide surrogate data for the things we cannot readily observe, but these models depend on the data that are available, and these are of questionable consistency. As a consequence, the models upon which we rely for planning and policy are likewise of questionable consistency.

A main source of uncertainty is attributing symptoms or deaths to coronavirus in the absence of testing. We can note that there was a dramatic increase in the number of reported cases in China on February 12, 2020 when the criteria for diagnosis changed from a laboratory verified standard to a "clinical" one. We would expect that such a procedure would result in an increase in the number of reported cases, and also that such number would include people who are not in fact infected with coronavirus. This does not improve our knowledge of the state of the contagion, because other sources of uncertainty, such as people who are neither tested for the virus, nor symptomatic may nonetheless have it. Changing the criteria has the primary effect of increasing the uncertainty, rather than increasing the accuracy of assessment.

A similar principle applies in the matter of attributing deaths to the virus. Doing so without a laboratory-confirmed diagnosis, and instead allowing "suspected" cases to be counted is likely to overestimate the true number of coronavirus-related deaths. Changing the criteria creates uncertainty, and we are likely much better off with a system that may result in an inaccurate count, but reliable profile of the trends and volatility of observed cases. Changing the criteria upon which deaths are attributed to coronavirus taints both the absolute numbers associated with the data as well as the inferences that may be drawn from them.

Attributing deaths to coronavirus in the setting of pre-existing conditions is inherently subjective. This is especially so in cases attended by significant pre-infection debility and frailty. It would be helpful, though not practical to attribute deaths to coronavirus only if the life expectancy of the person was at least a year in the virus. It used to be a principle of criminal law that an injury inflicted by a defendant was not a cause of death if the victim survived the assault by more than one year. The ability to assign causation is always fraught with sources of error, and uncertainty, and the point of changing the criteria for doing such in the middle of a pandemic would seem to serve little purpose. This, again assumes that the true information of interest are the pattern and trends observable from the data reflect the course of the spread, even if the absolute values of the either over-estimate or underestimate the true circumstance.

These uncertainties are obvious in the constant state of dispute regarding the mortality of the virus, or the number of asymptomatic carriers, or the R0. These uncertainties are exacerbated by misguided changes in assessment and attribution criteria. The errors associated with accepting subjective determinations of infection are illustrated by the data published by the Colorado Department of Health. These data reveal that, of all tests administered, no more than 20% are positive. It should be assumed that people are tested because they have symptoms consistent with the infection, or are at an elevated risk of contracting it. We may assume that of those tested based on symptoms, a certain number would have been diagnosed with coronavirus under the "clinical" criteria adopted in China, and thus the number of cases would be falsely increased. Countering this, a number of actually infected patients, symptomatic or not would not be tested, either for logistic reasons or reasons personal to the person affected. These would result in under-assessment of the true circumstance.

The upshot of this all of this, is that it is more important to have a consistent method of diagnosing cases and attributing deaths, both geographically and over time, than it is to constantly be tweaking criteria in an ultimately pointless attempt to make sure all cases are counted. At a certain point the experts will be tempted to alter the criteria of measurement to conform to models, rather than interpreting the data as it is to improve those models.

No comments: