Tuesday, March 31, 2020

Modeling coronavirus: duration of infectivity.

It is possible to try and perform simple curve-fitting to previous data to try and predict future values of total daily cases, deaths and so-forth, but this process wholly independent of plausible disease spread mechanisms and gives little insight into which control strategies might be most useful. As mentioned in the previous post, there are two variables that determine the shape of the coronavirus infection curve, and which enable us to make at least some ball-park predictions. One of this is at least partially modifiable by human intervention, and the other is not, being a function of the virus and resulting disease. The first of these variables is the related to the ease with which the virus is transmitted, and was previously denoted by the ratio D/N, where D is the number of daily cases and N is the number of total cases. This only works for simple models however, and more general approaches to this variable are needed. The second variable reflects the fact that in the real-world, each person infected with the coronavirus is only capable of transmitting the infection for a certain amount of time, which can be represented by an average for a large number of infections.

We model the daily change in infections by multiplying the ratio representing ease of transmission by the total active cases. The total active cases can be computed by subtracting the total number of cases T days previously for the current number of total cases, where T is the average number of days that an infected person is capable of transmitting the disease. This number should be approximately fixed, and can be inferred from actual data. When we do this, we can observe that the daily rate of change in the number of active cases (current day's active cases minus previous day's active cases with the difference divided by the previous day's active cases.) When we do this, we get a curve that looks like the following:

Figure 1.

We can compare the general shape of the curve with the actual daily new cases curve from South Korea:

Figure 2.

Figure 1 was generated simply from basic principles, i.e. that the intial spread follows an exponential profile, that the rate of growth depends n the number of presumed active cases, rather than total cases, and that the length of time that the average infected person can transmit the disease is limited.

If we look at the daily values for the D/N and the factor that we multiply by the number of active cases, we notice that two diverge the longer the epidemic continues. This is because D/N is computed using the total number of cases, and makes no allowance for the fact that the earliest cases are no longer capable of transmitting the virus. For this reason, we will replace the basic D/N, which works for purely exponential growth and is valid early in the epidemic, with a more general variable, r. Despite the change in variable and computation, r still represents the ease with which the virus is transmitted and can be modified by interventions such as social distancing, wearing masks, aggressive hand washing, etc. 

We can make two immediate observations: the number of daily cases will peak when the rate of change of the active cases crosses over the rate of change of r. The other thing to note is that there is a predictable tail, that represents the fact that there is initially an equilibrium reached where the number of daily new cases (the product of r and the previous day's active cases) and the rate at which previously infected persons become unable to transmit the virus.

Now that we have these basics, we can fit these curves to actual data, and model what would happen if r increased at certain points in the epidemic, representing relaxing of disease control efforts, what would be the effect of shortening the length of time a person remains infected, i.e. what if we had a cure, and what types of inferences can we make about r from the data obtained from other countries and states.

The simulation that produced figure 1 was implemented in Microsoft spreadsheet, and would take about 30 seconds to code in Exel. The details will be demonstrated in a subsequent post.

No comments: