Wednesday, April 8, 2020

Fun With Graphs

After all these serious posts on my COVID-19 blog, I just have to get a little less serious, so I'll take some clues from my imaginary friend Sheldon. Unfortunately, my lovely wife declined to play the role of Amy in a video, you this will have to be another long blog post instead.

I will put the blame on a guy who deals with the mathematics of auto warranties (yes, apparently this is real!). He decided to apply the same math ideas to the COVID-19 epidemic, found that they seem to apply surprisingly well, and published a little paper about it. Here's one of his figures:
Being a car guy, this graph is about Michigan. It shows that one a plot that used a logarithmic scale for both axes, a straight line can be fitted to the data. That reminded me off multiple papers that talked about transitions from exponential growth to power-law growth... but let's not go there!

It's more fun to play with graphs (try it!), so I downloaded the data from the COVID tracking project and plugged them into OpenOffice. Here's a graph of confirmed cases (in blue) and total deaths (in red):
Sure enough, the deaths data seem to fit a straight line very well once we get past 200 or so. Since death counts trail case numbers by about 2 weeks, the straight line shows the trajectory the US was on before governors in most states issued "stay-at-home" orders in the week of March 23 (day 1 in the graph is March 4). If the US had stayed on this trajectory, the graph predicts that the total number of deaths would have exceeded 1 million before the end of May. But fortunately, the blue curve indicates that the various measures have slowed down the growth of the epidemic.

If we reflect the effect that the current measures are having by using the slope in of the confirmed cases line to project deaths, we get this graph:
Here, we get about 200,000 total death sometime in June. Still a pretty scary number.

Maybe we can make curves with nicer results if we go back to a linear scale for the x-axis. Let's see:
Now we can see that the curves are getting flatter. That's what we want! How about drawing a straight line through the blue (confirmed cases) curve?

We can see exponential growth (a straight line) for the first 10 days or so, and then an acceleration in the growth of new cases. This happened to coincide with a rapid increase in tests being done: the number of COVID-19 tests done per day in the US increased more than 20-fold from 3/10 to 3/17, from 634 to 13,203.

We know that the time between onset of symptoms and death from COVID-19 is about 19 days. There are also delays of several days between first symptoms and hospital admission, and another few day to get the tests results. Taking these delays into account, we can assume that the delay between case confirmation and death is about 10 days. To extrapolate the number of deaths in the near future, we can simple grab the last 10 days of the cases curve, and copy them to the end of the deaths curve:
From this, we get an expected number of about 40,000 total deaths in the US 10 days from now. This number does not take increased mortality from hospital overloading into account. The actual number of deaths is likely to be higher, since hospitals in COVID-19 hot-spots like New York are already filled to capacity. But otherwise, this projection is quite solid, since it is based on the trend in actual reported case numbers.

For the copying of the curve in the last picture, I used GIMP to create a new layer, and used the "Darken" option so that the gridlines would not be obscured. Since I was already in GIMP, I could not resist the temptation to draw in some lines by hand:

This is a true "eyeball estimation", completely unscientific. But it does assume some further drops in the number of new cases per day. It leads to a prediction of 200,00 total COVID-19 deaths in 5 weeks from now. Again, this number does not include hospital overloading. Also note that this is not the overall total number of deaths, since additional deaths would occur after the time frame covered in the graph.

But what if we could really get our act together and reduce new infections even faster? Let's see:
The green lines in the graph above assume a rapid decrease in daily new cases. If that can be achieved, the total number of deaths could be reduced to about 80,000 for the 70-day window in the graph.

The difference between the purple and green lines illustrates what the potential effect of better social distancing and additional measures to prevent COVID-19 transmissions could be. Sure enough, both line where drawn by hand in a very unscientific way. But if you take a close look at the published computer models, they all have to do the same thing at the end: guess what the effect of interventions will be. In the best case, models may be based on measuring how much interventions have reduced "contacts" - but even then, the models still have to assume what the relation between "contacts" and disease transmission is. Typically, these two are regarded as the same, which completely ignores non-contact transmission modes, and therefore is simply wrong.