Friday, April 24, 2020

How Texas Beat COVID-19

Perhaps this post should go on my COVID-19 science blog, but it's more observational than scientific. It's also about Texas, and that's still the last place where I windsurfed, more than a month ago. Texas still reports a relatively low number of COVID-19 infections: 22,393 as of 4/23.

How did Texas keep COVID-19 case numbers down? Let's have a look at the COVID-19 case map:

Darker mean more cases. Interestingly, you can make out the boundaries of Texas' right neighbor, Louisina: things are darker in Louisina. Well, everyone knows that Louisiana had a lot of COVID-19 cases after Mardi Gras, right? That shows clearly on the map that shows cases per million inhabitants in a state:


There's an interactive version of this map where you can see case numbers here.

The interesting thing is that COVID-19 seems to stop at the state line between Louisiana and Texas. The counties on the Louisiana site are much darker, indicating much higher case numbers than the counties in Texas. Did we discover a "disinfecting effect of state lines"? Well, certain orange-blonde people might conclude that, but I highly doubt that's the answer.

One more map to look at: the number of COVID-19 tests done in each state (interactive version here):

Looks very similar. If you use the interactive version, you can see that Louisiana did almost 4x as many tests per 1000 inhabitants than Texas. "But they had to because they had more cases!", you say? Looking at the first chart, it is clear that it is the other way around: Texas reports fewer cases than Louisiana because it does fewer tests.

Perhaps this does not come as a surprise, considering that the lt. governor of Texas has stated that "there are more important things than living", and that "Texas should not be shut down because a small percentage of the population is dying". Just test fewer people for COVID-19, and you have fewer cases! You'll also have fewer COVID-19 deaths, because that requires a COVID-19 diagnosis. Genius!

Looking back at the map on top, you can see that it is also easy to make out to top (northern) border of Louisiana to Arkansas. That's another state that has not done many tests. Again, we see a sharp drop in infections right across the state line.

Without a doubt, Texas and Arkansas have lower reported COVID-19 numbers because they do less testing. For a quick comparison, let's look at New York: the state has done almost 700,000 tests, and has 263,460 confirmed cases: about 1.3% of the population is confirmed positive. But a couple of different studies have concluded that about ten times more people in NY have been exposed to the virus. Even with a lot of testing, the tests have under-estimated the number of infected people tenfold.

Now some "smart" people may conclude that this is "good news", because it means that COVID-19 is less deadly. Well, not really. Scientific studies had already concluded that the actual "infection fatality ratio" is about 0.4% to 1%, with 0.6% being a typical value. And that's roughly what we get for New York, too, when we assume 10-fold underreporting of cases.

So, what does that mean for Texas? Texas has 29 million inhabitants. If about two thirds of them would get infected, that's 20 million cases. With a fatality rate of 0.6%, that's 120,000 COVID-19 deaths for Texas alone. But with perhaps 1 in 20 cases getting tested, the official reported number of COVID-19 death in Texas can be kept down to 6,000 - less than New York! Eventually, some statistician may discover that the annual death rate for Texas increased by more than 50% during that time - but that will probably be years from now, since it takes a very long time for most death statistics to become publicly available.

Wednesday, April 8, 2020

Fun With Graphs

After all these serious posts on my COVID-19 blog, I just have to get a little less serious, so I'll take some clues from my imaginary friend Sheldon. Unfortunately, my lovely wife declined to play the role of Amy in a video, you this will have to be another long blog post instead.

I will put the blame on a guy who deals with the mathematics of auto warranties (yes, apparently this is real!). He decided to apply the same math ideas to the COVID-19 epidemic, found that they seem to apply surprisingly well, and published a little paper about it. Here's one of his figures:
Being a car guy, this graph is about Michigan. It shows that one a plot that used a logarithmic scale for both axes, a straight line can be fitted to the data. That reminded me off multiple papers that talked about transitions from exponential growth to power-law growth... but let's not go there!

It's more fun to play with graphs (try it!), so I downloaded the data from the COVID tracking project and plugged them into OpenOffice. Here's a graph of confirmed cases (in blue) and total deaths (in red):
Sure enough, the deaths data seem to fit a straight line very well once we get past 200 or so. Since death counts trail case numbers by about 2 weeks, the straight line shows the trajectory the US was on before governors in most states issued "stay-at-home" orders in the week of March 23 (day 1 in the graph is March 4). If the US had stayed on this trajectory, the graph predicts that the total number of deaths would have exceeded 1 million before the end of May. But fortunately, the blue curve indicates that the various measures have slowed down the growth of the epidemic.

If we reflect the effect that the current measures are having by using the slope in of the confirmed cases line to project deaths, we get this graph:
Here, we get about 200,000 total death sometime in June. Still a pretty scary number.

Maybe we can make curves with nicer results if we go back to a linear scale for the x-axis. Let's see:
Now we can see that the curves are getting flatter. That's what we want! How about drawing a straight line through the blue (confirmed cases) curve?

We can see exponential growth (a straight line) for the first 10 days or so, and then an acceleration in the growth of new cases. This happened to coincide with a rapid increase in tests being done: the number of COVID-19 tests done per day in the US increased more than 20-fold from 3/10 to 3/17, from 634 to 13,203.

We know that the time between onset of symptoms and death from COVID-19 is about 19 days. There are also delays of several days between first symptoms and hospital admission, and another few day to get the tests results. Taking these delays into account, we can assume that the delay between case confirmation and death is about 10 days. To extrapolate the number of deaths in the near future, we can simple grab the last 10 days of the cases curve, and copy them to the end of the deaths curve:
From this, we get an expected number of about 40,000 total deaths in the US 10 days from now. This number does not take increased mortality from hospital overloading into account. The actual number of deaths is likely to be higher, since hospitals in COVID-19 hot-spots like New York are already filled to capacity. But otherwise, this projection is quite solid, since it is based on the trend in actual reported case numbers.

For the copying of the curve in the last picture, I used GIMP to create a new layer, and used the "Darken" option so that the gridlines would not be obscured. Since I was already in GIMP, I could not resist the temptation to draw in some lines by hand:

This is a true "eyeball estimation", completely unscientific. But it does assume some further drops in the number of new cases per day. It leads to a prediction of 200,00 total COVID-19 deaths in 5 weeks from now. Again, this number does not include hospital overloading. Also note that this is not the overall total number of deaths, since additional deaths would occur after the time frame covered in the graph.

But what if we could really get our act together and reduce new infections even faster? Let's see:
The green lines in the graph above assume a rapid decrease in daily new cases. If that can be achieved, the total number of deaths could be reduced to about 80,000 for the 70-day window in the graph.

The difference between the purple and green lines illustrates what the potential effect of better social distancing and additional measures to prevent COVID-19 transmissions could be. Sure enough, both line where drawn by hand in a very unscientific way. But if you take a close look at the published computer models, they all have to do the same thing at the end: guess what the effect of interventions will be. In the best case, models may be based on measuring how much interventions have reduced "contacts" - but even then, the models still have to assume what the relation between "contacts" and disease transmission is. Typically, these two are regarded as the same, which completely ignores non-contact transmission modes, and therefore is simply wrong.