Possibly obligatory disclaimer: I am not an epidemiologist, nor do I have experience in studying diseases. I am, however a data scientist (in training and in some degrees by profession). Also, the views expressed here are not to cast blame towards any specific organization.
Coronavirus cases have grown substantially in Lubbock over the last few weeks. This left me curious as to how we were actually doing in comparisons to the metrics outlined by the CDC. In considerations regarding reopening, the CDC has called for 14-day periods that exhibit one of the two items (found here: https://www.whitehouse.gov/openingamerica/):
- A downward trajectory of document cases within a 14-day period
OR - A downward trajectory of positive tests as a percentage of total tests within a 14-day period.
Thankfully, the City of Lubbock makes it relatively easy to answer these two questions, as they have a good chunk of data available here: https://ci.lubbock.tx.us/departments/health-department/about-us/coronavirus-disease-2019-covid-19.
Data Collection and Methodology
We need three main pieces of data for this analysis: the number of tests by day, the number of new cases by day, and the number of active cases by day. This last metric is harder to obtain, so I will use a growth of new cases in its place. More on that later.
The City of Lubbock chart of new cases is labeled by day and has data going back to mid-March (Source: https://ci.lubbock.tx.us/storage/images/fS70ABdJWoyQUovoGjebtZxcZaaAPzuWfbXLJRlo.pdf). This is relatively easy to add to an Excel sheet. The graph showing tests by date is less helpful, as the bars here do not have value labels. (Source: https://ci.lubbock.tx.us/storage/images/YepkYjc2DnjQiDnn0bTMWx1fYz9mx8t5x97O142W.pdf) To come up with values, I measured the height of each bar in milimeters, and compared this value with the overall height of the chart area. I measured the chart to be 114 mm tall. The y-axis of the chart grows by 200 at each step, which seem to be evenly spaced, and maxes out at 1400. That means each mm of vertical bar represents about 12 cases. This report only has data going back to 4/6, so that will be the date we use for this analysis.
To determine the growth of new cases, we can take the count of new cases from one day and subtract the number of new cases from the previous day. As an example, April 6 had 26 new cases, while April 7 had only 11 new cases. That represents a reduction of new cases by 15. If you kept a count of how many times the new case count decreased, you would have the number of days in which the rate of new cases being added was slowing down. The CDC metric specifically is asking for active cases, which I do not have proper access to from this data. However, showing that the virus is spreading more slowly would likely also correlate with active cases being reduced – as it may give a chance for recoveries to happen more quickly than new cases are being added.
The second CDC metric regards the percentage of cases that test positive. This would mean one of two things 1) the infection rate is slowing (again, allowing more time for patients to recover) or 2) testing capacity has increased (which could represent the community’s ability to respond to the pandemic). For this, we can calculate the positive percentage by dividing the number of new cases by the number of tests for that day. We can then calculate the change by taking the current day’s percentage and subtracting the previous day. If yesterday’s percentage was greater, we would know the number of percentage of positive cases is shrinking.
You can find my data here: https://drive.google.com/file/d/1KCf3ZKCbme09vDVTISoBGXzntdPjFdyq/view?usp=sharing.
Analysis
Looking at this, and it is easy to see that it isn’t looking great. I accounted for some error in my calculations for both metrics – I allowed new cases to increase day-to-day by up to 5 and allowed the positive percentage rate to grow by up to 1 percentage points day-to-day. This is my attempt at accounting for a little bit of noise in the data where the value mostly stayed the same but didn’t change much. This isn’t specified for in the CDC guideline, and there was possibly a better way for me to come up with these limits using standard deviations or something similar, but it’s what I did for a quick analysis.
Looking first at the CDC metric of 14 days showing a decline in cases and you can see that May was a pretty decent month. Lubbock added 155 new cases during the month of May, and only 4 days saw new cases increase by 5 from the count of new cases the previous day, with each one almost immediately followed by a decline in new cases. This is a bit surprising, as they began calling for Texas to reopen around May 1. It’s a much bleaker story in June, and the last few weeks have been as horrendous as they seemed. While May average 5 new cases a day, June has seen an average of 22 new cases a day – with the last week having an average of 73 cases per day. I had mentioned before that I am technically counting new cases rather than active cases, which could be somewhat problematic. Since April 6, Lubbock has had 1252 new cases, 652 (52.077%) of which have been added in the last 2 weeks. So yes, active cases likely saw a decline especially in May where growth was relatively slow. However, there is absolutely no way recoveries have been able to occur as quickly as new cases have been added.
The second metric called for a decline in the percentage of positive tests. We have also failed to meet this criteria. Late May in this data gives us our best run, as May 23-31 saw this percentage staying fairly steady. While the last week or so has remained relatively constant (with a growth or decline of 3 percentage points on average from the previous day), today saw 39.36% of all tests come up positive. Testing has grown by a bit, with June seeing an average of 580 tests taken daily compared to May’s average of 340. Again, my test counts here are approximated from measurements of the graph, but should be relatively close. My data file linked above includes error values assuming my measurement was off by a milimeter in either direction.
Conclusion
After looking into this, I think I can definitively say that Lubbock has not met and is not meeting the CDC guidelines for reopening. You could make a case that there were times in May that our case counts and testing would have lined up with these guidelines, but the recent explosion of cases in Lubbock do not meet those definitions.
To be clear, I am attempting to approach this from a purely data standpoint. The CDC released a pair of metrics that seem relatively well defined and the City of Lubbock has published data that could be used to calculate the city’s performance. After conducting that analysis, we do not meet the CDC’s guidelines for reopening.
Issues
I have mentioned that I did not have the data available to specifically test the first of the CDC’s criteria. However, I believe showing the rate of change for new cases is a sufficient analog for that measurement. The data is technically available in the form of images on Facebook, but I didn’t feel like scrolling through 2 months of Facebook posts. Additionally, my count of tests was taken by measuring the graph which is hardly accurate. I believe my measurements to be accurate to the milimeter, and I did conduct the positive percentage test on even my best case scenario for tests and came up with the same results.
The other issue may be trust: trust in the CDC for providing metrics that are meaningful for public safety and trust in the City of Lubbock for producing data that is reliable for this analysis. For the CDC recommendations, I tend to believe in long-standing entities that serve the public. I believe they have been a wonderful source of information throughout this pandemic, and see no reason to question their authority. As for the City of Lubbock, I do not have any credentials for why they are to be trusted. One can argue that they would receive some benefit for not lying to the public, and the last week of data would certainly make them look bad and yet it is still published. I generally believe we can rely on this data, if only because it is the most comprehensive dataset we have avialable.
Other issues may be in my analysis. I did not offset new cases against tests. I somewhat assumed the count of tests would be a count of tests taken that day while a count of new cases would be the number of those tests that came up positive. I had heard some say the count of new cases is really representative of tests conducted 1-2 days prior. Lacking that institutional knowledge, I took those to values to have occurred on the same day. One should be able to offset the data relatively easily if that is needed.
A consideration I didn’t mention here is hospitalizations in Lubbock. We saw a peak of about 30 in late April and early May, only to drop around 9-12 (with occasional jumps to 15) from roughly May 21 – June 8 (source: https://ci.lubbock.tx.us/storage/images/NINtQrsxihx394E4GQ8CxwrOjeQPPnrewhHDNA43.pdf). However, with the spike in new cases, we have jumped up to average 20 for the last 6 days. I would assume this will rise up again given the growth of new cases over the last few days, but we shall see.
Poltics
If I may dive into politics, it is strange that we have a President that has suggested limiting tests would be a good thing (here, here, here). Testing is an important tool to let us know where we stand and where we’re headed. That’s really just the same for data and information in general, and it is a shame that Trump has taken such an anti-intellectual stance. It is confusing that the GOP had suggested for months that the young can go out and enter the world despite COVID-19 since they won’t be harmed, but these same people are now being annoyed that younger people are doing just that. Gov. Abbott has allowed restaurants to open at 75% capacity, and even amusement parks can be open with 50% capacity, while also citing frustration with the rise in cases seen across Texas. I’m not (necessarily) saying that we need to shut things down completely given the rise in coronavirus cases, but we definitely need to consider scaling operations back down to try to get closer to those metrics.
Leave a Reply