Clusters or networks of places – and not just individual sites – may play a critical super-spreading role.
When we think of super-spreader events, we tend to focus on places with a high concentration of susceptible people, such as assisted living facilities, detention centers, sports arenas, food processing plants, and other mass gatherings at a single site.
To better understand how the coronavirus can spread rapidly, we need to think more broadly about clusters or networks of places, and not just one place at a time. The idea is that infected individuals can move easily and rapidly across multiple places within the cluster or network. The component places are in turn linked together by close geographic proximity, or by an efficient transportation network.
Last spring, to take a salient example, South Korean authorities reported an outbreak of 34 cases after a 29-year-old patient visited five clubs and bars in the Itaewon district of Soeul from the night of May 1 to the early hours of the following morning. Case tracking eventually identified a total of 246 primary and secondary infections.
This super-spreader model underlies our analysis of the possible role of a cluster of off-campus bars in a recent outbreak at the University of Wisconsin-Madison, where nearly three thousand students tested positive for SARS-CoV-2 during September 2020. For a detailed exposition of our data sources, analytical methods and findings, see our National Bureau of Economic Research Working Paper 28132.
Green, Red, Purple and Yellow
Figure 1 shows a screenshot of a section of the university campus map, focusing on the eastern end of the campus. The superimposed solid lines mark the external boundaries between the census tracts, while the dashed lines mark the internal boundaries between the four census block groups within census tract 16.06.
The pair of green buildings within census block group 16.06-4 represent the two on-campus residence halls, Sellery and Witte, which were subject to a lockdown when approximately 20 percent of their residents became infected.
The pair of red buildings further to the south within census block group 16.06-3 represent two other residence halls, Ogg and Smith, that were not overrun with infections and not subject to quarantine.
The solid purple circles mark the locations of a cluster of 20 nearby off-campus bars, located mostly in census tracts 16.03 and 16.04.
The yellow circles show a comparison group of 68 coffee houses, inexpensive and medium-priced restaurants located in the same area. These venues are closer to Sellery and Witte than to Ogg and Smith. The most remote bar was only a 13-minute walk from Sellery.
Case Control Study
These four graphic elements – the green pair of residence halls , the red pair of residence halls , the purple cluster of bars , and the yellow group of restaurants – form the basic components of a case control study.
In the typical case control study, researchers want to find out whether exposure to a particular toxin is associated with the development of a particular disease. To that end, the researchers identify two categories of individuals. Those who have come down with a disease are the cases , and those who did not get the disease are the controls. The researchers then determine the odds that someone from the cases was exposed to the toxin. Similarly, they determine the odds that someone from the controls was exposed to the toxin. The ratio of these two quantities is called the odds ratio, which is routinely computed in a research report of the study.
For example, in the classic case control study of smoking and lung cancer reported by Richard Doll and Bradford Hill in 1950, the cases were hospitalized patients with lung cancer and the controls were patients without lung cancer. The odds that a patient with lung cancer (one of the cases) was a smoker were 2.97 times the odds that a patient without lung cancer (one of the controls) was a smoker.
In our study, the cases were residents of Sellery and Witte , while the controls were residents of Ogg and Smith . While not everyone in Sellery and White tested positive for COVID-19, it’s enough to know that the rate was much higher Sellery and White than in Ogg and Smith, so much higher that Sellery and White had to be put under quarantine.
In the classic Doll-Hill study, hospitalized patients were interviewed about their smoking habits. In our study, we used anonymized smartphone tracking data from the SafeGraph Patterns database to determine how many devices originating in Sellery-White (census block group 16.06-4) and Ogg-Smith (census block group 16.06-3) visited bars (the purple circles ) or restaurants (the yellow circles ). The odds that a Sellery-White resident (one of the cases) visited a bar were 2.95 times the odds that an Ogg-Smith resident did so. By contrast, the corresponding odds ratio for visiting the restaurants (the yellow circles) was only 1.55.
The SafeGraph data tell us only how many devices originating in a particular census block group visited a particular point of interest (in this study, a bar or a restaurant). The data aren’t broken down by individual smartphone, so we don’t know how many devices visited specific combinations of bars or restaurants. Still, we do know that the odds ratio for visiting a bar was almost twice the odds ratio for visiting a restaurant.
Observational Studies versus Experiments
Our case control study is an observational study. It is not an experiment in which subjects are randomly assigned to visit bars or restaurants to see who comes down with COVID-19. Accordingly, it is at least arguable that people who go to bars are less likely to wear masks, maintain social distancing, and take protective measures generally. The same criticism could be applied to a study of people who attended a political rally, a motorcycle rally, or a large wedding reception. Still, our comparison of residence halls – rather than individuals – tends to blunt this criticism.
There are numerous factors that go into a student’s decision to live in one residence hall versus another – whether the rooms are singles or doubles, whether there is more than one bathroom on a floor, whether the floors are mixed coed, whether the rooms have carpeting or air conditioning, and whether the student can cohabit with his or her friends, not to mention the price. It would be a stretch to argue that these factors readily correlate with a lack of protective behavior.
One could, of course, speculate about other possible explanatory factors. Sellery and Witte were regarded by some observers as party dorms, at least during prior semesters. This raises the possibility that some smartphone visits to the 20-bar cluster were to purchase alcoholic beverages to bring back to residence hall parties. But that wouldn’t negate the causal role of the bars in facilitating the parties.
Another possible explanation is that residents of Sellery and Witte partied at off-campus fraternities and sororities. But the smartphone tracking data from the SafeGraph Social Distancing database do not show a large number of visits from census block group 16.06-4 (where Sellery and White was located) to census block groups 16.04-1 and 16.04-4 (where the fraternities and sororities were concentrated).
A better explanation is that, with the preventive measures taken by the university, a substantial portion of the alleged partying was shifted off-campus. While some of the partying could have taken place in off-campus private residences, the smartphone tracking data tell us that a significant proportion was shifted to off-campus bars .
Sellery and Witte had more incoming freshman than other residences. To the extent that Wisconsin’s legal age limit of 21 was strictly observed, it would tend to reduce the visitation rate of these residence halls to local bars. Incoming students are required to enroll in an on-campus meal plan, which might help to explain why off-campus restaurant visitation appeared to be a less important predictor of the higher rate of infections at Sellery and Witte. On the other hand, Smith Residence Hall had its own Starbucks, which would tend to reverse that trend.
If there is any factor that clearly distinguishes Sellery and Witte from Ogg and Smith, it is that one pair of residence halls is nearer and the other pair is farther away. But that begs the question: Nearer to or farther from what? Our findings fit with the conclusion that the residents of Sellery and White suffered significantly more infections than the residents of Ogg and Smith because they were nearer to the 20-bar cluster, and not because they were also nearer to coffee houses, inexpensive and moderately priced restaurants, or nearer to some classrooms.
The Timing is Right
Figure 2 contains two concurrent plots covering each day from August 16 through October 11. The orange line, measured on the left vertical axis, shows the total number of daily visits by all devices, without restriction on origin, to any one of the eleven bars in the cluster within a 10-minute walk of Sellery. Since the SafeGraph Patterns database tracks the GPS pings emitted from a limited panel of devices, only the relative changes in visit counts have significance. Thus, during the week of August 16, there was an average of 176 daily visits among device-holders in the SafeGraph panel to the eleven bars. During August 29-30, the number peaked at 471, a relative increase of 2.7-fold.
The blue line, measured on the right vertical axis in Figure 2, shows the daily number positive COVID-19 cases reported by the Wisconsin Department of Health Services (WDHS) for census tract 16.06. (WDHS reported COVID-19 cases by census tract, but not by census block group.)
Comparison of the orange and blue data series shows a double-peaked surge in the volume of bar attendance, with the first peak occurring on August 29-30 and the second on September 5. Soon thereafter, the double-peaked surge in the volume of bar attendance was followed by a double-peaked surge in positive SARS-CoV-2 cases, with the first peak on September 10 and the second on September 14.
The timing is entirely consistent with a causal relationship between the two trends. The delay between the two curves in Figure 2 represents the latency period between initial infection and the subsequent onset of detectable disease.
In Figure 3, we have zoomed all the way out on our campus map. Now we can see the entire campus, along with all the surrounding census tracts. This time, we’ve annotated the campus map with blue bubbles indicating the number of positive COVID-19 cases in each surrounding census tract reported by WDHS during the two-month interval August 16 – October 16. The cumulative number of cases is proportional to the area (not the diameter) of each bubble. The four largest bubbles are: tract 16.04 (870 cases); tract 16.06 (726 cases); tract 16.03 (488 cases); and tract 11.01 (266 cases). These four census tracts comprised 76.7 percent of the 3,065 cases in campus-area census tracts compiled by WDHS during this 2-month interval.
The data in Figure 3 make clear that the cluster of 20 bars – concentrated in tracts 16.03 and 16.04 – was located directly within the geographic epicenter of the outbreak.
Census Tract Study
Figure 3 suggests another way of analyzing the data. Rather than comparing individual residence halls, we could treat individual census tracts as the units of analysis.
That’s exactly what we’ve done in Figure 4 below. The each lilac-colored point on the graph corresponds to a census tract. The vertical axis measures the number of positive coronavirus tests per 1,000 population in each census tract, while the horizontal axis measures the number of bar visits per 1,000 population by the census tract of origin of the bar visitor. Both axes are on a logarithmic scale so that, for example, the distance between 2 and 10 (that is, a 5-fold increase) is the same as the distance from 10 to 50.
With the exception of one outlier (tract 17.04), the points are aligned along the green fitted line. The slope of the fitted line was 0.87. Economists interpret this slope as an elasticity. For every 10-percent increase in bar visits per capita, the incidence of COVID-19 cases per capita rises an estimated 8.7 percent.
When we constructed the same graph for the per-capita number of visits to the group of 68 restaurants, the data did not fit anywhere near so tightly as in Figure 4. When we performed a multivariate regression, we again found a significant relation to bar visits (with an elasticity of 0.90) and no significant relation to restaurant visits. This finding was entirely consistent with the results of our case control study.
What Have We Learned
In contrast to studies of outbreaks that focus sharply on a single location, our study concentrated on a cluster of places, in this case a cluster of bars right in the geographic epicenter of the outbreak.
While we posited that a cluster or network of places can serve as a super-spreader, we did not have sufficient data in this study to map out the individual connections between places. We had some data on the movements of smartphone holders from one census block group to another, but we would need an even finer geographic grid to ascertain how the connections between bars within a census block group unfold. Still, it would be inappropriate to assume that, in absence of hard data, each establishment in the 20-bar cluster had no more than an independent effect on the risk of coronavirus propagation.
Our findings should not be interpreted broadly to mean that restaurants are entirely free of risk while bars are the sole source of contagion. The narrower interpretation is that a specific, centrally located cluster of bars appeared to be a significantly greater vehicle for propagation of the virus than restaurants in a particular university-based outbreak. Neither do our findings point the finger at all bars. Among the 51 bars throughout the campus area, we focused sharply on a cluster of 20 bars at the geographic epicenter of the outbreak.
Future studies of college and university outbreaks need to concentrate harder on the dynamics of viral transmission, and not simply on how many cases ended up in dormitories, athletic teams, fraternities and sororities. Retrospective case-tracking needs to expand its scope to ask an infected individual not just whether he went to a bar, but also whether he went bar-hopping, whether his roommates also went to bars, and if so, to what bars on what nights.
We need to think more about the ways that multiple places within a cluster may act synergistically to enhance viral propagation. The outbreak in the Itaewon district of Soeul, cited above, points to a model where an index case moves from one place to another within the cluster. More generally, when there is high mobility between places, they may function effectively as one place. If an average of 20 patrons attended each of five interconnected bars on a single night, we would end up with an “event” with 100 attendees. When a student says to his friends, “Let’s go to bar A and if we can’t get in, we’ll just go to bar B,” we have a classic network externality where the mere availability of bar B enhances the demand for, and thus the transmission potential of, bar A.
Even more broadly, we need to think about systemic factors that influence viral propagation, and not simply the characteristics of individuals or the places they go to. The epidemic in Los Angeles County has been sustained in great part by intra-household transmission among multigenerational families. But the larger question is what public policies have enhanced or mitigated these housing conditions. The spread of coronavirus in New York City and other metropolises may have been enhanced by individuals of high mobility. But the larger question is what transportation networks carried them from one place to another.