Behind the scenes with Interactive Graphics Journalist Cedric Sam
By: Ryan Mannion and Cory Guillory
More than six months into the global pandemic, data about Covid-19 cases and deaths is still as ubiquitous as ever. Last week, the team at Bloomberg Graphics News shared a new take on how important it is that we think about location and boundaries with the right context in their article “Mapping Coronavirus: How Many Covid-19 Cases Are Near You”. Their 🔥 visualization is built in part with Mapbox APIs so we took a moment to talk to Cedric Sam of the Bloomberg Graphics team (@BBGVisualData) to learn more about their journey to get this story out.
What inspired you to tell this story?
We began with the belief that Covid numbers by state, let alone nationally, were too broad and that county-level data often too sparse. We looked into what would happen if we broadened aggregations beyond the administrative or metropolitan boundaries that we’re used to reading about to more fluid areas that reflect how people actually move around in their daily lives.
The result is an interactive visualization where readers can derive information that may tell a different story than what somebody is hearing on the local or even national news. They can input the location of their choice and then toggle between viewing daily deaths and new cases in the counties that are within a two-hour drive of that location. There’s a time slider at the bottom that starts on March 12 and goes through today, and then there’s a dynamic line chart sitting atop the map that compares the designated area against the rest of the country in cases and deaths.
Run us through some of the most interesting examples you found that demonstrate how important this type of location context is.
The interactive really lets you dig into connected areas. Take Ripley County in Indiana, a rural county of 28,000 inhabitants, but within a 2-hour drive reach of 7.3 million people living in urban centers like Cincinnati, Indianapolis, and Louisville.
Another one is California. The Fresno area of San Joaquin Valley and the region around San Francisco and Silicon Valley are right next to each other, but the former was hit a lot harder this summer.
Describe how you went about building this visualization.
We used the dataset of county-level Covid-19 cases and deaths from our friends @nytgraphics across town, which we broke into one-per-county chunks that update at least once a day.
To identify the most reliable driving distances between counties, we queried Mapbox’s Matrix API for every county or county-equivalent in the United States against something like 100 or more of their closest neighbors. Then, using Svelte and Mapbox GL JS, we created a map interface with a custom search box that lets the user select a specific place, which fetches up to a few dozen historical data files for each county within a 2-hour drive.
We did heavy geoprocessing on the base map using a variety of tools, including Dirty Reprojectors as well as Mapshaper, a tool from NYT Graphics Editor Matthew Bloch.
What challenges did you encounter with this article and how did you solve them?
Obtaining the driving distance for every single county was one of the more difficult technical parts of the project.
To determine adjacency between counties, we weren’t sure if using a centroid of each county or its external boundaries and calculating straight-line distances would be the best measure. We determined that a representative city in the county, namely the county seat in 98% of counties, was the point per county we wanted and used a dataset from the DOT. We determined also that driving distance was a better metric to calculate connectedness, even if it was a harder thing to get (there doesn’t seem to be pre-calculated data out there).
Now, how were we going to measure connectedness for all 3,000 or so counties or county-equivalents in the U.S.? We originally looked at the Mapbox Isochrone API, but the maximum distance was 60 minutes, which isn’t far enough for counties. So we looked at the driving distances APIs out there that we could use and ended up choosing Mapbox’s Matrix API. To be able to measure driving distances between all 3,000+ counties with the 2,999 others, we did some pre-processing to reduce the number of API calls.
After getting the distance data, we parsed it and created a compact structure, then listed counties within 7,200 seconds for each county, using their FIPS identifier.
The Dirty Reprojectors technique mentioned above also added another layer of complexity in handling user interactions. The point on the map that a reader interacts with does not equate to the real latitude and longitude pair of numbers. Instead, they were points around Null Island, the name for the point on the Earth’s surface where the prime meridian and the equator cross. So, each user interaction (each map touch) was a query that had to determine which polygon of our Mapbox map layer was touched. And then, a cascade of functions (involving the use of Turf to create a contour around the selected area) would visualize the selection on the map and line charts above the map.
Did you have any unexpected learnings while putting this together?
Building this interactive gave us the possibility to start with high-level external observations and be able to drill down to determine that indeed there was a marked difference between adjacent regions. For example, we’d heard that for about a week, cases in northern Wisconsin have been going up. The visual shows that as of September 21, there was twice the number of new cases per capita around Wausau than around Madison. Conversely, you can start with the area where you live that might not be known for having a lot of cases per capita, and then see from the interactive that you live within close driving distance from an area that is seeing a rise in cases. I think that’s a very powerful feature of our project. We examined this more on Twitter.
Anything else we didn’t ask about that you’d like to share?
To encourage readers to explore and share what they find interesting, we add hash marks to the main URL as they interact with the graphic and search for places of interest. For Chicago, you could save and share this view, for instance. (Hint: get sharing!)
Thanks to Cedric and the entire Bloomberg Graphics team for working to give us the data we need to stay safe. Check out other Covid-19 related news stories built with Mapbox, and head over to our Matrix API documentation to learn more about calculating distances between thousands of points.
- Ryan Mannion - Manager, Technical Account Management - Mapbox | LinkedIn
- Cory Guillory - Enterprise Account Executive - Mapbox | LinkedIn
How Bloomberg News Pushed the Boundaries was originally published in maps for developers on Medium, where people are continuing the conversation by highlighting and responding to this story.