Quantcast
Channel: maps for developers - Medium
Viewing all articles
Browse latest Browse all 2230

Santa Rosa fire map: How I built it

$
0
0

By: Robin Kraft

Robin is a remote sensing expert with an environmental focus. He was a co-founder of Global Forest Watch, tracking tropical deforestation in near real time using satellite imagery. He led imagery analytics efforts at Planet, and is a product manager at the data science startup Domino Data Lab.

During a disaster, reliable information is unbearably rare. I learned this the hard way during the October wildfires in Northern California that threatened my hometown. As authorities, local news, and people on social media raced to share updates about where the fires were spreading, the inconsistent information they provided reflected the chaos and uncertainty on the ground. My family and friends needed a clearer picture.

🚨 There is another fire raging in Los Angeles right now — if DigitalGlobe and Planet release their data, you can use this guide to make your own map. 🚨

I spotted an opportunity to help when DigitalGlobe and Planet started releasing high-resolution satellite imagery of the region through their open data programs. The imagery covered most of the affected area and was fairly clear, but it was locked inside a few hundred gigabytes of GeoTIFF files — not much use if you escaped the fires with just your clothes and a smartphone. I set out to build something that more people could use.

You can check out the live map here.

I chose to build with Mapbox because I needed a platform powerful enough to process and host the imagery files, but also nimble enough to make a map that would load fast and be intuitive to use. In this post I’ll describe the simple, manual workflow that I used to publish an initial map of the fires as quickly as possible. I’ll then dive into optimizations I added to update the map and improve the user experience.

The easy way

The first time around, I downloaded a few images, stitched them together with gdal_merge.py, and uploaded the mosaic to Mapbox Studio. Once tiling was done, I added my new tileset to a new style based on Mapbox’s Satellite Streets template and clicked publish. This got me a fast, mobile-friendly map in less than an hour, from finding the imagery to texting the URL to a friend. When my friend asked why everything looked red, I overloaded the title to include a very brief explanation of false color imagery. No coding, no styling — sensible defaults for the win!

But this process wouldn’t scale. As more and more imagery became available and traffic to the map started creeping up, I realized I needed a better way to do updates.

The better way

In the end, I used largely the same tools for preprocessing: GDAL and Mapbox Studio. But I also used the Mapbox Uploads API and Amazon Web Services’ EC2 and S3 services. My code is available on Github here.

Working in Amazon’s cloud environment made a few things much better. First, DigitalGlobe and Mapbox operate on AWS already, and Planet is on Google Cloud — which has a fast pipe to AWS. Working within AWS would save me a lot of time on data transfer. Second, on my Ubuntu instance I used screen to keep processing going even if my laptop was off. I didn’t want to have to rely on a constant internet connection for a process that could potentially take an entire day to complete. You can check out this script to see how I installed my dependencies and prepped extra EBS storage.

The other major improvement was the efficient use of GDAL. This was important as data volumes exploded once I expanded the area covered by the map and new imagery became available. Naively mosaicking and projecting hundreds of high-resolution, 1-gigabyte images together using gdal_merge.py was taking too long. Also, Mapbox has a 10GB limit so I couldn’t upload the giant GeoTIFFs if I’d tried.

In the end, I relied on virtual rasters (VRTs) and TIFF compression. VRTs are great because they are text files describing processing steps; they don’t actually apply the operations until you convert the VRT to a “proper” image file, like a GeoTiff. This “lazy” processing technique simplified my workflow so that I didn’t have to manage multiple long-running processes or hundreds of files.

Step by step

My first operation was to create a single mosaic of all the files for a single day — which might be 50 files. Add the filenames to a text file, and you’re good to go!

shell
ls *.tif > files.txt
gdalbuildvrt -input_file_list files.txt mosaic.vrt

“Mosaicking” finished in < 1 second! This isn’t a “real” mosaic yet, but GDAL treats it as one thanks to the VRT format.

Then, I wanted to save Mapbox some work and project the mosaic from its initial WGS84 CRS to Web Mercator — the standard map tile projection — per the docs. Otherwise, this would have to be done on Mapbox’s end while it created the tileset, and I was already seeing some timeout errors.

gdalwarp -s_srs EPSG:4326 -t_srs EPSG:3857 -of vrt -overwrite mosaic.vrt mosaic_webmerc.vrt

“Reprojecting” a massive “mosaic” in < 1 second? Excellent!

Finally, I had to create a real GeoTIFF I could upload to Mapbox. Only at this point do the mosaicking and reprojection actually occur. I again followed Mapbox suggestions for optimizing these images and threw in a few compression flags I learned about on StackOverflow.

The most important optimization for the purposes of processing time and file size was using internal JPEG compression. JPEG compression is lossy, in that it throws away information in the image (mostly stuff that our eyes are less likely to notice). Normally for remote sensing that’s not ideal, but given the disaster, a bit of fuzziness at zoom level 22 for faster processing seemed like a reasonable tradeoff.

I also instructed GDAL to use a YCbCr photometric interpretation. I don’t really understand it, but apparently, it makes JPEG compression more efficient. A 50 cm resolution mosaic covering about 800 square miles came to about 3 gigs — very manageable.

gdal_translate -co tiled=yes -co bigtiff=yes -co compress=jpeg -co photometric=ycbcr -co BLOCKXSIZE=256 -co BLOCKYSIZE=256 -of Gtiff mosaic_webmerc.vrt mosaic_final.tif

This step took 15 minutes or so, instead of many hours.

Now I needed to get the GeoTiff over to Mapbox for final processing. I wrote a simple uploader script using the Mapbox Python SDK that would get the tiling process started. The SDK handles authentication and request formatting for you. You just provide the data and the tileset name, upload the file, and start the tiling. Here’s the bare minimum:

python
from mapbox import Uploader
u = Uploader()
tileset = 'username.tileset_name'  # yes, you must include your username
url = u.stage(open('mosaic_final.tif'))  # upload happens here
u.create(url, tileset, name=fname)  # starts the tiling job

A better map experience

The standard “shareable” URL for a Mapbox map style is super convenient. But I learned three ways to optimize it and my map.

First, the default URL bypasses caching. This makes sense given that you typically use this to preview a map you’re working on. But it’s really bad if your map goes viral. If you’re satisfied using the default map, do yourself a favor and tweak the URL you share: change one of the URL parameters from “fresh=true” to “fresh=false”. Even with caching turned on, the map will update within a few minutes if you change something, and you won’t overload the servers.

Second, the default view that appears when you share a map on social media is the extent of all the data. The Mapbox Satellite Streets layer is global; the default view covers Null Island and Africa.

You can control this in Studio using the Map Position tool by clicking “unlock”, then browse to the default view you like. Lock it again and re-publish.

Third, the default map for a style is great, but there’s no address search or much else. In the case of the fire map, the content was what really mattered, and this default map got the majority of views. But I wanted a better user experience, so I needed to embed the map in my own website.

I used Github Pages, which can handle a lot of traffic to a static website. Embedding the map in a website gives you more freedom to add layers to the map and other map controls. With the help of the Mapbox team and a few tutorials, I added address search and a comparison slider so that people could compare the fresh imagery to the standard Mapbox Satellite layer, which is older. It’s shocking to see what the devastated neighborhoods looked like before and after the fire.

Check out the source code for the embedded map page on Github. It also shows how to add Open Graph tags to control what appears on social media sites.

Maps matter

I’ve been working with satellite data and maps for a long time because I believe in the power of maps and geospatial data. But this disaster made me understand viscerally how powerful they can be. Using this imagery, I was fortunate to see my family home and wedding site intact.

Many others were not so fortunate. I was surprised that so much of the community found solace in this map, particularly since it put an intense loss on display for the world to see. But it turns out that it helped a lot of people move on. As one person wrote to me, “There are many homeowners in evacuated zones that have no idea whether their homes are still standing. This puts the power and knowledge back into their hands. It’s just awful waiting in suspense.”

My family, friends, and community are all grateful to Mapbox, DigitalGlobe, and Planet for their support during this disaster. Things like geocoded address searches, mobile-friendly maps, and rapid imagery updates are standard fare for those of us working in the geospatial world. But they make a huge difference for people in the midst of disaster.

There is now a fire raging in Los Angeles. Hopefully, DigitalGlobe and Planet will release their data ASAP, and then you can make the next viral fire map.

Read more about other disaster response organizations using our tools and explore our map design tutorials to inspire your next project.

Robin Kraft (@robinkraft) | Twitter


Santa Rosa fire map: How I built it was originally published in Points of interest on Medium, where people are continuing the conversation by highlighting and responding to this story.


Viewing all articles
Browse latest Browse all 2230

Trending Articles