End-to-end feature extraction from aerial and satellite imagery
By: Daniel Hofmann & Bhargav Kowshik
Today, our data team is excited to introduce RoboSat — our open source, production-ready, end-to-end pipeline for feature extraction from aerial and satellite imagery. RoboSat streamlines the machine learning workflow, making it easier and faster to gather insights from our high-res imagery or your own. Use RoboSat to track deforestation, fires, and landuse across the globe. Measure the impact of a natural disaster or humanitarian crisis. Or use RoboSat to validate changesets in OpenStreetMap in realtime (See our guidelines for Mapbox Satellite + ML).
We have a long history of working with the OpenStreetMap community, ensuring the database is as complete and accurate as possible. The on-the-ground efforts of contributors across the world are what make OpenStreetMap such a powerful data source and community.
As the needs of our users and partners evolve, and the scale at which we must process data increases, we have to think about new and exciting ways to detect data while also using traditional tooling to aid in the process. Learn more about RoboSat and contribute in the open repo on GitHub.
How RoboSat works
The RoboSat pipeline is categorized into three parts:
- Data preparation: automatically create a dataset for training feature extraction models
- Training and modeling: segmentation models for feature extraction from images
- Post-processing: turn segmentation results into clean and simple geometries
The data preparation tools make it easy for us to start creating a dataset for training feature extraction models. The dataset consists of aerial or satellite imagery and the corresponding masks for the features we want to extract. We provide convenient tools to automatically create these datasets, downloading aerial imagery from our Maps API and automatically generating masks from OpenStreetMap geometries — and RoboSat isn’t limited to those sources alone.
The modeling tools help train fully convolutional neural nets for segmentation. We recommend using (potentially multiple) GPUs for these tools. We’re running RoboSat on AWS GPU instances and a GTX 1080 TI GPU to keep our Berlin office warm during winter. Predicting with these trained models will result in segmentation probabilities and masks for each image.
The post-processing tools help clean up the segmentation model’s results. They are responsible for de-noising, simplifying geometries, transforming from pixels in geo-referenced images tiles to world coordinates (GeoJSON features), and properly handling tile boundaries.
Here’s an example of the pipeline steps happening during prediction:
About the Imagery
You can access our tiled imagery via our Maps API. Our high-resolution satellite layer is the ideal medium for feature extraction. Extractions can be performed free of charge (subject to map view & rate limits) for OpenStreetMap contributions, as well as general non-commercial purposes. Read the guidelines about free ML processing using our imagery. To learn more about commercial extractions, reach out to our team.
The humans behind RoboSat
Of course, none of this would have been possible without a remarkable team. And so, a big shout out to the entire team of engineers and scientists who fought hard to bring this to life and open source it for everyone.
It’s not hard to think of hundreds of creative uses cases for RoboSat. Share your ideas with us, tweet @mapbox or comment in the GitHub repo. We’re interested in extracting everything from buildings, to streets, to parking lots — even lakes and rivers. We look forward to continuing to make RoboSat smarter and to keeping you updated on our progress.
If you’re interested in applying machine learning to mapping efforts, head to our careers page — we’re always looking to talk to passionate people.
Meet RoboSat 🤖 🛰 was originally published in Points of interest on Medium, where people are continuing the conversation by highlighting and responding to this story.