Geospatial data is highly sought-after when training machine learning models. That said, it’s not always easy to find geospatial datasets to train your models.
That’s why we’ve done the tricky bit for you. We’ve searched high and low here at Twine to find the best geospatial datasets.
Are you ready?
Let’s dive in.
Here are our top picks for Geospatial Datasets:
Global Self-consistent, Hierarchical, High-resolution Geography Dataset (GSHHG)
This dataset contains high-resolution geography data, amalgamated from two databases: World Vector Shorelines (WVS) and CIA World Data Bank II (WDBII). The former is the basis for shorelines while the latter is the basis for lakes, although there are instances where differences in coastline representations necessitated adding WDBII islands to GSHHG.
GSHHG combines the older GSHHS shoreline database with WDBII rivers and borders, available in either ESRI shapefile format or in a native binary format.
Global 1-km Consensus Land Cover Dataset
This dataset integrates multiple global remote sensing-derived land-cover products and provides consensus information on the prevalence of 12 land-cover classes at 1-km resolution. It contains 12 data layers, each of which provides consensus information on the prevalence of one land-cover class. All data layers have a spatial extent from 90ºN – 56ºS and from 180ºW – 180ºE and have a spatial resolution of 30 arc-second per pixel (~1 km per pixel at the equator).
Satellite Data for Air Quality Database
With support from NASA, the Holloway Group at SAGE has developed a set of user-friendly datasets to support the wider utilization of remote sensing data for air quality and health. This growing inventory of data includes:
- Shapefiles of NO2 air pollution from satellite for use in GIS platforms, including the EPA’s EJSCREEN platform for environmental justice
- 12 km x 12 km daily gridded data of NO2 air pollution from satellite for comparison with photochemical grid model output or other data sources
Greenhouse Gas Emissions on Croplands Dataset
This dataset has developed global crop-specific circa 2000 estimates of GHG emissions and GHG intensity in high spatial detail, reporting the effects of rice paddy management, peatland draining, and nitrogen (N) fertilizer on CH4, CO2, and N2O emissions.
World Port Index Dataset
This Dataset from the National Geospatial-Intelligence Agency lists approximately 3700 ports across the world, with location and facilities offered. It provides global maritime geospatial intelligence in support of national security objectives, including the safety of navigation, international obligations, and joint military operations.
To conclude, here are the top picks for the best geospatial datasets for your projects:
- Global Self-consistent, Hierarchical, High-resolution Geography Dataset (GSHHG)
- Global 1-km Consensus Land Cover Dataset
- Satellite Data for Air Quality Database
- Greenhouse Gas Emissions on Croplands Dataset
- World Port Index Dataset
We hope that this list has helped you find a dataset for your project or, realize the myriad options available.
If you want to learn more about how we could help build a custom dataset for your project, don’t hesitate to contact us!
Let us help you do the math – check our AI dataset project calculator.