没有合适的资源?快使用搜索试试~ 我知道了~
Google发布近实时的10m分辨率全球土地利用数据
需积分: 0 23 浏览量
2023-06-07
13:10:04
上传
评论
收藏 3.09MB PDF 举报
温馨提示
Google发布近实时的10m分辨率全球土地利用数据
资源推荐
资源详情
资源评论













1
Scientific DATA | (2022) 9:251 | https://doi.org/10.1038/s41597-022-01307-4
www.nature.com/scientificdata
Dynamic World, Near real-time
mapping
✉
,
,
,
Regularly updated global land use land cover (LULC) datasets provide the basis for understanding the sta-
tus, trends, and pressures of human activity on carbon cycles, biodiversity, and other natural and anthropo-
genic processes
1–3
. Annual maps of global LULC have been developed by many groups. ese maps include
the National Aeronautics and Space Administration (NASA) MCD12Q1 500 m resolution dataset
4,5
(2001–
2018), the European Space Agency (ESA) Climate Change Initiative (CCI) 300 m dataset
6
(1992–2018), and
Copernicus Global Land Service (CGLS) Land Cover 100 m dataset
7,8
(2015–2019). While widely used, many
important LULC change processes are dicult or impossible to observe at a spatial resolution greater than
100 m and annual temporal resolution
9
, such as emerging settlements and small-scale agriculture (prevalent in
the developing world) and early stages of deforestation and wetland/grassland conversion. Inability to resolve
these processes introduces signicant errors in our understanding of ecological dynamics and carbon budgets.
us, there is a critical need for spatially explicit, moderate resolution (10–30 m/pixel) LULC products that are
updated with greater temporal frequency.
Currently, almost all moderate resolution LULC products are available with only limited spatial and/or tem
-
poral coverage (e.g., USGS NLCD
10
and LCMAP
11
) or via proprietary and/or closed products (e.g., BaseVue
12
,
GlobeLand30
13
, GlobeLand10
14
) that are generally not available to support monitoring, forecasting, and decision
making in the public sphere. A noteworthy exception is the recent iMap 1.0
15
series of products available globally
at a seasonal cadence with a 30 m resolution. Nonetheless, globally consistent, near real-time (NRT) mapping
of LULC remains an ongoing challenge due to the tremendous computational and data storage requirements.
Simultaneous advances in large-scale cloud computing and machine learning algorithms in
high-performance open source soware frameworks (e.g., TensorFlow
16
) as well as increased access to satellite
1
Google, LLC, 1600 Amphitheatre Pkwy., Mountain View, CA, 94043, USA.
2
National Geographic Society, 1145 17th
St NW, Washington, DC, 20036, USA.
3
World Resources Institute, 10 G St NE #800, Washington, DC, 20002, USA.
4
Department of Earth & Environment, Boston University, 685 Commonwealth Avenue, Boston, MA, 02215, USA.
✉
e-mail: c@google.com

2
Scientific DATA | (2022) 9:251 | https://doi.org/10.1038/s41597-022-01307-4
www.nature.com/scientificdata
www.nature.com/scientificdata/
image collections through platforms such as Google Earth Engine
17
have opened new opportunities to create
global LULC datasets at higher spatial resolutions and greater temporal cadence than ever before. In this paper,
we introduce a new NRT LULC dataset produced using a deep-learning modeling approach. Our model, which
was trained using a combination of hand-annotated imagery and unsupervised methods, is used to operation
-
ally generate NRT predictions of LULC class probabilities for new and historic Sentinel-2 imagery using cloud
computing on Earth Engine and Google Cloud AI Platform. ese products, which we refer to collectively as
Dynamic World, are available as a continuously updating Earth Engine Image Collection that enables users to
leverage both class probabilities and multi-temporal results to track LULC dynamics in NRT and create custom
products suited to their specic needs. We nd that our model exhibits strong agreement with expert annota
-
tions for an unseen validation dataset, and though dicult to compare with existing products due to dierences
in temporal resolution and classication schemes, achieves better or comparable performance relative to other
state-of-the-art global and regional products when compared to the same reference dataset.
e classication schema or “taxonomy” for Dynamic World, shown
in Table1, was determined aer a review of global LULC maps, including the USGS Anderson classication
system
18
, ESA Land Use and Coverage Area frame Survey (LUCAS) land cover modalities
19
, MapBiomas classi-
cation
20
, and GlobeLand30 land cover types
13
. e Dynamic World taxonomy maintains a close semblance to
the land use classes presented in the IPCC Good Practice Guidance (forest land, grassland, cropland, wetland,
settlement, and other)
21
to ensure easier application of the resulting data for estimating carbon stocks and green-
house gas emissions. Unlike single-pixel labels, which are usually dened in terms of percent cover thresholds,
the Dynamic World taxonomy was applied using “dense” polygon-based annotations such that LULC labels are
applied to areas of relatively homogenous cover types with similar colors and textures.
Our modeling approach relies on semi-supervised deep learning and requires
spatially dense (i.e., ideally wall-to-wall) annotations. To collect a diverse set of training and evaluation data, we
divided the world into three regions: the Western Hemisphere (160°W to 20°W), Eastern Hemisphere-1 (20°W
to 100°E), and Eastern Hemisphere-2 (100°E to 160°W). We further divided each region by the 14 RESOLVE
Ecoregions biomes
22
. We collected a stratified sample of sites for each biome per region based on NASA
MCD12Q1 land cover for 2017
4
. Given the availability of higher-resolution LULC maps in the United States and
Brazil, we used the NLCD 2016
10
and MapBiomas 2017
20
LULC products respectively in place of MODIS prod-
ucts for stratication in these two countries.
At each sample location, we performed an initial selection of Sentinel-2 images from 2019 scenes based on
image cloudiness metadata reported in the Sentinel-2 tile’s QA60 band. We further ltered scenes to remove
images with many masked pixels. We nally extracted individual tiles of 510 × 510 pixels centered on the sample
sites from random dates in 2019. Tiles were sampled in the UTM projection of the source image and we selected
one tile corresponding to a single Sentinel-2 ID number and single date.
Further steps were then taken to obtain an “as balanced as possible” training dataset with respect to the
LULC classications from the respective LULC products. In particular, for each Dynamic World LULC category
contained within a tile, the tile was labeled to be high, medium, or low in that category. We then selected an
approximately equal number of tiles with high, medium or low category labels for each category.
To achieve a large dataset of labeled Sentinel-2 scenes, we worked with two groups of annotators. e rst
group included 25 annotators with previous photo-interpretation and/or remote sensing experience. e expert
group labeled approximately 4,000 image tiles (Fig.1a), which were then used to train and measure the per
-
formance and accuracy of a second “non-expert” group of 45 additional annotators who labeled a second set
of approximately 20,000 image tiles (Fig.1b). A nal validation set of 409 image tiles were held back from
the modeling eort and used for evaluation as described in the Technical Validation section. Each image tile
in the validation set was annotated by three experts and one non-expert to facilitate cross-expert and expert/
non-expert QA comparisons.
All Dynamic World annotators used the Labelbox platform
23
, which provides a vector drawing tool to
mark the boundaries of feature classes directly over the Sentinel-2 tile (Fig.2). We instructed both expert and
non-expert annotators to use dense markup instead of single pixel labels with a minimum mapping unit of 50
× 50 m (5 × 5 pixels). For water, trees, crops, built area, bare ground, snow & ice, and cloud, this was a fairly
straightforward procedure at the Sentinel-2 10 m resolution since these feature classes tend to appear in fairly
homogenous agglomerations. Shrub & scrub and ooded vegetation classes proved to be more challenging as
they tended not to appear as homogenous features (e.g. mix of vegetation types) and have variable appearance.
Annotators used their best discretion in these situations based on the guidance provided in our training material
(i.e. descriptions and examples in Table1). In addition to the Sentinel-2 tile, annotators had access to a match
-
ing high-resolution satellite image via Google Maps and ground photography via Google Street View from the
image center point. We also provided the date and center point coordinates for each annotation. All annotators
were asked to label at least 70% of a tile within 20 to 60 minutes and were allowed to skip some tiles to best bal
-
ance their labeling accuracy with their eciency.
We prepared Sentinel-2 imagery in a number of ways to accommodate both annota-
tion and training workows. An overview of the preprocessing workow is shown in Fig.3.
For training data collection, we used the Sentinel-2 Level-2A (L2A) product, which provides radiometri-
cally calibrated surface reectance (SR) processed using the Sen2Cor soware package
24
. is advanced level
of processing was advantageous for annotation, as it attempts to remove inter-scene variability due to solar dis
-
tance, zenith angle, and atmospheric conditions. However, systematically produced Sentinel-2 SR products are

3
Scientific DATA | (2022) 9:251 | https://doi.org/10.1038/s41597-022-01307-4
www.nature.com/scientificdata
www.nature.com/scientificdata/
currently only available from 2017 onwards. erefore, for our modeling approach, we used the Level-1C (L1C)
product, which has been generated since the beginning of the Sentinel-2 program in 2015. e L1C product rep
-
resents Top-of-Atmosphere (TOA) reectance measurements and is not subject to a change in processing algo-
rithm in the future. We note that for any L2A image, there is a corresponding L1C image, allowing us to directly
map annotations performed using L2A imagery to the L1C imagery used in model training. All bands except
for B1, B8A, B9, and B10 were kept, with all bands bilinearly upsampled to 10 m for both training and inference.
In addition to our preliminary cloud ltering in training image selection, we adopted and applied a novel
masking solution that combines several existing products and techniques. Our procedure is to rst take the
10 m Sentinel-2 Cloud Probability (S2C) product available in Earth Engine
25
and join it to our working set of
Sentinel-2 scenes such that each image is paired with the corresponding mask. We compute a cloud mask by
thresholding S2C using a cloud probability of 65% to identify pixels that are likely obscured by cloud cover. We
then apply the Cloud Displacement Index (CDI) algorithm
26
and threshold the result to produce a second cloud
Class ID LULC Type Description Examples
0 Water
• Water is present in the image.
• Contains little-to-no sparse vegetation, no rock
outcrop, and no built-up features like docks.
• Does not include land that can or has previously been
covered by water.
• Rivers
• Ponds & Lakes
• Ocean
• Flooded Salt Pans
1 Trees
• Any signicant clustering of dense vegetation,
typically with a closed or dense canopy.
• Taller and darker than surrounding vegetation (if
surrounded by other vegetation).
• Wooded vegetation
• Dense green shrubs
• Cluster of dense, tall vegetation within savannas
• Plantations such as apples, bananas, citrus, and
rubber
• Swamp (dense/tall vegetation with no obvious
water)
• Any mix of the above
• Any burned areas of the above
2 Grass
• Open areas covered in homogenous grasses with little
to no taller vegetation.
• Other homogenous areas of grass-like vegetation
(blade-type leaves) that appear dierent from trees
and shrubland.
• Wild cereals and grasses with no obvious human
plotting (i.e. not a structured eld).
• Natural meadows and elds with sparse or no
tree cover
• Open savanna with little to no tree cover
• Parks, golf courses, human manicured lawns,
including large elds in urban settings like
soccer and baseball.
• Tree cut-throughs for power lines, gas etc.
• Pastures
• Reeds and marshes with no obvious ooding
3 Flooded vegetation
• Areas of any type of vegetation with obvious
intermixing of water.
• Do not assume an area is ooded if ooding is
observed in another image.
• Seasonally ooded areas that are a mix of grass/
shrub/trees/bare ground.
• Flooded mangroves
• Emergent vegetation
4 Crops • Human planted/plotted cereals, grasses, and crops.
• Corn, wheat, soy, etc.
• Hay and fallow plots of structured land
5 Shrub & Scrub
• Mix of small clusters of plants or individual plants
dispersed on a landscape that shows exposed soil
and rock.
• Scrub-lled clearings within dense forests that are
clearly not taller than trees. Appear grayer/browner
due to less dense leaf cover.
• Moderate to sparse cover of bushes, shrubs, and
tus of grass
• Savannas with very sparse grasses, trees, or other
plants
6 Built area
• Clusters of human-made structures or individual
very large human-made structures.
• Contained industrial, commercial, and private
building, and the associated parking lots.
• A mixture of residential buildings, streets, lawns,
trees, isolated residential structures or buildings
surrounded by vegetative land covers.
• Major road and rail networks outside of the
predominant residential areas.
• Large homogeneous impervious surfaces, including
parking structures, large oce buildings, and
residential housing developments containing clusters
of cul-de-sacs.
• Cluster of houses, can include smalls lawns or
small patches of trees can be included
• Dense villages, town, and cityscape (buildings
and roads together)
• Clusters of paved roads and large highways
• Asphalt and other human-made surfaces
7 Bare ground
• Areas of rock or soil containing very sparse to no
vegetation.
• Large areas of sand and deserts with no to little
vegetation.
• Large individual or dense networks of dirt roads.
• Exposed rock
• Exposed soil
• Desert and sand dunes
• Dry salt ats and salt pans
• Dried lake bottoms
• Mines
• Large empty lots in urban areas
8 Snow & Ice
• Large homogenous areas of thick snow or ice,
typically only in mountain areas or highest latitudes.
• Large homogenous areas of snowfall.
• Glaciers
• Permanent snowpack
• Snowfall
Table 1. Dynamic World Land Use Land Cover (LULC) classication taxonomy. Denitions and examples
were provided as part of annotator reference materials, along with descriptions of colors and patterns typically
associated with each LULC type.

4
Scientific DATA | (2022) 9:251 | https://doi.org/10.1038/s41597-022-01307-4
www.nature.com/scientificdata
www.nature.com/scientificdata/
mask, which is intersected with the S2C mask to reduce errors of commission by removing bright non-cloud
targets based on Sentinel-2 parallax eects. We nally intersect this sub-cirrus mask with a threshold on the
Sentinel-2 cirrus band (B10) using the thresholding constants proposed for the CDI algorithm
26
, and take a
morphological opening of this as our cloudy pixel mask. is mask is computed at 20 m resolution.
In order to remove cloud shadows, we extend the cloudy pixel mask 5 km in the direction opposite the solar
azimuthal angle using the scene level metadata “SOLAR_AZIMUTH_ANGLE” and a directionaldistancetrans
-
form (DDT) operation in Earth Engine. e nal cloud and shadow mask is resampled to 100 m to decrease both
the data volume and processing time. e resulting mask is applied to Sentinel-2 images used for training and
inference such that unmasked pixels represent observations that are likely to be cloud- and shadow-free.
e distribution of Sentinel-2 reectance values are highly compressed towards the low end of the sensor
range, with the remainder mostly occupied by high return phenomena like snow and ice, bare ground, and
specular reection. To combat this imbalance, we introduce a normalization scheme that better utilizes the
useful range of Sentinel-2 reectance values for each band. We rst log-transform the raw reectance values to
Fig. 1 Global distribution of annotated Sentinel-2 image tiles used for model training and periodic testing
(neither including 409 validation tiles). (a) 4,000 tiles interpreted by a group of 25 experts (b) 20,000 tiles
interpreted by a group of 45 non-experts. Hexagons represent approximately 58,500 km
2
areas and shading
corresponds to the count of annotated tile centroids per hexagon.
Fig. 2 Sentinel-2 tile and example reference annotation provided as part of interpreter training. is example
was used to illustrate the Flooded vegetation class, which is distinguished by small “mottled” areas of water
mixed with vegetation near a riverbed. Also note that some areas of the tile are le unlabeled.
剩余16页未读,继续阅读
资源评论


NagaClimber
- 粉丝: 1
- 资源: 1
上传资源 快速赚钱
我的内容管理 展开
我的资源 快来上传第一个资源
我的收益
登录查看自己的收益我的积分 登录查看自己的积分
我的C币 登录后查看C币余额
我的收藏
我的下载
下载帮助


安全验证
文档复制为VIP权益,开通VIP直接复制
