ohi logo
OHI Science | Citation policy

1 Summary

From Halpern et al. 2012 supplemental info:

The ‘Lasting Special Places’ sub-goal focuses instead on those geographic locations that hold particular value for aesthetic, spiritual, cultural, recreational or existence reasons. This sub-goal is particularly hard to quantify. Ideally one would survey every community around the world to determine the top list of special places, and then assess how those locations are faring relative to a desired state (e.g., protected or well managed). The reality is that such lists do not exist. Instead, we assume areas that are protected represent these special places (i.e. the effort to protect them suggests they are important places).

Clearly this is an imperfect assumption but in many cases it will be true. Using lists of protected areas as the catalogue of special places then creates the problem of determining a reference condition. We do not know how many special places have yet to be protected, and so we end up having all identified special places also being protected. To solve this problem we make two important assumptions. First, we assume that all countries have roughly the same percentage of their coastal waters and coastline that qualify as lasting special places. In other words, they all have the same reference target (as a percentage of the total area). Second, we assume that the target reference level is 30% of area protected.

The model for this goal considers the inland coastal zone (up to 1 km inland) independently from, and equally weighted with, the offshore coastal zone (up to 3 nm offshore). The status for this goal is calculated as:

\[X_{LSP} = \frac{\left(\frac{Area_{P}}{Area_{P_{ref}}} + \frac{Area_{MPA}}{Area_{MPA_{ref}}}\right)}{2}\]

where:

  • \(Area_{P}\) = Protected area for inland 1 km buffer
  • \({Area_{P_{ref}}}\) = Reference value for inland protected area
  • \(Area_{MPA}\) = Marine protected area for offshore 3 nm buffer
  • \({Area_{MPA_{ref}}}\) = Reference value for marine protected area within offshore 3 nm buffer
  • \(Ref\) = 30% of total area within buffer zone is protected

2 Updates from previous assessment

v2024 Using updated June 2024 data. Switched raster functions over to their terra equivalents.

v2021 Using updated February 2021 data


3 Data Source

Reference: IUCN and UNEP-WCMC (2023), The World Database on Protected Areas (WDPA) [On-line], May 2024. Cambridge, UK: UNEP-WCMC. Available at: www.protectedplanet.net.

Downloaded: May 24, 2024

Description: Shapefile of World Database on Protected Areas

Time range: 1800 - 2023; some protected areas do not have an associated “status year” and are reported as year 0.

Format: Shapefile

File location: Mazu:home/shares/ohi/git-annex/globalprep/_raw_data/wdpa_mpa/d2024/WDPA_Jun2024_Public_shp/


4 Setup

5 Methods

5.1 Filter and re-project WDPA polygons

The WDPA-MPA dataset comes as a shapefile or geodatabase in WGS84 coordinate reference system.

  • For OHI we have chosen to count only protected areas with defined legal protection, so we apply a filter on the STATUS attribute that selects only STATUS == “Designated”.
    • According to the WDPA Manual: STATUS as “Designated” means: “Is recognized or dedicated through legal means. Implies specific binding commitment to conservation in the long term. Applicable to government and non-government sources.”
    • Other values for STATUS include “Proposed”, “Adopted”, “Inscribed”, or “Not Reported” and “Established”.
      • “Adopted” and “Inscribed” are World Heritage or Barcelona Convention sites; while these may seem important, they are generally protected by other means (as overlapping “Designated” polygons) in addition to these values.
  • In 2015, the USA started including polygons that represent marine management plans, in addition to more strictly defined protected areas. This info is contained in the “MANG_PLAN” field.
    • These programmatic management plans variously protect species, habitats, and (??) and can be MPA or non-MPA.
    • For OHI we have chosen to count only MPA programmatic management plans, omitting Non-MPA programmatic management plans.
  • For ease of tallying areas, we convert the polygons to a Mollweide equal-area projection before rasterizing.

Once the polygons have been prepped, we rasterize the results to 500 m resolution.

This process is all done in the script: 1_prep_wdpa_rast.Rmd. After that is complete, move on to computing zonal statistics.


5.2 Compute zonal statistics

Comparing the global WDPA raster to the 3 nautical miles offshore and 1 km inland rasters, we can tally the protected area within each region and compare to the total area within each region. Note each cell is 500 m x 500 m, so area is .25 km2, but since we are simply calculating a ratio, this cancels out.

# list intermedite file paths
zonal_files <- c('zonal_3nm' =  file.path(dir_goal, 'int', 'zonal_stats_3nm.csv'),
                 'zonal_1km' =  file.path(dir_goal, 'int', 'zonal_stats_1km.csv'),
                 'zonal_eez' =  file.path(dir_goal, 'int', 'zonal_stats_eez.csv'))

# load raster created in 1_prep_wdpa_rast.rmd
rast_wdpa <- terra::rast(file.path(dir_goal_anx, 'rast', 'wdpa_2024_moll_500m.tif'))

# point to 500 m rasters for 3 nautical mile coastal regions, and 1 km inland coastal regions.
dir_zones <- file.path(dir_anx, 'spatial/d2014/data/rgn_mol_raster_500m')

# list filepaths to raster files for LSP areas
rgn_rast_list <- c(
  'zonal_3nm' = file.path(dir_zones, 'rgn_offshore3nm_mol_500mcell.tif'),
  'zonal_1km' = file.path(dir_zones, 'rgn_inland1km_mol_500mcell.tif'),
  'zonal_eez' = file.path(dir_zones, 'rgn_eez_mol_500mcell.tif'))

### Remove all files in `int` if it's the first time working through this data prep for this assessment
### Filters out finished zonal files: if zonal files don't exist yet, they will be created (comment out to recalculate)
zonal_files_to_run <- zonal_files[!file.exists(zonal_files)]
rgn_rast_list <- rgn_rast_list[!file.exists(zonal_files)]


  ### NOTE: The crosstab function returns this warning - does it affect the
  ### outcomes, or does the function coerce the correct outcome?
      # Warning message:
      # In FUN(X[[i]], ...) : integer overflow - use sum(as.numeric(.))
  ### zonal() wouldn't work since we want to track the frequency of each
  ### year value within each rgn_id value.

# Load necessary library
library(tools)

# Define the source and destination directories
source_dir <- "/home/shares/ohi/git-annex/globalprep/spatial/d2014/data/rgn_mol_raster_500m"
dest_dir <- "/home/shares/ohi/git-annex/globalprep/spatial/d2014/data/rgn_mol_raster_500m_old"

# Create the destination directory if it doesn't already exist
if (!dir.exists(dest_dir)) {
  dir.create(dest_dir)
}

# List all files in the source directory
files <- list.files(source_dir, full.names = TRUE)

# Copy all files to the destination directory
file.copy(files, dest_dir)

# Optionally, check if all files have been copied
files_dest <- list.files(dest_dir, full.names = TRUE)
all(paste(dest_dir, basename(files), sep="/") %in% files_t)



  lsp_crosstab <- function(rgn_rast_file, rast_values) { # rgn_rast_file <- rgn_rast_list[1]
    rgn_rast <- terra::rast(rgn_rast_file)
    rgn_rast <- terra::as.factor(rgn_rast)
    # rgn_rast = rgn_rast+0 ## Adding this in to ensure rgn_id values are correct. If as.factor(rgn_rast) doesn't work, then can try commenting out the as.factor() line and including this line instead.
    message('Cross tabulating ', rgn_rast_file)
    rast_df <- terra::crosstab(c(rast_values, rgn_rast), useNA = TRUE) %>%
      as.data.frame() %>%
      setNames(c('year', 'rgn_id', 'n_cells')) %>%
      mutate(year   = as.integer(as.character(year)),
             rgn_id = as.integer(as.character(rgn_id))) %>%
      arrange(rgn_id, year)

    return(rast_df)
  }

# Processing & saving zonal statistics for a single raster 

# time stamp
ptm <- proc.time()

# run the function over the wdpa raster using each of the LSP zones
x <- lsp_crosstab(rgn_rast_list[1], rast_values = rast_wdpa) #~35 minutes to run #3nm
cat('writeRaster elapsed: ', (proc.time() - ptm)[3])

y <- lsp_crosstab(rgn_rast_list[2], rast_values = rast_wdpa) #35 minutes to run #1km
cat('writeRaster elapsed: ', (proc.time() - ptm)[3])

z <- lsp_crosstab(rgn_rast_list[3], rast_values = rast_wdpa) #~51 min minutes to run #eez
cat('writeRaster elapsed: ', (proc.time() - ptm)[3])

## Save these files to the int folder
write_csv(x, zonal_files_to_run[1])
write_csv(y, zonal_files_to_run[2])
write_csv(z, zonal_files_to_run[3])

5.2.1 2024 Issue with lsp layer: checking to ensure that terr::rast() method works. See Issue LSP #286

Once the WDPA raster is cross-tabulated against the OHI region rasters (both 3 nm offshore and 1 km inland) we have the number of protected cells, identified by year of protection, within each region. NA values are unprotected cells.

5.2.2 Summary of zonal stats dataframes (3 nm offshore):

stats_3nm <- read_csv(zonal_files['zonal_3nm'])
print(summary(stats_3nm))

5.2.3 Summary of zonal stats dataframes (1 km inland):

stats_1km <- read_csv(zonal_files['zonal_1km'])
print(summary(stats_1km))

5.2.4 Summary of zonal stats dataframes (entire EEZ):

stats_eez <- read_csv(zonal_files['zonal_eez'])
print(summary(stats_eez))

5.3 Calculate protected area and total area by region

Grouping by rgn_id, the total number of cells per region is determined by summing cell counts across ALL years, including cells with year == NA (unprotected cells). We can then determine the protected area for each year by looking at the cumulative sum of cells up to any given year.

Since the cells are 500 m on a side, we can easily calculate area by multiplying cell count * 0.25 km2 per cell.

Finally we can calculate the status of a region for any given year by finding the ratio of protected:total and normalizing by the goal’s target of 30% protected area.

# read in LSP zone csvs created above
stats_3nm <- read_csv(zonal_files['zonal_3nm'])
stats_1km <- read_csv(zonal_files['zonal_1km'])
stats_eez <- read_csv(zonal_files['zonal_eez'])

# assign OHI core regions to object
rgn_eez <- region_data() 

# the WDPA data is published annually, so these should all be the year before the OHI year.
# currently 2023, checks out!
max_year <- max(c(stats_1km$year, stats_3nm$year, stats_eez$year), na.rm = TRUE) 


### Determine total cells per region (n_cells_tot) and then a cumulative
### total of cells per region

# OHI Core function
region_data() #use this function to call rgns_eez, which is called below in the function "calc_areas()"

# create function to calculate values based on OHI regions
calc_areas <- function(stats_df) {
  area_df <- stats_df %>%
    group_by(rgn_id) %>%
    mutate(n_cells_tot = sum(n_cells),
           a_tot_km2   = n_cells_tot / 4) %>% 
    filter(!is.na(year) & !is.na(rgn_id)) %>% 
    mutate(n_cells_cum = cumsum(n_cells),
            a_prot_km2  = n_cells_cum / 4) %>% 
    complete(year = 2000:max_year) %>% 
    ungroup() %>%
    fill(-year, .direction = 'down') %>% 
    dplyr::select(-contains('cell')) %>%
    distinct() %>%
    left_join(rgns_eez, by = 'rgn_id') %>%
    dplyr::select(rgn_id:rgn_name)
  
  return(area_df)
}

# execute functions on stats data frames
prot_1km <- stats_1km %>% calc_areas()
prot_3nm <- stats_3nm %>% calc_areas()
prot_eez <- stats_eez %>% calc_areas()

# write results to int folder as csv
write_csv(prot_3nm, file.path(dir_goal, 'int', 'area_protected_3nm.csv'))
write_csv(prot_1km, file.path(dir_goal, 'int', 'area_protected_1km.csv'))
write_csv(prot_eez, file.path(dir_goal, 'int', 'area_protected_eez.csv'))

5.4 Write out layers

From the protected area files, write out the individual layers ready for the Toolbox[TM].

  • total area for offshore 3 nm and inland 1 km
  • protected area for offshore 3 nm and inland 1 km
# read in files and rename
prot_3nm <- read_csv(file.path(dir_goal, 'int', 'area_protected_3nm.csv')) %>%
  rename(area = a_tot_km2,
         a_prot_3nm = a_prot_km2)
prot_1km <- read_csv(file.path(dir_goal, 'int', 'area_protected_1km.csv')) %>%
  rename(area = a_tot_km2,
         a_prot_1km = a_prot_km2)

# create function to create LSP layer
write_lsp_layer <- function(df, layers, layername) {
  df1 <- df[ , c('rgn_id', layers)] %>%
    filter(rgn_id <= 250) %>%
    distinct()
  write_csv(df1, file.path(dir_goal, 'output', paste0(layername, '.csv')))
}

# write LSP output files
a_tot_3nm <- write_lsp_layer(prot_3nm, 'area', 'rgn_area_offshore3nm')
a_tot_1km <- write_lsp_layer(prot_1km, 'area', 'rgn_area_inland1km')

a_prot_3nm <- write_lsp_layer(prot_3nm, c('year', 'a_prot_3nm'), 'lsp_prot_area_offshore3nm')
a_prot_1km <- write_lsp_layer(prot_1km, c('year', 'a_prot_1km'), 'lsp_prot_area_inland1km')

Some goals require calculation of resilience nearshore (3nm) or entire EEZ.

area_ref = .30 ### 30% of area protected = reference point

# read in regional data csvs from int
resil_3nm <- read_csv(file.path(dir_goal, 'int', 'area_protected_3nm.csv')) %>%
  mutate(resilience.score = (a_prot_km2 / a_tot_km2) / area_ref,
         resilience.score = ifelse(resilience.score > 1, 1, resilience.score))

resil_eez <- read_csv(file.path(dir_goal, 'int', 'area_protected_eez.csv')) %>%
  mutate(resilience.score = (a_prot_km2 / a_tot_km2) / area_ref,
         resilience.score = ifelse(resilience.score > 1, 1, resilience.score))
# ask Mel about using file.path versus here function

## Save resilience scores for 3 nm and EEZ data in output
  tmp_3nm <- resil_3nm %>%
    dplyr::select(rgn_id, year, resilience.score)
  write_csv(tmp_3nm, file.path(dir_goal, 'output', "mpa_3nm_resilience.csv"))

  tmp_eez <- resil_eez %>%
    dplyr::select(rgn_id, year, resilience.score)
  write_csv(tmp_eez, file.path(dir_goal, 'output', "mpa_eez_resilience.csv"))

6 Gapfill

There was no gapfilling for these data. Created gapfill files with values of 0.

library(dplyr)
# gap fill for eez
res_eez <- read.csv("output/mpa_eez_resilience.csv")%>%
  mutate(resilience.score = 0) %>% 
  rename(gapfilled = resilience.score)

write.csv(res_eez, "output/mpa_eez_resilience_gf.csv", row.names=FALSE)

# gap fill for 3nm
res_3nm <- read.csv("output/mpa_3nm_resilience.csv")%>%
  mutate(resilience.score = 0) %>% 
  rename(gapfilled = resilience.score)

write.csv(res_3nm, "output/mpa_3nm_resilience_gf.csv", row.names=FALSE)

# gap fill for inland 1km
inland <- read.csv("output/lsp_prot_area_inland1km.csv") %>%
  mutate(a_prot_1km = 0) %>% 
  rename(gapfilled = a_prot_1km)

write.csv(inland, "output/lsp_prot_area_inland1km_gf.csv", row.names=FALSE)

# gap fill for offshore 3nm
offshore <- read.csv("output/lsp_prot_area_offshore3nm.csv") %>%
  mutate(a_prot_3nm = 0)%>% 
  rename(gapfilled = a_prot_3nm)

write.csv(offshore, "output/lsp_prot_area_offshore3nm_gf.csv", row.names=FALSE)

7 Data checking

Plot scores for 2022 vs 2023 assessment years

  • 2022 AY data goes through December 2022
  • 2023 AY data goes through December (downloaded on May 24, 2024) 2023
library(ggplot2)
#library(plotly)

## Calculates this year and last year's coastal marine protected area ratio (CMPA/Ref-CMPA) for plotting
status_3nm_new <- read_csv(file.path(dir_goal, 'output', 'lsp_prot_area_offshore3nm.csv')) %>%
  full_join(read_csv(file.path(dir_goal, 'output', 'rgn_area_offshore3nm.csv')),
            by = 'rgn_id') %>%
  mutate(pct_prot_3nm_new = a_prot_3nm / area,
         status_3nm_new   = pct_prot_3nm_new / 0.3,
         status_3nm_new   = ifelse(status_3nm_new > 1, 1, status_3nm_new)) %>%
  filter(year == max(year)) %>%
  dplyr::select(rgn_id, pct_prot_3nm_new, status_3nm_new)

status_3nm_old <- read_csv(file.path(dir_goal, '../v2023/output', 'lsp_prot_area_offshore3nm.csv')) %>%
  full_join(read_csv(file.path(dir_goal, 'output', 'rgn_area_offshore3nm.csv')),
            by = 'rgn_id') %>%
  mutate(pct_prot_3nm_old = a_prot_3nm / area,
         status_3nm_old   = pct_prot_3nm_old / 0.3,
         status_3nm_old   = ifelse(status_3nm_old > 1, 1, status_3nm_old)) %>%
  filter(year == max(year)) %>%
  dplyr::select(rgn_id, pct_prot_3nm_old, status_3nm_old)

## Calculates this year and last year's coastline protected ratio (CP/Ref-CP) for plotting
status_1km_new <- read_csv(file.path(dir_goal, 'output', 'lsp_prot_area_inland1km.csv')) %>%
  full_join(read_csv(file.path(dir_goal, 'output', 'rgn_area_inland1km.csv')),
            by = 'rgn_id') %>%
  mutate(pct_prot_1km_new = a_prot_1km / area,
         status_1km_new   = pct_prot_1km_new / 0.3,
         status_1km_new   = ifelse(status_1km_new > 1, 1, status_1km_new)) %>%
  filter(year == max(year)) %>%
  dplyr::select(rgn_id, pct_prot_1km_new, status_1km_new)

status_1km_old <- read_csv(file.path(dir_goal, '../v2023/output', 'lsp_prot_area_inland1km.csv')) %>% 
  full_join(read_csv(file.path(dir_goal, 'output', 'rgn_area_inland1km.csv')),
            by = 'rgn_id') %>%
  mutate(pct_prot_1km_old = a_prot_1km / area,
         status_1km_old   = pct_prot_1km_old / 0.3,
         status_1km_old   = ifelse(status_1km_old > 1, 1, status_1km_old)) %>%
  filter(year == max(year)) %>%
  dplyr::select(rgn_id, pct_prot_1km_old, status_1km_old)

# Updated sequence to use pivot_longer() instead of gather()
lsp_new_old <- status_3nm_new %>%
  full_join(status_3nm_old, by = c('rgn_id')) %>%
  full_join(status_1km_new, by = c('rgn_id')) %>%
  full_join(status_1km_old, by = c('rgn_id')) %>%
  mutate(status_old = (status_3nm_old + status_1km_old) / 2,
         status_new = (status_3nm_new + status_1km_new) / 2) %>%
  pivot_longer(cols = contains('new'), names_to = 'rgn', values_to = 'score_new') %>%
  pivot_longer(cols = contains('old'), names_to = 'rgn_old', values_to = 'score_old') %>% 
  mutate(rgn = str_replace(rgn, '_new', ''),
         rgn_old = str_replace(rgn_old, '_old', ''),
         score_new = round(score_new, 3),
         score_old = round(score_old, 3)) %>%
  filter(rgn_id <= 250) %>%
  filter(rgn == rgn_old) %>%
  dplyr::select(-rgn_old) %>%
  left_join(rgns_eez, by = 'rgn_id') %>%
  dplyr::select(rgn_id:rgn_name)

# plot score change results
lsp_status_plot <- ggplot(lsp_new_old, 
                        aes(x = score_old, y = score_new, key = rgn_name)) +
  geom_point(alpha = .6) +
  theme(legend.position = 'none') +
  geom_abline(slope = 1, intercept = 0, color = 'red') +
  labs(x = 'LSP status v2023 (data through Jan 2022)',
       y = 'LSP status v2024 (data through Jun 2023)',
       title = 'Comparing LSP status: 2024 vs 2023') +
  facet_wrap( ~ rgn)

lsp_status_plot #got rid of ggplotly
# looks great!

# save plot to figs folder
ggsave(file.path(dir_goal, 'figs/plot_v2023_v2024.png'), 
       plot = lsp_status_plot, height = 4.5, width = 6)

# create data frame with scores that changed by more than 0.05
mjr_score_change <- lsp_new_old %>%
  mutate(diff = score_new - score_old) %>%
  filter(rgn == 'status' & abs(diff) > 0.05) %>%
  mutate(abs_diff = abs(diff))

# save major score changes to output folder as csv
# no data was available - ask Mel
write.csv(mjr_score_change, "output/major_changes_2024.csv")