ohi logo
OHI Science | Citation policy

1 Summary

Spatial species range data and extinction risk data from IUCN is used to generate regional scores for the Species subgoal (part of the Biodiversity goal) and resilience layers.

Mean risk status per cell: Species ranges are converted to a global spatial raster of 10 km resolution.

The mean extinction risk for each cell, \(\bar{R}_{cell}\), is calculated by averaging the IUCN extinction risk of the species with ranges overlapping the cell.

Risk is a scaled value representing the species extinction risk category: * ‘LC’ = 0.0, ‘NT’ = 0.2, ‘VU’ = 0.4, ‘EN’ = 0.6, ‘CR’ = 0.8, ‘EX’ = 1.0

\[\bar{R}_{cell} = \frac{\displaystyle\sum_{species}(Risk)}{n_{spp}}\]

Mean risk status per region: The mean extinction risk for a region, \(\bar{R}_{SPP}\), is estimated by averaging the risk values of the raster cells falling within each OHI region, with each cell’s contribution weighted by the number of species in the cell.

Species goal model

The regional risk values are converted to species status scores by subtracting the risk values from 1 and rescaling so a risk value of $0.75 $ receives a score of zero.

From Halpern et al (2012):

The target for the Species sub-goal is to have all species at a risk status of Least Concern. We scaled the lower end of the biodiversity goal to be 0 when 75% species are extinct, a level comparable to the five documented mass extinctions and would constitute a catastrophic loss of biodiversity.

\[X_{SPP} = \frac{((1 - \bar{R}_{SPP}) - 0.25)}{(1 - 0.25)} * 100%\]

where:

  • \(X_{SPP}\) is Species goal status
  • \(\bar{R}_{SPP}\) is mean extinction risk for a region

Trend is calculated using time series of risk categories based on current and past assessments.

2 Updates from previous assessment

  • Trend is now calculated using the time series of risk categories from current and past IUCN assessments, and we use the trend to estimate previous scenario years of species status based on a linear model of change. Previously we used the most current scores for all previous scenario years.
  • We use only comprehensively assessed species groups (>90% of species have IUCN risk assessment)
  • We now use range maps from IUCN, whereas in the past we used Aquamap range maps when IUCN maps were unavailable. This is good news due to the challenges of aligning two different data sources.
  • We improve range maps using additional information about habitat, etc.
  • We incorporate regional IUCN scores for species.

3 Data Sources

IUCN:

  • Reference:
  • Description: Shapefiles containing polygons of assessed species ranges; each shapefile represents all assessed species within a comprehensively-assessed (i.e. >90% assessed) taxonomic group.
  • Native data resolution: NA
  • Time range: NA
  • Format: Shapefile

4 Methods

There are several steps that need to be taken to get to this point in the data prep.

Here is an overview of the organization of files and data that are run prior to this:

4.0.1 Code

4.0.1.1 Run this first! Setup directory: spp_risk_dists/_setup

In this directory are a sequence of files used to generate the bits and pieces that are later assembled into the rasters of biodiversity risk.

.Rmd files are sequenced with a prefix number (and letter) to indicate the order of operations. Briefly:

  1. Pull information from the IUCN Red List API to determine an overall species list, habitat information, and current risk (conservation status).
  2. Pull information from API on risk from regional assessments; also recode the regions according to Marine Ecoregions (Spalding et al, 2007) for later spatialization.
  3. Pull historical assessment information from API for possible trend analysis. Note that this did not make it into the final draft of the manuscript.
  4. Set up spatial layers in Gall-Peters, 100 km2 cells. Layers include:
    • cell ID (cells are sequentially numbered for combining with tabular data)
    • ocean area
    • marine protected area (classification, year of protection, proportion of protection)
    • Exclusive Economic Zones (EEZ) and FAO fishing regions
    • Marine Ecoregions of the World
    • bathymetry
    • NOTE: these layers are all saved in the spp_risk_dists/_spatial directory.
  5. Convert species range maps to rasters.
    • For maps provided directly by IUCN, aggregate into multispecies files based on family. There is some cleaning done at this stage to fix problematic extents and attributes.
    • From the list of all available maps, generate a master list of all mapped, assessed species for inclusion in the study.
    • Rasterize each species to a .csv that includes cell ID and presence. A .csv format was used for file size and ease of reading and binding into dataframes.
  6. Aggregate individual species ranges into larger taxonomic groups, and summarize key variables (mean risk, variance of risk, number of species, etc) by group.
    • Technically this is not necessary but makes it easier to quality check the process along the way, and supports mapping at the level of taxonomic group rather than the entire species list level.
    • This process is done twice: once for uniform weighting and once for range-rarity weighting. Resulting files are saved separately.

4.0.1.2 Then run this! Root directory: spp_risk_dists

At this level there are several scripts, prefixed 1x_biodiversity_maps, that collate the various taxonomic group level files (generated in setup part 6) and summarize to the global level.

  • Note each creates a specific aggregation - comprehensively assessed species vs all available species; uniform vs range-rarity weighting.
  • The rasters generated in these scripts are saved in the _output folder.

4.0.2 Data and output files

The spp_risk_dists/_data folder contains tabular data about IUCN species used throughout the processing of this analysis. These files are generated by scripts in the setup directory.

The spp_risk_dists/_spatial folder contains general spatial data generated and/or used in the setup scripts. These include:

  • rasters for cell ID, EEZ ID, marine ecoregion ID, ocean area, and bathymetry masks.
  • tabular data of region names and lookups for IUCN regional assessment to marine ecoregion.
  • tabular data of marine protected area level/year/coverage to cell ID.
  • shapefiles used for map plotting from Natural Earth.

The spp_risk_dists/_output folder contains the rasters of biodiversity risk, species richness, variance of risk, etc generated from the scripts in the base directory.


4.1 Compare all vs. comprehensively assessed species

In the past, we have used all species with IUCN risk assessments to calculate the species subgoal. However, some of Casey’s work suggests it is better to use the taxa groups that have been comprehensively assessed by IUCN (> 90% of species assessed). The general concern is that IUCN tends to oversample species in the Atlantic portion of the ocean, relative to other regions. This is indicated by the larger number of species with IUCN status in this region. However, the Atlantic falls in line with the other regions when looking at the comprehensively assessed species.

library(beyonce)
cols <- beyonce_palette(129, 100, type = "continuous")

n_comp <- raster::raster(here("globalprep/spp/v2018/_output/n_spp_risk_raster_comp.tif"))
log_n_comp <- log(n_comp)
plot(log_n_comp, col=cols)

n_all <- raster::raster(here("globalprep/spp/v2018/_output/n_spp_risk_raster_all.tif"))
log_n_all <- log(n_all)
plot(log_n_all, col=cols)

prop_comp <- n_comp/n_all
plot(prop_comp, col=rev(cols))

4.2 SPP: Status

4.2.1 Status: Extract average species risk for each region

For each cell, we multiply the average species risk by the number of species in order to weight each cells contribution by the number of species. We sum these values for each region and calculate: (average species risk * number species)/number of species

mean_risk_comp <- raster::raster(here("globalprep/spp/v2018/_output/mean_risk_raster_comp.tif"))

n_comp <- raster::raster(here("globalprep/spp/v2018/_output/n_spp_risk_raster_comp.tif"))

regions_ohi <- raster::raster(here("globalprep/spp/v2018/_spatial/eez_rast.tif"))

risk_stack_comp <- stack(regions_ohi, mean_risk_comp, n_comp)
risk_vals_comp <- values(risk_stack_comp) %>%
  data.frame()
risk_vals_comp <- filter(risk_vals_comp, !is.na(eez_rast))
risk_vals_comp <- filter(risk_vals_comp, !is.na(mean_risk_raster_comp))

rgn_risk_comp <- risk_vals_comp %>%
  rowwise() %>%
  dplyr::mutate(risk_weight = mean_risk_raster_comp * n_spp_risk_raster_comp) %>%
  group_by(eez_rast) %>%
  summarize(rgn_risk_weight = sum(risk_weight),
            rgn_n_species = sum(n_spp_risk_raster_comp))

rgn_risk_comp <- rgn_risk_comp %>%
  dplyr::rowwise() %>%
  dplyr::mutate(mean_risk = rgn_risk_weight/rgn_n_species) %>%
  dplyr::select(rgn_id = eez_rast, mean_risk)

4.2.2 Status: estimate for previous years

We use the trend data to estimate risk values for previous years (vs. using the same values for all assessment years). The change in species status across years is based on a linear model.

Trend is calculated using the same method as the risk calculation. For each cell, we multiply the average species trend by the number of species in order to weight each cell’s contribution by the number of species in the cell. We sum these values for each OHI region and calculate for each region: (average species trend * number species)/number of species

trend_comp <- raster::raster(here("globalprep/spp/v2018/_output/trend_raster_comp.tif"))

n_trend_comp <- raster::raster(here("globalprep/spp/v2018/_output/n_trend_raster_comp.tif"))

regions_ohi <- raster::raster(here("globalprep/spp/v2018/_spatial/eez_rast.tif"))

trend_stack_comp <- stack(regions_ohi, trend_comp, n_trend_comp)
trend_vals_comp <- values(trend_stack_comp) %>%
  data.frame()

trend_vals_comp <- filter(trend_vals_comp, !is.na(eez_rast))
trend_vals_comp <- filter(trend_vals_comp, !is.na(trend_raster_comp))

rgn_trend_comp <- trend_vals_comp %>%
  rowwise() %>%
  dplyr::mutate(trend_weight = trend_raster_comp * n_trend_raster_comp) %>%
  group_by(eez_rast) %>%
  summarize(rgn_trend_weight = sum(trend_weight),
            rgn_n_species = sum(n_trend_raster_comp)) %>%
  rename(rgn_id = eez_rast)

rgn_trend_comp <- rgn_trend_comp %>%
  dplyr::rowwise() %>%
  dplyr::mutate(mean_trend = rgn_trend_weight/rgn_n_species) %>%
  dplyr::select(rgn_id, mean_trend)

4.2.3 Status: Get yearly risk scores based on trend

We estimate previous risk for each region, using the trend data. We assume change in risk is linear.

assess_years <- 2012:2018
years <- expand.grid(rgn_id = unique(rgn_risk_comp$rgn_id), year=assess_years)

# this is what the trend will be multiplied by to get a risk estimate for each year:
year_multiplier <- data.frame(year=assess_years, multiplier = rev(0:(length(assess_years)-1))) 

rgn_risk_comp_yrs <- rgn_risk_comp %>%
  dplyr::left_join(rgn_trend_comp, by = "rgn_id") %>%
  dplyr::left_join(years, by = "rgn_id") %>%
  dplyr::left_join(year_multiplier, by="year") %>%
  dplyr::rowwise() %>%
  dplyr::mutate(mean_risk_per_year = mean_risk - mean_trend*multiplier) %>%
  dplyr::select(rgn_id, year, mean_risk = mean_risk_per_year)

4.2.4 Status: Converting regional mean risk to status

We rescale the data so a risk factor of 0.75 is equal to zero.

rgn_status <- rgn_risk_comp_yrs %>%
  mutate(spp_status = (0.75 - mean_risk)/0.75)

4.2.5 Status: Gapfill missing regions

Region 232 (Bosnia) does not have a value, which is not surprising because their coast is super small and results are erratic for this region. We gapfill with surrounding regions.

status_gf <- rgns_global %>%
  left_join(rgn_status) %>%
    dplyr::select(-mean_risk)
summary(status_gf)
##      rgn_id           year        spp_status    
##  Min.   :  1.0   Min.   :2012   Min.   :0.5852  
##  1st Qu.: 58.0   1st Qu.:2013   1st Qu.:0.7618  
##  Median :116.0   Median :2015   Median :0.7932  
##  Mean   :117.2   Mean   :2015   Mean   :0.7943  
##  3rd Qu.:176.0   3rd Qu.:2017   3rd Qu.:0.8365  
##  Max.   :250.0   Max.   :2018   Max.   :0.8961  
##                  NA's   :1      NA's   :1
filter(status_gf, is.na(spp_status))
##   rgn_id year spp_status
## 1    232   NA         NA
croatia <- filter(status_gf, rgn_id == 187)
mont <- filter(status_gf, rgn_id == 186) 

bosnia <- bind_rows(croatia, mont) %>%
  group_by(year) %>%
  summarize(spp_status = mean(spp_status)) %>%
  mutate(rgn_id = 232)

status_gf <- status_gf %>%
  filter(rgn_id !=232) %>%
  bind_rows(bosnia)

4.2.6 Status: Final formatting for ohi-global

status <- status_gf %>%
  dplyr::select(rgn_id, year, score = spp_status)
dim(status)  # 220*length(assess_years)
## [1] 1540    3
summary(status) # should be no NA values
##      rgn_id            year          score       
##  Min.   :  1.00   Min.   :2012   Min.   :0.5852  
##  1st Qu.: 58.75   1st Qu.:2013   1st Qu.:0.7613  
##  Median :116.50   Median :2015   Median :0.7928  
##  Mean   :117.64   Mean   :2015   Mean   :0.7940  
##  3rd Qu.:176.25   3rd Qu.:2017   3rd Qu.:0.8363  
##  Max.   :250.00   Max.   :2018   Max.   :0.8961
write.csv(status, here("globalprep/spp/v2018/output/sp_status_global.csv"), row.names=FALSE)  

4.2.7 Status: Compare to last year

old_spp <- read.csv(here("globalprep/spp_ico/v2017/output/spp_status_global.csv")) %>%
  mutate(year = 2018) %>%
  rename(old_score = score) %>%
  left_join(status) %>%
  rename(new_score = score)

plot(old_spp$old_score, old_spp$new_score)
abline(0,1, col="red")

old_spp_gather <- old_spp %>%
  dplyr::select(rgn_id, old_score, new_score) %>%
  tidyr::gather("assessment", "score", -1) %>%
  dplyr::filter(rgn_id <= 250)

ggplot(old_spp_gather, aes(y=assessment, x=score)) + 
  geom_density_ridges()

4.3 SPP: Trend

4.3.1 Trend: calculating

Getting proportional trend requires the status data (trend/status). Proportional trend is multiplied by 5 to get estimated change in five years.

# proportional trend requires status data
status <- read.csv(here("globalprep/spp/v2018/output/sp_status_global.csv")) %>%
  dplyr::filter(year==max(year)) %>%
  dplyr::select(rgn_id, score)

# Calculated in above section: Trend data
rgn_trend_score <- rgn_trend_comp %>%
  mutate(spp_trend_adj = -mean_trend/0.75) %>%  # puts in comparable units to status
  left_join(status, by="rgn_id") %>%
  dplyr::mutate(trend_score = spp_trend_adj/score * 5)

4.3.2 Trend: Gapfilling missing data

Check there are data for every region. Region 232 (Bosnia) does not have a value which is not surprising because their coast is super small and results are erratic for this region. We estimate this using the mean of the 2 surrounding regions.

trend <- rgns_global %>%
  left_join(rgn_trend_score) 

summary(trend)
##      rgn_id         mean_trend        spp_trend_adj           score       
##  Min.   :  1.00   Min.   :-0.001811   Min.   :-0.006276   Min.   :0.5852  
##  1st Qu.: 58.75   1st Qu.: 0.000495   1st Qu.:-0.002639   1st Qu.:0.7577  
##  Median :116.50   Median : 0.001416   Median :-0.001888   Median :0.7859  
##  Mean   :117.64   Mean   : 0.001263   Mean   :-0.001685   Mean   :0.7893  
##  3rd Qu.:176.25   3rd Qu.: 0.001980   3rd Qu.:-0.000660   3rd Qu.:0.8324  
##  Max.   :250.00   Max.   : 0.004707   Max.   : 0.002414   Max.   :0.8941  
##                   NA's   :1           NA's   :1           NA's   :1       
##   trend_score       
##  Min.   :-0.041357  
##  1st Qu.:-0.017813  
##  Median :-0.011790  
##  Mean   :-0.010939  
##  3rd Qu.:-0.004082  
##  Max.   : 0.013645  
##  NA's   :1
filter(trend, is.na(trend_score))
##   rgn_id mean_trend spp_trend_adj score trend_score
## 1    232         NA            NA    NA          NA
croatia <- filter(trend, rgn_id == 187)
mont <- filter(trend, rgn_id == 186)
bosnia <- mean(c(croatia$trend_score, mont$trend_score))

trend$trend_score[trend$rgn_id == 232] <- bosnia 

4.4 Trend: Final formatting for ohi-global

trend <- trend %>%
  dplyr::select(rgn_id, score = trend_score)
dim(trend) # should be 220
## [1] 220   2
summary(trend) # should be no NAs
##      rgn_id           score          
##  Min.   :  1.00   Min.   :-0.041357  
##  1st Qu.: 58.75   1st Qu.:-0.017792  
##  Median :116.50   Median :-0.011798  
##  Mean   :117.64   Mean   :-0.010960  
##  3rd Qu.:176.25   3rd Qu.:-0.004087  
##  Max.   :250.00   Max.   : 0.013645
write.csv(trend, here("globalprep/spp/v2018/output/sp_trend_global.csv"), row.names=FALSE)  

4.4.1 Trend: Compare to last year

We use a very different approach for calculating trend this year. Previously we used proxy data rather than actual change in IUCN status over time. Our previous method overestimated the magnitude of the trend. It is not surprising there is poor correlation with trend estimates in previous years, but it is reassuring the the values mainly fall in the same quadrant.

trend <- read.csv(here("globalprep/spp/v2018/output/sp_trend_global.csv"))  

old_spp <- read.csv(here("globalprep/spp_ico/v2017/output/spp_trend_global.csv")) %>%
  rename(old_score = score) %>%
  left_join(trend) 

plot(old_spp$old_score, old_spp$score, xlim=c(-0.35, 0.05))
abline(h=0, col="red")
abline(v=0, col="red")

4.4.2 Status/trend: save a record of gapfilling

status <- read.csv(here("globalprep/spp/v2018/output/sp_status_global.csv")) %>%
  mutate(gapfilled = ifelse(rgn_id == 232, 1, 0)) %>%
  mutate(method = ifelse(rgn_id == 232, "mean of neighbors", NA)) %>%
  dplyr::select(rgn_id, year, gapfilled, method)
write.csv(status, here("globalprep/spp/v2018/output/sp_status_global_gf.csv"), row.names=FALSE)          

trend <- read.csv(here("globalprep/spp/v2018/output/sp_trend_global.csv")) %>%
  mutate(gapfilled = ifelse(rgn_id == 232, 1, 0)) %>%
  mutate(method = ifelse(rgn_id == 232, "mean of neighbors", NA)) %>%
  dplyr::select(rgn_id, gapfilled, method)
write.csv(trend, here("globalprep/spp/v2018/output/sp_trend_global_gf.csv"), row.names=FALSE)          

4.5 Resilience data

We use species condition data as a resilience measure as well. We also calculate species condition at 3nm of shoreline, because for some goals, nearshore species condition is the relevant metric.

4.5.1 Resilience: Prepare rasters for 3nm extraction

We reproject the data to have higher resolution in order to more easily extract the data at the 3nm scale.
We modify the method a bit from above due to size of the rasters.

# relevant species files
mean_risk_comp <- raster::raster(here("globalprep/spp/v2018/_output/mean_risk_raster_comp.tif"))
n_comp <- raster::raster(here("globalprep/spp/v2018/_output/n_spp_risk_raster_comp.tif"))

risk_x_n <- mean_risk_comp*n_comp

# project rasters to moll
# saved in Mazu:spp/v2018
projectRaster(risk_x_n, rgns, method="ngb", over=TRUE, 
              filename=file.path(dir_M, "git-annex/globalprep/spp/v2018/int/risk_x_n_comp_mol.tif"),
              progress="text")
projectRaster(n_comp, rgns, method="ngb", over=TRUE, 
              filename=file.path(dir_M, "git-annex/globalprep/spp/v2018/int/n_comp_mol.tif"),
              progress="text")

4.5.2 Resilience: Extract data

Extract species risk data that corresponds to 3nm regions.

#3nm raster file
rgns <- raster(file.path(dir_M, "git-annex/globalprep/spatial/v2018/rgns_3nm_offshore_mol.tif"))
plot(rgns)

risk_x_n_mol <- raster(file.path(dir_M, "git-annex/globalprep/spp/v2018/int/risk_x_n_comp_mol.tif"))
n_mol <- raster(file.path(dir_M, "git-annex/globalprep/spp/v2018/int/n_comp_mol.tif"))

risk_stack <- stack(risk_x_n_mol, n_mol)

risk_df <- raster::zonal(risk_stack, rgns, fun='sum')

rgn_3nm_risk <- risk_df %>%
  data.frame() %>%
  rowwise() %>%
  dplyr::mutate(rgn_wt_risk = risk_x_n_comp_mol/n_comp_mol) %>%
  dplyr::select(rgn_id = zone, rgn_wt_risk)

4.5.3 Resilience: estimate for previous years

We use the trend data to estimate risk values for previous years (vs. using the same values for all assessment years). The change in species status across years is based on a linear model.

Trend is calculated using the same method as the risk calculation. For each cell, we multiply the average species trend by the number of species in order to weight each cell’s contribution by the number of species in the cell. We sum these values for each OHI region and calculate for each region: (average species trend * number species)/number of species

4.5.3.1 Resilience: Prepare rasters for 3nm extraction (trend to estimate previous years’ data)

We reproject the data to have higher resolution in order to more easily extract the data at the 3nm scale.

trend_comp <- raster::raster(here("globalprep/spp/v2018/_output/trend_raster_comp.tif"))

n_trend_comp <- raster::raster(here("globalprep/spp/v2018/_output/n_trend_raster_comp.tif"))

trend_x_n <- trend_comp*n_trend_comp

# project rasters to moll
# saved in Mazu:spp/v2018
projectRaster(trend_x_n, rgns_3nm, method="ngb", over=TRUE, 
              filename=file.path(dir_M, "git-annex/globalprep/spp/v2018/int/trend_x_n_comp_mol.tif"),
              progress="text")
projectRaster(n_trend_comp, rgns_3nm, method="ngb", over=TRUE, 
              filename=file.path(dir_M, "git-annex/globalprep/spp/v2018/int/n_trend_comp_mol.tif"),
              progress="text")

4.5.4 Resilience: Extract trend data (used to estimate previous years’ data)

Extract species risk data that corresponds to 3nm regions.

#3nm raster file
rgns_3nm <- raster(file.path(dir_M, "git-annex/globalprep/spatial/v2018/rgns_3nm_offshore_mol.tif"))
plot(rgns_3nm)

trend_3nm_mol <- raster(file.path(dir_M, "git-annex/globalprep/spp/v2018/int/trend_x_n_comp_mol.tif"))
n_trend_3nm_mol <- raster(file.path(dir_M, "git-annex/globalprep/spp/v2018/int/n_trend_comp_mol.tif"))


trend_stack <- stack(trend_3nm_mol, n_trend_3nm_mol)

trend_df <- raster::zonal(trend_stack, rgns, fun='sum')

rgn_3nm_trend <- trend_df %>%
  data.frame() %>%
  rowwise() %>%
  dplyr::mutate(rgn_wt_trend = trend_x_n_comp_mol/n_trend_comp_mol) %>%
  dplyr::select(rgn_id = zone, rgn_wt_trend)

4.5.5 Resilience: Calculate yearly risk scores based on trend

We estimate previous risk for each region, using the trend data. We assume change in risk is linear.

assess_years <- 2012:2018
years <- expand.grid(rgn_id = unique(rgn_3nm_trend$rgn_id), year=assess_years)

# this is what the trend will be multiplied by to get a risk estimate for each year:
year_multiplier <- data.frame(year=assess_years, multiplier = rev(0:(length(assess_years)-1))) 

rgn_risk_3nm <- rgn_3nm_risk %>%
  left_join(rgn_3nm_trend, by = "rgn_id") %>%
  left_join(years, by = "rgn_id") %>%
  left_join(year_multiplier, by="year") %>%
  rowwise() %>%
  mutate(mean_risk_per_year = rgn_wt_risk - rgn_wt_trend*multiplier) %>%
  dplyr::select(rgn_id, year, mean_risk = mean_risk_per_year)

4.5.6 Resilience: Converting regional mean risk to status

We rescale the data so a risk factor of 0.75 is equal to zero.

rgn_3nm_res <- rgn_risk_3nm %>%
  mutate(spp_status = (0.75 - mean_risk)/0.75)

# quick check
hist(rgn_3nm_res$spp_status)

4.5.7 Resilience: Gapfill missing regions

Region 19 (Tuvalu) does not have a value. This is an island. We gapfill with the value from the entire eez.

res_gf <- rgns_global %>%
  left_join(rgn_3nm_res) %>%
    dplyr::select(-mean_risk)

summary(res_gf)
##      rgn_id            year        spp_status    
##  Min.   :  1.00   Min.   :2012   Min.   :0.7088  
##  1st Qu.: 58.75   1st Qu.:2013   1st Qu.:0.8086  
##  Median :116.50   Median :2015   Median :0.8608  
##  Mean   :117.64   Mean   :2015   Mean   :0.8478  
##  3rd Qu.:176.25   3rd Qu.:2017   3rd Qu.:0.8893  
##  Max.   :250.00   Max.   :2018   Max.   :0.9227  
##                                  NA's   :7
filter(res_gf, is.na(spp_status))
##   rgn_id year spp_status
## 1     19 2012        NaN
## 2     19 2013        NaN
## 3     19 2014        NaN
## 4     19 2015        NaN
## 5     19 2016        NaN
## 6     19 2017        NaN
## 7     19 2018        NaN
# get eez value:
eez_status <- read.csv(here("globalprep/spp/v2018/output/sp_status_global.csv")) %>%
  filter(rgn_id == 19) %>%
  rename(spp_status = score)

res_gf <- res_gf %>%
  filter(!is.na(spp_status)) %>%
  bind_rows(eez_status)


summary(res_gf)
##      rgn_id            year        spp_status    
##  Min.   :  1.00   Min.   :2012   Min.   :0.7088  
##  1st Qu.: 58.75   1st Qu.:2013   1st Qu.:0.8080  
##  Median :116.50   Median :2015   Median :0.8603  
##  Mean   :117.64   Mean   :2015   Mean   :0.8474  
##  3rd Qu.:176.25   3rd Qu.:2017   3rd Qu.:0.8893  
##  Max.   :250.00   Max.   :2018   Max.   :0.9227

4.5.8 Resilience: Final formatting for ohi-global

resilience <- res_gf %>%
  dplyr::select(rgn_id, year, score = spp_status)
dim(resilience)  # 220 * 7
## [1] 1540    3
summary(resilience) # should be no NA values
##      rgn_id            year          score       
##  Min.   :  1.00   Min.   :2012   Min.   :0.7088  
##  1st Qu.: 58.75   1st Qu.:2013   1st Qu.:0.8080  
##  Median :116.50   Median :2015   Median :0.8603  
##  Mean   :117.64   Mean   :2015   Mean   :0.8474  
##  3rd Qu.:176.25   3rd Qu.:2017   3rd Qu.:0.8893  
##  Max.   :250.00   Max.   :2018   Max.   :0.9227
write.csv(resilience, here("globalprep/spp/v2018/output/sp_status_3nm.csv"), row.names=FALSE)  

4.5.9 Resilience: save a record of gapfilling

res <- read.csv(here("globalprep/spp/v2018/output/sp_status_3nm.csv")) %>%
  mutate(gapfilled = ifelse(rgn_id == 19, 1, 0)) %>%
  mutate(method = ifelse(rgn_id == 19, "eez scale data used", NA)) %>%
  dplyr::select(rgn_id, year, gapfilled, method)
write.csv(res, here("globalprep/spp/v2018/output/sp_status_3nm_gf.csv"), row.names=FALSE)          

4.5.10 Resilience: Compare

Compared to the entire EEZ, most (but not all) coastal areas have lower species condition scores. The correlation with last year isn’t that great, but it is similar to what we observed in the EEZ data.

# compare to eez values
eez_status <- read.csv(here("globalprep/spp/v2018/output/sp_status_global.csv")) %>%
  filter(year == max(year)) %>%
  dplyr::select(rgn_id, eez_score = score)

status <- read.csv(here("globalprep/spp/v2018/output/sp_status_3nm.csv")) %>%
  dplyr::select(rgn_id, year, nm3_score = score) %>%
  filter(year == max(year)) %>%
  left_join(eez_status, by = "rgn_id")

plot(status$nm3_score, status$eez_score)
abline(0,1)

# compare to last year's values
status <- read.csv(here("globalprep/spp/v2018/output/sp_status_3nm.csv")) %>%
  filter(year == max(year))

old_spp <- read.csv(here("globalprep/spp_ico/v2017/output/spp_status_3nm.csv")) %>%
  mutate(year = 2018) %>%
  rename(old_score = score) %>%
  left_join(status) %>%
  rename(new_score = score)

plot(old_spp$old_score, old_spp$new_score)
abline(0,1, col="red")

old_spp_gather <- old_spp %>%
  dplyr::select(rgn_id, old_score, new_score) %>%
  tidyr::gather("assessment", "score", -1) %>%
  filter(rgn_id <= 250)

ggplot(old_spp_gather, aes(y=assessment, x=score)) + 
  geom_density_ridges()