We use the methods and code from Casey O’Hara: https://github.com/oharac/spp_risk_dists to prepare the data for the species subgoal.
Reference:
Spatial species range data and extinction risk data from IUCN is used to generate regional scores for the Species subgoal (part of the Biodiversity goal) and resilience layers.
Mean risk status per cell: Species ranges are converted to a global spatial raster of 10 km resolution.
The mean extinction risk for each cell, \(\bar{R}_{cell}\), is calculated by averaging the IUCN extinction risk of the species with ranges overlapping the cell.
Risk is a scaled value representing the species extinction risk category: * ‘LC’ = 0.0, ‘NT’ = 0.2, ‘VU’ = 0.4, ‘EN’ = 0.6, ‘CR’ = 0.8, ‘EX’ = 1.0
\[\bar{R}_{cell} = \frac{\displaystyle\sum_{species}(Risk)}{n_{spp}}\]
Mean risk status per region: The mean extinction risk for a region, \(\bar{R}_{SPP}\), is estimated by averaging the risk values of the raster cells falling within each OHI region, with each cell’s contribution weighted by the number of species in the cell.
Species goal model
The regional risk values are converted to species status scores by subtracting the risk values from 1 and rescaling so a risk value of $0.75 $ receives a score of zero.
From Halpern et al (2012):
The target for the Species sub-goal is to have all species at a risk status of Least Concern. We scaled the lower end of the biodiversity goal to be 0 when 75% species are extinct, a level comparable to the five documented mass extinctions and would constitute a catastrophic loss of biodiversity.
\[X_{SPP} = \frac{((1 - \bar{R}_{SPP}) - 0.25)}{(1 - 0.25)} * 100%\]
where:
Trend is calculated using time series of risk categories based on current and past assessments.
Added Casey Ohara’s personal common.R functions into the common_fxns.R file located in the ’_setup’ file for this goal. Additional year of data.
IUCN:
There are several steps that need to be taken to get to this point in the data prep.
Here is an overview of the organization of files and data that are run prior to this:
spp_risk_dists/_setup
In this directory are a sequence of files used to generate the bits and pieces that are later assembled into the rasters of biodiversity risk.
.Rmd files are sequenced with a prefix number (and letter) to indicate the order of operations. Briefly:
spp_risk_dists/_spatial
directory.spp_risk_dists
At this level there are several scripts, prefixed 1x_biodiversity_maps
, that collate the various taxonomic group level files (generated in setup
part 6) and summarize to the global level.
_output
folder.The spp_risk_dists/_data
folder contains tabular data about IUCN species used throughout the processing of this analysis. These files are generated by scripts in the setup directory.
The spp_risk_dists/_spatial
folder contains general spatial data generated and/or used in the setup
scripts. These include:
The spp_risk_dists/_output
folder contains the rasters of biodiversity risk, species richness, variance of risk, etc generated from the scripts in the base directory.
In the past, we have used all species with IUCN risk assessments to calculate the species subgoal. However, some of Casey’s work suggests it is better to use the taxa groups that have been comprehensively assessed by IUCN (> 90% of species assessed). The general concern is that IUCN tends to oversample species in the Atlantic portion of the ocean, relative to other regions. This is indicated by the larger number of species with IUCN status in this region. However, the Atlantic falls in line with the other regions when looking at the comprehensively assessed species.
#this will be the same for 2019 since we only used comprehesively assessed taxa
library(beyonce)
cols <- beyonce_palette(129, 100, type = "continuous")
#2019
n_comp <- raster::raster(here("globalprep/spp/v2019/_output/n_spp_risk_raster_comp.tif"))
log_n_comp <- log(n_comp)
plot(log_n_comp, col=cols)
#2018
#n_comp <- raster::raster(here("globalprep/spp/v2018/_output/n_spp_risk_raster_comp.tif"))
#log_n_comp <- log(n_comp)
#plot(log_n_comp, col=cols)
#2019
n_all <- raster::raster(here("globalprep/spp/v2019/_output/n_spp_risk_raster_all.tif"))
log_n_all <- log(n_all)
plot(log_n_all, col=cols)
#2018
#n_all <- raster::raster(here("globalprep/spp/v2018/_output/n_spp_risk_raster_all.tif"))
#log_n_all <- log(n_all)
#plot(log_n_all, col=cols)
prop_comp <- n_comp/n_all
plot(prop_comp, col=rev(cols))
#this makes sense. For some reason Damselfish and a few reptiles were classified as "non comprehensive", but they are actually comprehensive.
For each cell, we multiply the average species risk by the number of species in order to weight each cells contribution by the number of species. We sum these values for each region and calculate: (average species risk * number species)/number of species
mean_risk_comp <- raster::raster(here("globalprep/spp/v2019/_output/mean_risk_raster_comp.tif"))
n_comp <- raster::raster(here("globalprep/spp/v2019/_output/n_spp_risk_raster_comp.tif"))
regions_ohi <- raster::raster(here("globalprep/spp/v2019/_spatial/eez_rast.tif"))
risk_stack_comp <- stack(regions_ohi, mean_risk_comp, n_comp)
risk_vals_comp <- values(risk_stack_comp) %>%
data.frame()
risk_vals_comp <- filter(risk_vals_comp, !is.na(eez_rast))
risk_vals_comp <- filter(risk_vals_comp, !is.na(mean_risk_raster_comp))
rgn_risk_comp <- risk_vals_comp %>%
rowwise() %>%
dplyr::mutate(risk_weight = mean_risk_raster_comp * n_spp_risk_raster_comp) %>%
group_by(eez_rast) %>%
summarize(rgn_risk_weight = sum(risk_weight),
rgn_n_species = sum(n_spp_risk_raster_comp))
rgn_risk_comp <- rgn_risk_comp %>%
dplyr::rowwise() %>%
dplyr::mutate(mean_risk = rgn_risk_weight/rgn_n_species) %>%
dplyr::select(rgn_id = eez_rast, mean_risk)
We use the trend data to estimate risk values for previous years (vs. using the same values for all assessment years). The change in species status across years is based on a linear model.
Trend is calculated using the same method as the risk calculation. For each cell, we multiply the average species trend by the number of species in order to weight each cell’s contribution by the number of species in the cell. We sum these values for each OHI region and calculate for each region: (average species trend * number species)/number of species
trend_comp <- raster::raster(here("globalprep/spp/v2019/_output/trend_raster_comp.tif"))
n_trend_comp <- raster::raster(here("globalprep/spp/v2019/_output/n_trend_raster_comp.tif"))
regions_ohi <- raster::raster(here("globalprep/spp/v2019/_spatial/eez_rast.tif"))
trend_stack_comp <- stack(regions_ohi, trend_comp, n_trend_comp)
trend_vals_comp <- values(trend_stack_comp) %>%
data.frame()
trend_vals_comp <- filter(trend_vals_comp, !is.na(eez_rast))
trend_vals_comp <- filter(trend_vals_comp, !is.na(trend_raster_comp))
rgn_trend_comp <- trend_vals_comp %>%
rowwise() %>%
dplyr::mutate(trend_weight = trend_raster_comp * n_trend_raster_comp) %>%
group_by(eez_rast) %>%
summarize(rgn_trend_weight = sum(trend_weight),
rgn_n_species = sum(n_trend_raster_comp)) %>%
rename(rgn_id = eez_rast)
rgn_trend_comp <- rgn_trend_comp %>%
dplyr::rowwise() %>%
dplyr::mutate(mean_trend = rgn_trend_weight/rgn_n_species) %>%
dplyr::select(rgn_id, mean_trend)
We estimate previous risk for each region, using the trend data. We assume change in risk is linear.
assess_years <- 2012:2019
years <- expand.grid(rgn_id = unique(rgn_risk_comp$rgn_id), year=assess_years)
# this is what the trend will be multiplied by to get a risk estimate for each year:
year_multiplier <- data.frame(year=assess_years, multiplier = rev(0:(length(assess_years)-1)))
rgn_risk_comp_yrs <- rgn_risk_comp %>%
dplyr::left_join(rgn_trend_comp, by = "rgn_id") %>%
dplyr::left_join(years, by = "rgn_id") %>%
dplyr::left_join(year_multiplier, by="year") %>%
dplyr::rowwise() %>%
dplyr::mutate(mean_risk_per_year = mean_risk - mean_trend*multiplier) %>%
dplyr::select(rgn_id, year, mean_risk = mean_risk_per_year)
We rescale the data so a risk factor of 0.75 is equal to zero.
Region 232 (Bosnia) does not have a value, which is not surprising because their coast is super small and results are erratic for this region. We gapfill with surrounding regions.
## rgn_id year spp_status
## Min. : 1.0 Min. :2012 Min. :0.6171
## 1st Qu.: 58.0 1st Qu.:2014 1st Qu.:0.7697
## Median :116.0 Median :2016 Median :0.8055
## Mean :117.2 Mean :2016 Mean :0.8022
## 3rd Qu.:176.0 3rd Qu.:2017 3rd Qu.:0.8431
## Max. :250.0 Max. :2019 Max. :0.8964
## NA's :1 NA's :1
## rgn_id year spp_status
## 1 232 NA NA
status <- status_gf %>%
dplyr::select(rgn_id, year, score = spp_status)
dim(status) # 220*length(assess_years)
## [1] 1760 3
## rgn_id year score
## Min. : 1.00 Min. :2012 Min. :0.6171
## 1st Qu.: 58.75 1st Qu.:2014 1st Qu.:0.7693
## Median :116.50 Median :2016 Median :0.8051
## Mean :117.64 Mean :2016 Mean :0.8019
## 3rd Qu.:176.25 3rd Qu.:2017 3rd Qu.:0.8429
## Max. :250.00 Max. :2019 Max. :0.8964
status <- read_csv(here("globalprep/spp/v2019/output/sp_status_global.csv"))
old_spp <- read.csv(here("globalprep/spp/v2018/output/sp_status_global.csv")) %>%
filter(year == max(year)) %>%
rename(old_score = score) %>%
mutate(year = 2019) %>%
left_join(status) %>%
rename(new_score = score)
plot(old_spp$old_score, old_spp$new_score)
abline(0,1, col="red")
Getting proportional trend requires the status data (trend/status). Proportional trend is multiplied by 5 to get estimated change in five years.
# proportional trend requires status data
status <- read.csv(here("globalprep/spp/v2019/output/sp_status_global.csv")) %>%
dplyr::filter(year==max(year)) %>%
dplyr::select(rgn_id, score)
# Calculated in above section: Trend data
rgn_trend_score <- rgn_trend_comp %>%
mutate(spp_trend_adj = -mean_trend/0.75) %>% # puts in comparable units to status
left_join(status, by="rgn_id") %>%
dplyr::mutate(trend_score = spp_trend_adj/score * 5)
Check there are data for every region. Region 232 (Bosnia) does not have a value which is not surprising because their coast is super small and results are erratic for this region. We estimate this using the mean of the 2 surrounding regions.
## rgn_id mean_trend spp_trend_adj
## Min. : 1.00 Min. :-0.0016678 Min. :-0.0036924
## 1st Qu.: 58.75 1st Qu.: 0.0003407 1st Qu.:-0.0020450
## Median :116.50 Median : 0.0009818 Median :-0.0013091
## Mean :117.64 Mean : 0.0008893 Mean :-0.0011858
## 3rd Qu.:176.25 3rd Qu.: 0.0015338 3rd Qu.:-0.0004542
## Max. :250.00 Max. : 0.0027693 Max. : 0.0022237
## NA's :1 NA's :1
## score trend_score
## Min. :0.6171 Min. :-0.026373
## 1st Qu.:0.7644 1st Qu.:-0.012764
## Median :0.8015 Median :-0.007905
## Mean :0.7981 Mean :-0.007624
## 3rd Qu.:0.8402 3rd Qu.:-0.002758
## Max. :0.8956 Max. : 0.012554
## NA's :1 NA's :1
## rgn_id mean_trend spp_trend_adj score trend_score
## 1 232 NA NA NA NA
## [1] 220 2
## rgn_id score
## Min. : 1.00 Min. :-0.026373
## 1st Qu.: 58.75 1st Qu.:-0.012748
## Median :116.50 Median :-0.007927
## Mean :117.64 Mean :-0.007647
## 3rd Qu.:176.25 3rd Qu.:-0.002763
## Max. :250.00 Max. : 0.012554
We use a very different approach for calculating trend this year. Previously we used proxy data rather than actual change in IUCN status over time. Our previous method overestimated the magnitude of the trend. It is not surprising there is poor correlation with trend estimates in previous years, but it is reassuring the the values mainly fall in the same quadrant.
trend <- read.csv(here("globalprep/spp/v2019/output/sp_trend_global.csv"))
old_spp <- read.csv(here("globalprep/spp/v2018/output/sp_trend_global.csv")) %>%
rename(old_score = score) %>%
left_join(trend)
plot(old_spp$old_score, old_spp$score)# xlim=c(-0.35, 0.05))
abline(h=0, col="red")
abline(v=0, col="red")
abline(0,1, col="red")
status <- read.csv(here("globalprep/spp/v2019/output/sp_status_global.csv")) %>%
mutate(gapfilled = ifelse(rgn_id == 232, 1, 0)) %>%
mutate(method = ifelse(rgn_id == 232, "mean of neighbors", NA)) %>%
dplyr::select(rgn_id, year, gapfilled, method)
write.csv(status, here("globalprep/spp/v2019/output/sp_status_global_gf.csv"), row.names=FALSE)
trend <- read.csv(here("globalprep/spp/v2019/output/sp_trend_global.csv")) %>%
mutate(gapfilled = ifelse(rgn_id == 232, 1, 0)) %>%
mutate(method = ifelse(rgn_id == 232, "mean of neighbors", NA)) %>%
dplyr::select(rgn_id, gapfilled, method)
write.csv(trend, here("globalprep/spp/v2019/output/sp_trend_global_gf.csv"), row.names=FALSE)
We use species condition data as a resilience measure as well. We also calculate species condition at 3nm of shoreline, because for some goals, nearshore species condition is the relevant metric.
We reproject the data to have higher resolution in order to more easily extract the data at the 3nm scale.
We modify the method a bit from above due to size of the rasters.
#3nm raster file
rgns <- raster(file.path(dir_M, "git-annex/globalprep/spatial/v2018/rgns_3nm_offshore_mol.tif"))
plot(rgns)
# relevant species files
mean_risk_comp <- raster::raster(here("globalprep/spp/v2019/_output/mean_risk_raster_comp.tif"))
n_comp <- raster::raster(here("globalprep/spp/v2019/_output/n_spp_risk_raster_comp.tif"))
risk_x_n <- mean_risk_comp*n_comp
# project rasters to moll
# saved in Mazu:spp/v2019
projectRaster(risk_x_n, rgns, method="ngb", over=TRUE,
filename=file.path(dir_M, "git-annex/globalprep/spp/v2019/int/risk_x_n_comp_mol.tif"),
progress="text")
projectRaster(n_comp, rgns, method="ngb", over=TRUE,
filename=file.path(dir_M, "git-annex/globalprep/spp/v2019/int/n_comp_mol.tif"),
progress="text")
Extract species risk data that corresponds to 3nm regions.
#3nm raster file
rgns <- raster(file.path(dir_M, "git-annex/globalprep/spatial/v2018/rgns_3nm_offshore_mol.tif"))
plot(rgns)
risk_x_n_mol <- raster(file.path(dir_M, "git-annex/globalprep/spp/v2019/int/risk_x_n_comp_mol.tif"))
n_mol <- raster(file.path(dir_M, "git-annex/globalprep/spp/v2019/int/n_comp_mol.tif"))
risk_stack <- stack(risk_x_n_mol, n_mol)
risk_df <- raster::zonal(risk_stack, rgns, fun='sum')
rgn_3nm_risk <- risk_df %>%
data.frame() %>%
rowwise() %>%
dplyr::mutate(rgn_wt_risk = risk_x_n_comp_mol/n_comp_mol) %>%
dplyr::select(rgn_id = zone, rgn_wt_risk)
We use the trend data to estimate risk values for previous years (vs. using the same values for all assessment years). The change in species status across years is based on a linear model.
Trend is calculated using the same method as the risk calculation. For each cell, we multiply the average species trend by the number of species in order to weight each cell’s contribution by the number of species in the cell. We sum these values for each OHI region and calculate for each region: (average species trend * number species)/number of species
We reproject the data to have higher resolution in order to more easily extract the data at the 3nm scale.
trend_comp <- raster::raster(here("globalprep/spp/v2019/_output/trend_raster_comp.tif"))
n_trend_comp <- raster::raster(here("globalprep/spp/v2019/_output/n_trend_raster_comp.tif"))
trend_x_n <- trend_comp*n_trend_comp
# project rasters to moll
# saved in Mazu:spp/v2019
projectRaster(trend_x_n, rgns, method="ngb", over=TRUE,
filename=file.path(dir_M, "git-annex/globalprep/spp/v2019/int/trend_x_n_comp_mol.tif"),
progress="text")
projectRaster(n_trend_comp, rgns, method="ngb", over=TRUE,
filename=file.path(dir_M, "git-annex/globalprep/spp/v2019/int/n_trend_comp_mol.tif"),
progress="text")
Extract species risk data that corresponds to 3nm regions.
#3nm raster file
rgns_3nm <- raster(file.path(dir_M, "git-annex/globalprep/spatial/v2018/rgns_3nm_offshore_mol.tif"))
plot(rgns_3nm)
trend_3nm_mol <- raster(file.path(dir_M, "git-annex/globalprep/spp/v2019/int/trend_x_n_comp_mol.tif"))
n_trend_3nm_mol <- raster(file.path(dir_M, "git-annex/globalprep/spp/v2019/int/n_trend_comp_mol.tif"))
trend_stack <- stack(trend_3nm_mol, n_trend_3nm_mol)
trend_df <- raster::zonal(trend_stack, rgns, fun='sum')
rgn_3nm_trend <- trend_df %>%
data.frame() %>%
rowwise() %>%
dplyr::mutate(rgn_wt_trend = trend_x_n_comp_mol/n_trend_comp_mol) %>%
dplyr::select(rgn_id = zone, rgn_wt_trend)
We estimate previous risk for each region, using the trend data. We assume change in risk is linear.
assess_years <- 2012:2019
years <- expand.grid(rgn_id = unique(rgn_3nm_trend$rgn_id), year=assess_years)
# this is what the trend will be multiplied by to get a risk estimate for each year:
year_multiplier <- data.frame(year=assess_years, multiplier = rev(0:(length(assess_years)-1)))
rgn_risk_3nm <- rgn_3nm_risk %>%
left_join(rgn_3nm_trend, by = "rgn_id") %>%
left_join(years, by = "rgn_id") %>%
left_join(year_multiplier, by="year") %>%
rowwise() %>%
mutate(mean_risk_per_year = rgn_wt_risk - rgn_wt_trend*multiplier) %>%
dplyr::select(rgn_id, year, mean_risk = mean_risk_per_year)
We rescale the data so a risk factor of 0.75 is equal to zero.
Region 19 (Tuvalu) does not have a value. This is an island. We gapfill with the value from the entire eez.
## rgn_id year spp_status
## Min. : 1.00 Min. :2012 Min. :0.7223
## 1st Qu.: 58.75 1st Qu.:2014 1st Qu.:0.8174
## Median :116.50 Median :2016 Median :0.8671
## Mean :117.64 Mean :2016 Mean :0.8524
## 3rd Qu.:176.25 3rd Qu.:2017 3rd Qu.:0.8918
## Max. :250.00 Max. :2019 Max. :0.9217
## NA's :8
## rgn_id year spp_status
## 1 19 2012 NaN
## 2 19 2013 NaN
## 3 19 2014 NaN
## 4 19 2015 NaN
## 5 19 2016 NaN
## 6 19 2017 NaN
## 7 19 2018 NaN
## 8 19 2019 NaN
# get eez value:
eez_status <- read.csv(here("globalprep/spp/v2019/output/sp_status_global.csv")) %>%
filter(rgn_id == 19) %>%
rename(spp_status = score)
res_gf <- res_gf %>%
filter(!is.na(spp_status)) %>%
bind_rows(eez_status)
summary(res_gf)
## rgn_id year spp_status
## Min. : 1.00 Min. :2012 Min. :0.7223
## 1st Qu.: 58.75 1st Qu.:2014 1st Qu.:0.8171
## Median :116.50 Median :2016 Median :0.8665
## Mean :117.64 Mean :2016 Mean :0.8519
## 3rd Qu.:176.25 3rd Qu.:2017 3rd Qu.:0.8916
## Max. :250.00 Max. :2019 Max. :0.9217
## [1] 1760 3
## rgn_id year score
## Min. : 1.00 Min. :2012 Min. :0.7223
## 1st Qu.: 58.75 1st Qu.:2014 1st Qu.:0.8171
## Median :116.50 Median :2016 Median :0.8665
## Mean :117.64 Mean :2016 Mean :0.8519
## 3rd Qu.:176.25 3rd Qu.:2017 3rd Qu.:0.8916
## Max. :250.00 Max. :2019 Max. :0.9217
res <- read.csv(here("globalprep/spp/v2019/output/sp_status_3nm.csv")) %>%
mutate(gapfilled = ifelse(rgn_id == 19, 1, 0)) %>%
mutate(method = ifelse(rgn_id == 19, "eez scale data used", NA)) %>%
dplyr::select(rgn_id, year, gapfilled, method)
write.csv(res, here("globalprep/spp/v2019/output/sp_status_3nm_gf.csv"), row.names=FALSE)
Compared to the entire EEZ, most (but not all) coastal areas have lower species condition scores. The correlation with last year is better than it was last year.
# compare to eez values
eez_status <- read.csv(here("globalprep/spp/v2019/output/sp_status_global.csv")) %>%
filter(year == max(year)) %>%
dplyr::select(rgn_id, eez_score = score)
status <- read.csv(here("globalprep/spp/v2019/output/sp_status_3nm.csv")) %>%
dplyr::select(rgn_id, year, nm3_score = score) %>%
filter(year == max(year)) %>%
left_join(eez_status, by = "rgn_id")
plot(status$nm3_score, status$eez_score)
abline(0,1)
# compare to last year's values
status <- read.csv(here("globalprep/spp/v2019/output/sp_status_3nm.csv")) %>%
filter(year == max(year))
old_spp <- read.csv(here("globalprep/spp/v2018/output/sp_status_3nm.csv")) %>%
filter(year == max(year)) %>%
rename(old_score = score) %>%
mutate(year = 2019) %>%
left_join(status) %>%
rename(new_score = score)
plot(old_spp$old_score, old_spp$new_score)
abline(0,1, col="red")