ohi logo
OHI Science | Citation policy

REFERENCE RMD FILE

1 Summary: Iconic Species Subgoal (Sense of Place)

This script prepares scores (status and trend) for Iconic Species in each global coastal region. For each iconic marine species, the countries of occurrence are pulled from the IUCN API. Extinction risk categories for each species are pulled based on current and past assessments; by tracking the assessed extinction risk over time, we can understand the trends of extinction risk for iconic species directly rather than using the “population trend” method from prior OHI assessments.

The Iconic Species sub-goal model calculates a region’s status based upon an unweighted average of species health for all ‘iconic’ species found within each reporting region.

From Halpern et al (2012):

Iconic species are those that are relevant to local cultural identity through a species’ relationship to one or more of the following: 1) traditional activities such as fishing, hunting or commerce; 2) local ethnic or religious practices; 3) existence value; and 4) locally-recognized aesthetic value (e.g., touristic attractions/common subjects for art such as whales). Habitat-forming species are not included in this definition of iconic species, nor are species that are harvested solely for economic or utilitarian purposes (even though they may be iconic to a sector or individual). …

Ultimately, almost any species can be iconic to someone, and so the intent with this goal was to focus on those species widely seen as iconic within a country, and iconic from a cultural or existence value (rather than for a livelihoods or extractive reason). …

The reference point is to have the risk status of all assessed species as Least Concern (i.e., a goal score = 1.0)

The Status of this sub-goal (XICO) is then the % of iconic species in each threat category (as defined by the IUCN Red List), such that:

\[X_{ICO} = \frac{\displaystyle\sum_{category}S_{cat}*w_{cat}}{\displaystyle\sum_{category}S_{cat}}\]

where for each IUCN threat category:

ICO trend is calculated in a similar manner, but weightings are assigned according to IUCN population trend: ‘Decreasing’ = -0.5, ‘Stable’ = 0.0, ‘Increasing’ = +0.5.

2 Updates from previous assessment

Additional year of IUCN Red List data added. In 2019, we stopped using the regions list to merge IUCN and iconic species region IDs and switched to the name_2_rgn() funciton in ohicore. However, this function reports regions that don’t have a match in OHI region list, so we report certain reported regions at a higher spatial scale, based on the listed regions in the error message.

2.1 Future improvements?

  • Update list of iconic species… see issue #671 (not 2018 assessment, see previous assesment issues)

3 Data Sources

List of iconic species:

Species native country information:


4 Methods

4.1 Setup

Using the IUCN API, we accessed the full IUCN species list at http://apiv3.iucnredlist.org/api/v3/speciescount?token=. With some minor formatting, this list contains the following variables:

iucn_sid | kingdom | phylum | class | order | family | genus | sciname | population | category

4.4 Identify countries with extant ICO species populations

Filtering the complete IUCN species list to include only the identified Iconic Species, we then use the IUCN API to access the list of countries in which each species occurs, from http://apiv3.iucnredlist.org/api/v3/species/countries/id/?token=. The country list identifies whether the species’ presence in that country is “Extant”, “Extinct Post-1500”, or “Possibly Extinct”; the “Extinct Post-1500” presence will be used later to identify locally extinct populations.

For each of these species, use the IUCN API to gather a list of countries in which it is present.

Name to region function (in OHI core package) reports regions that don’t have a match in OHI region list. Here we report certain reported regions at a higher spatial scale, based on the listed regions in the error message.

# Report these regions at higher spatial resolution:
country_split_1 <- data.frame(country = "Bonaire, Sint Eustatius and Saba", region = c('Bonaire', 'Saba', 'Sint Eustatius'))

country_split_2 <- data.frame(country = "French Southern Territories", region = c('Glorioso Islands', 'Juan de Nova Island', 'Bassas da India', 'Ile Europa', 'Ile Tromelin', 'Crozet Islands', 'Amsterdam Island and Saint Paul Island', 'Kerguelen Islands'))

country_split_3 <- data.frame(country = "Saint Helena, Ascension and Tristan da Cunha", region = c('Ascension', 'Saint Helena', 'Tristan da Cunha'))

country_split_4 <- data.frame(country = "United States Minor Outlying Islands",region = c('Wake Island', 'Jarvis Island', 'Palmyra Atoll', 'Howland Island and Baker Island', 'Johnston Atoll'))

country_split <- rbind(country_split_1, country_split_2, country_split_3, country_split_4)
country_split_data <- country_split %>%
  left_join(ico_spp_countries) %>%
  select(-country) %>%
  rename(country = region)

# Join country split data with ico_spp_countries
ico_spp_countries <- rbind(ico_spp_countries, country_split_data)
ico_spp_countries <- ico_spp_countries %>% 
  mutate(country = ifelse(country == "Curaçao", "Curacao", country)) %>% 
  mutate(country = ifelse(country == "Côte d'Ivoire", "Ivory Coast", country)) %>% 
  mutate(country = ifelse(country == "Réunion", "Reunion", country)) %>% 
  mutate(country = ifelse(country == "Saint Martin (French part)", "Sint Maarten", country)) %>% 
  rename(iucn_sid = name) # changed this back to align with ico_list for next step

ico_spp_rgn_raw <- name_2_rgn(df_in = ico_spp_countries, 
                       fld_name='country',
                       flds_unique = 'iucn_sid')

# No match for Isle of Man (not reported in OHI), Palestine (disputed), Saint Barthélemy (not reported in OHI) or Macedonia (landlocked)

# Check error message to make sure the unmatched countries don't have an OHI region match, that there are no duplicates, and that the unmatched regions make sense (not disputed, landlocked, not reported). If there are duplicates or unmatched regions that are reported in OHI, need to go back to fix and rerun the code.


summary(ico_spp_rgn_raw) # Make sure there are no NAs!

write_csv(ico_spp_rgn_raw, file.path(dir_github, 'raw/ico_spp_rgn_raw.csv'))

Filter for globally iconic species

ico_spp_rgn_raw <- read_csv(file.path(dir_github, 'raw/ico_spp_rgn_raw.csv'))
ico_list <- read_csv(file.path(dir_github, 'int/ico_list_prepped.csv'))

ico_spp_rgn_prepped <- ico_spp_rgn_raw %>%
  left_join(ico_list, by = 'iucn_sid') 
# check dimensions here; expect it to expand because each species in the iconic global list will be linked to the country it's listed in in the IUCN data

## filter this for species who are global (so all instances are iconic) 
##  OR ico_rgn_id matches rgn_id (so locally iconic matches with location)
ico_spp_rgn_prepped <- ico_spp_rgn_prepped %>%
  filter(ico_gl == TRUE | ico_rgn_id == rgn_id) %>% 
  as.tibble()
# dimensions should shrink after running this line
# Adding as.tibble() simplified the structure, prevents ico_rgn_id from being coerced to logical structue


write_csv(ico_spp_rgn_prepped, here('globalprep/ico/v2019/int/ico_spp_rgn_prepped.csv'))
# This part of the code presented an issue where the ico_rgn_id column values being preserved in this csv, but these data were being lost where the csv is imported in the next code chunk. Tried converting the column to numeric before saving and updating the tidyverse, neither of which solved the problem.  . Read_csv is coercing ico_rgn_id column to logical instead of numeric/integer format, as it only looks at the first 1000 rows of data, all of which are NA for ico_rgn_id (where ico_gl = TRUE). Read.csv solves the problem, but the better solution was to hard code col_types into the read_csv function in the next code chunk.


4.5 Identify extinction risk from current and past assessments

We accessed the IUCN API to determine past IUCN assessments for each of the identified iconic species: http://apiv3.iucnredlist.org/api/v3/species/history/id/?token=.

Each assessment includes a year and an extinction risk, along with additional information on the assessment.

These raw assessments are cleaned up to standardize codes and categories, and category scores are assigned based on a scale from Least Concern = 1 to Extinct = 0, as per the Species OHI subgoal. Note that past assessments used different coding, especially prior to 1994; we reclassified older codes according to this chart:


\[\begin{eqnarray*} \textbf{New Category} & & \longleftarrow \textbf{Original Category/Description}\\ NT & & \longleftarrow \text{"Lower risk/Near threatened ($LR/NT$)"}\\ T & & \longleftarrow \text{"Threatened ($T$)" treat as "$EN$"}\\ VU & & \longleftarrow \text{"Vulnerable ($V$)"}\\ EN & & \longleftarrow \text{"Endangered ($E$)" }\\ LR/CD & & \longleftarrow \text{"Lower risk/Conservation dependent ($LR/CD$)" treat as between $VU$ and $NT$}\\ CR & & \longleftarrow \text{"Very rare and believed to be decreasing in numbers"}\\ T & & \longleftarrow \text{"Less rare but believed to be threatened-Requires watching"}\\ DD & & \longleftarrow \text{"Insufficiently known ($K$)"}\\ DD & & \longleftarrow \text{"Indeterminate ($I$)"}\\ DD & & \longleftarrow \text{"Status inadequately known-Survey required or data sought"}\\ NE & & \longleftarrow \text{"Not recognized ($NR$)"} \end{eqnarray*}\]


4.6 Clean up the time series

iucn_sid | year | code | category | sciname


Using tidyr::complete() and tidyr::fill(), we create a full time series for all species from the earliest assessment to the most recent year.

ico_assess <- read_csv(file.path(dir_github, 'int/ico_assess_clean.csv'))
ico_list <- read_csv(file.path(dir_github, 'int/ico_list_prepped.csv'))

# Fill in category score for missing years based on previous year's data:
ico_assess_full <- ico_assess %>%
  mutate(eval_yr = year) %>% 
  select(-sciname) %>%
  arrange(iucn_sid, year) %>%
  complete(year = full_seq(year, 1), nesting(iucn_sid)) %>%
  group_by(iucn_sid) %>%
  fill(cat, cat_txt, cat_score, eval_yr) %>% ## fills all the way to latest year
  ungroup()

# does it make sense that there are 4818 NAs here for cat_score and eval_yr? 

ico_spp_cat <- ico_list %>% 
  rename(cat_2016 = cat) %>%
  left_join(ico_assess_full, by = c('iucn_sid'))

## if no time series available, time series years will be NA.  Assign a list to those NAs, then unnest it to create observations for those years.
ico_spp_cat <- ico_spp_cat %>%
  mutate(year = ifelse(is.na(year), 
                       list(c(min(year, na.rm = TRUE):max(year, na.rm = TRUE))), 
                       year)) %>% 
  unnest(year)
# ico_rgn_id is classified as numeric 

## NAs will be filled backward in time by starting from the most recent non-NA.
## To do this, we'll swap any current-year NAs with the cat_score (meaning no
## time series fill), and fill upwards instead of downwards.
ico_spp_cat <- ico_spp_cat %>%
  left_join(pop_cat %>% 
              rename(cat_2016 = cat, cat_2016_score = cat_score), 
            by = 'cat_2016') %>%
  mutate(cat_score = ifelse(year == max(year, na.rm = TRUE) & is.na(cat), 
                            cat_2016_score, 
                            cat_score)) %>%
  arrange(iucn_sid, year) %>%
  group_by(iucn_sid) %>%
  fill(cat, cat_score, cat_txt, .direction = 'up') %>%
  ungroup() %>%
  distinct()
  
write_csv(ico_spp_cat, file.path(dir_github, 'int/ico_spp_cat.csv'))

4.8 Prep dataframe for toolbox; estimate status and trend

The toolbox wants rgn_id, species sciname, and extinction risk category for the basic calculations. Since some regions contain multiple subpops (or parent/subpop) we also include iucn_sid to differentiate. This information is included for each year, filtered back to the year 2000.

While the official calculations are performed in the toolbox, we perform the same basic calcs here to get a sense of the ICO status and trend ahead of time. Report and summarize estimate of regional iconic species status:

ico_spp_rgn_cat <- read_csv(file.path(dir_github, 'int/ico_spp_rgn_cat.csv'),
                            col_types = "dcdcdddccldc")


## Report out for toolbox format (rgn_id | sciname | category or popn_trend for each species within a region).
## Note: in toolbox, group_by(rgn_id, sciname) and then summarize(category = mean(category)) to
##   average any parent/subpop species listings before aggregating to overall average per region.

ico_status_raw <- ico_spp_rgn_cat %>%
  select(rgn_id, rgn_name, sciname, iucn_sid, cat, cat_score, year, eval_yr) %>%
  arrange(rgn_id, desc(year), sciname) %>%
  ungroup()

# Get a preview of status and trend:
ico_status_calc <- ico_status_raw %>%
  group_by(rgn_id, rgn_name, sciname, year) %>%
  filter(!is.na(cat_score)) %>% # remove any DDs (data deficient)
  summarize(cat_score = mean(cat_score)) %>% 
  group_by(rgn_id, rgn_name, year) %>%
  summarize(mean_cat = round(mean(cat_score), 5), 
            ico_status = (1 - mean_cat) * 100,
            n_spp = n()) %>%
  ungroup()

ico_trend <- data.frame()
for (i in 1993:max(ico_status_calc$year, na.rm = TRUE)) { # i <- 2013
  tmp_status <- ico_status_calc %>%
    filter(year <= i & year > (i - 10)) # trend based on 10-year average since assessments are sporadic
  tmp_trend <- tmp_status %>%
    group_by(rgn_id) %>%
    do(trend_lm = lm(ico_status ~ year, data = .)$coefficients[2]) %>%
    mutate(year = i,
           trend_lm  = as.numeric(trend_lm)/100, # status is 0 - 100; trend should be +1 to -1
           ico_trend = round(trend_lm * 5, 5)) %>% # trend prediction five years out
    ungroup()
  ico_trend <- ico_trend %>%
    bind_rows(tmp_trend)
}

ico_sum <- ico_status_raw %>%
  left_join(ico_status_calc, by = c('rgn_id', 'rgn_name', 'year')) %>%
  left_join(ico_trend, by = c('rgn_id', 'year'))

write_csv(ico_sum, file.path(dir_github, 'summary/ico_summary.csv'))
### Report out for finalized status and trend values per region

ico_status_raw1 <- ico_status_raw %>%
  dplyr::select(rgn_id, sciname, iucn_sid, year, eval_yr, category = cat)

# Create the files the toolbox will use:
write_csv(ico_status_raw1, file.path(dir_github, 'output/ico_spp_iucn_status.csv'))
write_csv(ico_status_calc, file.path(dir_github, 'output/ico_status_calc.csv'))
write_csv(ico_trend,       file.path(dir_github, 'output/ico_trend.csv'))

ico_status_raw1[duplicated(ico_status_raw1 ), ] 
### NOTE: if iucn_sid were removed, this would show duplicates due to subpops with same category.
table(ico_status_raw1$category)

4.8.1 Iconic Species full list (year == 2018)

4.10 Compare OHI 2019 vs OHI 2018 assessment results

Plot the estimated status and trends for 2016 from assessments v2017, v2018, and v2019.

ico_sum <- read_csv(file.path(dir_github, 'summary/ico_summary.csv'))

status_trend <- ico_sum %>% # status and trend in this year's assessment
  filter(year == max(year)-1) %>%  # max year of ico_sum is 2019; want to compare the most recent shared year (2018)
  select(rgn_id, ico_status, ico_trend, n_spp) %>% 
  distinct() # because removing iucn_sid, cat, cat_score, and species name, results in duplicates
  
ico_compare <- read_csv(here('globalprep/ico',previous_scen,'/summary/ico_summary.csv')) %>% 
  filter(year == max(year)) %>% # 2018 data is max of last year's data
  select(rgn_id, rgn_name, ico_status, ico_trend, n_spp) %>% 
  distinct() %>% 
  rename(previous_status = ico_status, previous_trend = ico_trend, previous_nspp = n_spp) %>% 
  left_join(status_trend, by = 'rgn_id') %>% 
  mutate(d_nspp = as.factor(n_spp - previous_nspp))
# one region has NAs 

compare_status_plot <- ggplot(ico_compare) + 
  geom_point(aes(previous_status, ico_status, color = d_nspp, label = rgn_id), alpha = 0.6) + 
    geom_abline(slope = 1, intercept = 0, color = 'red') +
  labs(x = sprintf('ICO status %s', previous_scen),
       y = sprintf('ICO status %s', scenario),
       title = 'ICO Status Comparison')
ggplotly(compare_status_plot)

trend_compare_plot <- ggplot(ico_compare) +
  geom_point(aes(previous_trend, ico_trend, color = d_nspp, label = rgn_id), alpha = .6) +
  geom_abline(slope = 1, intercept = 0, color = 'red') +
  labs(x = sprintf('ICO trend %s', previous_scen),
       y = sprintf('ICO trend %s', scenario),
       title = 'ICO Trend Comparison')
ggplotly(trend_compare_plot)

4.11 Data forensics - why are outliers occurring?

No need to run this section of the code; for data exploration purposes only.

Region ID Region Name Status Prev Status d_nspp Insights
3 Norfolk Island 72 80 3 - only kind of an outlier after all other outliers under the trendline were addressed; not sure this is still of concern
16 Australia 61.6 76.4 -22 Spp were being removed due to data not being preserved through import
52 Bahrain 57.5 46.7 2 Went from having 7 spp in 2018 to 9 in 2019 (added tiger shark (NT) and humpback whale (LC), other spp did not change status)
*59 Belgium 77.5 63.3 1 Similar situation to Netherlands; went from 4 spp in 2018 to 5 spp in 2019 by adding the humpback whale (LC)
73 Russia 54.4 73.8 -11 Spp were being removed due to data not being preserved through import
105 Bouvet Island (Norway) 40 70 1 Status jumped because this region had only one fin whale that was endangered, now it is vulnerable and they’ve added another species of least concern - source data issue!
175 Denmark 68.3 83.9 -12 Spp were being removed due to data not being preserved through import
176 Germany 62.5 83.3 -11 Spp were being removed due to data not being preserved through import
*177 Netherlands 78 67.5 1 In 2019, Netherlands went from 5 to 6 species by adding the humpback whale (status = least concern) and the fin whale went from being vulnerable to endangered = big jump in improvement in status?
192 Iraq 62 48 0 Went from 5 spp in 2018 to 6 in 2019. The fin whale (status=EN) was dropped from the list, tiger shark was added (NT) as well as another humpback whale subpopulation
213 Antarctica 70 80 -1 disregard; not included in OHI
215 Jordan 66.7 50 1 Went from having 2 spp in 2018 to 3 spp in 2019 (added humpback whale, other spp did not change status)
222 Sweden 60 82.4 -12 Spp were being removed due to data not being preserved through import