ohi logo
OHI Science | Citation policy

Summary

The commercial fishing pressure layers are created from spatialized catch by gear data provided by Watson (2018), and net primary production data from the Vertically Generalized Production Model (VGPM) as described in Behrenfeld and Falkowski (1997).

This script prepares and formats the IMAS Global Fisheries Catch raw data into intermediate data by combining Industrial (CatchInd_XXXX_XXXX) and Non-Industrial (CatchNInd_XXXX_XXXX) rds files with IndexInd.csv, IndexNInd.csv, and Cells (a sheet within Codes.xlsx) as well as a single file with all years.

We create raster files of total annual global catch from the Watson data and correct for system productivity by dividing catch by NPP: quick review.

Updates from previous assessment

  • Previously, the raw catch data was all in a single file. This year, we have to combine across five different data tables: catch data (CatchInd_XXXX_XXXX), two master index files split by non industrial and industrial with country name, species, etc (IndexInd/IndexNInd.csv), geospatial information (sheet name Cells in Codes.xlsx), taxon information (sheet name Taxa in Codes.xlsx), and country information (sheet name Country in Codes.xlsx).

Data Source

Reference: Watson, R. A. and Tidd, A. 2018. Mapping nearly a century and a half of global marine fishing: 1869–2015. Marine Policy, 93, pp. 171-177. (Paper URL)

Downloaded: July 3, 2019 from IMAS portal - click on download tab, step 3

Description: Global fisheries landings data per cell separated by Industrial versus Non-Industrial catch, IUU, and discards.

Native data resolution:

Time range: 1950 - 2015

Format: CSV format

Additional Information: Metadata, Supplementary Material


Setup

New Global Fisheries Raw Data

Download the following from the data_download.Rmd script:

  • Industrial Catch (1950 - 2015) - reported, iuu, and discard catch data for each cell location and unique identifier
  • Master Index Files (IndexInd/IndexNInd.csv) - information associated with the unique identifiers in the Industrial Catch data
  • Spatial Cells Reference (Codes.xlsx - sheet name is Cell) - contains geospatial information associated wtih the Industrial Catch data
  • Gear Reference (Codes.xlsx - sheet name is Gear) - contains information regarding how different fishing gear is classified in the index datasets. This will probably not be used for the fishing pressure layers.
  • Taxa Reference (Codes.xlsx - sheet name is Taxa) - contains information regarding how different taxa is classified in the index datasets.
  • Country Reference (Codes.xlsx - sheet name is Country) - contains informations regarding how different countries are labeled in the index datasets.

Explore Data

Read in Index files, Spatial cells, Taxa reference, Country reference, and a single 5-year Catch dataset.

Here we will need to wrangle a combination of the index files, Gear reference, Taxa reference, and Country reference to have columns ID, Year, CountryName, TaxonName, CommonName, IndReported, IndIUU, IndDiscards, NIndReported, NIndIUU, and NIndDiscards.

Test to see whether the values for IndReported, IndIUU, and IndDiscards in the Master Index File are sum totals of Abalones in the Mexico in year 1953.

Let’s do another test. Test to see whether the values for NIndReported, NIndIUU, and NIndDiscards in the Master Index File are sum totals of Sailfin sandfish in the Japan in year 1953.

Methods

Wrangle

Tidy Fisheries Files: 1. Combine the Master Index and Spatial Cells with the CatchInd and CatchNInd files. 2. Save each year of data as a separate file into: “globalprep/prs_fish/v2019/int/annual_catch”

Function for Combining & Saving Files: Create function to read in each file name, combine with Index and Cells data frame, and save each year of catch data into a separate file in mazu.

Industrial Catch Data

Create list of industrial catch file names and apply the combine_fis function to save each individual 5-year interval catch data.

Non-industrial Catch Data

Create list of non-industrial catch file names and apply the combine_fis function to save each individual 5-year interval catch data. Then, create a single file that has all years of data.

Create Annual Catch Rasters

We want a landings raster and a discards raster for Industrial (commericial) and Non-Industrial (artisanal) data per year.

Note: Annual catch per cell contains values in units of kg per km2. Since values in Reported, Discards, Landings, and IUU are in tonnes, they must be first converted to kg then divided by the Area column.

  1. Setup
  1. First get the template raster with a resolution of 0.5 degree cells, from the cells data.

The values of these cells are the Cell ID numbers.

  1. Create commercial and artisanal rasters
  • comm_landings_XXXX.tif
  • comm_discards_XXXX.tif
  • artisanal_landings_XXXX.tif
  • artisanal_discards_XXXX.tif

Catch data is in tonnes, so need to convert to kg then divide by OceanAreasqkm (km2) to get catch per area. Since Cell is the identifier for a specific 30-min spatial cell, we want to add up total landings and total discards grouped by Cell.

Note: Subset for data 2003 and onwards since NPP data starts at 2003. Since we will be taking 5-year averages, we will need years 1999-2003 to get the average value for year 2003.

## Specify years of data, file locations, raster output location
years = c(1999:2015)
annual_files <- list.files(paste0(rastFolder, "annual_catch"), full.names=TRUE)


## Specify a list of arguments - commercial or artisanal - for reading in data and saving raster file name
raw_suffx <- list(catchind = "CatchInd", catchnind = "CatchNInd")
rast_prefx <- list(comm = "comm", artisanal = "artisanal")

## Function for totalling landings/discards and saves raster files 
catch2raster <- function(raw_suffx,rast_prefx){
  
foreach(yr = years) %dopar% { 
  #yr = 2011
  
  ## find file path of the respective year of data
  yr <- as.character(yr)
  ## Select the respective year of industrial catch data
  dataname <- str_subset(annual_files, paste0(raw_suffx,"*.",yr))
  ## read in raw data
  raw_data <- readRDS(dataname)

  ## Total Landings per Cell
  landings <- raw_data %>%
    dplyr::mutate(Landings_CR = (Landings * 1000)/OceanAreasqkm) %>% # convert to catch rate (kg/km^2)
    dplyr::group_by(Cell) %>%
    dplyr::summarise(cell_catch = sum(Landings_CR, na.rm=TRUE)) %>% # usually no NAs, but just in case
    dplyr::ungroup() %>%
    data.frame()
 
  #length(unique(raw_data$Cell))
   
#test what is occurring with cells in v4.0
#landings_test <- landings %>%
#  dplyr::arrange(Cell) %>%
#  dplyr::mutate(lag = lag(Cell)) %>%
#  dplyr::mutate(difference = Cell - lag)

  ## Total Discards per Cell
  discards <- raw_data %>%
    dplyr::mutate(Discards_CR = (Discards * 1000)/OceanAreasqkm) %>% 
    dplyr::group_by(Cell) %>%
    dplyr::summarise(cell_catch = sum(Discards_CR)) %>% 
    dplyr::ungroup()
  
  ## Rasterize Catch: swapping template cell values with those in dataframe
 raster::subs(cells_raster, landings, by = "Cell", which = "cell_catch", subsWithNA=TRUE, filename = paste0(rastFolder, rast_prefx, '_landings/annual_catch_', yr ,'.tif'), overwrite=TRUE)

#rast_test <- raster::subs(cells_raster, landings, by = "Cell", which = "cell_catch", subsWithNA=TRUE)
#writeRaster(rast_test, filename = paste0(rastFolder, "artisanal", "_landings/annual_catch_", yr, ".tif"))
#plot(rast_test)

  raster::subs(cells_raster, discards, by = "Cell", which = "cell_catch", subsWithNA=TRUE, 
               filename = paste0(rastFolder, rast_prefx, '_discards/annual_catch_', yr ,'.tif'), overwrite=TRUE) 
  
  }
}

## Applies catch2raster function on commercial (industrial) then artisanal (non-industrial) files
create_rast <- map2(raw_suffx, rast_prefx, catch2raster)

Check Rasters against Watson Fig

Compare composite of newly created rasters with industrial landings figure created in Watson (2018). See Figure 2D, which maps industrial landings in tonnes between 2000 and 2015.

## Read in rasters for commercial landings between 2000 and 2015
commercial <- list.files(paste0(rastFolder,"comm_landings"),full.names = TRUE)
commercial <- commercial[!(str_detect(commercial, "mean_catch"))] # remove files with 'mean_catch' in file path
commercial <- commercial[!(str_detect(commercial, "corr"))] # remove files with 'corr' in file path


## Plot one year of data
catch <- raster(commercial[17])
plot(catch)
res(catch) # 0.5 degree cells


## Compare 2000-2015 industrial catch (log-transformed) to Watson figure
## create a raster stack from the input raster files (exclude 1999 which is commercial[1])
allRasters <- raster::stack(commercial[2:17])


## run the sum function on the raster stack - i.e. add (non-cumulatively) the rasters together
tot_Catch <- sum(allRasters)

## apply log to adjust for really large values
log_Catch <- raster::calc(tot_Catch, function(x){log(x+1)},
             filename = paste0(rastFolder,"datacheck/commLand_2000_2015_log.tif"), overwrite=TRUE)


## Plot & Save raster
png(filename="figs/log_IndCatch_2000_2015.png")
plot(log_Catch, col=cols, alpha=1)
dev.off()

## Check out some values or zoom in on an area
plot(log_Catch, col=cols, alpha=1)
click(log_Catch)
zoom(log_Catch)


## Read in rasters for artisanal landings between 2000 and 2015
artisanal <- list.files(paste0(rastFolder,"artisanal_landings"),full.names = TRUE)
artisanal <- artisanal[!(str_detect(artisanal, "mean_catch"))] # remove files with 'mean_catch' in file path
artisanal <- artisanal[!(str_detect(artisanal, "corr"))] # remove files with 'corr' in file path


## Plot one year of data
catch <- raster(artisanal[17])
plot(catch)
res(catch) # 0.5 degree cells


## Compare 2000-2015 industrial catch (log-transformed) to Watson figure
## create a raster stack from the input raster files (exclude 1999 which is commercial[1])
allRasters <- raster::stack(artisanal[2:17])


## run the sum function on the raster stack - i.e. add (non-cumulatively) the rasters together
tot_Catch <- sum(allRasters)

## apply log to adjust for really large values
log_Catch <- raster::calc(tot_Catch, function(x){log(x+1)},
             filename = paste0(rastFolder,"datacheck/artisanalLand_2000_2015_log.tif"), overwrite=TRUE)


## Plot & Save raster
png(filename="figs/log_NIndCatch_2000_2015.png")
plot(log_Catch, col=cols, alpha=1)
dev.off()

## Check out some values or zoom in on an area
plot(log_Catch, col=cols, alpha=1)
click(log_Catch)
zoom(log_Catch)

Standardize Fisheries Catch

Use net primary production (NPP) data to correct global fisheries catch for spatial differences in ecosystem impact

  • Prepare and gapfill NPP data in npp.Rmd; NOTE: We did not update the NPP data for OHI 2019 since the source data did not change.
  • Resample NPP and fish catch rasters to ocean raster
  • Correct fish catch with NPP (catch divided by NPP)

Read in gapfilled NPP rasters

Create function standardizing catch by NPP

  • read in a single year of catch data
  • read in the same year of npp data
  • transform npp raster to match the crs and res of the catch raster (coarser resolution will run faster)
  • correct fisheries catch with npp raster (the land values should turn back to NAs since they are NA in catch rasters)
  • save corrected fish catch rasters!!

Note: Avoid saving output rasters from projectRaster(), resample(), or other raster function into a variable (#45). Instead, specify a filepath in the argument filename if available, and call the raster back in using raster() to run the next raster function (e.g. overlay()). Sometimes over=TRUE is not enough to prevent the weird wrapping (see stackoverflow).

Create corrected catch rasters - apply catch_npp_fun function. Resolution and projection will be matched with NPP rasters (8 x 10 km resolution and Mollweide coordinate system).

Check corrected raster

Check if map plots outside of the global ellipse: - See example of issue in stackoverflow - If it is plotting weirdly, will eventually be fixed in the final output map, but may be confusing throughout the data processing steps

Calculate 5-Year Means

Calculate 5-year means for commercial and artisanal landings & discards. Set up foreach loop to start with the oldest year of data (2003) and end on the 5th most recent year of data (2011).

Check 5-year mean raster - make sure land values are NA not zero.

Reference Point

Find 99.99th quantile to use as reference point:

  • ref_clb is reference point for Commercial Low Bycatch
  • ref_chb is reference point for Commercial High Bycatch
  • ref_alb is reference point for Artisanal Low Bycatch
  • ref_ahb is reference point for Artisanal High Bycatch

Artisanal High Bycatch

Get artisanal discards values across all years 2003-2015.

Save your reference points to a data frame and save to file – may want to change pressures column to characters and ref_point column to numeric..

Note: all quantiles except for commercial high bycatch increased slightly from last year’s assessment (see v2018 prep).

Rescale (0-1) & Resample (1-km res)

Use reference points ref_clb on commercial landings data, ref_chb on commercial discards data, and ref_alb on artisanal landings data.

Rasters are currently in resolution 8.35 km x 10.3 km (same as NPP rasters) and in Mollweide coordinate system. Convert to ocean raster 1-km x 1-km resolution.

mean_CL <- list.files(paste0(rastFolder, 'comm_landings'), pattern = "mean_catch", full.names=TRUE); mean_CL <- mean_CL[!(str_detect(mean_CL, "1km"))]

mean_CD <- list.files(paste0(rastFolder, 'comm_discards'), pattern = "mean_catch", full.names=TRUE); mean_CD <- mean_CD[!(str_detect(mean_CD, "1km"))]

mean_AL <- list.files(paste0(rastFolder, 'artisanal_landings'), pattern = "mean_catch", full.names=TRUE); mean_AL <- mean_AL[!(str_detect(mean_AL, "1km"))]

mean_AD <- list.files(paste0(rastFolder, 'artisanal_discards'), pattern = "mean_catch", full.names=TRUE); mean_AD <- mean_AD[!(str_detect(mean_AD, "1km"))]


foreach (i = 2003:2011) %dopar%{ # i = 2011
  
  yrs <- c(i:(i+4))

  mean_CL[which(str_detect(mean_CL, pattern = paste0("_",i,"_")))] %>%
            raster() %>%
            calc(fun=function(x){ifelse(x > ref_clb, 1, x/ref_clb)}) %>%
            calc(fun=function(x){ifelse(x < 0, 0, x)}) %>%
            resample(ocean, method = 'ngb', filename = paste0(rastFolder, 'comm_landings/mean_catch_1km_', yrs[1], '_', yrs[5], '.tif'), overwrite = TRUE)


  mean_CD[which(str_detect(mean_CD, pattern = paste0("_",i,"_")))] %>%
            raster() %>%
            calc(fun=function(x){ifelse(x > ref_chb, 1, x/ref_chb)}) %>%
            calc(fun=function(x){ifelse(x < 0, 0, x)}) %>%
            resample(ocean, method = 'ngb', filename = paste0(rastFolder, 'comm_discards/mean_catch_1km_', yrs[1], '_', yrs[5], '.tif'), overwrite = TRUE)



  mean_AL[which(str_detect(mean_AL, pattern = paste0("_",i,"_")))] %>%
            raster() %>%
            calc(fun = function(x){ifelse(x > ref_alb, 1, x/ref_alb)}) %>%
            calc(fun = function(x){ifelse(x < 0, 0, x)}) %>%
            resample(ocean, method = 'ngb', filename = paste0(rastFolder, 'artisanal_landings/mean_catch_1km_', yrs[1], '_', yrs[5], '.tif'), overwrite = TRUE)

  mean_AD[which(str_detect(mean_AD, pattern = paste0("_",i,"_")))] %>%
            raster() %>%
            calc(fun = function(x){ifelse(x > ref_ahb, 1, x/ref_ahb)}) %>%
            calc(fun = function(x){ifelse(x < 0, 0, x)}) %>%
            resample(ocean, method = 'ngb', filename = paste0(rastFolder, 'artisanal_discards/mean_catch_1km_', yrs[1], '_', yrs[5], '.tif'), overwrite = TRUE)

}

Check out intermediate results

Mask with Ocean Raster

Note: Might be able to skip masking with ocean raster if there were no wrapping issues.

Check masked rasters!

Extract Data from Rasters

Summary: Combine all years of data for catch types low bycatch commercial, high bycatch commercial, and low bycatch artisanal and extract the average pressure data for each region. Do some tidying and save the three outputs.

Get OHI Raster/Zonal Data:

  • regions_eez_with_fao_ant.tif - This includes all the ocean regions (eez/fao/antarctica), but the raster cell values correspond to the rgn_ant_id in regions_2017_update. This file is most often used to extract pressure values for each region.
  • regionData.csv - has data for spatial id’s used in raster

Data check

Compare with previous year’s data

## Commercial Low Bycatch pressure 
new <- read.csv("output/comm_lb.csv") %>%
  filter(year == 2014) %>%
  dplyr::select(rgn_id, new_pressure_score = pressure_score) 

old_new <- read.csv("../v2018/output/comm_lb.csv") %>%
  filter(year == 2014) %>% 
  left_join(new, by = 'rgn_id')

plot(old_new$new_pressure_score, old_new$pressure_score)
abline(0, 1, col="red")

old_check1_comm_lb <- old %>%
  filter(is.na(pressure_score), new_pressure_score >=0 ) # 1 new regions w data
old_check2_comm_lb <- old %>%
  filter(pressure_score >= 0 , is.na(new_pressure_score)) # 0 regions now with NA that previously weren't

## Commercial High Bycatch pressure 
new <- read.csv("output/comm_hb.csv") %>%
  filter(year == 2014) %>%
  dplyr::select(rgn_id, new_pressure_score = pressure_score) 

old <- read.csv("../v2018/output/comm_hb.csv") %>%
  filter(year == 2014) %>% 
  left_join(new, by = 'rgn_id')

plot(old$new_pressure_score, old$pressure_score)
abline(0, 1, col="red")

old_check1_comm_hb <- old %>%
  filter(is.na(pressure_score), new_pressure_score >=0 ) # 1 new regions w data
old_check2_comm_hb <- old %>%
  filter(pressure_score >= 0 , is.na(new_pressure_score)) # 0 regions now with NA that previously weren't


## Artisanal Low Bycatch pressure 
new <- read.csv("output/art_lb.csv") %>%
  filter(year == 2014) %>%
  dplyr::select(rgn_id, new_pressure_score = pressure_score) 

old <- read.csv("../v2018/output/art_lb.csv") %>%
  filter(year == 2014) %>% 
  left_join(new, by = 'rgn_id')

plot(old$new_pressure_score, old$pressure_score)
abline(0, 1, col="red")

old_check1_art_lb <- old %>%
  filter(is.na(pressure_score), new_pressure_score >=0 ) # 10 new regions w data
old_check2_art_lb <- old %>%
  filter(pressure_score >= 0 , is.na(new_pressure_score)) # 2 regions now with NA that previously weren't

## Artisanal High Bycatch pressure was calculated using different data before
new <- read.csv("output/art_hb.csv") %>% 
  filter(year == 2014) %>% 
  dplyr::select(rgn_id, new_pressure_score = pressure_score) 

old <- read.csv("../v2018/output/art_hb.csv") %>% 
  dplyr::filter(year == 2014) %>% 
  left_join(new, by = 'rgn_id')

plot(old$new_pressure_score, old$pressure_score)
abline(0, 1, col="red")


old_check1_art_hb <- old %>%
  filter(is.na(pressure_score), new_pressure_score >=0 ) # 10 new regions w data
old_check2_art_hb <- old %>%
  filter(pressure_score >= 0 , is.na(new_pressure_score)) # 2 regions now with NA that previously weren't

old_check3_art_hb <- old %>%
  mutate(pressure_diff = pressure_score - new_pressure_score) # 35 regions with better scores, rest are worse. 

Check for variation in data years for 2019 AY

## Commercial Low Bycatch - 8/14/18 not much change from year to year within same assessment year
##                        - 8/14/19 not much change from year to year within same assessment year
new <- read.csv("output/comm_lb.csv") %>%
  filter(year == 2014) %>%
  dplyr::select(rgn_id, new_pressure_score = pressure_score) 

new2 <- read.csv("output/comm_lb.csv") %>%
    filter(year == 2013) %>%
  left_join(new, by = 'rgn_id')

plot(new$new_pressure_score, new2$pressure_score)
abline(0,1, col="red")


## Commercial High Bycatch - 8/14/18 not much variation from year to year within same assessment year
##                         - 8/14/19 not much variation from year to year within same assessment year
new <- read.csv("output/comm_hb.csv") %>%
  filter(year == 2014) %>%
  dplyr::select(rgn_id, new_pressure_score = pressure_score) 

new2 <- read.csv("output/comm_hb.csv") %>%
    filter(year == 2013) %>%
  left_join(new, by = 'rgn_id')

plot(new$new_pressure_score, new2$pressure_score)
abline(0,1, col="red")

## Artisanal Low Bycatch - 8/14/18 not much variation from year to year within same assessment year
##                       - 8/14/19 not much variation from year to year within same assessment year
new <- read.csv("output/art_lb.csv") %>%
  filter(year == 2014) %>%
  dplyr::select(rgn_id, new_pressure_score = pressure_score) 

new2 <- read.csv("output/art_lb.csv") %>%
    filter(year == 2013) %>%
  left_join(new, by = 'rgn_id')

plot(new$new_pressure_score, new2$pressure_score)
abline(0,1, col="red")

## Artisanal Low Bycatch - 8/14/18 not much variation from year to year within same assessment year
##                       - 8/14/19 not much variation from year to year within same assessment year
new <- read.csv("output/art_lb.csv") %>%
  filter(year == 2014) %>%
  dplyr::select(rgn_id, new_pressure_score = pressure_score) 

new2 <- read.csv("output/art_lb.csv") %>%
    filter(year == 2013) %>%
  left_join(new, by = 'rgn_id')

plot(new$new_pressure_score, new2$pressure_score)
abline(0,1, col="red")