This analysis converts FAO mariculture data into data used to calculate the OHI global mariculture status score. This also calculates the genetic escapee from mariculture pressure data.
Reference:
http://www.fao.org/fishery/statistics/software/fishstatj/en#downlApp Release date: March 2019 FAO Global Aquaculture Production Quantity 1950_2016 FAO metadata found here
Downloaded: 5/17/2019
Description: Quantity (tonnes) of mariculture for each country, species, year.
Time range: 1950-2017
Reference:
Mariculture Sustainability Index (Trujillo 2008)
Trujillo, Pablo. 2008. “Using a Mariculture Sustainability Index to Rank Countries’ Performance.” In Fisheries Centre Research Reports, edited by Jackie Alder and Daniel Pauly. University of British Columbia, Vancouver, Canada: Fisheries Centre Research Reports. https://circle.ubc.ca/handle/2429/40933.
Original data table is located in the Trujillo 2008 paper saved in the OHI Zotero database. Report starts on page 28 of the link provided in the reference.
Description:
Original MSI rescaled from 1-10 to 0-1 for mariculture taxa groups.
## [1] "/home/sgclawson/github/ohiprep_v2019/globalprep/mar/v2019"
knitr::opts_chunk$set(eval=FALSE)
## load libraries, set directories
library(ohicore) #devtools::install_github('ohi-science/ohicore@dev')
library(dplyr)
library(stringr)
library(tidyr)
library(tidyverse)
## Load FAO-specific user-defined functions
source('mar_fxs.R') # functions specific to mariculture dealing with compound countries
source('../../../workflow/R/fao_fxn.R') # function for cleaning FAO files
source('../../../workflow/R/common.R') # directory locations
## This file makes it easier to process data for the OHI global assessment
## by creating the following objects:
##
## * dir_M = identifies correct file path to Mazu (internal server) based on your operating system
## * mollCRS = the crs code for the mollweide coordinate reference system we use in the global assessment
## * regions_shape() = function to load global shapefile for land/eez/high seas/antarctica regions
## * ohi_rasters() = function to load two rasters: global eez regions and ocean region
## * region_data() = function to load 2 dataframes describing global regions
## * rgn_syns() = function to load dataframe of region synonyms (used to convert country names to OHI regions)
## * low_pop() = function to load dataframe of regions with low and no human population
## * UNgeorgn = function to load dataframe of UN geopolitical designations used to gapfill missing data
Mariculture production in tonnes.
Filter freshwater mariculture, make long format, and clean FAO codes.
mar <- mar %>%
rename(country = `Country (Country)`,
FAO_name = `Species (ASFIS species)`,
fao = `Aquaculture area (FAO major fishing area)`,
environment = `Environment (Environment)`)
table(mar$environment)
## include only marine environments
mar <- mar %>%
filter(environment %in% c("Brackishwater", "Marine"))
## convert to long format and clean FAO codes:
## for some reason, I can't provide the data range in gather programatically!
mar <- mar %>%
select(-Unit)
mar <- mar %>%
gather(key="year", value="value", num_range("",1950:2017)) %>%
fao_clean_data()
Update species name in the raw/species_list.csv
file with names in the mar
dataset (exclude non-food species). I simplified the species_list. I cut the “species” name columns because it wasn’t clear what this was trying to accomplish and created potential error.
## Commented out 'read.csv' lines are different versions of the dataset, representing the three different seaweed exclusion methods (exclude some seaweed, all seaweed, all seaweed nei). Use at the end for data checking.The first one is the one which is saved to the "output" folder.
mar_sp <- read.csv('raw/species_list.csv', stringsAsFactors=FALSE) %>%
select(FAO_name, exclude, alias, Taxon_code)
#mar_sp <- read.csv('raw/species_list.csv', stringsAsFactors=FALSE) %>%
# select(FAO_name, exclude_no_seaweed, alias, Taxon_code)
#mar_sp <- read.csv('raw/species_list.csv', stringsAsFactors=FALSE) %>%
# select(FAO_name, exclude_no_nei, alias, Taxon_code)
new.spp <- setdiff(mar$FAO_name, mar_sp$FAO_name)
new.spp # check: if dim has 0 rows it means all match
## if there is a list of species, hand check species_list.csv to see whether to keep (exclude seaweeds and species harvested for ornamental/medicinal), check if synonyms match Trujillo names.
## This is outlined in issue #8 and issue #81 for reference. You need to add the species output from this to species_list.csv and fill out the corresponding information. I.e. add FAO_name, exlude (whether to include in assessment or not. If they are only harvested for medicinal purposes, do not include), alias, and Taxon_code to species_list.csv.
## REMOVE SPECIES not relevant to mariculuture goal (i.e., non-food species)
mar <- mar %>% left_join(mar_sp, by="FAO_name")
## Filters out species that should be excluded from the MAR sub-goal, depending on which of the 3 scenarios was read in. Searches for column name "exclude", "exclude_no_seaweed", or "exclude_no_nei"
if(sum(str_detect(names(mar_sp), "exclude$"))==1) {
mar <- mar %>% filter(exclude==0)
} else if (sum(str_detect(names(mar_sp), "exclude_no_seaweed"))==1) {
mar <- mar %>% filter(exclude_no_seaweed==0)
} else if (sum(str_detect(names(mar_sp), "exclude_no_nei"))==1) {
mar <- mar %>% filter(exclude_no_nei==0)
}
## change names using species alias or FAO species name (global changes)
mar$species <- ifelse(!is.na(mar$alias), mar$alias, mar$FAO_name)
## sum production values for each group to account for duplicate rows after name change (remove NA values)
mar <- mar %>%
filter(!is.na(value)) %>%
group_by(country, fao, environment, species, year, Taxon_code) %>%
summarize(value = sum(value)) %>%
ungroup()
## eliminate country-species data with zero production throughout the time-series (1950-recent)
mar <- mar %>%
group_by(country, species) %>%
mutate(total_value = sum(value)) %>%
filter(total_value > 0) %>%
select(-total_value) %>%
ungroup()
# Divide mariculture from countries that we report as separate regions (assume equal production in all regions)
# Netherlands Antilles: Conch restoration among Aruba, Bonaire, Curacao
# Channel Islands: Jersey and Guernsey
# Bonaire/S.Eustatius/Saba
# Yugoslavia SFR: no longer a country after 1992
mar <- mar %>%
mutate(country = ifelse(country=="R\xe9union", "Reunion", country)) %>% # this one is hard to get right
mar_split() # function in mar_fxs.R
mar_rgn <- name_2_rgn(df_in = mar,
fld_name='country',
flds_unique=c('species', 'fao', 'environment', 'Taxon_code', 'year'))
## sum values of regions with multiple subregions
mar_rgn <- mar_rgn %>%
group_by(fao, environment, species, year, Taxon_code, rgn_id) %>%
summarize(value = sum(value)) %>%
ungroup()
Take a look at the tidied data for a single year and region
For some regions, a specific species can be altered so that it matches more general Trujillo sustainability data. In this case, I don’t want the name changes to be global because some regions may have more specific species data.
(Will explore this in the future, but will not implement this year).
# ## based on looking at the list, make a few name changes to match the regions Trujillo data
# # Chile name modification
# mar_rgn$species[mar_rgn$rgn_id==224 & mar_rgn$species == "Red abalone"] <- "Abalones nei"
# mar_rgn$species[mar_rgn$rgn_id==224 & mar_rgn$species == "Japanese abalone"] <- "Abalones nei"
#
# # China
# mar_rgn$species[mar_rgn$rgn_id==209 & mar_rgn$species == "Areolate grouper"] <- "Groupers nei"
# mar_rgn$species[mar_rgn$rgn_id==209 & mar_rgn$species == "Greasy grouper"] <- "Groupers nei"
# mar_rgn$species[mar_rgn$rgn_id==209 & mar_rgn$species == "Hong Kong grouper"] <- "Groupers nei"
# mar_rgn$species[mar_rgn$rgn_id==209 & mar_rgn$species == "Orange-spotted grouper"] <- "Groupers nei"
#
# # Honduras
# mar_rgn$species[mar_rgn$rgn_id==133 & mar_rgn$species == "Whiteleg shrimp"] <- "Penaeus shrimps nei"
#
# # Italy
# mar_rgn$species[mar_rgn$rgn_id==84 & mar_rgn$species == "Pacific cupped oyster"] <- "Cupped oysters nei"
#
# # New Zealand
# mar_rgn$species[mar_rgn$rgn_id==162 & mar_rgn$species == "Rainbow abalone"] <- "Abalones nei"
#
# # Pakistan
# mar_rgn$species[mar_rgn$rgn_id==53 & mar_rgn$species == "Penaeus shrimps nei"] <- "Marine crustaceans nei"
#
# # Philippines
# mar_rgn$species[mar_rgn$rgn_id==15 & mar_rgn$species == "Whiteleg shrimp"] <- "Penaeus shrimps nei"
#
# # Portugal
# mar_rgn$species[mar_rgn$rgn_id==183 & mar_rgn$species == "Golden carpet shell"] <- "Marine molluscs nei"
# mar_rgn$species[mar_rgn$rgn_id==183 & mar_rgn$species == "Peppery furrow"] <- "Marine molluscs nei"
# mar_rgn$species[mar_rgn$rgn_id==183 & mar_rgn$species == "Solen razor clams nei"] <- "Razor clams nei"
# mar_rgn$species[mar_rgn$rgn_id==183 & mar_rgn$species == "Atlantic bluefin tuna"] <- "Marine fishes nei"
# mar_rgn$species[mar_rgn$rgn_id==183 & mar_rgn$species == "Meagre"] <- "Marine fishes nei"
# mar_rgn$species[mar_rgn$rgn_id==183 & mar_rgn$species == "Seabasses nei"] <- "Marine fishes nei"
# mar_rgn$species[mar_rgn$rgn_id==183 & mar_rgn$species == "Soles nei"] <- "Marine fishes nei"
# mar_rgn$species[mar_rgn$rgn_id==183 & mar_rgn$species == "White seabream"] <- "Marine fishes nei"
#
# # Spain
# mar_rgn$species[mar_rgn$rgn_id==182 & mar_rgn$species == "Atlantic bluefin tuna"] <- "Tuna-like fishes nei"
#
# # Turkey
# mar_rgn$species[mar_rgn$rgn_id==76 & mar_rgn$species == "Atlantic bluefin tuna"] <- "Tuna-like fishes nei"
# mar_rgn$species[mar_rgn$rgn_id==76 & mar_rgn$species == "European seabass"] <- "Seabasses nei"
#
# ### sum values of regions with multiple subregions
# mar_rgn <- mar_rgn %>%
# group_by(fao, environment, species, year, Taxon_code, rgn_id) %>%
# summarize(value = sum(value)) %>%
# ungroup()
For example: Production of blue shrimp in Maine starts in 1983 – don’t include years before that.
Checked to make sure that there weren’t instances in which it made more sense to carry the previous year’s data forward as a method of gapilling. This didn’t seem to be the case.
## spread mar_rgn to create a value for every year-species-region (if missing it will be given a NA)
mar_rgn_spread <- spread(mar_rgn, year, value)
dim(mar_rgn_spread)
## turn data frame back into long format
mar_rgn_gf <- gather(mar_rgn_spread, "year", "value", num_range("",1950:2017)) %>%
arrange(rgn_id, species, year, Taxon_code, fao, environment)
## NA values are converted to zero.
mar_rgn_gf <- mar_rgn_gf %>%
mutate(year = as.numeric(as.character(year))) %>%
mutate(value_w_0 = ifelse(is.na(value), 0, value)) %>%
group_by(fao, environment, species, Taxon_code, rgn_id) %>%
mutate(cum_value = cumsum(value_w_0)) %>%
ungroup() %>%
filter(cum_value > 0) %>% # eliminates years before mariculture began
mutate(gap_0_fill = ifelse(is.na(value), "NA_to_zero", "0")) %>% # record gapfill
mutate(value = ifelse(is.na(value), 0, value)) %>% # finally, convert all NAs in original column to 0
select(-cum_value, -value_w_0)
See how may NA values were converted to 0
table(mar_rgn_gf$gap_0_fill)
## 3790 of these out of 27810+3790 cases had NA values converted to 0 - 2018 assessment
## 3763 of these out of 28694+4291 cases had NA values converted to 0 - 2019 assessment
Remove species-region-environment time series with less than four years of mariculture production > 0 tonnes (assume these are not established mariculture programs).
mar_rgn_gf = mar_rgn_gf %>%
group_by(rgn_id, species, fao, environment) %>%
mutate (not_0 = length(value[value>0])) %>% # length of vector of years greater than 0
filter (not_0>3) %>% # filter for groups that have at least four years of mariculture production
ungroup() %>%
select(rgn_id, species, fao, environment, year, value, Taxon_code, gap_0_fill)
Add a unique identifier per cultivated stock that describes each species, fao region, and environment grouping.
Used to estimate total mariculture yield per country.
Saves the appropriate Mariculture-FP file depending on whether we are excluding species based on best judgement (exclude), all seaweeds (exclude_no_seaweed), or just seaweeds nei (exclude_no_nei). See method changes above.
if(sum(str_detect(names(mar_sp), "exclude$"))==1) {
write.csv(maric, 'output/MAR_FP_data.csv', row.names=FALSE)
} else if (sum(str_detect(names(mar_sp), "exclude_no_seaweed"))==1) {
write.csv(maric, 'test/MAR_FP_data_no_seaweed.csv', row.names=FALSE)
} else if (sum(str_detect(names(mar_sp), "exclude_no_nei"))==1) {
write.csv(maric, 'test/MAR_FP_data_no_nei.csv', row.names=FALSE)
}
These data describe the sustainability and genetic escapes for country/species combinations (and, in a couple cases, environment and fao region combinations). In cases where these data were not available for a specific county/species, we averaged the data across taxonomic groups to gapfill the missing data.
## these need to be re-added (get cut when associated with region ids)
sus_no_rgn <- filter(sus, is.na(country))
sus_rgn <- name_2_rgn(df_in = sus,
fld_name='country',
flds_unique=c('species_fao', 'fao', 'environment', 'species_Truj'))
sus_rgn <- bind_rows(sus_rgn, sus_no_rgn) %>%
unique()
Check that the non-matches between Trujillo sustainability FAO spp (sus_rgn) and the FAO mariculture species in the wrangled FAO Aquaculture Production data table (maric) are not due to spelling errors or slightly different names. We want to include as many species that have sustainability scores as possible
## Make sure same species are spelled the same in the two data tables (e.g. check that there are no extra spaces)
sort(setdiff(sus_rgn$species_fao, maric$species)) # species that are no longer have mariculture industry or are not included due to being freshwater or non-food
sort(setdiff(maric$species, sus_rgn$species_fao)) # FAO species with no Trujillo data - there will probably be a long list
## Hand check each of the species output here. I mainly check to make sure there aren't obvious cases when the species is the same, but the names are presented slightly differently in the lists. For example, some species might have an extra space in the name somewhere. If any of these obvious differences exist, we need to go in and change the names to match.
## Its not too surprising (unfortunately) that the FAO list has far more species than the sustainability list. But, I like to go over them to make sure that we are getting sustainability scores for as many species as we can.
Append sustainability score to the FAO mariculture data.
The following joins the sustainability scores to regions/species that have Trujillo data.
table(sus_rgn$match_type)
## join taxa specific to country/species/environment
c_sp_env = sus_rgn %>%
filter(match_type == 'c_sp_env') %>%
select(rgn_id, species=species_fao, environment, Sust_c_sp_env = Maric_sustainability)
mar_sus <- maric %>%
left_join(c_sp_env, by= c("species", "environment", "rgn_id"))
## join taxa specific to country/species/fao region
c_sp_fao = sus_rgn %>%
filter(match_type == 'c_sp_fao') %>%
select(rgn_id, species=species_fao, fao, Sust_c_sp_fao = Maric_sustainability)
mar_sus <- mar_sus %>%
left_join(c_sp_fao, by= c("species", "fao", "rgn_id"))
Take a look at the data thus far
head(data.frame(filter(mar_sus, rgn_id==218 & species == "Atlantic salmon")))
head(data.frame(filter(mar_sus, !is.na(Sust_c_sp_fao))))
## join taxa specific to country/species
c_sp = sus_rgn %>%
filter(match_type == 'c_sp') %>%
select(rgn_id, species=species_fao, Sust_c_sp = Maric_sustainability)
mar_sus <- mar_sus %>%
left_join(c_sp, by= c("species", "rgn_id"))
Now look at a summary after appending all the Trujillo data
Merge the three Trujillo type categories into a single sustainability score column in the following order:
For example, if Sust_c_sp_env is missing, use Sust_c_sp_fao and so on.
mar_sus = mar_sus %>%
mutate(Sust = ifelse(!is.na(Sust_c_sp_env), Sust_c_sp_env, Sust_c_sp_fao)) %>%
mutate(Sust = ifelse(is.na(Sust), Sust_c_sp, Sust)) %>%
select(-Sust_c_sp_env, -Sust_c_sp_fao, -Sust_c_sp)
This joins the sustainability data that is gapfilled either at the species level (average of specific species/genera across regions) or at a higher course taxonomic levels and documents which data are gapfilled and how.
## Select observations gapfilled at the species/genera level:
gf_sp_sus <- sus_rgn %>%
filter(gapfill != "actuals" & match_type == "species") %>%
select(species = species_fao, gapfill, Sust_gf_sp = Maric_sustainability)
## check that there are no duplicated species_fao
gf_sp_sus[duplicated(gf_sp_sus$species), ]
## Match gapfilling values by species
mar_sus_gf = mar_sus %>%
left_join(gf_sp_sus, by = 'species')
## Select observations gapfilled at the coarse taxon level:
gf_taxon_sus <- sus_rgn %>%
filter(gapfill != "actuals" & match_type == "taxon") %>%
select(Taxon_code=taxon, Sust_gf_taxon = Maric_sustainability)
## Match gapfilling values by species
mar_sus_gf = mar_sus_gf %>%
left_join(gf_taxon_sus, by = c('Taxon_code'))
Take a look at the wrangled data
Obtain a sustainability score for each record, and a book-keeping column of whether it’s actual or gap-filled
For missing sustainability scores:
mar_sus_final = mar_sus_gf %>%
mutate(gapfill = ifelse(!is.na(Sust), "none", gapfill)) %>%
mutate(Sust = ifelse(is.na(Sust), Sust_gf_sp, Sust)) %>% # gapfill with species level
mutate(gapfill = ifelse(is.na(Sust), "taxon_average", gapfill)) %>% # add in taxon gapfill record
mutate(Sust = ifelse(is.na(Sust), Sust_gf_taxon, Sust)) %>% # gapfill with taxon level
mutate(taxa_code = paste(species, species_code, sep="_")) %>%
select(rgn_id, species, species_code, taxa_code, taxa_group=Taxon_code, year, gapfill_sus = gapfill, gapfill_fao = gap_0_fill, tonnes=value, Sust)
## save mariculture harvest data
mar_harvest_tonnes = mar_sus_final %>%
select(rgn_id, taxa_code, taxa_group, year, tonnes)
anyDuplicated(mar_harvest_tonnes) # check for duplications
## Saves the appropriate Mariculture Harvest Tonnes file depending on whether we are excluding species based on best judgement (exclude), all seaweeds (exclude_no_seaweed), or just seaweeds nei (exclude_no_nei). See method changes above.
if(sum(str_detect(names(mar_sp), "exclude$"))==1) {
write.csv(mar_harvest_tonnes, 'output/mar_harvest_tonnes.csv', row.names=F)
} else if (sum(str_detect(names(mar_sp), "exclude_no_seaweed"))==1) {
write.csv(mar_harvest_tonnes, 'test/mar_harvest_tonnes_no_seaweed.csv', row.names=F)
} else if (sum(str_detect(names(mar_sp), "exclude_no_nei"))==1) {
write.csv(mar_harvest_tonnes, 'test/mar_harvest_tonnes_no_nei.csv', row.names=F)
}
## save gapfill data for mariculture harvest
mar_harvest_tonnes_gf = mar_sus_final %>%
select(rgn_id, taxa_code, taxa_group, year, tonnes=gapfill_fao)
## Saves the appropriate Mariculture Harvest Gapfill file depending on whether we are excluding species based on best judgement (exclude), all seaweeds (exclude_no_seaweed), or just seaweeds nei (exclude_no_nei). See method changes above
if(sum(str_detect(names(mar_sp), "exclude$"))==1) {
write.csv(mar_harvest_tonnes_gf, 'output/mar_harvest_tonnes_gf.csv', row.names=F)
} else if (sum(str_detect(names(mar_sp), "exclude_no_seaweed"))==1) {
write.csv(mar_harvest_tonnes_gf, 'test/mar_harvest_tonnes_gf_no_seaweed.csv', row.names=F)
} else if (sum(str_detect(names(mar_sp), "exclude_no_nei"))==1) {
write.csv(mar_harvest_tonnes_gf, 'test/mar_harvest_tonnes_gf_no_nei.csv', row.names=F)
}
## save sustainability scores data for 2012
mar_sustainability_score = mar_sus_final %>%
mutate(year = 2012) %>% # Only 2012 sustainability scores exist (Trujillo)
select(rgn_id, year, taxa_code, sust_coeff = Sust) %>%
unique()
anyDuplicated(mar_sustainability_score)
## Saves the appropriate Sustainability Score file depending on whether we are excluding species based on best judgement (exclude), all seaweeds (exclude_no_seaweed), or just seaweeds nei (exclude_no_nei). See method changes above
if(sum(str_detect(names(mar_sp), "exclude$"))==1) {
write.csv(mar_sustainability_score, 'output/mar_sustainability.csv', row.names=F)
} else if (sum(str_detect(names(mar_sp), "exclude_no_seaweed"))==1) {
write.csv(mar_sustainability_score, 'test/mar_sustainability_no_seaweed.csv', row.names=F)
} else if (sum(str_detect(names(mar_sp), "exclude_no_nei"))==1) {
write.csv(mar_sustainability_score, 'test/mar_sustainability_no_nei.csv', row.names=F)
}
## save gapfill data for sustainability scores
mar_sustainability_score_gf = mar_sus_final %>%
select(rgn_id, taxa_code, sust_coeff = gapfill_sus) %>%
unique()
## Saves the appropriate Sustainability Score Gapfill file depending on whether we are excluding species based on best judgement (exclude), all seaweeds (exclude_no_seaweed), or just seaweeds nei (exclude_no_nei). See method changes above
if(sum(str_detect(names(mar_sp), "exclude$"))==1) {
write.csv(mar_sustainability_score_gf, 'output/mar_sustainability_gf.csv', row.names=F)
} else if (sum(str_detect(names(mar_sp), "exclude_no_seaweed"))==1) {
write.csv(mar_sustainability_score_gf, 'test/mar_sustainability_gf_no_seaweed.csv', row.names=F)
} else if (sum(str_detect(names(mar_sp), "exclude_no_nei"))==1) {
write.csv(mar_sustainability_score_gf, 'test/mar_sustainability_gf_no_nei.csv', row.names=F)
}
Combine genetic escapes data to mariculture data.
## can eliminate the environment category because these have the same scores
esc = sus_rgn %>%
filter(!is.na(Genetic.escapees)) %>%
mutate(match_type = ifelse(match_type == "c_sp_env", "c_sp", match_type)) %>%
group_by(rgn_id, species=species_fao, fao, match_type, taxon, gapfill) %>%
summarize(Genetic.escapees = mean(Genetic.escapees)) %>%
ungroup()
## join taxa specific to country/species/fao
c_sp_fao = esc %>%
filter(match_type == 'c_sp_fao') %>%
select(rgn_id, species, fao, Esc_c_sp_fao = Genetic.escapees)
mar_esc <- maric %>%
left_join(c_sp_fao, by= c("species", "fao", "rgn_id"))
head(data.frame(filter(mar_esc, !is.na(Esc_c_sp_fao))))
## join taxa specific to country/species
c_sp = esc %>%
filter(match_type == 'c_sp') %>%
select(rgn_id, species, Esc_c_sp = Genetic.escapees) # fao is blank
mar_esc <- mar_esc %>%
left_join(c_sp, by= c("species", "rgn_id"))
Look at a summary of the wrangled data table
Merge the different match types (esc_c_sp_fao, esc_c_sp) into a single sustainability score
Join the sustainability data that is gapfilled either at the species level (average of specific species/genera across regions) or at a higher course taxonomic levels and documents which data are gapfilled and how.
## Select observations gapfilled at the species/genera level:
gf_species_esc <- esc %>%
filter(gapfill != "actuals" & match_type == "species") %>%
select(species, gapfill, Esc_gf_sp = Genetic.escapees)
## check that there are no duplicated species_fao
gf_species_esc[duplicated(gf_species_esc$species), ]
## Match gapfilling values by species
mar_esc_gf = mar_esc %>%
left_join(gf_species_esc, by = 'species')
## Select observations gapfilled at the coarse taxon level:
gf_taxon_sus <- esc %>%
filter(gapfill != "actuals" & match_type == "taxon") %>%
select(Taxon_code=taxon, Esc_gf_taxon = Genetic.escapees)
## Match gapfilling values by species
mar_esc_gf = mar_esc_gf %>%
left_join(gf_taxon_sus, by = c('Taxon_code'))
Take a look at wrangled data
Obtain a sustainability score for each record, and a book-keeping column of whether it’s actual or gap-filled.
For gapfill column
Esc_gf_sp
(gapfill for species-level already recorded)Esc_gf_sp
value, then record “taxon_average” as the gapfill methodEsc_gf_taxon
valuetonnes_esc = mar_esc_gf %>%
mutate(gapfill = ifelse(!is.na(Escapees), "none", gapfill)) %>%
mutate(Escapees = ifelse(is.na(Escapees), Esc_gf_sp, Escapees)) %>%
mutate(gapfill = ifelse(is.na(Escapees), "taxon_average", gapfill)) %>%
mutate(Escapees = ifelse(is.na(Escapees), Esc_gf_taxon, Escapees)) %>%
select(rgn_id, species, species_code, year, gapfill_escapees = gapfill, tonnes=value, Escapees)
**Take a look at the summary
Final formatting of the escapee data. This is used as a pressure layer.
For each region/year: take a weighted average of the genetic escape probability for each taxa based on tonnes mariculture
genEscapes <- tonnes_esc %>%
group_by(rgn_id, year) %>%
summarize(genEscapes = weighted.mean(Escapees, tonnes, na.rm=TRUE))
Obtain gapfill information from genEscapes
Gapfill values in output file are the proportion of data that is gapfilled.
# obtain corresponding gapfilling information for each region (average of gapfilled data, weighted by tonnes of mariculture).
genEscapes_gf <- tonnes_esc %>%
mutate(gapfill = ifelse(gapfill_escapees=="none", 0, 1)) %>%
group_by(rgn_id, year) %>%
summarize(genEscapes = weighted.mean(gapfill, tonnes, na.rm=TRUE)) %>%
ungroup() %>%
filter(year==2017) %>%
select(rgn_id, pressures.score=genEscapes) %>%
mutate(pressures.score=ifelse(pressures.score=="NaN", NA, pressures.score)) %>%
data.frame()
## Saves the appropriate Genetic Escapes Gapfill file depending on whether we are excluding species based on best judgement (exclude), all seaweeds (exclude_no_seaweed), or just seaweeds nei (exclude_no_nei). See method changes above
if(sum(str_detect(names(mar_sp), "exclude$"))==1) {
write.csv(genEscapes_gf, 'output/GenEsc_gf.csv', row.names=FALSE)
} else if (sum(str_detect(names(mar_sp), "exclude_no_seaweed"))==1) {
write.csv(genEscapes_gf, 'test/GenEsc_gf_no_seaweed.csv', row.names=FALSE)
} else if (sum(str_detect(names(mar_sp), "exclude_no_nei"))==1) {
write.csv(genEscapes_gf, 'test/GenEsc_gf_no_nei.csv', row.names=FALSE)
}
Obtain escapee data layers from genEscapes
## Saves the appropriate Genetic Escapes data file depending on whether we are excluding species based on best judgement (exclude), all seaweeds (exclude_no_seaweed), or just seaweeds nei (exclude_no_nei). See method changes above
if(sum(str_detect(names(mar_sp), "exclude$"))==1) {
write.csv(data, 'output/GenEsc.csv', row.names=FALSE)
} else if (sum(str_detect(names(mar_sp), "exclude_no_seaweed"))==1) {
write.csv(data, 'test/GenEsc_no_seaweed.csv', row.names=FALSE)
} else if (sum(str_detect(names(mar_sp), "exclude_no_nei"))==1) {
write.csv(data, 'test/GenEsc_no_nei.csv', row.names=FALSE)
}
Comparing this year’s data to previous year’s data. Expect small variation from year to year. Plot to view differences.
## Compare genetic escapes pressure scores for Russia; saw large changes between 2018 and 2017 assessment years. No big changes between 2019 and 2018 assessment years.
old <- read.csv("../v2018/output/GenEsc.csv") %>%
filter(rgn_id == 73) %>%
select(rgn_id, year, prs_score_old=pressure_score)
new <- read.csv("../v2019/output/GenEsc.csv") %>%
filter(rgn_id == 73) %>%
select(rgn_id, year, pressure_score)
compare <- full_join(old, new, by=c("rgn_id","year")); View(compare)
plot(compare$prs_score_old, compare$pressure_score);abline(0,1,col="red")
## Compare yield data for Russia
mar_old <- read.csv("../v2018/output/MAR_FP_data.csv") %>%
filter(rgn_id == 73, year == 2016) %>%
select(rgn_id, species, fao, old_value = value)
mar_new <- read.csv("output/MAR_FP_data.csv") %>%
filter(rgn_id == 73, year == 2017) %>%
select(rgn_id, species, fao, value)
yield <- mar_old %>%
full_join(mar_new, by = c("rgn_id","species","fao")); View(yield)
## Compare genetic escapes pressure scores for Vietnam; no change in pressures
old <- read.csv("../v2018/output/GenEsc.csv") %>%
filter(rgn_id == 207) %>%
select(rgn_id, year, prs_score_old=pressure_score)
new <- read.csv("../v2019/output/GenEsc.csv") %>%
filter(rgn_id == 207) %>%
select(rgn_id, year, pressure_score)
compare <- full_join(old, new, by=c("rgn_id","year")); View(compare)
plot(compare$prs_score_old, compare$pressure_score);abline(0,1,col="red")
## Compare yield data for Vietnam
mar_old <- read.csv("../v2018/output/MAR_FP_data.csv") %>%
filter(rgn_id == 207, year == 2016) %>%
select(rgn_id, species, fao, environment, old_value = value)
mar_new <- read.csv("output/MAR_FP_data.csv") %>%
filter(rgn_id == 207, year == 2017) %>%
select(rgn_id, species, fao, environment, value)
yield <- mar_old %>%
full_join(mar_new, by = c("rgn_id","species","fao", "environment")); View(yield)
## Compare genetic escapes pressure scores for Iceland; saw large changes between 2018 and 2017 assessment years, we do not see a big difference between 2019 and 2018 assessment years.
old <- read.csv("../v2018/output/GenEsc.csv") %>%
filter(rgn_id == 143) %>%
select(rgn_id, year, prs_score_old=pressure_score)
new <- read.csv("../v2019/output/GenEsc.csv") %>%
filter(rgn_id == 143) %>%
select(rgn_id, year, pressure_score)
compare <- full_join(old, new, by=c("rgn_id","year")); View(compare)
plot(compare$prs_score_old, compare$pressure_score);abline(0,1,col="red")
## Compare yield data for Iceland. Atlantic Salmon increased by a lot between 2018 and 2019 assessment years.
mar_old <- read.csv("../v2018/output/MAR_FP_data.csv") %>%
filter(rgn_id == 143, year == 2016) %>%
select(rgn_id, species, fao, environment, old_value = value)
mar_new <- read.csv("output/MAR_FP_data.csv") %>%
filter(rgn_id == 143, year == 2017) %>%
select(rgn_id, species, fao, environment, value)
yield <- mar_old %>%
full_join(mar_new, by = c("rgn_id","species","fao", "environment")); View(yield)
## Compare genetic escapes pressure scores for Belize; saw large changes between 2018 and 2017 assessment years, we do not see big differences between 2019 and 2018 assessment years.
old <- read.csv("../v2018/output/GenEsc.csv") %>%
filter(rgn_id == 164) %>%
select(rgn_id, year, prs_score_old=pressure_score)
new <- read.csv("../v2019/output/GenEsc.csv") %>%
filter(rgn_id == 164) %>%
select(rgn_id, year, pressure_score)
compare <- full_join(old, new, by=c("rgn_id","year")); View(compare)
plot(compare$prs_score_old, compare$pressure_score);abline(0,1,col="red")
## Compare yield data for Belize
mar_old <- read.csv("../v2018/output/MAR_FP_data.csv") %>%
filter(rgn_id == 164, year == 2016) %>%
select(rgn_id, species, fao, old_value = value)
mar_new <- read.csv("output/MAR_FP_data.csv") %>%
filter(rgn_id == 164, year == 2017) %>%
select(rgn_id, species, fao, value)
yield <- mar_old %>%
full_join(mar_new, by = c("rgn_id","species","fao")); View(yield)
## Compare genetic escapes pressure scores for France; saw large changes between 2018 and 2017 assessment years, we do not see big difference between 2019 and 2018 assessment years.
old <- read.csv("../v2018/output/GenEsc.csv") %>%
filter(rgn_id == 179) %>%
select(rgn_id, year, prs_score_old=pressure_score)
new <- read.csv("../v2019/output/GenEsc.csv") %>%
filter(rgn_id == 179) %>%
select(rgn_id, year, pressure_score)
compare <- full_join(old, new, by=c("rgn_id","year")); View(compare)
plot(compare$prs_score_old, compare$pressure_score);abline(0,1,col="red")
## Compare yield data for France
mar_old <- read.csv("../v2018/output/MAR_FP_data.csv") %>%
filter(rgn_id == 179, year == 2016) %>%
select(rgn_id, species, fao, old_value = value)
mar_new <- read.csv("output/MAR_FP_data.csv") %>%
filter(rgn_id == 179, year == 2017) %>%
select(rgn_id, species, fao, value)
yield <- mar_old %>%
full_join(mar_new, by = c("rgn_id","species","fao")); View(yield)
plot(yield$old_value, yield$value);abline(0,1,col="red")
## Compare new/old Mariculture sustainability scores
tmp_old <- read.csv("../v2018/raw/Truj_label_sust.csv") %>%
filter(country == "Russian Federation") %>%
select(country, species_Truj, sust_old = Maric_sustainability, gen_old = Genetic.escapees)
tmp_new <- read.csv("../v2019/raw/Truj_label_sust.csv") %>%
filter(country == "Russian Federation") %>%
select(country, species_Truj, Maric_sustainability, Genetic.escapees)
test <- tmp_new %>%
full_join(tmp_old, by=c("country", "species_Truj"))
View(test)
plot(test$sust_old, test$Maric_sustainability);abline(0,1,col="red")
## Compare new/old genetic escapees scores
plot(test$gen_old, test$Genetic.escapees);abline(0,1,col="red")
Compare the three exclusion methods for seaweed species.
Method 1:Exclude some seaweed species (exclude
) Method 2 (original): Exclude all seaweed species (exclude_no_sw
) *Method 3: Exclude all seaweed nei species (exclude_no_nei
)
## Read in the three different versions of wrangled data
## Best Professional Judgement (BPJ)
exclude <- read.csv("../v2019/output/GenEsc.csv") %>%
filter(year == 2017) %>% # compare the most recent shared year between old and new
select(rgn_id, prs_score_BPJ=pressure_score)
## Exclude all seaweeds (no seaweed)
exclude_no_sw <- read.csv("../v2019/test/GenEsc_no_seaweed.csv") %>%
filter(year == 2017) %>% # compare the most recent shared year between old and new
select(rgn_id, prs_score_no_sw=pressure_score)
## Exclude only nei seaweeds (no nei)
exclude_no_nei <- read.csv("../v2019/test/GenEsc_no_nei.csv") %>%
filter(year == 2017) %>% # compare the most recent shared year between old and new
select(rgn_id, prs_score_no_nei=pressure_score)
## Compare No Seaweed to Best Professional Judgement
compare <- full_join(exclude, exclude_no_sw, by="rgn_id")
compare <- na.omit(compare)
## Compare genetic escapes pressure scores
plot(compare$prs_score_BPJ, compare$prs_score_no_sw); abline(0,1,col="red")
## Compare No Nei to No Seaweed (not much difference)
compare <- full_join(exclude_no_nei, exclude_no_sw, by="rgn_id")
compare <- na.omit(compare)
## Compare genetic escapes pressure scores
plot(compare$prs_score_no_nei, compare$prs_score_no_sw); abline(0,1,col="red")
## Compare No Nei to BPJ
compare <- full_join(exclude_no_nei, exclude, by="rgn_id")
compare <- na.omit(compare)
## Compare genetic escapes pressure scores
plot(compare$prs_score_BPJ, compare$prs_score_no_nei); abline(0,1,col="red")
[REFERENCE RMD FILE: https://raw.githubusercontent.com/OHI-Science/ohiprep_v2019/gh-pages/globalprep/mar/v2019/reference_point/CountryProductionEstimate.Rmd]
This analysis produces potential tonnes of aquaculture from growth potential estimates of finfish and bivalves. These aquaculture numbers will be used for the reference point for the mariculture subgoal.
Reference:
https://knb.ecoinformatics.org/view/doi:10.5063/F1CF9N69 Rebecca Gentry, Halley Froehlich, Dietmar Grimm, Peter Kareiva, Michael Parke, et al. SNAPP - Mapping the Global Potential for Marine Aquaculture. Knowledge Network for Biocomplexity. doi:10.5063/F1CF9N69.
Downloaded: 7/3/2019
Description: Growth Potential estimate raster for global cells
knitr::opts_chunk$set(eval=FALSE)
#######regression inputs from VBGF_Fish.r
# This is a heavily modified script from KNB:
# https://knb.ecoinformatics.org/view/doi:10.5063/F1CF9N69
# From this paper:
# https://www.nature.org/content/dam/tnc/nature/en/documents/Mapping_the_global_potential_for_marine_aquaculture.pdf
### libraries useful for data wrangling
library(dplyr)
library(tidyr)
library(tidyverse)
## libraries useful for spatial data
library(raster)
library(rgdal)
library(sf)
library(fasterize)
## data visualization
library(RColorBrewer)
library(ggplot2)
library(rasterVis)
library(maps)
## path management
library(here)
## some OHI files
source('http://ohi-science.org/ohiprep_v2019/workflow/R/common.R')
dir_git <- file.path(here(), "globalprep/mar/v2019")
## Variables needed to get from PHI to tonnes of production
## MRF: assumes 1 farm is 1km2 ##
# (seems easier to just units of km2, rather than farm)
F_estCoef <- c(7.6792, (-5.8198)) #from regression estimated in VBGF_Fish_Final.r
B_estCoef <- c(2.9959,(-1.6659)) #from regression estimate in VBGF_Bivalves.r
density <- 20 #juveniles per m3
cagesize <- 9000 #m3
cagesperfarm <- 24 #located atleast 1 km apart.. cagesperkm2
bivperfarm <- 130000000 #(100 longlines/km2) * 4 km * (100 bivalves seeded/0.0003 km) = 133333333 bivalve/km2
weight35cm <- 554.8 ## in grams see VBGF_Fish_Final. Paper reports 548 grams
## Global tiff file of PHI (Growth Potential) estimates
FishPhiALLConstraints <- raster(file.path(dir_M, "git-annex/globalprep/mar/v2019/Spatial_Data/NewLayersWOHypoxia/FishPhiALLConstraints95LT2.tiff"))
plot(FishPhiALLConstraints)
FishPhiVector=getValues(FishPhiALLConstraints)
## Convert Phi raster to number of years it takes to grow a 35 cm fish
LogFishYears <- calc(FishPhiALLConstraints, fun=function(x){F_estCoef[1]+F_estCoef[2]*log(x)})
LogFishYears
plot(LogFishYears)
FishYears <- calc(LogFishYears, fun=function(x){exp(x)})
FishYears
plot(FishYears)
writeRaster(FishYears,file.path(dir_M, "git-annex/globalprep/mar/v2019/Spatial_Data/NewLayersWOHypoxia/FishYearsbyCell.tif"), overwrite=TRUE)
FishYears <- raster(file.path(dir_M, "git-annex/globalprep/mar/v2019/Spatial_Data/NewLayersWOHypoxia/FishYearsbyCell.tif"))
FishYearsVector=getValues(FishYears)
#OHI 2018 regions (original analysis used older regions file)
## call spatial file from sourced file
regions_shape()
OHIcountries <- regions %>%
filter(type_w_ant == "eez")
OHIcountries <- st_transform(OHIcountries, crs(FishPhiALLConstraints))
OHIcountries_raster <- fasterize(OHIcountries, FishPhiALLConstraints, field="rgn_id")
CountryVector=getValues(OHIcountries_raster)
### area of each cell (each cell is different given lat/long coordinate reference system)
areaPerCell <- area(FishPhiALLConstraints, weights=FALSE, na.rm=TRUE)
areaPerCellVector <- getValues(areaPerCell)
### Make a dataframe with raster values that includes cells: Country, area, Phi, and Years to Harvest
productionDF <- data.frame(CellID = 1:933120000,
Country = CountryVector,
AreaKM2 = areaPerCellVector,
PhiPrime = FishPhiVector,
YearsToHarvest = FishYearsVector)
head(productionDF)
summary(FishYearsVector)
summary(areaPerCellVector) ##they seem to match
## calculate production for each cell
productionDFFishCells <- productionDF %>%
filter(!is.na(YearsToHarvest)) %>%
mutate(F_yieldperfarmMT = (weight35cm * density * cagesize * cagesperfarm)/1000000) %>% # MRF: units yieldperkm2? 554.8 grams * 20 juv/m3 * 9000 m3 * 24 cages/km2 = grams/km2
mutate(F_yieldpercellperyear = (F_yieldperfarmMT/YearsToHarvest) * AreaKM2) %>%
arrange(YearsToHarvest) %>%
mutate(YieldCumSum = cumsum(F_yieldpercellperyear)) %>%
mutate(AreaCumSum = cumsum(AreaKM2))
write.csv(productionDFFishCells, file.path(dir_M, "git-annex/globalprep/mar/v2019/int/fish/productionDFFishCells.csv")) #save to mazu because so large. This is functionally a raster file.
productionDFFishCells <- read.csv(file.path(dir_M, "git-annex/globalprep/mar/v2019/int/fish/productionDFFishCells.csv"))
head(productionDFFishCells)
summary(productionDFFishCells)
str(productionDFFishCells)
##cumsum area is 11,402,629 km2 - # matches the paper
##cumprod is 15,950,000,000MT - # matches the paper
### how many of these cells are not in a country?
sum(is.na(productionDFFishCells$Country))
dim(productionDFFishCells)
## MRF: with new OHI regions: 544,569 are not in a country..probably a lot are in conflicted areas
## Calculate production if 1% of top production area is used:
productionByCountryFishDF <- productionDFFishCells %>%
filter(!is.na(Country)) %>%
dplyr::select(-YieldCumSum, -AreaCumSum, -X) %>%
arrange(YearsToHarvest) %>%
mutate(ID = Country) %>%
dplyr::arrange(ID) %>%
group_by(ID) %>%
mutate(CountryYieldCumSum = cumsum(F_yieldpercellperyear)) %>%
mutate(CountryAreaCumSum = cumsum(AreaKM2)) %>%
mutate(MaxProdPerCountry = max(CountryYieldCumSum)) %>%
mutate(OnePercentDevPerCountry = .01*max(CountryAreaCumSum)) #calculating 1 percent of area
write.csv(productionByCountryFishDF,file.path(dir_M, "git-annex/globalprep/mar/v2019/int/fish/FishProdByCountryByCell.csv"), row.names = FALSE) #save to mazu because so large. This is functionally a raster file.
productionByCountryFishDF <- read.csv(file.path(dir_M, "git-annex/globalprep/mar/v2019/int/fish/FishProdByCountryByCell.csv"))
## MRF: For each area identify amount of area that corresponds to 1% of production area,
## MRF: assume maximum production within the country for the 1% of area
CountryProdSummary <- productionByCountryFishDF %>%
dplyr::arrange(YearsToHarvest) %>%
dplyr::arrange(ID) %>%
group_by(ID) %>%
filter(CountryAreaCumSum <= OnePercentDevPerCountry) %>%
mutate(ProdPerCountryOnePercent = max(CountryYieldCumSum)) %>%
slice(1)
write.csv(CountryProdSummary, file.path(dir_git, "/int/FishProdByCountrySummary.csv"), row.names = FALSE) #save to github
CountryProdSummary <- read.csv(file.path(dir_git, "int/FishProdByCountrySummary.csv"))
# MRF: get fasted YearsToHarvest for each country
CountryProdSummaryNop <- productionByCountryFishDF %>%
dplyr::arrange(YearsToHarvest) %>%
dplyr::arrange(ID) %>%
group_by(ID) %>%
slice(1)
# Add country names
region_data()
CountryLabel <- rgns_eez %>%
dplyr::select(ID = rgn_id, rgn_name)
## Final data
## I think the relevant value we want for the reference point is in this table: ProdPerCountryOnePercent. Indonesia number for fish matches up with paper. > 24 million tonnes of fish if 1% of aquaculture potential developed.
CountryProdSummaryFAO <- CountryProdSummary %>%
ungroup %>%
dplyr::select(ID:ProdPerCountryOnePercent) %>%
full_join(CountryLabel, by= "ID")
write.csv(CountryProdSummaryFAO, file.path(dir_git, "/int/FishProdByCountrySummaryFAO.csv"), row.names = FALSE) #save to github
sum(CountryProdSummaryFAO$MaxProdPerCountry, na.rm = TRUE)
#15451023277 number matches paper. > 15 billion tonnes
##Now for Bivalves
BivalvePhiALLConstraints=raster(file.path(dir_M, "git-annex/globalprep/mar/v2019/Spatial_Data/NewLayersWOHypoxia/BivalvePhiALLConstraints95LT1.tif"))
plot(BivalvePhiALLConstraints)
BivalvePhiVector <- getValues(BivalvePhiALLConstraints)
#OHI 2018 regions (original analysis used older regions file)
## call spatial file from sourced file
regions_shape()
OHIcountries <- regions %>%
filter(type_w_ant == "eez")
OHIcountries <- st_transform(OHIcountries, crs(FishPhiALLConstraints))
OHIcountries_raster <- fasterize(OHIcountries, FishPhiALLConstraints, field="rgn_id")
CountryVector <- getValues(OHIcountries_raster)
#make the value of each cell the years it takes to grow a 4 cm bivlave
LogBivalveYears <- calc(BivalvePhiALLConstraints, fun = function(x){B_estCoef[1] + B_estCoef[2]*(x)})
LogBivalveYears
plot(LogBivalveYears)
BivalveYears=calc(LogBivalveYears,fun=function(x){exp(x)})
BivalveYears
plot(BivalveYears)
writeRaster(BivalveYears,file.path(dir_M, "git-annex/globalprep/mar/v2019/Spatial_Data/NewLayersWOHypoxia/BivalveYearsByCell.tif"), overwrite=TRUE)
BivalveYears <- raster(file.path(dir_M, "git-annex/globalprep/mar/v2019/Spatial_Data/NewLayersWOHypoxia/BivalveYearsByCell.tif"))
BivalveYearsVector <- getValues(BivalveYears)
###now load in area values for each cell
#areaPerCell=raster("Spatial_Data/MiddleFiles/AreaBivalveLT1.grd")
areaPerCell <- area(BivalvePhiALLConstraints, weights=FALSE, na.rm=TRUE)
areaPerCell
areaPerCellVector=getValues(areaPerCell)
productionDFBiv <- data.frame(CellID=1:933120000,
Country=CountryVector,
AreaKM2=areaPerCellVector,
PhiPrime=BivalvePhiVector,
YearsToHarvest=BivalveYearsVector)
head(productionDFBiv)
productionDFBivCells <- productionDFBiv %>%
filter(!is.na(YearsToHarvest)) %>%
mutate(B_yieldperfarmInd = bivperfarm) %>%
mutate(B_yieldpercellperyear = (B_yieldperfarmInd / YearsToHarvest) * AreaKM2) %>%
arrange(YearsToHarvest) %>%
mutate(YieldCumSum = cumsum(B_yieldpercellperyear)) %>%
mutate(AreaCumSum = cumsum(AreaKM2))
head(productionDFBivCells)
summary(productionDFBivCells)
str(productionDFBivCells)
write.csv(productionDFBivCells, file.path(dir_M, "/git-annex/globalprep/mar/v2019/int/productionDFBivCells.csv")) #save to mazu because so large. This is functionally a raster file.
productionDFBivCells <- read.csv(file.path(dir_M, "/git-annex/globalprep/mar/v2019/int/bivalve/productionDFBivCells.csv"))
productionByCountryBivDF <- productionDFBivCells %>%
filter(!is.na(Country)) %>%
dplyr::select(-YieldCumSum, -X) %>%
dplyr::select(-AreaCumSum) %>%
arrange(YearsToHarvest) %>%
mutate(ID = Country) %>%
dplyr::arrange(ID) %>%
group_by(ID) %>%
mutate(CountryYieldCumSum = cumsum(B_yieldpercellperyear)) %>%
mutate(CountryAreaCumSum = cumsum(AreaKM2)) %>%
mutate(MaxPhi = max(PhiPrime)) %>%
mutate(averagePhi = mean(PhiPrime)) %>%
mutate(averageWeightedPhi = sum(PhiPrime*AreaKM2)/(max(CountryAreaCumSum))) %>%
mutate(MaxDevPerCountry = max(CountryAreaCumSum)) %>%
mutate(MaxProdPerCountry = max(CountryYieldCumSum)) %>%
mutate(OnePercentDevPerCountry = .01*max(CountryAreaCumSum))
head(productionByCountryBivDF)
write.csv(productionByCountryBivDF,file.path(dir_M, "git-annex/globalprep/mar/v2019/int/bivalve/BivProdByCountryByCell.csv"), row.names = FALSE) #save to mazu because so large. This is functionally a raster file.
productionByCountryBivDF <- read.csv(file.path(dir_M, "git-annex/globalprep/mar/v2019/int/bivalve/BivProdByCountryByCell.csv"))
CountryProdSummary <- productionByCountryBivDF %>%
filter(CountryAreaCumSum <= OnePercentDevPerCountry) %>%
mutate(ProdPerCountryOnePercent = max(CountryYieldCumSum)) %>%
slice(1)
write.csv(CountryProdSummary,file.path(dir_git, "int/BivalveProdByCountrySummary.csv"), row.names = FALSE) #save to github
region_data()
CountryLabel <- rgns_eez %>%
dplyr::select(ID = rgn_id, rgn_name)
head(CountryLabel)
## I think the relevant value we want for the reference point is in this table: ProdPerCountryOnePercent. Indonesia number for bivalves is higher than the paper, however the paper does use the phrase "over 3.9*10^11 tonnes". Data says 4.7 * 10^11 million tonnes of bivalves if 1% of aquaculture potential developed, which is greater than 3.9 * 10^11 tonnes.
CountryProdSummaryFAO <- CountryProdSummary %>%
ungroup %>%
dplyr::select(ID:ProdPerCountryOnePercent) %>%
full_join(CountryLabel, by = "ID")
write.csv(CountryProdSummaryFAO, file.path(dir_git, "int/BivalveProdByCountrySummaryFAO.csv"), row.names = FALSE) #save to github
sum(CountryProdSummaryFAO$MaxDevPerCountry, na.rm = TRUE)
#1491404 km2. Matches paper ~1,500,000 km2
To compare potential vs harvest, we need to convert bivalve units to metric tonnes, they are currently in units of individual bivalves. Some figures:
Averaging these gives about 27.5g per piece.
aq_mass_per_pc <- 0.0275 * 1e-3 ### mass of bivalve piece in tonnes
pot_b <- read_csv(file.path(dir_git, 'int/BivalveProdByCountrySummaryFAO.csv')) %>%
mutate(potential_prod_one_percent_b = ProdPerCountryOnePercent * aq_mass_per_pc,
potential_prod_max_b = MaxProdPerCountry * aq_mass_per_pc,
aq_type = 'shellfish',
units = 'tonnes') %>%
dplyr::select("rgn_id" = "ID", rgn_name, potential_prod_one_percent_b, potential_prod_max_b)
pot_f <- read_csv(file.path(dir_git, 'int/FishProdByCountrySummaryFAO.csv')) %>%
mutate(aq_type = 'finfish',
units = 'tonnes') %>%
dplyr::select("rgn_id" = "ID", rgn_name, "potential_prod_one_percent_f" = "ProdPerCountryOnePercent", "potential_prod_max_f" = "MaxProdPerCountry")
pot_aq_int <- full_join(pot_b, pot_f) ## makes sense. 1% of potential AREA... not 1% of potential PRODUCTION
write_csv(pot_aq, file.path(dir_git, 'int/aq_potential_int.csv')) ##intermediate file that might come in handy in the future
DT::datatable(pot_aq_int)
pot_aq_final <- pot_aq_int %>%
mutate(potential_prod_one_percent_b = replace_na(potential_prod_one_percent_b, 0),
potential_prod_one_percent_f = replace_na(potential_prod_one_percent_f, 0)) %>%
mutate(potential_mar_tonnes = potential_prod_one_percent_b + potential_prod_one_percent_f) %>%
arrange(rgn_id) %>%
dplyr::select(rgn_id, potential_mar_tonnes) %>%
mutate(year = 2019)
write_csv(pot_aq_final, file.path(dir_git, 'output/aq_potential_final.csv'))
DT::datatable(pot_aq_final)
## make gapfilling dataset
pot_aq_final_gf <- pot_aq_final %>%
mutate(gapfilled = case_when(
potential_mar_tonnes == 0 ~ 1,
potential_mar_tonnes > 0 ~ 0
),
method = case_when(
potential_mar_tonnes == 0 ~ "missing regions given 0 value",
potential_mar_tonnes > 0 ~ ""
)) %>%
dplyr::select(rgn_id, gapfilled, method, year)
write_csv(pot_aq_final_gf, file.path(dir_git, "output/aq_potential_gf.csv"))