ohi logo
OHI Science | Citation policy

REFERENCE RMD FILE:

1 Summary

This data prep markdown calls a series of scripts to download and analyze sea ice data for the habitat subgoal and coastal protection goal. See data layers documentation for more information.

2 Updates from previous assessment

No changes to methods from previous assessment. We now download the data from a HTTPS website, rather than the defunct FTP website used previously. Instructions for downloading have been updated below.


3 Data Source

Reference:
Cavalieri, D.J., Parkinson, C.L., Gloersen, P. and Zwally, H. (1996). Sea ice concentrations from Nimbus-7 SMMR and DMSP SMM/I-SSMIS passive microwave data. 1979-2014. Boulder, Colorado, USA: NASA National Snow and Ice Data Center Distributed Active Archive Center (http://dx.doi.org/10.5067/8GQ8LZQVL0VL).

Downloaded: 2021 April 26

Description: Monthly sea ice extent data.

Data can be downloaded here: https://nsidc.org/data/nsidc-0051

3.1 Downloading the raw data

To download the raw data, you must go to this website: https://nsidc.org/data/nsidc-0051, and search “nt_??????_” to access the monthly files.

This is because all monthly files follow the format “nt_YYYYMM_SSS_vVV_R.bin”. Click “Order Files” and wait until your order is processed. Once processed (should only take a couple of minutes, you will receive an email with instructions), download the zip file and place it into the git-annex/globalprep/_raw_data/NSIDC_SeaIce/v2020 folder on Mazu and extract it.

The zip file contains numbered folders, each with two files in them, .xml and .bin files.

We need to create “xml”, north” and “south” folders and place the .bin and .xml files within them.

To do this, we will be working in the terminal (alternatively, you could do it manually, but it takes much longer, see v2020 script).

First we need to ssh into mazu. Open the terminal, and type “ssh ”, and enter your password.

Now we need to cd into the correct folder on mazu, which where we have placed our seaice raw data: - Enter cd /home/shares/ohi/git-annex/globalprep/_raw_data/NSIDC_SeaIce/v2021/5000001093042 to do this.

Now move all of the xml files to the xml folder that your have created: - mv **/*.xml /home/shares/ohi/git-annex/globalprep/_raw_data/NSIDC_SeaIce/v2021/xml

And do the same with the n.bin files and s.bin files: - mv **/*n.bin /home/shares/ohi/git-annex/globalprep/_raw_data/NSIDC_SeaIce/v2021/north - mv **/*s.bin /home/shares/ohi/git-annex/globalprep/_raw_data/NSIDC_SeaIce/v2021/south

NOTE: remember to change the assessment year to whatever assessment you are conducting.

See hab_seaice/v2021/int folder for a screenshot of my terminal if needed.

The final folder structure looks like this, where you have created new “north”, “south”, and “xml” folders:

For complete documentation and more information about data access, please see:

http://nsidc.org/data/nsidc-0051.html

If you wish to be notified of updates or corrections to these data, please register with NSIDC User Services by sending e-mail to: nsidc@nsidc.org

Identify yourself as a user of “Sea Ice Concentrations from Nimbus-7 SMMR and DMSP SSM/I-SSMIS Passive Microwave Data (NSIDC-0051).” Include your name, e-mail address, postal address, and telephone number.

If you have questions, please contact NSIDC User Services.

CONTACT INFORMATION: User Services National Snow and Ice Data Center CIRES, 449 UCB University of Colorado Boulder, CO USA 80309-0449 Phone: +1 303-492-6199 Fax: +1 303-492-2468 E-mail:

Time range: 1979-2020


4 Methods

4.1 Setup

Load all relevant libraries, establish/define parameters and commonly used pathnames. Manually change scenario and data years in file pathnames code chunk to reflect the most recent data (d) and current assessment year (v) in setup code chunk.

## Install R packages where necessary
pkg <- c("raster", "fasterize", "sf", "sp", "rgdal", "fields") # "fields" for colors in Status_Trend.R
new.pkg <- pkg[!(pkg %in% installed.packages())]
if (length(new.pkg)){install.packages(new.pkg)}
if (!("ohicore" %in% installed.packages())){devtools::install_github("ohi-science/ohicore")}

## Load libraries, set directories
lapply(c(pkg, "ohicore"), require, character.only = TRUE)

## UPDATE THESE!
assessYear <- "v2021" # change to reflect assessment year
previous_yr <- "v2020" # previous assessment year
final.year <- 2020 # final year of data (all months)

library(here)
source("../../../workflow/R/common.R") # directory locations
## This file makes it easier to process data for the OHI global assessment
##  by creating the following objects:
## 
##  * dir_M = identifies correct file path to Mazu (internal server) based on your operating system
##  * mollCRS = the crs code for the mollweide coordinate reference system we use in the global assessment
##  * regions_shape() = function to load global shapefile for land/eez/high seas/antarctica regions
##  * ohi_rasters() = function to load two rasters: global eez regions and ocean region
##  * region_data() = function to load 2 dataframes describing global regions 
##  * rgn_syns() = function to load dataframe of region synonyms (used to convert country names to OHI regions)
##  * low_pop() = function to load dataframe of regions with low and no human population
##  * UNgeorgn = function to load dataframe of UN geopolitical designations used to gapfill missing data
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj
## = prefer_proj): Discarded datum Unknown based on WGS84 ellipsoid in Proj4
## definition

4.2 Location of Maps

These maps of the OHI regions were made by the PreparingSpatialFiles.R script. If there are changes to the OHI regions, the PreparingSpatialFiles.R script will need to be run. Additionally, it is critical to walk through the ObtainingData.R script if any of the spatial files have been modified (this saves the files as spatial points).

The original polygon files are used from: git-annex/globalprep/_raw_data/NSIDC_SeaIce/v2015. However, we use the fasterize package to rasterize the polygons and save them to: git-annex/globalprep/_raw_data/NSIDC_SeaIce/v2020.

maps <- file.path(dir_M, "git-annex/globalprep/_raw_data/NSIDC_SeaIce/v2021") # change year to current assessment

4.3 Establish Parameters

Establish: CRS, website to collect data, data selection parameters. Filename format for final monthly data is nt_YYYYMM_SSS_vVV_R.bin. Parameters will be used to scrape the data from the web in ObtainingData script.

pixel = 25000 # pixel dimension in meters for both x and y
prj.n = "+proj=stere +lat_0=90 +lat_ts=70 +lon_0=-45 +k=1 +x_0=0 +y_0=0 +a=6378273 +b=6356889.449 +units=m +no_defs"
prj.s = "+proj=stere +lat_0=-90 +lat_ts=-70 +lon_0=0 +k=1 +x_0=0 +y_0=0 +a=6378273 +b=6356889.449 +units=m +no_defs"
prj.mol = "+proj=moll +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs"

## Filepath (fp), filepath for the final monthly data is nt_YYYYMM_SSS_vVV_R.bin
fp.n <- file.path(dir_M, "git-annex/globalprep/_raw_data/NSIDC_SeaIce/v2021/north")
fp.s <- file.path(dir_M, "git-annex/globalprep/_raw_data/NSIDC_SeaIce/v2021/south")

poles = c("n","s")
years = c(1979:final.year) # full range of data
months = 1:12
n.pym = length(poles)*length(years)*length(months)
i.pym = 0

4.4 Function 1

Collects the data for each month/year from mazu and add to raster stack that is saved in tmp folder as: n_rasters_points.rdata or s_rasters_points.rdata. And, if it doesn’t already exist, it converts the region shapefile into a raster points file. See ObtainingData.R script for more details.

source("ObtainingData.R") ## UPDATE some file paths with the correct assessment year in ObtainingData.R

## should take ~30 minutes to completely run 

4.5 Function 2

Using the data from the .rdata files created from Function 1 with the ObtainingData.R script, this function calculates status and trend for shoreline ice and ice edge habitat. Data is saved in intermediate (int) folder:

  • Habitat: n_IceEdgeHabitat.csv, s_IceEdgeHabitat.csv
  • Coastal Protection: n_IceShoreProtection.csv, s_IceShoreProtection.csv
ref.years <- 1979:2000
source("Status_Trend.R") # calculates status and trend for shoreline ice and ice edge habitat

4.6 Final calculations and organization

Read in ice edge habitat and ice shore protection csv-format datasets, remove anamolous eez regions with minimal ice cover, remove disputed regions. Bind these datasets and convert to units of km^2. Save seaice health, extent, trend, and extent data.

n_edge <- read.csv("int/n_IceEdgeHabitat_ref1979to2000.csv")
s_edge <- read.csv("int/s_IceEdgeHabitat_ref1979to2000.csv")
edge <- rbind(n_edge, s_edge)
edge  <- edge %>%
  dplyr::filter(Reference_avg1979to2000monthlypixels != 0) %>%
  dplyr::filter(!(rgn_id %in% c(59, 141, 219, 4, 172, 94))) %>%  # anomalous eez regions with very little ice cover
  dplyr::filter(!(rgn_id %in% c("248300","258510","258520","258600","258700"))) %>% # cut due to minimal ice (<200km2/yr - avg of months)
  dplyr::filter(rgn_nam != "DISPUTED") %>%
  dplyr::mutate(habitat="seaice_edge")

n_shore <- read.csv("int/n_IceShoreProtection_ref1979to2000.csv")
s_shore <- read.csv("int/s_IceShoreProtection_ref1979to2000.csv")
shore <- rbind(n_shore, s_shore)
shore <- shore %>%
  dplyr::filter(Reference_avg1979to2000monthlypixels != 0) %>%
  dplyr::filter(!(rgn_id %in% c(59, 89, 177, 178))) %>%  # anomalous eez regions with very little ice cover
  dplyr::filter(rgn_nam != "DISPUTED") %>%
  dplyr::mutate(habitat = "seaice_shoreline")

data <- rbind(edge, shore)
data  <- data %>%
  dplyr::mutate(km2 = Reference_avg1979to2000monthlypixels/12 * (pixel/1000)^2)

write.csv(data, "int/sea_ice.csv", row.names = FALSE)

## Health data
health <- data %>%
  dplyr::filter(rgn_typ == "eez") %>%
  dplyr::select(rgn_id, habitat, dplyr::starts_with("pctdevR")) %>%
  tidyr::gather("year", "health", -(1:2)) %>%
  dplyr::mutate(year = substring(year, 9, 12)) %>%
  dplyr::mutate(year = as.numeric(year)) %>%
  dplyr::mutate(health = ifelse(health > 1, 1, health))

write.csv(health, "output/hab_ice_health_eez.csv", row.names = FALSE) # save sea ice health data

## Trend data
trend <- data %>%
  dplyr::filter(rgn_typ == "eez") %>%
  dplyr::select(rgn_id, habitat, dplyr::starts_with("Trend")) %>%
  tidyr::gather("year", "trend", -(1:2)) %>%
  dplyr::mutate(year = substring(year, 13, 16)) %>%
  dplyr::mutate(year = as.numeric(year)) %>%
  dplyr::mutate(trend = trend * 5) %>%
  dplyr::mutate(trend = ifelse(trend > 1, 1, trend)) %>%
  dplyr::mutate(trend = ifelse(trend < (-1), -1, trend))

write.csv(trend, "output/hab_ice_trend_eez.csv", row.names = FALSE) # save sea ice trend data

## Sea ice extent data
extent <- data %>%
   dplyr::filter(rgn_typ == "eez") %>%
   dplyr::mutate(year = 2016) %>% # extent not updated each year (historic extent of the sea ice habitat); updated last 2016 bc of source methods
  dplyr::select(rgn_id, habitat, year, km2)

write.csv(extent, "output/hab_ice_extent_eez.csv", row.names = FALSE) # save sea ice extent data

4.7 Data Checks

  1. Compare to last year’s data. There should be a strong correlation between these data.
  2. Pair with region names for sanity check.
## Health comparison
health <- read.csv("output/hab_ice_health_eez.csv")
ice_health_eez <- read.csv(sprintf("../%s/output/hab_ice_health_eez.csv", previous_yr)) %>% # compare to last year's data
  dplyr::rename(health_prev_assess = health) %>%
  dplyr::left_join(health, by = c("rgn_id", "habitat", "year")) %>% 
  na.omit("health")
plot(ice_health_eez$health_prev_assess, ice_health_eez$health)
abline(0, 1, col="red")

### let's compare the data year 2020 to data year 2019

health_2020 <- health %>%
  filter(year == 2020)

health_2019 <- ice_health_eez %>%
  filter(year == 2019) %>%
  dplyr::select(-year, -health)

health_old_new <- left_join(health_2020, health_2019)

plot(health_old_new$health_prev_assess, health_old_new$health)
abline(0, 1, col="red") ## ok makes more sense to me now 

## Trend comparison
trend <- read.csv("output/hab_ice_trend_eez.csv")
ice_trend_eez <- read.csv(sprintf("../%s/output/hab_ice_trend_eez.csv", previous_yr)) %>% # compare to last year's data
  dplyr::rename(trend_prev_assess = trend) %>%
  dplyr::left_join(trend, by = c("rgn_id", "habitat", "year")) %>% 
  na.omit("trend")
plot(ice_trend_eez$trend_prev_assess, ice_trend_eez$trend)
abline(0, 1, col="red")

### let's compare the data year 2020 to data year 2019

trend_2020 <- trend %>%
  filter(year == 2020)

trend_2019 <- ice_trend_eez %>%
  filter(year == 2019) %>%
  dplyr::select(-year, -trend)

trend_old_new <- left_join(trend_2020, trend_2019)

plot(trend_old_new$trend_prev_assess, trend_old_new$trend)
abline(0, 1, col="red") ## ok makes more sense to me now 


## Extent comparison
extent <- read.csv("output/hab_ice_extent_eez.csv", stringsAsFactors = FALSE)
ice_extent_eez <- read.csv(sprintf("../%s/output/hab_ice_extent_eez.csv", previous_yr), stringsAsFactors = FALSE) %>% # compare to last year's data
  dplyr::rename(km2_prev_assess = km2)
plot(ice_extent_eez$km2_prev_assess, extent$km2)
abline(0, 1, col="red")

### let's compare the data year 2020 to data year 2019

extent_2020 <- extent 

extent_2019 <- ice_extent_eez

extent_old_new <- left_join(extent_2020, extent_2019)

plot(extent_old_new$km2_prev_assess, extent_old_new$km2)
abline(0, 1, col="red") # extent doesn't change at all, because we use the same year for extent every year



## Pair with region names
regions <- rgn_master %>%
  dplyr::select(rgn_id = rgn_id_2013, rgn_name = rgn_nam_2013) %>%
  unique()

## Make sure sea ice health across regions makes intuitive sense...
data <- read.csv("output/hab_ice_health_eez.csv") %>%
  dplyr::left_join(regions, by = "rgn_id") %>%
  dplyr::arrange(habitat, health)



#### Check largest score changes for v2021 ####

## Bouvet Island, rgn_id == 105, HAB score +15.15

# health 
bouv_health <- health_old_new %>%
  filter(rgn_id == 105) %>%
  mutate(diff = health - health_prev_assess) # +0.19 

# trend 
bouv_trend <- trend_old_new %>%
  filter(rgn_id == 105) %>%
  mutate(diff = trend - trend_prev_assess) # trend got worse 

## checked against previous conditions, and it looks like Bouvet Island and Jan Mayen (our two largest increases for 2021) have pretty volatile condition scores year after year, so I think this is ok. Nothing went wrong in the data processing on our part. 

4.8 Gapfill

There was no gapfilling for these data. Created gapfill files with values of 0. Note: all layers need a gf file, eventhough if there was no gapfilling. In this case the gapfill value is 0 for every region.

## Health gapfill
hab_ice_health_gf <- read.csv("output/hab_ice_health_eez.csv")%>%
  dplyr::mutate(gapfilled = 0) %>% 
  dplyr::select(rgn_id, year, gapfilled)

write.csv(hab_ice_health_gf, "output/hab_ice_health_eez_gf.csv", row.names=FALSE) # save sea ice health gapfill file

## Extent gapfill
hab_ice_extent_gf <- read.csv("output/hab_ice_extent_eez.csv")%>%
  dplyr::mutate(gapfilled = 0) %>% 
  dplyr::select(rgn_id, year, gapfilled)

write.csv(hab_ice_health_gf, "output/hab_ice_extent_eez_gf.csv", row.names=FALSE) # save sea ice extent gapfill file

## Trend gapfill
hab_ice_trend_gf <- read.csv("output/hab_ice_trend_eez.csv")%>%
  dplyr::mutate(gapfilled = 0) %>% 
  dplyr::select(rgn_id, year, gapfilled)

write.csv(hab_ice_health_gf, "output/hab_ice_trend_eez_gf.csv", row.names=FALSE) # save sea ice trend gapfill file