Welcome to Open Data Science Training at the University of Queensland, hosted by The Faculty of Science and The Centre for Biodiversity and Conservation Science in Brisbane, Australia, from June 18-19, 2019.

Registration is now closed. The event is funded by The Faculty of Science Emerging Leaders Programme, and is free to attend. The course will be held in Steele Building Room 315.

Course Description

This training workshop will introduce you to open data science so you can work with data in an open, reproducible, and collaborative way. Open data science means that methods, data, and code are available so that others can access, reuse, and build from it without much fuss. Coding and using collaborative software helps you reproduce your analyses, be more efficient, and share and collaborate easily with others. Here you will learn a reproducible science workflow with R, RStudio, Git, and GitHub, as recently described in Lowndes et al. 2017, Nature Ecology & Evolution.

This workshop is going to be fun, because learning these open data science tools and practices is empowering! We will live-code as we teach, with you doing everything hands-on on your own computer as you learn. Our training materials that we use to teach are online and available for you as a reference. It can also be used as self-paced learning, or you can use it to teach an in-person workshop, as we have done with Software Carpentry.

This workshop is for you if:

Our 2-day workshop will give you exposure to the best available free resources today, hands-on experience coding with them on your own computer, and confidence to continue using them in your own research. And you’ll have a community of peers to continue coding with after the workshop.

We will cover:

No prior experience with R, RStudio, and GitHub are required.

Workshop schedule

Steele Building Room 315.

Google doc notes.

Day 1 Time Schedule Day 2 Time Schedule
9-10:30 Welcome and goals of the workshop (BH);
Open data science: motivation & mindset (JL)
9-10:30 Visualization (ggplot2) (MF)
tea break Centre for Biodiversity and Conservation Science tea break
seminar 11:00-12:00 Prof Ben Halpern (03-309 in the Steele Building) 10:45-12:15 Data wrangling: dplyr (JA)
lunch lunch
12:30-14:00 R, RStudio & RMarkdown (JA) 13:00-14:30 Collaborating with GitHub (JL)
tea break tea break
14:15-15:45 GitHub (MF) 14:45-16:15 Wrangling data in the wild: Tidy Coral (JL)
15:45-16:30 Data interlude (JA) 16:15-16:30 Closing; strategies for continued learning (JL)
Homework! install.packages("tidyverse") 17:00 Pizza party!


Before the training, please make sure you have done the following.

Read Lowndes et al. 2017, Nature Ecology & Evolution.

Download up-to-date versions of R, RStudio, and Git, and create a GitHub account:

  1. Download and install R: https://cloud.r-project.org
  2. Download and install RStudio: http://www.rstudio.com/download
  3. Download and install Git: https://git-scm.com/downloads Note! Successful install won’t create an application to open
  4. Create a GitHub account: https://github.com Note! Shorter usernames that identify you are best e.g. jafflerbach

Instructor Bios

Instructors are part of the Ocean Health Index team at the National Center for Ecological Analysis and Synthesis (NCEAS).

Julie Lowndes @juliesquid

Julie Lowndes is a marine ecologist, data scientist, and Mozilla Fellow at NCEAS. As founding director of Openscapes and science program lead of the Ocean Health Index, she works to increase the value and practice of environmental open data science. She is a co-founder of EcoDataScience and R-Ladies Santa Barbara and is a Carpentries instructor. She earned her PhD at Stanford University in 2012 studying drivers and impacts of Humboldt squid in a changing climate.

Jamie Afflerbach @jafflerbach

Jamie Afflerbach is a marine data scientist at NCEAS. She is lead analyst for the US Northeast Ocean Health Index assessment and one of many scientists working on the annual global OHI assessments. Her work includes assessing the global status of fisheries using data limited methods and synthesizing large spatial datasets to estimate and track human impacts on the global oceans. She is also an enthusiastic proponent of open science and has taught data science training workshops around the US while coordinating two local Santa Barbara data sicence groups, EcoDataScience and R-Ladies Santa Barbara.

Melanie Frazier

Melanie Frazier is a marine scientist at NCEAS. Prior to NCEAS, she worked at the U.S. Environmental Protection Agency developing ballast water discharge standards and statistical methods for improving indicators of estuarine health. She loves data and statistics because they can reveal hidden patterns and help us see beyond our biases. She finds it very satisfying to teach people new skills that will make their lives easier and improve their science!

Ben Halpern @BSHalpern

Ben Halpern is Director of NCEAS and professor at the Bren School of Environmental Science & Management at the University of California, Santa Barbara. Ben’s research interests are primarily in marine ecology and conservation planning but span a wide range of disciplines. He has led several broad research programs addressing different aspects of managing ocean ecosystems, including a global synthesis of how much and where marine protected areas (MPAs) meet conservation and fisheries objectives, development of new tools and the application of them to a global assessment mapping the cumulative impact of human activities on the ocean, and development and global application of the OHI.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.