EDS 213: Group Project Description

Group Project Goal

The goal of the group project is to practice the re-use of data for a synthesis project merging different datasets to get new insights, as well as preserving your scientific products on a data repository.

Project Development

You will have to:

Define an environmental scientific question you would like to answer
Search dataONE, Google datasets and other data repositories (https://www.re3data.org/) to find data to try to answer this question
Develop a Data Management Plan for your project, including keeping track of the provenance (sources) of your data and who will be in charge of what.
Develop a scripted workflow to conduct your analysis. When possible relies on API to programmatically download the necessary data sets
Preserve your data products on the KNB repository, including developing the necessary metadata to make them reusable
Document and Make your code available

The most important part of the project is to go through those steps. Having a conclusion or a specific answer to your scientific question at the end of the project is secondary. The evaluation will be focused on the steps followed and how reusable your work will be.

Project Presentation

Prepare a 6 slides / 10min Presentation

You should address:

The question you are trying to answer
Data Management plan, including who did what
How you got the data (API, …) & merge it
Some cool results you got
Where is your work Preserved:
- Data package on KNB including data set(s) produced and code(s)
- GitHub Repository with a well documented README

Project Example

I will take the example of the LTER Luquillo stream chemistry dataset that we used for the group project of EDS-214.

An example of project question based on this data could be: Do hurricanes have an significant impact on stream chemistry and how variable is this response across the US? The data sets to merge could be the one we used for Luquillo combined with other LTER sites, such as Florida Coastal Everglades, or USGS stream data along hurricane tracks.

Actually, there was a recent LTER synthesis working group which looked at patterns in stream chemistry: https://lternet.edu/working-groups/stream-energy-nutrient-cycling/

Group Projects

(in order of presentation)

Group 8: Clarissa Boyajian, Alex Vand, and Scout Leonard – How do population demographic factors impact lead exposure in Philadelphia?
Group 4: Cullen Molitor, Desik Somasundaram, Julia Parish, Ryan Munnikhuis – What is the effect of sea surface temperature on coral bleaching?
Group “San Clemmies”: Paloma Cartwright, Mia Forsline, Daniel Kerstan, and Wylie Hampson – What is the abundance and impact of zebra mussel populations in US freshwater lakes?
Group 1 / “Teamnado”: Peter Menzies, Shale Hunter, Alex Clippinger, and Charles Hendrickson – What are the effects of tornadoes on key water quality parameters?
Group 2: Jake Eisaguirre, Juliet Cohen, Grace Lewin, and Connor Flynn – How does wind speed affect sea surface temperature and chlorophyll in our local Santa Barbara Channel?
Group 7: Steven Cognac, Felicia Cruz, and Joe Decesaro – How have harmful algal blooms changed since 2011 at the Scripps Pier?
Group 3: Halina Do-Linh, Allie Cole, and Marie Rivers – How MPA’s affect sea otter populations

Project Teams Random Suggestion

library(tidyverse)
library(DT)

# Set the random seed to make this reproducible
set.seed(123)

# Read the roster 
students <- read.csv("data/eds213_roster.csv")

# Randomly create the groups
students %>% 
  slice(sample(1:n())) %>%   # randomly arrange the data frame
  group_by((row_number()-1) %/% (n()/7)) %>%  # create 7 Groups
  nest %>% pull() %>% bind_rows(.id = "Group") %>%  # group number as column
  datatable(options = list(pageLength = 25))   # Display