Group Project Description

Group Project Goal

The goal of the group project is to practice the re-use of data for a synthesis project merging different datasets to get new insights, as well as preserving your scientific products on a data repository.

Project Development

You will have to:

  1. Define an environmental scientific question you would like to answer
  2. Search dataONE, Google datasets and other data repositories (https://www.re3data.org/) to find data to try to answer this question
  3. Develop a Data Management Plan for your project, including keeping track of the provenance (sources) of your data and who will be in charge of what.
  4. Develop a scripted workflow to conduct your analysis. When possible relies on API to programmatically download the necessary data sets
  5. Preserve your data products on the KNB repository, including developing the necessary metadata to make them reusable
  6. Document and Make your code available

The most important part of the project is to go through those steps. Having a conclusion or a specific answer to your scientific question at the end of the project is secondary. The evaluation will be focused on the steps followed and how reusable your work will be.

Project Presentation

Prepare a 6 slides / 10min Presentation

You should address:

Project Example

I will take the example of the LTER Luquillo stream chemistry dataset that we used for the group project of EDS-214.

An example of project question based on this data could be: Do hurricanes have an significant impact on stream chemistry and how variable is this response across the US? The data sets to merge could be the one we used for Luquillo combined with other LTER sites, such as Florida Coastal Everglades, or USGS stream data along hurricane tracks.

Actually, there was a recent LTER synthesis working group which looked at patterns in stream chemistry: https://lternet.edu/working-groups/stream-energy-nutrient-cycling/


Group Projects

(in order of presentation)


Project Teams Random Suggestion

library(tidyverse)
library(DT)

# Set the random seed to make this reproducible
set.seed(123)

# Read the roster 
students <- read.csv("data/eds213_roster.csv")

# Randomly create the groups
students %>% 
  slice(sample(1:n())) %>%   # randomly arrange the data frame
  group_by((row_number()-1) %/% (n()/7)) %>%  # create 7 Groups
  nest %>% pull() %>% bind_rows(.id = "Group") %>%  # group number as column
  datatable(options = list(pageLength = 25))   # Display