Reproducible & Collaborative Workflows

Day 1

Time (PST) Activity
10:00am - 10:50am Introduction & Reproducible workflows (50 min)
10:50am - 11:00am Break (10 min)
11:00am - 12:30pm Interactive Session: Drawing board + reporting (60 min); git recap (30min)
12:30pm - 1:30pm Lunch
1:30pm - 2:00pm Collaborative workflows: Fork and Branches (30 min)
2:00pm - 3:00pm Interactive Session: Collaborating with GitHub (60 min)
3:00pm - 3:10pm Break (10 min)
3:10pm - 4:00pm git conflicts (50 min)

Morning

Introduction to EDS-214

A few words…

Lecture 1: Definitions and Concepts of reproducible workflows

Slide deck

Planing things

Don’t start implementing nor coding without planning! It is important to stress that scientists write scripts to help them to investigate scientific question(s). Therefore scripting should not drive our analysis and thinking. We strongly recommend you take the time to plan ahead all the steps you need to conduct your analysis. Developing such a scientific workflow will help you to narrow down the tasks that are needed to move forward your analysis.

From drawings to pseudocode

Materials

Cherubini et al., 2007

Interactive session 1: Develop your worklow skills 💪

Investigating the impacts of Hurricane on stream chemistry in Puerto Rico


Afternoon

Lecture 2: Coding together

Materials

Collaborating using GitHub

At the Terminal:

git config --global user.name "Jane Doe" 
git config --global user.email janedoe@example.com
git config --global credential.helper 'cache --timeout=10000000'
git config --list

Setting GitHub token on Taylor

At the R Console:

# On your laptop
usethis::create_github_token() # This should open a web browser on GitHub

# On Taylor 
gitcreds::gitcreds_set()
usethis::git_sitrep()

Collaborating through forking

Materials

Collaborating through branches

Materials

Interactive session 2: Collaborative coding with GitHub

https://github.com/brunj7/eds214-handson-ghcollab

Bonus of the day: Git Therapy

git commit messages

Materials

GitHub conflicts

First thing to know is that actually git pull is a two step process: git pull = git fetch + git merge

Second: you did nothing wrong!! Git tries to merge files automatically. When the changes are on the same file far apart, git will figure it out on his own and do the merge automatically. However if changes are overlapping, git will call you to the rescue and ask you how to best merge the two versions.

Materials


Further reading

Here are a few selected publications to help you to learn more about these topics.

Data and scientific workflow management:

Open Science

Collaborative coding

Code Review

Branches

GitHub Workflow

Git using RStudio

Git mainly from the command line:

Undoing things



Bren School logo

The original parts of this work are licensed under a Creative Commons Attribution 4.0 International License.

This website was made with quarto by Posit.