git config --global user.name "Jane Doe"
git config --global user.email janedoe@example.com
git config --global credential.helper 'cache --timeout=10000000'
git config --list
Reproducible & Collaborative Workflows
Day 1
Time (PST) | Activity |
---|---|
10:00am - 10:50am | Introduction & Reproducible workflows (50 min) |
10:50am - 11:00am | Break (10 min) |
11:00am - 12:30pm | Interactive Session: Drawing board + reporting (60 min); git recap (30min) |
12:30pm - 1:30pm | Lunch |
1:30pm - 2:00pm | Collaborative workflows: Fork and Branches (30 min) |
2:00pm - 3:00pm | Interactive Session: Collaborating with GitHub (60 min) |
3:00pm - 3:10pm | Break (10 min) |
3:10pm - 4:00pm | git conflicts (50 min) |
Morning
Introduction to EDS-214
Lecture 1: Definitions and Concepts of reproducible workflows
Planing things
Don’t start implementing nor coding without planning! It is important to stress that scientists write scripts to help them to investigate scientific question(s). Therefore scripting should not drive our analysis and thinking. We strongly recommend you take the time to plan ahead all the steps you need to conduct your analysis. Developing such a scientific workflow will help you to narrow down the tasks that are needed to move forward your analysis.
From drawings to pseudocode
Interactive session 1: Develop your worklow skills 💪
Investigating the impacts of Hurricane on stream chemistry in Puerto Rico
Afternoon
Lecture 2: Coding together
Collaborating using GitHub
At the Terminal:
Setting GitHub token on Taylor
At the R Console:
# On your laptop
::create_github_token() # This should open a web browser on GitHub
usethis
# On Taylor
::gitcreds_set()
gitcreds::git_sitrep() usethis
Collaborating through forking
Collaborating through branches
Interactive session 2: Collaborative coding with GitHub
Bonus of the day: Git Therapy
git
commit messages
GitHub conflicts
First thing to know is that actually git pull
is a two step process: git pull = git fetch + git merge
Second: you did nothing wrong!! Git tries to merge files automatically. When the changes are on the same file far apart, git will figure it out on his own and do the merge automatically. However if changes are overlapping, git
will call you to the rescue and ask you how to best merge the two versions.
Further reading
Here are a few selected publications to help you to learn more about these topics.
Data and scientific workflow management:
- The Practice of Reproducible Research-Case Studies and Lessons from the Data-Intensive Sciences:
Kitzes, Justin, Daniel Turek, and Fatma Deniz. n.d. http://www.practicereproducibleresearch.org/ - Good enough practices in Scientific Computing:
https://doi.org/10.1371/journal.pcbi.1005510 - Script your analysis:
https://doi.org/10.1038/nj7638-563a - Principles for data analysis workflows:
https://doi.org/10.1371/journal.pcbi.1008770 - Benureau, F.C.Y., Rougier, N.P., 2018. Re-run, Repeat, Reproduce, Reuse, Replicate: Transforming Code into Scientific Contributions. Front. Neuroinform. 0. https://doi.org/10.3389/fninf.2017.00069
- Sandve, G.K., Nekrutenko, A., Taylor, J., Hovig, E., 2013. Ten Simple Rules for Reproducible Computational Research. PLOS Computational Biology 9, e1003285. https://doi.org/10.1371/journal.pcbi.1003285
- Some Simple Guidelines for Effective Data Management:
https://doi.org/10.1890/0012-9623-90.2.205 - Basic concepts of data management:
https://www.dataone.org/education-modules
Open Science
- The Tao of open science for ecology:
https://doi.org/10.1890/ES14-00402.1 - Challenges and Opportunities of Open Data in Ecology:
https://doi.org/10.1126/science.1197962
- Scientific computing: Code alert
https://doi.org/10.1038/nj7638-563a - Our path to better science in less time using open data science tools
https://doi.org/10.1038%2Fs41559-017-0160 - FAIR data guiding principles
https://doi.org/10.1038/sdata.2016.18 - Skills and Knowledge for Data-Intensive Environmental Research https://doi.org/10.1093/biosci/bix025
- Let go your data
https://doi.org/10.1038/s41563-019-0539-5
Collaborative coding
- A new grad’s guide to coding as a team - Atlassian: https://www.atlassian.com/blog/wp-content/uploads/HelloWorldEbook.pdf
- 10 tips for efficient programming: https://www.devx.com/enterprise/top-10-tips-for-efficient-team-coding.html
- Agile Manifesto: https://moodle2019-20.ua.es/moodle/pluginfile.php/2213/mod_resource/content/2/agile-manifesto.pdf
Code Review
- Small-Group Code Reviews For Education: https://cacm.acm.org/blogs/blog-cacm/175944-small-group-code-reviews-for-education/fulltext
Branches
- Interactive tutorial to learn more about git branches and more https://learngitbranching.js.org/
GitHub Workflow
- GitHub:
- guides on how to use GitHub: https://guides.github.com/
- GitHub from RStudio: http://r-pkgs.had.co.nz/git.html#git-pull
- Forking:
- Comparison of git repository host services: https://www.git-tower.com/blog/git-hosting-services-compared/
- Branches
- interactive tutorial on branches: http://learngitbranching.js.org/
- using git in a collaborative environment: https://www.atlassian.com/git/tutorials/comparing-workflows/centralized-workflow https://moodle2019-20.ua.es/moodle/pluginfile.php/2213/mod_resource/content/2/agile-manifesto.pdf
Git using RStudio
- Happy Git and GitHub for the useR:http://happygitwithr.com/
- R packages - Git and GitHub: http://r-pkgs.had.co.nz/git.html#git-init
Git mainly from the command line:
- Interactive git 101: https://try.github.io/
- Very good tutorial about git: https://www.atlassian.com/git/tutorials/what-is-version-control
- Git tutorial geared towards scientists: http://nyuccl.org/pages/gittutorial/
- Short intro to git basics: https://github.com/mbjones/gitbasics
- Git documentation about the basics: http://gitref.org/basic/
- Git documentation - the basics: https://git-scm.com/book/en/v2/Getting-Started-Git-Basics
- Git terminology: https://www.atlassian.com/git/glossary/terminology
- In trouble, guide to know what to do: http://justinhileman.info/article/git-pretty/git-pretty.png
- Want to undo something? https://github.com/blog/2019-how-to-undo-almost-anything-with-git
- Git terminology: https://www.atlassian.com/git/glossary/terminology
- 8 tips to work better with git: https://about.gitlab.com/2015/02/19/8-tips-to-help-you-work-better-with-git/
- GitPro book (2nd edition): https://git-scm.com/book/en/v2
Undoing things
- Help to decide how to undo your problem: http://justinhileman.info/article/git-pretty/git-pretty.png
- Undo almost everything with git https://blog.github.com/2015-06-08-how-to-undo-almost-anything-with-git/
- Difference between git reset soft, mixed and hard https://davidzych.com/difference-between-git-reset-soft-mixed-and-hard/
- Resetting, Checking Out & Reverting https://www.atlassian.com/git/tutorials/resetting-checking-out-and-reverting