Version control and code review
Objectives
Browse commits and branches of a Git repository.
Remember that commits are like snapshots of the repository at a certain point in time.
Know the difference between Git (something that tracks changes) and GitHub/GitLab (a platform to host Git repositories).
Instructor note
xx min teaching/discussion
Why do we need to keep track of versions?
Version control is an answer to the following questions (do you recognize some of them?):
“It broke … hopefully I have a working version somewhere?”
“Can you please send me the latest version?”
“Where is the latest version?”
“Which version are you using?”
“Which version have the authors used in the paper I am trying to reproduce?
“Found a bug! Since when was it there?”
“I am sure it used to work. When did it change?”
“My laptop is gone. Is my thesis now gone?”
Features: roll-back, branching, merging, collaboration
Problem: Your code worked two days ago, but is giving an error now. You don’t know what you changed.
Problem: You and your colleague want to work on the same code at the same time.
Roll-back: you can always go back to a previous version and compare
Branching and merging:
Work on different ideas at the same time
Different people can work on the same code/project without interfering
You can experiment with an idea and discard it if it turns out to be a bad idea

Image created using https://gopherize.me/ (inspiration).
Collaboration: review, compare, share, discuss
Reproducibility
Problem: Someone asks you about your results from 5 years ago. Can you get the same results now?
How do you indicate which version of your code you have used in your paper?
When you find a bug, how do you know when precisely this bug was introduced (Are published results affected? Do you need to inform collaborators or users of your code?).
With version control we can “annotate” code (browse this example online):

Example of a git-annotated code with code and history side-by-side.
What we typically like to snapshot
Software (this is how it started but Git/GitHub can track a lot more)
Scripts
Documents (plain text files much better suitable than Word documents, this material is tracked using Git)
Manuscripts (Git is great for collaborating/sharing LaTeX or Quarto manuscripts)
Configuration files
Website sources
Data
Demonstration
Example repository: https://github.com/coderefinery/planets
Commits are like snapshots and if we break something we can go back to a previous snapshot.
Commits carry metadata about changes: author, date, commit message, and a checksum.
Branches are like parallel universes where you can experiment with changes without affecting the default branch: https://github.com/coderefinery/planets/network (“Insights” -> “Network”)
With version control we can annotate code (example).
Collaboration: We can fork (make a copy on GitHub), clone (make a copy to our computer), review, compare, share, and discuss.
Code review: Others can suggest changes using pull requests or merge requests. These can be reviewed and discussed before they are merged. Conceptually, they are similar to “suggesting changes” in Google Docs.
Where to explore more
Exercises
Exercise Git-2: Contribute to the example repository
TODO: Have something in example repo that anyone could contribute to?
Fork the example repository: https://github.com/coderefinery/planets
Create a new branch in your fork and give it a descriptive name.
Make a modification on the new branch and create a new commit in the webinterface.
The new branch and the new commit now only exist on your branch on your fork, not yet in the original repository.
In case you would like to contribute your change back to the original repository, you would create a pull request (you are welcome to try). TODO: Full workflow with Issue and PR description?
TODO: In case you wanted to work on this exercise locally, the process would be the following: Fork on webinterface, clone to local computer, create new branch, work on branch, add, commit to local branch, push to remote - new branch : now same stage as when working in webinterface.
Exercise Git-3: Archaeology using Git annotate (“blame”)
Your goal is to find out when precisely this line was modified last time (which commit)?
Solution
TODO: Describe how it can be found! It was this commit: https://github.com/coderefinery/planets/commit/56cf6fdfef6a516ee369034d7c67a20237abb368
Keypoints
TODO