Version Control with Git

Reproducible Book Club #2

Turkuler Ozgumus

2023-11-07

Outline

Chapter 4 of Building reproducible analytical pipelines with R

  • Version control
  • Git
  • Github
  • Use of Git and Github

Version control

  • Keep track of changes on text files
  • It is possible to see
    • how the file changed,
    • who made the changes and,
    • when these are made.

So what?

  • Easy collaboration
  • Safe backup
  • Adding/not adding new features

Git

  • Git is a tool to be used to control the version.
  • It should be installed on the computer before usage.

Github

  • Is an online service to host project repositories.
  • One should have an account to be able to use it.
  • Pros:
    • Large community
    • Continuous integration via Github Actions
  • Cons:
    • Owned by Microsoft (privacy issues)
    • Not possible to self-host an instance of Github
  • There are alternatives: Gitlab, Bitbucket…

Some notes!

  • Do not put git repositories into cloud services like Dropbox and Onedrive:
  • Public repository does not mean that everyone can make changes on the files.

Installing Git

  • open terminal ( , ) or command prompt ()
  • run
which git

or

git --version

Installing Git

sudo apt-get update
sudo apt-get install git

or https://git-scm.com/download/linux

https://git-scm.com/download/mac

https://git-scm.com/download/win

Opening a Github account

  • Open https://github.com

Git superbasics

  • Create a folder called housing
  • Save below scripts under this folder
    • save_data.R: https://is.gd/7PhUjd
    • analysis.R: https://is.gd/qCJEbi

Git superbasics

  • On terminal, go to housing folder by using
cd /path/to/folder/housing

or

  • Open the folder on file explorer
    • , right-click inside the folder and select “Open Terminal here” or similar
    • , right-click inside the folder and select “Open Git Bash here” or similar
    • , activate this option by using google search results for “open terminal at folder macOS”

Git superbasics

  • Use ls om terminal/Git bash and check folder contents
ls
  • We will use terminal/Git bash for the rest

Git superbasics

git init

Git superbasics

git status

Git superbasics

git add

git add needs file(s) to be specified

git add .
git status

Git superbasics

git commit -m "Project start"
git status

Git superbasics

Make a change from

to

Git superbasics

git status
git add .
git commit -m "Added a comment to analysis.R"

Git superbasics

how to recover a deleted file before committing the change

rm analysis.R
git status
git stash
ls
git stash drop

Git superbasics

how to recover a deleted file after committing the change

rm analysis.R
git status
git add .
git commit -m "Removed analysis.R"
git status

Git superbasics

how to recover a deleted file after committing the change

git log
git revert ab43b4b1069cd98768..HEAD

Git superbasics

how to recover a deleted file after committing the change

git log

Git superbasics

how to recover a deleted file after committing the change

git revert 8e51867dc5

Git and Github

Push -> uploading the changes we did on our local repository to GitHub

Git and Github

  • Create a new repository with following:
    • Repository name: housing
    • Public
    • No README file

Git and Github

Git and Github

Git and Github

Git and Github

Git and Github

git remote add origin git@github.com:turkulerc/housing.git
# if you would like to rename the branch from "master" to "main"
git branch -M main 
git push -u origin main

Git and Github

  • Generate a public/private RSA key pair
ssh-keygen

  • Leave this empty and press enter

Git and Github

  • Leave this empty, too

- Confirm empty passphase by pressing enter again!

Git and Github

Git and Github

We need to copy the contents of the id_rsa.pub file to Github

  • go to https://github.com
  • go to your profile settings
  • select SSH and GPG keys
  • select New SSH key
  • Name this key and paste the contents of id_rsa.pub in the text box and click on “add SSH key”

Git and Github

Git and Github

Back to the terminal/Git bash

git push -u origin master

Getting to know Github

https://github.com

Getting to know Github

Minimal, reproducible example (MRE)

  • The code needs to be self-contained
  • Required packages to run the code should be listed –> sessionInfo()
  • Provide code to create any object that the code snippet requires
  • {reprex}

Some additional resources

  • https://happygitwithr.com/
  • https://ohshitgit.com/

THANKS!