Introduction

My typical git use consists in archiving file modifications for personal projects. Sometimes I share these projects with a few colleagues. We are then 2, 3 or a handful of persons working concurrently on the same project, pushing and pulling changes several times per week or even per day.

Simple workflow for one user

We suppose someone has helped you initialise a git repository. Before you modify files, get the latest version of files from the remote repository:

git pull

Show which files have been modified in the folder under git control:

git status

Add a file to be committed

git add filename

Or add all files:

git add --all

Commit and describe changes

git commit -m "describe changes made to those files"

Now you can continue working and changing files. Repeat the git add --all or git add filename followed by git commit -m "describe changes" each time your reach a point where you think it’s a good idea to archive your work (when a fix is implemented, or at the end of the morning). Before you leave your computer, push the content to the remote repository with:

git push

In case you work with other people on this remote repository, Before you push, you might want to integrate work done by others in your repository. If any new work has been added by others on the remote repository, git will ask you to first do:

git pull

and then:

git push

This simple work flow works well for one user pulling from- and pushing to- a master branch on a remote repository.

Sometimes the vi editor starts. To exit the vi editor:

ESC:q!

Note: you might use a graphical git client with check boxes to add the files to be committed.

Workflow with branches

Initialise a new repository

git init

Change to the development branch

# If the branch doesn't exist, create it first.
git branch develop
git checkout develop

Work and modify files in the folder. At some point, when your work has reached a certain status, you might want to archive it before you move on to something else. The few commands below tell git to archive the work. All file modifications will be archived with a commit hash which is a unique key describing your changes. The most important part is entering a commit message, it should be sufficiently general to describe your changes with a high level of abstraction and sufficiently specific to be understood amoung all other changes later.

First show which files have been modified in the folder under git control:

git status

Add a file to be committed

git add filename
# Add all files
git add --all

Commit and describe changes

git commit -m "describe changes made to those files"

Now you can continue working and changing files. Repeat the git add --all and git commit -m "describe changes" each time your reach a consistant change status.

At some point you will be ready to upload your content to a remote archive. Merge to the master branch

git checkout master
git merge develop

Push contents to the remote repository

git push

See below how to set-up the remote repository with git remote add origin ssh:repositoryurl and pushing content for the first time with git push -u origin master.

Check status again

git status

Go back to the development branch

git checkout develop

Continue to modify files in the repository.

Add and commit

Add patterns

Add modified markdown files to the next git commit

git add *.md

The star * is a file glob pattern that matches all other path components in between.

Commit messages

Commit messages descripe at a high level what has been changed in the code.

On the linux kernel repository commit messages start with a capital letter and do not have a dot at the end. This style is also what is written in the tidyverse style guide.

Example of a long commit message (like a letter) torvalds/uemacs Stop using ‘short’ for line and allocation sizes.

Extract:

“I really should just learn another editor, rather than continue to polish this turd.”

Add minor changes to the previous commit (git commit –amend):

git commit --amend

Autocompletion

git has an auto complete feature available in a script: git-completion.bash.

That script can be loaded in bashrc at the start of the shell. A debian package was made to simply load git-completion.bash for you:

sudo apt install bash-completion

Branching

To start work in a new branch:

git branch new_branch_name
git checkout new_branch_name

To compare a file between 2 branches:

git diff branch1 branch2 file_name

To compare the tip of 2 branches

git diff branch1..branch2

To merge changes back to the main branch:

git checkout main

git merge branch1

If there were conflicts, they will be presented in this way:

"The area where a pair of conflicting changes happened is marked with
markers &lt;&lt;&lt;&lt;&lt;&lt;&lt;, =======, and
&gt;&gt;&gt;&gt;&gt;&gt;&gt;. The part before the ======= is typically your
side, and the part afterwards is typically their side."

If I am on a detached head, it is recommended to create a temporary branch (stackoverflow).

git branch temp

git checkout temp

git add -a

git  commit -m "description of changes"

git checkout main

git merge temp

Delete uncommitted changes in current working directory:

git checkout branch_name .

Branching strategy

For solo projects, I use to have this branching strategy:

I never work in the master branch, always in a development branch called develop. I merge develop in master, only before pushing modifications back to my remote archive. This simple branching strategy makes it much easier to deal with potential changes in the remote master branch.

But I have now changed to a simpler strategy where I work only on the master branch.

For collaborative projects, we use feature branches or also a dev and a main branch.

Create a branch

Create a branch and start working on it

git branch new-branch-name
git checkout new-branch-name

Push the new branch to the remote repo

git push --set-upstream origin new-branch-name
git push -u origin new-branch-name

Move to a commit in history and create a branch at that commit

git checkout <commit-sha>
git switch -c new-branch-name

Delete a branch

I might need to delete a branch at some point:

git branch -d branch_name

Delete a remote branch

git push --delete origin branch_name
git push --d origin branch_name

Check if a branch has been merged before deleting it

git merge-base main compare

Will return the latest common commit between the 2 branches.

Merge branches

To merge branch1 changes to the main branch:

git checkout main

git merge branch1

Checkout `--ours` or `--theirs`

In case of a merge conflict, you can edit the corresponding file(s) manually in a text editor, or you can decide to keep one of the 2 versions by checking out. For example if you are in the main branch

git checkout --ours <filename>

will get the file as it is in the main branch, while

git checkout --theirs <filename>

will get the file as it is in branch1. And vice-versa, if you are currently in a feature branch git checkout --ours <filename> will get the file as it is in the feature branch.

Unmerged paths

When a file in a status Unmerged paths

Unmerged paths:
  (use "git restore --staged <file>..." to unstage)
  (use "git add <file>..." to mark resolution)
        both modified:  file_name.ext

Either edit the file, looking for ===== to find the zones of conflict. Or revert to the latest tracked version of that file with

git restore --stage file_name.ext

Checkout a remote branch

When you fetch a remote repository, you get all its branches. The remote branches can be viewed with:

git branch -r

If there is only one branch with that name, you can simply do (without the remote prefix, only the branch name):

git checkout <branch name>

To create a local branch that tracks the remote branch. You’ll be able to make changes to that local branch and push them to the remote branch later.

Branch naming convention

https://medium.com/@nitanshu1991/git-branching-conventions-102041e9bb71

“If your branch name is more than one words you should use underscore _ rather than hyphen - because - here separates the two different branch.”

Rename a branch

Create and push under a new branch name to the remote repository

git switch -c new-branch-name
git push -u origin new-branch-name

Delete the old branch name locally and in the remote repository

git branch -d old-branch-name
git push --delete origin old-branch-name

Configuring git

Configuring user name and email

Display your user name, email and remote repositories

git config -l

To change username and email globally

git config --global user.name "Your Name" 

git config --global user.email you@example.com

Stackoverflow git multiple user names for different projects explains how to change the email for the current repository only

git config user.email personal@example.org

Other config –global options

How do I make Git use the editor of my choice for commits?

git config --global core.editor "vim"

Edit the global configuration file

git config --global --edit

Remembering passwords

How to save user name and password in git

git config credential.helper store

Alternativelly, it’s better to use ssh keys to connect to git repositories.

Debug git with trace

To see more output when a git command is not working as you would expect you can prefix it with GIT_TRACE=1:

GIT_TRACE=1 git commit

Display changes

To view modified files that have not been committed and to view commit history you can use:

git status
git log

git log --pretty=oneline

Show when the tip of branches have been updated

git reflog

Alternatively, call the repository browser with:

gitk

To view a shorter version of the log file, and get an idea at where I am in the history:

git log --graph --decorate --all --pretty=oneline

You can define an alias for git log as explained be Fred here:

git config --global alias.lg "log --color --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr)%C(bold blue)&lt;%an&gt;%Creset' --abbrev-commit"

The new alias can then be used with

git lg

Use tags to specify important points in history, such as software versions.

Diff the working directory

Shows the changes between the working directory and the index.

git diff

Shows the changes between the index and the HEAD

git diff --cached

Shows all the changes between the working directory and HEAD

git diff HEAD

Compare between 2 commits

git diff <commit hash 1> <commit hash 2>

List all revision of a given line number in the git history

FILE=filename.csv
LINE=1
git log --format=format:%H $FILE | xargs -L 1 git blame $FILE -L $LINE,$LINE

Useful to track different versions of CSV files headers.

Diff csv files

To consider commas a separator when using word diff in csv file, you can use

git diff --word-diff-regex="[^[:space:],]+"
# Or 
git diff --color-words="[^[:space:],]+" x.csv y.csv

Show changes in previous commits

Show which change happened in a particular commit

git show <commit-hash>

Show only word changes

git show --word-diff <commit-hash>

Show all commits by a person

git log --author="Author Name"

With a regular expression

git log --author="^John"

Blame

To see the commit number and author of each line in a file use

git blame filename

Illustrations

Data flows and storage levels in the Git revision control system.

.gitignore

To ignore all files in a folder but not the folder itself. Put this .gitignore into the folder, then git add .gitignore

*
!.gitignore

To exclude everything except a specific directory foo/bar (note the /* - without the slash, the wildcard would also exclude everything within foo/bar):

/*
!/foo
/foo/*
!/foo/bar

.gitkeep

So question how can I add a blank directory to a git repository

some answers suggest using .gitkeep or simply .keep because files starting with “.git” should be reserved for git usage.

Git run to automate many repositories

Git run enables to pull from many repositories with a single command

configuration was a bit complicated on debian

Install nodejs and npm from the backports

sudo apt -t stretch-backports install nodejs
sudo apt -t stretch-backports install npm
sudo apt install build-essential

Configure nmp to run only for the local user configure npm to use a different directory citing docs.npmjs.com Manually change npm’s default directory

in your home directory, create a directory for global installations:

mkdir ~/.npm-global

Configure npm to use the new directory path:

npm config set prefix '~/.npm-global'

Install Git run:

npm install -g git-run

Edit your bash profile:

vim ~/.bash_aliases

and add this line:

export PATH=~/.npm-global/bin:$PATH

Then the configuration file is a .json file on the home folder. Edit it:

vim ~/.grconfig.json

For example:

{
  "tags": {
    "all": [
      "/home/paul/repos/bioeconomy_notes",
      "/home/paul/repos/bioeconomy_papers",
      "/home/paul/repos/cbm3_python",
      "/home/paul/repos/cbmcfs3_data",
      "/home/paul/repos/cbmcfs3_pub",
      "/home/paul/repos/cbmcfs3_runner",
    ]
  }
}

Show the status and pull from many repos:

gr git status
gr git pull

Going back in time

Bissect

https://git-scm.com/docs/git-bisect

“This command uses a binary search algorithm to find which commit in your project’s history introduced a bug. You use it by first telling it a”bad” commit that is known to contain the bug, and a “good” commit that is known to be before the bug was introduced. Then git bisect picks a commit between those two endpoints and asks you whether the selected commit is “good” or “bad”. It continues narrowing down the range until it finds the exact commit that introduced the change. In fact, git bisect can be used to find the commit that changed any property of your project; e.g., the commit that fixed a bug, or the commit that caused a benchmark’s performance to improve. To support this more general usage, the terms “old” and “new” can be used in place of “good” and “bad”, or you can choose your own terms. See section “Alternate terms” below for more information.”

Log

Display the modification log

git log

In the log, you can copy the beginning of a commit hash for use as commit_sha in the reset commands below.

Display the log of a particular branch (after a fetch for example)

git log origin/master

Display the log of a particular file

git log path/filename

Display a compact log for one file or one directory only

git log --abbrev-commit --pretty=oneline path_to_file

Edit ~/.gitconfig to define an alias that shows the last 10 log entries in a compact way:

[alias]
lkj = log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(cyan)<%an>%Creset' --abbrev-commit -10

This alias also works with the log of one file only

git lkj path/filename

Commit statistics

Show the number of lines modified in each files for each commit:

git log --numstat

A one liner version showing the number of files changes, number of lines inserted and deleted how to get log with short stat in one line:

git log --pretty="@%h"  --shortstat |  tr "\n" " " | tr "@" "\n"

Export the log to csv

SO Export git log as csv:

git log --pretty=format:'%h,%an,%ad,"%s"' --date=iso -10

Edit ~/.gitconfig to define it as an alias:

logcsv = log --pretty=format:'%h,%an,%ad,\"%s\"' --date=iso

Another version adding stats on the number of files changes, number of lines inserted and deleted based on how to get log with short stat in one line:

git log --pretty=format:'linebegin%h,%an,%ad,"%s"' --shortstat --date=iso |  tr "\n" "," | sed -E 's/linebegin|,*$/\n/g'

Simpler solution using paste to glue lines together (there is an issue with extra spaces at the beginning and in the middle of some lines):

git log --pretty=format:'%h,%an,%ad,"%s",' --shortstat --date=iso | paste - - - #> /tmp/gitlog.csv

File history through rename and deletion

Find the last commit where a file was still present in the repository.

Show commits where a particular file was modified, it continues history even if the file was renamed.

git log --follow -- notebooks/wood/belarus_ukraine_russia/wood_bur_comtrade_monthly.Rmd

Show all commits where a file was modified

git log --full-history -- biomass_mandate_s2p/biomass_mandate_s2p_report.Rmd

All commits affecting one file

Show all commits affecting one file including dangling commits

git reflog -- README.md

https://stackoverflow.com/questions/17195604/how-can-i-see-the-list-of-all-commits-that-affect-a-specific-file-with-git-incl

Reference log

git help reflog

> Reference logs, or "reflogs", record when the tips of branches and other 
> references were updated in the local repository. Reflogs are useful in 
> various Git commands, to specify the old value of a reference. For 
> example, HEAD@{2} means "where HEAD used to be two moves ago", 
> master@{one.week.ago} means "where master used to point to one week ago 
> in this local repository", and so on.

Tig

The tig utility is a repository browser for git log and git diff.

Rewriting history

Amend to the previous commit

In case you entered a wrong commit message, you can edit the last commit message with:

git commit --amend

In case you forgot a file, you can add a file to the last commit with:

git add filename
git commit --amend

A comment in How to change past commit to include a missed file? specifies that commits before and after --amend have different hashes, even if only the commit message is changed. The use of --amend is only for private commits, which have not yet been shared with others.

Rewriting commits older than the previous one is much harder. It requires the use of git rebase. I regularly want to fix typo in the git log, but it is not possible it was made unflexible on purpose. Explained in this Stackoverflow Answer

“The real reason git doesn’t allow you to change the commit message ends up being very simple: that way, you can trust the messages. If you allowed people to change them afterwards, the messages are inherently not very trustworthy.”

Commit to the wrong branch

https://stackoverflow.com/questions/2941517/how-to-fix-committing-to-the-wrong-git-branch

“If you haven’t yet pushed your changes, you can also do a soft reset:”

git reset --soft HEAD^

“This will revert the commit, but put the committed changes back into your index. Assuming the branches are relatively up-to-date with regard to each other, git will let you do a checkout into the other branch, whereupon you can simply commit:”

git checkout branch
git commit -c ORIG_HEAD

“The -c ORIG_HEAD part is useful to not type commit message again.”

Rebase your changes on top of the remote head commit

It is safe to use rebase as long as the commits have not been pushed to another repository. To rename older commits, simply run:

git rebase -i
git rebase -i @~9   # Show the last 9 commits in a text editor

Rebase will show the last 3 commits by default in the vim editor. Each line starts with “pick” followed by the commit hash and commit message. You can then replace the “pick” word in the vim editor by the command you want to achieve, press “escape”, “:wq” and peform the git command on the selected commit.

Rebase when a remote upstream branch has diverged and there is a merge conflict

You can either go through with the rebase and resolve the conflict. In case you just want to get to the status of the remote branch

Abort the Rebase:

git rebase --abort

Reset the local branch

git fetch origin
git reset --hard origin/main

See also the section on checkout --ours or --theirs.

Rebase a feature branch to the main branch

This is not a good idea if other people are working on the same project and have committed to that branch. Go to the feature branch and rebase the changes onto the main branch

git checkout feature-branch
git rebase main

Note that if the main branch has already been merged into the feature branch with git checkout feature-branch followed by git merge main, then it’s too late, you cannot rebase anymore.

List of commands used to delete a feature branch where I had mistakenly merged main, I wanted to delete the branch and rebase and then try again.

Delete the feature branch, recreate it and pull again

git checkout main
git branch -D crcf
git branch crcf
git checkout crcf
git pull origin crcf

From the feature branch, rebase on top of the main branch

git rebase main
git rebase --continue
git checkout main
git rebase crcf

Merge conflict

To keep only the version of their changes

git checkout --theirs .

Git history when changing a licence

SO question in a case where the project was not made available publicly yet.

Use rebase -i
or use git filter-branch

Clean

Remove all files also those not tracked by git because they are in .gitignore. For example output pdf files and auxiliary files generated from latex or from pandoc.

git clean -xf

Remove all files including directories

git clean -f -d

Remove local (untracked) files from my current Git branch

Show what will be deleted with the -n option:

git clean -f -n

Then - beware: this will delete files - run:

git clean -f

Alternatively clean in interactive mode:

git clean -i

Reset whole folder

Identify the commit identity in the log and copy its sha number. Then to go back to this state for the whole folder:

git reset --hard commit_sha

Do this before pushing only!

Once you have pushed the changes, you need to revert the commit and commit those changes as well:

git revert -m 1 HEAD~
git revert commit_sha

See https://stackoverflow.com/questions/9804211/can-not-push-changes-after-using-git-reset-hard

Use the begining of a commit_sha copied from the log.

To remove a commit where I accidentaly commited a very large file. I used how to remove a too large file in a commit:

git reset HEAD^

Then added the large file to .gitignore and committed again.

Restore only one file

Git restore can be used to “restore working tree files”. The git checkout command can also be used for this and many other things. Git restore was created to simplify and have a dedicated command that only focuses on restoring files.

To go back to this state for only one file, see git checkout

git checkout commit_hash  path_to_file/file_name

No need to enter the commit hash to get to get the file from the latest commit

git checkout path_to_file/file_name

Checkout the older revision of a file under a new name

git show commit_sha:filename > new_file_name

Revert

Revert an unwanted commit with a new commit doing exactly the reverse operation

git revert <commit hash>

Linus Torvald’s advice on reverting a merge

“When you have a problem you are chasing down, and you hit a”revert this merge”, what you’re hitting is essentially a single commit that contains all the changes (but obviously in reverse) of all the commits that got merged. So it’s debugging hell, because now you don’t have lots of small changes that you can try to pinpoint which part of it changes.

But does it all work? Sure it does. You can revert a merge, and from a purely technical angle, Git did it very naturally and had no real troubles. It just considered it a change from “state before merge” to “state after merge”, and that was it. Nothing complicated, nothing odd, nothing really dangerous. Git will do it without even thinking about it.

So from a technical angle, there’s nothing wrong with reverting a merge, but from a workflow angle it’s something that you generally should try to avoid.

If at all possible, for example, if you find a problem that got merged into the main tree, rather than revert the merge, try really hard to bisect the problem down into the branch you merged, and just fix it, or try to revert the individual commit that caused it.

Yes, it’s more complex, and no, it’s not always going to work (sometimes the answer is: “oops, I really shouldn’t have merged it, because it wasn’t ready yet, and I really need to undo all of the merge”). So then you really should revert the merge, but when you want to re-do the merge, you now need to do it by reverting the revert.”

Dangling blobs and trees

Show dangling blobs and trees with the git file system check command:

git fsck

Stackoverflow What is a dangling blob?

“Dangling blob = A change that made it to the staging area/index, but never got committed. One thing that is amazing with Git is that once it gets added to the staging area, you can always get it back because these blobs behave like commits in that they have a hash too!!”

git fsck shows dangling blobs and dangling trees. Running git cat-file -p blob_sha > /path/to/the/file followed by git diff shows me a lot of line deletions. I have to explore around to find which blob is adding content back.

To show the content of a blob

git show blob_sha

To retrieve a single file from a dangling blob

git cat-file -p blob_sha > /path/to/the/file

To retrieve many files from a dangling tree SO recover dangling blobs in git

git checkout  commit_sha -- .

Going forward in time

Cherry pick one commit

I was on a dev branch and committed a change to the README file. But in fact I wanted this change - related to installations instructions - to be visible by all users immediately. So that commit should have been on the main branch. The problem is that I had other commit in the dev branch that were not ready yet to be shared on the main branch. In order to pick just the commit that changed the README file and put it in the main branch I did:

git cherry-pick <commit-hash>

https://www.atlassian.com/git/tutorials/cherry-pick

“Cherry picking is a powerful and convenient command that is incredibly useful in a few scenarios. Cherry picking should not be misused in place of git merge or git rebase. The git log command is required to help find commits to cherry pick.”

GUI Graphical User Interface

For colleagues who need a GUI:

http://gitextensions.github.io/ recommended on this SO answer
Github Desktop

A section of the git book on Graphical User Interfaces describes the default git-gui and gitk graphical user interfaces to prepare commits and view the git log history. It also describes the github desktop client and in particular it’s “sync” feature:

“The main way you interact with other repositories over the network is through the “Sync” feature. Git internally has separate operations for pushing, fetching, merging, and rebasing, but the GitHub clients collapse all of these into one multi-step feature. Here’s what happens when you click the Sync button:
git pull --rebase. If this fails because of a merge conflict, fall back to git pull --no-rebase.
git push.
This is the most common sequence of network commands when working in this style, so squashing them into one command saves a lot of time. ”

Help

Get help on a command (will start a web browser):

git init --help

Hosting services and online platforms

Github

Press y to get a permalink to the current file on Gitlab. The URL will include the hash number of the latest commit. https://docs.github.com/en/repositories/working-with-files/using-files/getting-permanent-links-to-files

Gitlab

Issues

Cross linking issues

https://docs.gitlab.com/ee/user/project/issues/crosslinking_issues.html

To link between issues in different repositories in the same project, prefix the #issue number with the project name:

project_name#1

If they are not in the same group, you can add the full URL to the issue (https://gitlab.com///-/issues/).

Duplicates

To mark an issue as duplicate:

/duplicate #issue_number

Issue priority triage

On the Gitlab labels page:

“Labels can be applied to issues and merge requests. Star a label to make it a priority label. Order the prioritized labels to change their relative priority, by dragging.”

Issue triage by priority

Priority	Importance	Intention
`~priority::1`	Urgent	We will address this as soon as possible regardless of the limit on our team capacity.
`~priority::2`	High	We will address this soon and will provide capacity from our team for it in the next few releases.
`~priority::3`	Medium	We want to address this but may have other higher priority items.
`~priority::4`	Low	We don’t have visibility when this will be addressed.

CI Continuous Integration

By default, job artifacts are not kept on failure. To enable this add a when clause to .gitlab-ci.yaml

  artifacts:
    when: always
    paths:
    - notebooks

Usage quotas

View the usage quota through time and by repository at the top project level under settings / usage quotas.

gitlab pages

Site URL available in the project repository under Settings/Pages

The app’s built output files are in a folder named “public”

The public folder also needs to be published as an artifact. See the .gitlab-ci.yml file in this repository.

Deploy token

Create a deploy token on the repository page. It also works at the project level for several repos.

A deploy token makes it possible to clone in an unauthenticated way.

SO answer

git clone https://username:token@gitlab.com/user/repo.git

Usage quotas

Check usage quotas for a group at (replace group_name by the actual group name):

https://gitlab.com/groups/group_name/-/usage_quotas

For a user at (link works as is):

https://gitlab.com/-/profile/usage_quotas#pipelines-quota-tab

Line endings

How to renormalize line endings in all files in all revisions?

git add --renormalize .

This is usually taken care of at the system level. But sometimes when you share files with windows machines (accessing them on a shared folder for example) you end up with commits that show a lot of modifications because the windows line ending LRLF appears in the linux machine. Configuring git to handle line endings To force the line ending to be LF only, create a .gitattributes file which contains:

text eol=lf

Publishing project documentation

On Github

Older content related to pages built from branches.

SO answer to the question “How to add a git repo as a submodule of itself? (Or: How to generate GitHub Pages programmatically?)”: An alternative to using Git Submodules to generate GitHub Pages is to use Git Subtree Merge Strategy.

In fact I didn’t use quite that strategy and I instead cloned a temporary copy of my repository. Created the gh-page branch. Pushed it to github. Then I went back to the original repository (where I have a few large untracked data files I find handy to keep for analyses purposes).

Then within the inst folder, I cloned only the gh-branch. To clone only one branch:

git clone -b mybranch --single-branch git://sub.domain.com/repo.git

Then I renamed the folder to “web”, so that I had a inst/web folder, tracking the gh-branch. inst/web is ignored in the main repository.

On Gitlab

Gitlab has an integrated CI mechanism, * Create a pages website by using a CI / CD template * Rendering an R markdown presentation to gitlab pages uses the rocker/verse:4.0.0 docker image. Posted on September 16, 2020. * GitLab CI for your bookdown project Uses the rocker/verse:4.0.2 docker image. Posted on 2020-09-08. A setup generating pages for the master branch and also for merge requests.

To generate html pages from markdown and publish them, I added a .gitlab-ci.yml to the repository. The CI deployment is visible on the paulrougieux.jobs page. Clicking on “running” displays the latest activity of an ongoing build. It starts with “Pulling docker image rocker/verse:4.0.0 …”. Once the build is finished, the completed page is visible at https://paulrougieux.gitlab.io/paulrougieux/

Remote

Add a remote repository with:

git remote add git@github.com:paulrougieux/paulrougieux.github.io.git
# or
git remote add reponame https://repositoryurl

Change the url of a remote repository

git remote set-url origin git@github.com:paulrougieux/paulrougieux.github.io.git

When a repository is connected to several remote repositories, you can push content to repositories by specifying their names

git push -u origin master
git push -u anotherrepository master

To change the default git remote, push with :

git push -u origin master

Then later push of that branch to that remote can be made simply with:

git push

Another command without specifying the remote and the branch

$ git push -u

fatal: The current branch master has no upstream branch.
To push the current branch and set the remote as upstream, use
    git push --set-upstream origin master

After I run this set upstream flag, I can push to the remote server. Then I get this message

[...]
 * [new branch]      master -&gt; masterBranch master set up to track remote branch master from origin.

I’ll have to figure out what this does.

Push to multiple remotes

Based on this blog post I added the push remote twice to the origin repo as such :

git remote set-url --add --push origin  git@gitlab.com:paulrougieux/paulrougieux.git
git remote set-url --add --push origin  git@github.com:paulrougieux/paulrougieux.github.io.git

The output of git push -v shows that it is now pushing to both repositories

git push -v
Pushing to git@gitlab.com:paulrougieux/paulrougieux.git
To gitlab.com:paulrougieux/paulrougieux.git
 = [up to date]      master -> master
updating local tracking ref 'refs/remotes/origin/master'
Everything up-to-date
Pushing to git@github.com:paulrougieux/paulrougieux.github.io.git
To github.com:paulrougieux/paulrougieux.github.io.git
 = [up to date]      master -> master
updating local tracking ref 'refs/remotes/origin/master'
Everything up-to-date

The problem with pushing to multiple remotes and doing rebase commits at the same time is that this might lead to a situation where the push on remote A fails because there are new changes in that remote repository, while the push to the other remote B succeeds. Now if you rebase your changes at the top of the branch changes that you pulled from A. The branch will have a different order than the branch in remote B (I think, to be checked).

Using the push to multiple remote should be combined with a command that pulls first before any push.

To solve issues when someone updated a remote and you didn’t have the changes, then the push pussed only to gitlab and not to github, you can merge the 2 branches as such:

git merge github/main origin/main

Pull rebase versus merge

Explanation on rebase versus merge

“If you pull remote changes with the flag –rebase, then your local changes are reapplied on top of the remote changes.”

git pull --rebase

“If you pull remote changes with the flag –merge, which is also the default, then your local changes are merged with the remote changes. This results in a merge commit that points to the latest local commit and the latest remote commit.”

git pull --merge

“It is best practice to always rebase your local commits when you pull before pushing them.”

This can be set as a global configuration with

git config --global pull.rebase true

Pull a new branch from the repository

Pull a newly created branch from the remote repository

git fetch

Rename and delete files

Rename a file

git mv file_name file_name_new

Change the case of a file on a windows FAT 32 system:

git mv load.r load2.R
git mv load2.R load.R

If a file or folder has been renamed outside of git, I get this warning:

$ git add .
warning: You ran 'git add' with neither '-A (--all)' or '--ignore-remo
whose behaviour will change in Git 2.0 with respect to paths you removed
Paths like 'docs/efi/efi_logo_rgb_small_siw.jpg' that are
removed from your working tree are ignored with this version of Git.

* 'git add --ignore-removal ', which is the current default,
  ignores paths you removed from your working tree.

* 'git add --all ' will let you also record the removals.

Therefore I think I should always run “git add –all” after I have removed or renamed files.

Replace and search text

Searching text with grep

Example from the git grep help page. Looks for time_t in all tracked .c and .h files in the working directory and its subdirectories.

git grep 'time_t' -- '*.[ch]'

Search all files in the subdirectory “subdir” for lines containing the words “factor” and “item”. Show 2 lines of context (2 leading and 2 trailing lines).

git grep -e item --and -e factor -C 2 -- subdir/

Search all R markdown files containing the word lyx ignore the case:

git grep -i lyx -- *.Rmd

Stackoverflow: How to search committed code in the git history?

Search in all possible locations in all commits

git grep <regexp> $(git rev-list --all)

Search for a particular location in the git repo

git grep <regexp> $(git rev-list --all -- lib/util) -- lib/util

The above grep commands will give you a list of commit sha where the grep pattern was present in the repo. You can then checkout the older revision of a file under a new name

git show commit_sha:filename > new_file_name

Unique results

Get unique list of import statements in python files

git grep -h  import -- "*.py" | sort --unique

Searching for two terms on the same line

For example:

git grep -e module_dir --and -e "import"

Excluding a pattern from the search

For example (SO):

git grep -e "index=True" --and --not -e "ignore_index"

Replacing strings

Use git grep to replace strings in many files in the directory :

git grep -l 'original_text' | xargs sed -i 's/original_text/new_text/g'

See also the refactor function in the bash page.

Signing commits and tags with PGP keys and GPG

The Git book chapter on signing your work

> "Everyone Must Sign"
>
> "Signing tags and commits is great, but if you decide to
> use this in your normal workflow, you’ll have to make sure that everyone on
> your team understands how to do so. If you don’t, you’ll end up spending a
> lot of time helping people figure out how to rewrite their commits with
> signed versions. Make sure you understand GPG and the benefits of signing
> things before adopting this as part of your standard workflow. "

Manual page man git-config:

commit.gpgSign

“A boolean to specify whether all commits should be GPG signed. Use of this option when doing operations such as rebase can result in a large number of commits being signed. It may be convenient to use an agent to avoid typing your GPG passphrase several times.”

Generate a pgp key

gpg --gen-key

More information on gpg key publishing in the bash page. Display what keys are available

gpg --list-keys

Automatically sign tags and commits

Stackoverflow Is there a way to autosign commits in git with a GPG key?

Configure git to use the key

 git config --global user.signingkey <-- key number -->

Configure git to always sign commits and tags

git config --global commit.gpgSign true
git config --global tag.gpgSign true

Configure git to always sign push or only when possible

git config --global push.gpgSign true
git config --global push.gpgSign "if-asked"

If you get an error

error: gpg failed to sign the data

Try first to test if gpg signing is working with that key

echo "test" | gpg --clearsign -bsau CC44A3EC868FDAE69F0E57148A77836103E2ADB0
# gpg: skipped "CC44A3EC868FDAE69F0E57148A77836103E2ADB0": No secret key
# gpg: [stdin]: clear-sign failed: No secret key

If you get an error

fatal: the receiving end does not support --signed push

You might make the push sign optional with:

git config --global push.gpgSign "if-asked"

Display a commit signature

Run git show on a commit to view the signature.

Show signatures in the log:

git log --show-signature

Trusted signatures on Gitlab

Gitlab sign commits with GPG
- Export the text of the public key to a text file with:
  
  gpg –list-keys gpg –armor –export paul.rougieux@gmail.com # Or export it to a file gpg –armor –export paul.rougieux@gmail.com > ~/downloads/gpg_pubkey_paul.asc
Once GPG is configured in git, you need to add a GPG key to your account so that it appears as verified.
In the same way GPG keys can be added to github.

Statistics

Count the number of lines of code for each language in a remote git repository according to this stackoverflow answer

#!/usr/bin/env bash
git clone --depth 1 "$1" temp-linecount-repo &&
  printf "('temp-linecount-repo' will be deleted automatically)\n\n\n" &&
  cloc temp-linecount-repo &&
  rm -rf temp-linecount-repo

“This script requires CLOC (“Count Lines of Code”) to be installed. cloc can probably be installed with your package manager”

sudo apt install cloc

“You can install the script by saving its code to a file cloc-git, running chmod +x cloc-git, and then moving the file to a folder in your $PATH such as /usr/local/bin.”

Usage

cloc-git https://github.com/evalEmpire/perl5i.git

Tagging

Creating an annotated tag

git tag -a v1.4 -m 'my version 1.4'
git tag v0.0.2  -m "Version 0.0.2"

A regular push command won’t push a tag (bitbucket), to push all your tags :

git push origin --tags

List tags

git tag -l

You can add a tag after the fact. To tag an earlier commit, specify the commit checksum or part of it:

git log --pretty=oneline

git tag -a v1.2 -m 'version 1.2' 9fceb02

Delete a tag

git tag -d tag_name

To delete a tag on the remote server as well

git push --delete origin tagname

Return all files in the repository to a particular tag:

git checkout tags/v0.1

Optionally you can checkout and create a new branch to commit your changes with:

git checkout tags/o.1 -b newbranchname

Push tags

A regular push command won’t push a tag (bitbucket), to push all your tags :

git push origin --tags

git push --tags

TO DO lists

In one repo:

git grep TODO

In multiple repos (using git run):

gr git grep "^TODO "

Uploading content to online git platforms

Git is useful as a version management on its own. But it’s even more usefull when code can be back-up online. * Free public git storage is availble on Github. * Free public and private git storage are available on Bitbucket, with up to 5 collaborators per project. * Free public and private git storage are available on Gitlab.

Commands I’ve used to upload content to an existing repository github.com/paul4forest/forestproductsdemand are:

    git remote add origin https://github.com/paul4forest/forestproductsdemand
    git pull origin master
    git add
    git commit -m "Explanatory message"
    git push origin master

Commands to upload content to a fresh repository from bitbucket :

    mkdir /path/to/your/project
    cd /path/to/your/project
    git init
    git remote add origin ssh://git@bitbucket.org/username/bbreponame.git
    # Upload and set the local changes as upstream
    git push -u origin master

See also this discussion on why do I need to set upstream?

Commands to copy an existing repository from bitbucket :

 git clone git@bitbucket.org/username/bbreponame.git

Setting up SSH keys

See Generating a new SSH key and adding it to the ssh-agent and Adding a new SSH key to your GitHub account

To test if my new ssh key worked with github, I created a branch and pushed it to github. Then I deleted the branch locally and remotely:

git branch "key"
git checkout key
git push --set-upstream origin key
# Checked that the "key" branch became visible on github
git checkout master
git branch -d key
git push origin --delete key

Stash to Keep uncommitted changes in the current branch while working quickly in another branch

For example you edit a file in branch develop

git checkout develop
echo "blabla" > newfile.txt

Imagine you are in the middle of your work and need to do an urgent modification in another branch. To move to another branch, you would normaly need to commit this change. If you don’t want to commit the change yet. You can store temporary changes with the command:

git stash

Move to the master branch

git checkout master
# perform important operation in the master branch.

Get back to the develop branch and reload uncommited changes

git checkout develop
git stash pop

Subversion

Subversion is another version control system. It is different than git and should be on another page, but since I don’t expect to use it very much, it probably makes sense to keep a Subversion section under the git page here. See also wiki.debian.org small SVN tutorial.

Install subversion from the Debian repository

sudo apt install subversion

Load the latest version of some repository

svn checkout --username yourusername http://repositoryurl

Get help on a command

svn help log

See the last 4 log entries

svn log -l 4

Bring changes from the repository HEAD into the working copy

svn up

synchronize working copy to revision given by -r.

svn up -r10860

show the status

svn status

show differences

svn diff

References

This git page is the continuation of my blog post on git commands.

Presentations:

Power Your Workflow With Git (towards 43 minutes, an animation of branches and merging) A recommendation if you work in a team, don't use pull, but use fetch + merge.
Getting Started with GitHub + Git

Workflow:

Mixing private and public repository in Git workflow

Book:

Pro Git book good introduction.

Tutorial:

Git foundations
Definition and pictures of git and related services

Blogs:

10 things I hate about git
list of git resources by Fernando Perez, the creator of ipython

Git Commands

Introduction

Simple workflow for one user

Workflow with branches

Add and commit

Add patterns

Commit messages

Autocompletion

Branching

Branching strategy

Create a branch

Delete a branch

Merge branches

Checkout --ours or --theirs

Unmerged paths

Checkout a remote branch

Branch naming convention

Rename a branch

Configuring git

Configuring user name and email

Other config –global options

Remembering passwords

Debug git with trace

Display changes

Diff the working directory

Diff csv files

Show changes in previous commits

Show all commits by a person

Blame

Illustrations

.gitignore

.gitkeep

Git run to automate many repositories

Going back in time

Bissect

Log

Commit statistics

Export the log to csv

File history through rename and deletion

All commits affecting one file

Reference log

Tig

Rewriting history

Amend to the previous commit

Commit to the wrong branch

Rebase your changes on top of the remote head commit

Rebase when a remote upstream branch has diverged and there is a merge conflict

Rebase a feature branch to the main branch

Merge conflict

Git history when changing a licence

Clean

Reset whole folder

Restore only one file

Revert

Dangling blobs and trees

Going forward in time

Cherry pick one commit

GUI Graphical User Interface

Help

Hosting services and online platforms

Github

Gitlab

Issues

Cross linking issues

Duplicates

Issue priority triage

CI Continuous Integration

Usage quotas

gitlab pages

Deploy token

Usage quotas

Line endings

Publishing project documentation

On Github

On Gitlab

Remote

Push to multiple remotes

Pull rebase versus merge

Pull a new branch from the repository

Rename and delete files

Checkout `--ours` or `--theirs`