Git

Git is the distributed version control system that was developed for the Linux kernel (started by Linux Torvalds in 2005). Today, it is widely used in the open source development community.

At work, I got in contact with Subversion and I really enjoy using it for all text-based projects (programming, LaTeX, you name it). However, the Linux kernel is managed with git and in recent time, many projects switched to Git. So I thought it was time to learn its basics. And while learning, I maintain my cheat sheet here. It is not intended to be a stand-alone tutorial, there are many good ones available (e.g. the Book on the Git homepage, that was also translated in a number of languages). So this is merely a short reference. Further, my own view is from an experienced Suversion user, so I base my notes on previous knowledge about source code revision systems.

While Subversion is a server-based version control system (the central repository is a server and all interaction is done with this server), Git is a distributed version control system. That means that every user has a local repository (therefore "distributed") that stores all revisions. These local repos can be synchronized by pushing the changes to a server or by pulling changes from someone into the local repo.

A git server can be any system accessible via network either with SSH or HTTP(s). Github provides free Git hosting. For the following tests, I use my test repository.

Configuration

Every tutorial starts with the following lines to configure your name and E-Mail:

$ git config --global user.name "Your Name Comes Here"
$ git config --global user.email you@yourdomain.example.com

So it appears to be really advisable to do this. The background is that Git stores the name and an E-Mail address with every commit. Unlike Subversion, where every user needs an account on the server, Git cryptographically hashes all commits and everybody (or every alias) can create commits. The --global parameter adds the settings to the config file in your home directory. If you want to contribute to a project with a different identity (e.g. your company E-Mail instead of your private one), the settings can also be done only for the current project by omitting this option. The global config file is ~/.gitconfig and the project-one is .git/config.

Please refer to the documentation (man git-config and git --help) for further information. The command git config -l lists the current settings (if multiple entries exist, the last one overwrites previous ones).

Activate colors (which are deactivated by default) with:

$ git config --global color.diff auto
$ git config --global color.status auto
$ git config --global color.branch auto

Initialization

A new repository is initialized by entering

$ git init

in the base directory. If that's an existing project, you can add all files with

$ git add *
$ git commit -m "initial commit"

Instead of git add *, you want probably add only selected files, such as *.c etc.

It's also possible to clone an existing repository (like a Subversion checkout):

$ git clone https://github.com/georgwassen/HelloWorld

The URL is provided by the project that offers the repository.

First steps

Files, that should be commited to the repository must be added (called "staging"):

$ git add main.c
$ git add Makefile

Check the changes:

$ git status
$ git show

Commit the staged changes (to the local repository) providing a change-log message:

$ git commit -m "initially adding my files"

If you don't provide the -m "message" parameter, Git will open an editor and ask for the commit message.

Now, further modifications can be done. Unlike Subversion, registered files are not automatically included in the next commit. With Git, the files must be added again:

$ git status
# On branch master
# Changes not staged for commit:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#       modified:   main.c
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#       hello
#       main.o
no changes added to commit (use "git add" and/or "git commit -a")
$ git add main.c
$ git status
# On branch master
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#       modified:   main.c
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#       hello
#       main.o
$ git commit -m "added Bye message"
[master 6c8bb69] added Bye message
 1 file changed, 1 insertion(+)

The cycle is:

  • Make (and test) changes. Prefer small and related changes and commit them with meaningful messages.
  • Add changed files: git add file.c
  • Check state and if you missed a file: git status
  • Commit changes to local repository: git commit -m "message"

So far, all work is done in the master branch.

Upload to a server

As explained in the Github help, a local repository can be pushed to a server with the command

$ git remote add origin git@github.com:georgwassen/HelloWorld.git
$ git push origin master

(Using the URL for my test repos with SSH access.)

Later, or if the local repository was cloned from a server, it suffices to issue:

$ git push origin master

(`origin´ is the name of the server and ´master´ is the branch to push)

The full power of git

It is possible to synchronize the local repository with multiple servers.

Get the changes from a server:

$ git remote add upstream https://github.com/octocat/Spoon-Knife.git
$ git fetch upstream
$ git merge upstream/master

The strongest feature of Git is said to be branching and merging. Create a new branch and switch to it (there should be no open changes, i.e. a clean working copy):

$ git branch newfeature
$ git checkout newfeature

With git branch, all availabel branches are displayed. Now, you can switch between the branches and commit changes.

To merge changes from one branch to the other (for example, to merge the changes developed in a branch back to master), just call

$ git checkout master
$ git merge newfeature

A merged branch can be removed with git branch -d newfeature.

Branches in Git are very light-weight and fast, so they can be used to keep separate issues apart and merge them when they work. If there is a merge conflict (e.g. if both branches changed the same line), they conflict is marked and reported. You need to clean up the conflicts manually and then add the conflicting files to the staging area. When all conflicts are resolved, commit the staged files and the pending merge will be completed.

To get an overview of the current branch and how it's composed of commits and merges, the visual tool gitk is a great help. If you start it out of a Git repository, it displays the log of the current branch. With gitk --all, it displays all existing branches which helps to remember what branches are pending for a merge.

Creating a server repository

Git supports four protocols to interact with remote repositories: * File: for local repos in other directories or network drives (e.g. NFS) * SSH: encrypted transfer for retrieving and uploading * HTTP/HTTPS: only for download (easy publishing of source code) * GIT: only for download

Refer to the documentation for benefits and disadvantages and for details how to set these protocols up.

To create a personal repository on a Linux server accessible with SSH on the internet, I followed these steps:

  1. Create a user dedicated for Git (or use your personal account). The commits are already tagged with name and E-Mail, the git user is used to determine read-only or read/write access on the repo. I copied the ~/.ssh/authorized_users from another account to enable private/public-key-logins.

    $ ssh www.example.com
    $ su -
    # adduser -m git
    # su git
    $ mkdir .ssh
    $ cp authorized_keys ~/.ssh
    

    Now, I can login to the server with ssh git@www.example.com without being asked for a password. Note, that every user that should access the Git repository via SSH protocol, must provice a SSH public key and subsequently can also log-in on the server! (The git user can be configured to have no shell to disallow logging in, but that's out of the scope of this article.)

  2. Create a Git repository. Usually, a server repo should be a bare one where no working copy exists. On my server, an old version of Git does not know the parameter, but this is how it should work:

    $ mkdir -p git/HelloWorld.git
    $ cd git/HelloWorld.git
    $ git init --bare
    

    Now, the new repository has the Git URL git@www.example.com:git/HelloWorld.git. It is a convention to name bare repositories with a .git extension. When cloning such a repository, the local directory is named without this extension by default.

  3. Add the new upstream repository to your local Git repository (where the working copy lives):

    $ git remote add myrepo git@www.example.com:git/HelloWorld.git
    $ git push myrepo master
    

    With git branch -a or gitk, you can see, that two remote repositories are listed now.

Tips and Tricks

Amend to a commit

If you see the typo in the commit message shortly after hitting ENTER or forgot to compile the change and committed an error, the last commit can be updated or amended. Use git commit --amend to edit the last commit message or add and commit an additional file (or a file again) to add that change to the last commit.

Backdated Branch

Ever started commiting changes and then determined that this should better have gone into a branch? It's easy to move the last N commits into a new branch:

  1. create the new branch (it contains all commits in the current state, thus also the ones that should be moved into it).
  2. reset the current branch (not the new one) to remove the additional commits.
  3. checkout the new branch and continue.
    $ git branch newbranch
    $ git reset --hard HEAD~3
    $ git checkout newbranch
    

Again: the reset is done on the master (or previous) branch where the last changes should be removed. The commits are preserved in the new branch.

Source: Stackoverflow

Stashing

When changing between branches, the repository should be in a clean state. If you have modifications that are not yet ready for a commit but you need to change to another branch (e.g. for a hot fix), you can stash the pending changes. That's similar to a commit, but temporary.

$ git stash [save]
$ git checkout otherbranch
# do some modifications, e.g. to fix an error
$ git commit -am "hotfix..."
$ git checkout firstbranch
$ git stash pop

The stashed changes can be applied to another branch or on the branch where you stashed them. The stashing mechanism uses a stack of stashes. The command git stash list shows the stack of stashes and every line gives hint in which branch it was created. See the book and the man-page git-stash for more details.

Import from Subversion

This part was moved to a dedicated article.

Display current branch in the Bash prompt

The bash-completion package contains a function that supports the display of the current Git branch in the Bash prompt.

  1. Check, if the function __git_ps1 exists:

    $ type __git_ps1
    

    If it displays a long function, proceed with step 2, otherwise, try the following:

    • Install with the package manager, look for packages like bash-completion, git-extras etc.
    • On Fedora, I found the file /usr/share/git-core/contrib/completion/git-prompt.sh and copied it to etc/bash_completion.d/.
    • Google for that function and add it either to your local .bash_profile or to the global /etc/bash_completion.d.
  2. Include $(__git_ps1) in your PS1 definition. Example (with Colors):

    export PS1='\[\033[01;32m\]\u@\h\[\033[01;34m\] \w\[\033[01;33m\]$(__git_ps1)\[\033[01;34m\] \$\[\033[00m\] '
    

Push into Working Repository

Usually, one should only push into a bare server repository. But sometimes, it happens that I clone a working repository from one PC (e.g. to my Notebook) and later push the changes back to continue working on the PC. When pushing to a working repository, the .git is updated, but the checked out branch is not modified (it's a bit like the difference between fetch and pull).

Now, the checked out working copy is out of sync with the .git repository backend data. This can be fixed with the simple command:

git checkout -f HEAD

Warning: the current branch should be clean because that command will overwrite modifications.

References

I try to continue this little cheat sheet. But there's already a load of great tutorials on the 'net.

social