Ever found an old copy of a repository and didn't know the purpose or state of if?
Was it just a test? Are there modifications that were not pushed anywhere?
When was it cloned in the first place? And when was is used the last time?
I didn't found any methods of Git itself or third-party tools to query this information
(okay, I didn't search very thoroughly).
This little script demonstrates what can be found when digging the .git directory.
The script is on GitHub.
Identify a GitHub repository
A GitHub repository carries its meta-information in a .git
directory located
in its base directory. If the current working directory is a subdirectory,
the path must be followed towards the root directory.
DIR=''
ORIG_DIR=$PWD
if [[ -d .git ]]; then
DIR=$ORIG_DIR
else
while [[ $PWD != / ]]; do
cd ..
if [[ -d .git ]]; then
DIR=$PWD
break
fi
done
fi
if [[-z $DIR ]]; then
echo "ERROR: no .git directory found in path '$ORIG_DIR'"
exit
fi
These lines check if there is a .git
directory in the current working directory
and if it can not be found, it steps upwards until either the root directory is reached
or a .git
directory is found.
If the loop is left by the while
condition, the variable DIR
is still empty
and an error message is printed.
This code changes to the base directory of the Git repository.
Since it is a script with its own scope, it does not need to store the original
directory to restore it later. The calling environment is not changed.
Age of the repo
The age of the repo can be derived from the oldest file in the .git directory.
This is probably not the most stable algorithm. Later, the logs
directory is
explained that holds a better source for this information.
# try to derive the age (date of init or clone) from .git files
# (use oldest file in .git directory)
echo -n "init'ed or clone'd most probably on: "
stat -c '%Y %y %n' $DIR/.git/* | sort | head -n1 | awk '{print $2 " " $3 " (" $5 ")" }'
The stat
utility is given a format string to print the modification date
as seconds since the UNIX epoch, in a readable format and the file name.
This list is sorted, the first line is extracted and the columns 2 and 3 (date, time) and 5 (filename)
are printed.
A basic information is the .git/description
that can be set for every repository.
It appears not to be used by the git
tools, but might be read by other tools (GitWeb)
or hooks.
If the file exists and does not contain the default ("Unnamed repository..."),
its content is printed.
Remote links
The remote links are printed by git remote -v
, but the lines are annotated with (fetch) and (push).
To just see the links, the second column is cut out and sorted and unified:
echo -n "Remote links: "
git remote -v | cut -f2 | cut -d' ' -f1 | sort | uniq
SVN connection
The command git svn info
should reveal any ties to a Subversion repository.
If the command gives an error, the output is suppressed:
SVNINFO=$(git svn info 2>&1)
if [[ ! "$SVNINFO" =~ ^Unable ]]; then
echo -n "Git-SVN info: " $SVNINFO
fi
Last commit
The commit log can be flexibly formated with git log
. The script uses:
echo -n "Last commit: "
git --no-pager log --all -n1 --format="${COLGITHASH}%h ${COLGITDATE}%ci${COLGITRESET}%d ${COLGITSUBJECT}%s${COLGITRESET}"
The variables $COLGIT*
contain the git log
format %C(...)
as documented in its man-pages.
The can be set empty to suppress color (see the final script linked above for the whole picture).
Git logs
The directory .git/logs
holds files with a history log for various objects.
The HEAD
contains the initialization or cloning, any pushes, fetches and pulls
and commits and checkouts.
The first line gives the information when the repo was creates by initialization
or cloning. In the latter case, the clone source is given.
Depending on verbosity, the script prints either the first and last entry or the full
history.
The tokenization of the lines is a bit tricky because after two hashes, the
name can be one or multiple words. The E-Mail address is enclosed in angle brackets.
Then follows a UNIX time stamp (seconds since epoch), the time zone and a description
of the action.
function tokenize_log()
{
read REV0 REV1 REST < <(echo "$@") # prev. and current revision sha1, remainder
echo_v3 "REV0 = '$REV0'" # echo_v3 prints only on verbosity >= 3
echo_v3 "REV1 = '$REV1'"
NAME=${REST%% <*} # Name is up to first angle bracket
echo_v3 "NAME = '$NAME'"
REST=${REST##* <} # Remainder is after first bracket
MAIL=${REST%%>*} # Mail is up to closing angle bracket
echo_v3 "MAIL = '$MAIL'"
REST=${REST##*> } # Remainder is after angle bracket
read TIME ZONE ACTION < <(echo $REST ) # Time, Zone, Action (multiple words)
echo_v3 "TIME = '$TIME'"
echo_v3 "ZONE = '$ZONE'"
DATE=$(date -d@$TIME +'%Y-%m-%d %H:%M:%S') # convert UNIX time stamp into readable date
echo_v3 "ACTION = '$ACTION'"
if [[ -z $ACTION ]]; then
ACTION="git init" # if no action: it was a `git init`
fi
}
if [[ $PARAM_VERBOSE -ge 1 ]]; then
# full history: loop over all lines of .git/refs/heads/master
echo -e "History of master"
while read LINE; do
# function fills global variables $REV0, $REV1, ..., $ACTION, $DATE, $NAME, $MAIL
tokenize_log "$LINE"
# print in a convenient format
echo -e " $ACTION on $DATE by $NAME $MAIL"
done < <(cat $DIR/.git/logs/refs/heads/master)
else
# print only the first and last line of .git/logs/HEAD
tokenize_log $(head -n1 $DIR/.git/logs/HEAD)
echo -e "Source of this Repo: $ACTION on $DATE"
tokenize_log $(tail -n1 $DIR/.git/logs/HEAD)
echo -e "Last action: $ACTION on $DATE by $NAME $MAIL"
fi
The function tokenize_log
receives a line of the logfile and fills the
global variables. The echo_v3
function only prints the debug output,
if the log level (verbosity) is above or equal to three.
The same output is created for all remotes:
for D in $DIR/.git/logs/refs/remotes/*; do
REMOTE=${D##*/} # extract last part of path
# get name of server
REMSERVER=$(git remote -v | grep $REMOTE | cut -f2 | cut -d' ' -f1 | sort | uniq)
echo -e "History of $REMOTE ($REMSERVER)"
if [[ $PARAM_VERBOSE -ge 1 ]]; then
# full history
while read LINE; do
tokenize_log "$LINE"
echo -e " $ACTION on $DATE by $NAME $MAIL"
done < <(cat $D/master)
else
# only first and last entry
tokenize_log $(head -n1 $D/master)
echo -e " First action: $ACTION on $DATE"
tokenize_log $(tail -n1 $D/master)
echo -e " Last action: $ACTION on $DATE by $NAME $MAIL"
fi
done