What is the Git index

This is a recurring question: “Why keep the git index?” or, “The git index is a performance trick?”

Actually, the git index is a staging area. Every SCM (Source Code Manager) has it, the difference is that git shows it to you, and uses it effectively.

Git staging area

When you do:

git add some.file
git commit -m 'message'

The first command, git add some.file does not commit the file, right? Where has it gone? Answer: into the staging area.

With the second command, git commit -m "message", you do finally commit staged files. But what happens to the other modified files that wasn’t in staging area? Are they committed? Answer: no, the last revision is updated with the new version of some.file, “that was in the staging area”, and then committed.

So the index comes naturally to you when you run: git add some.file, now, the file is in the index.

One special case exists:

git commit some.file

In this case, git assumes that you want to create a temporary staging area from the tip of the current branch (“HEAD”), update some.file, and commit the resulting state. After you committed that state, the staging area is resurrected as it was before that commit.

This operation (“save the current staging area, construct a new one, commit it, and then restore the staging area”) seems a bit illogical, since you would usually expect only one staging area. However, in practice it happens quite often that you slap your forehead, because you forgot to commit something very important. So, just edit the respective files, commit just these, and continue with what you were doing before you hurt yourself (slapping your forehead).

What is Git index?

In essence: The Git index is a staging area for the next commit, but for convenience, passing filenames explicitely to git commit builds a temporary staging area from the latest revision and the current version of the provided files before committing that state.

Git Merges and the index

Normally, a git user will rarely be exposed to the index, if he/she is not committing a revision. But there is one notable exception: merges.

When you merge the work of others, sometimes conflicts happen. These conflicts are put in the index.

That means, the whole merge is done inside the index by inserting the current version, the version of the branch-to-be-merged, and the merge base (common ancestor) into the index, and merging them using a three-way-diff. If there are no conflicts, these three entries are collapsed into a single entry. Otherwise the three entries stay there, with the common ancestor being replaced by the result of the merge.

Again, git is intelligent about what to show you upon a git diff: those entries which merged cleanly are already updated in the staging area. It is unlikely that you want to see these differences right now, because you have to fix up conflicts – if there are any. So, a git diff will show you a combined diff, i.e. a simultaneous diff of the merged-with-conflicts file against both the current version and the version in the branch-to-be-merged.