Git Unite - Fix Case Sensitive File Paths on Windows

5 minute read

Git Unite is a utility that fixes case sensitive file paths present in a git repository index on Windows. Since Windows is not case sensitive, the git index case sensitivity issue does not manifest itself until browsing the code repository on GitHub or cloning the repository to a case sensitive file system on Linux.

Introducing case sensitive file paths into the git index on a case insensitive operating system like Windows is easier than you think. A simple ‘ git mv .\Where\Waldo where\is\Waldo' is all you need to create two separate paths in the git index, but the Windows working directory will only report one. There might be git config settings that help avoid this problem, but controlling the settings and behavior of 20+ contributors on a project team is nearly impossible.

The problem is exacerbated when hundreds of files are moved during a repository layout reorganization. If the user moving the files is not careful, these case sensitive path names will pollute the git index but appear fine in the working directory. Cleaning up these case sensitive file path issues on Windows is tedious, and this is where Git Unite helps out.

Git Unite will search the git repository index for file paths that do not match the same case that Windows is using. For each git index path case mismatch found, Git Unite will update the git index entry with the case reported by the Windows file system.

Usage

Usage: Git.Unite [OPTIONS]+ repository
Unite the git repository index file paths with current Windows case usage.
If no repository path is specified, the current directory is used.

Options:
      --dry-run              dry run without making changes
  -h, --help                 show this message and exit

History

I work on a project that has one particular git repository tracking over 7,000 files. The repository contains a mixture of ASP.NET MVC3 code, SQL Server SSIS ETL packages, and PowerShell scripts. It all started one day when an ETL developer could not locate the package she developed on the GitHub web site.

I took a look at the git repository on her machine and the ETL package was clearly there under an Etl\Some\Dir\Path folder. The repository reported being up to date with origin/master, but it took several minutes before I noticed an etl and Etl folder on the GitHub web site.

It turns out that the ETL team was in the process reorganizing the ETL packages into a new directory structure layout. I booted up a VM running Ubuntu and cloned the repository down to a case sensitive file system. I found 694 ETL files that were tracked in the git index with a directory path case different than the one reported by the Windows file system.

I fixed the problem by using a combination of find, sort, and awk to build a bash script to run the 694 git mv commands. This was a painful process that I did not want to repeat so I decided to build a tool anyone on the team could use on Windows to fix the problem.

In fact, two months later the same issue appeared again in a different repository. This time I was able to install the Git Unite utility on the user’s machine and fix the issue in a couple minutes. We tracked down the source of the problem to a developer that hand-typed the target directory of a git mv command in all lowercase.

Example Scenario

Here is a representative example using Posh-Git on Windows 7 as to how someone can introduce case sensitive file paths on a case insensitive file system.

Step 1 – Create a new git repository and push it to GitHub

C:\demo> mkdir Where
C:\demo> touch .\Where\Waldo
C:\demo> touch .\Where\IsHere
C:\demo> git init .
Initialized empty Git repository in C:/demo/.git/
C:\demo [master +1 ~0 -0 !]> git add .
C:\demo [master +2 ~0 -0]> git commit -m initial
[master (root-commit) 42ea0fc] initial
 0 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 Where/IsHere
 create mode 100644 Where/Waldo

C:\demo [master]> git remote add origin git@github.com:tawman/waldo.git
C:\demo [master]> git push -u origin master
Counting objects: 4, done.
Delta compression using up to 6 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (4/4), 265 bytes, done.
Total 4 (delta 0), reused 0 (delta 0)
To git@github.com:tawman/waldo.git
 * [new branch]      master -> master
Branch master set up to track remote branch master from origin.

When we look on GitHub the repository appears as expected: Initial repository as seen on GitHub

Step 2 – Start asking some questions

C:\demo [master]> mkdir .\Where\Is
C:\demo [master]> touch .\Where\Is\He
C:\demo [master +1 ~0 -0 !]> git add -A
C:\demo [master +1 ~0 -0]> git commit -m "Good Question"
[master 3d9006e] Good Question
 0 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 Where/Is/He

Keep a close eye on where Waldo is going...

C:\demo [master]> git mv .\Where\Waldo where\is\Waldo
C:\demo [master +0 ~1 -0]> git commit -m "Find Me"
[master 35f843b] Find Me
 1 files changed, 0 insertions(+), 0 deletions(-)
 rename {Where => where/is}/Waldo (100%)

C:\demo [master]> find Where
Where
Where/Is
Where/Is/He
Where/Is/Waldo
Where/IsHere
C:\demo [master]> ls


    Directory: C:\demo


Mode                LastWriteTime     Length Name
----                -------------     ------ ----
d----         1/12/2013  10:54 PM            Where

Seems quite obvious Where Waldo is, but let’s check what GitHub thinks:

C:\demo [master]> git push
Counting objects: 11, done.
Delta compression using up to 6 threads.
Compressing objects: 100% (5/5), done.
Writing objects: 100% (9/9), 683 bytes, done.
Total 9 (delta 1), reused 0 (delta 0)
To git@github.com:tawman/waldo.git
   42ea0fc..35f843b  master -> master

It would appear that git and GitHub have narrowed down the location of Waldo to one of two possible locations: GitHub is not exactly sure where he is at

Step 3 – Let the confusion begin

C:\demo [master]> ls .\Where\Is\Waldo


    Directory: C:\demo\Where\Is


Mode                LastWriteTime     Length Name
----                -------------     ------ ----
-a---         1/12/2013  10:50 PM          0 Waldo

According to Windows, Waldo should be hanging out right here: Is he here? Unfortunately, according to git he is hanging out over there:

Or is he here?

Step 4 – Get everyone back on the same page with Git Unite

C:\demo [master]> Git.Unite.exe C:\demo
C:\demo [master +0 ~1 -0]> git status
# On branch master
# Changes to be committed:
#   (use "git reset HEAD ..." to unstage)
#
#       renamed:    where/is/Waldo -> Where/Is/Waldo
#
C:\demo [master +0 ~1 -0]> git commit -m fixed
[master 4495f40] fixed
 1 files changed, 0 insertions(+), 0 deletions(-)
 rename {where/is => Where/Is}/Waldo (100%)
C:\demo [master]> git push
Counting objects: 7, done.
Delta compression using up to 6 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (4/4), 354 bytes, done.
Total 4 (delta 0), reused 0 (delta 0)
To git@github.com:tawman/waldo.git
   35f843b..4495f40  master -> master

Git Unite clears up the confusion by reconciling the git index file path with the same case Windows is using. When I go back and look at the repository on GitHub, there is only one place Where Waldo could be: Everyone is back Where expected As far as Windows was concerned, Waldo was here the whole time: I knew he was here the whole time

Fork me on GitHub