dVCS

Distributed Version Control Systems: Git vs. Mercurial

Assuming you've already decided that you want to use a dVCS, (you do, right?) now you need to figure out which to use. Maybe even which is "best". I don't pretend to know which is best, but am more than willing to say which I prefer, git, and why.

Here's my contribution at the end of a long Nov-2006 fedora-maintainers discussion:

From: Jim Meyering
Subject: Re: dist-hg proof-of-concept ready for use

On the Fedora version control wiki, I was glad to see the plan to use a distributed VCS.

However, I noticed that the comparison lists "Higher disk space requirements" as one of the "Git Cons". I'd like to point out that in my experience[*], git repository size is on par with that of mercurial, and these days, even smaller.

Jim

[*] I've been maintaining upstream coreutils for ages, and have wanted to switch development to a dVCS for the last year. I finally made the leap just last month. Over the months leading up to my decision/switch, I converted the 100MB coreutils CVS repository to git and to hg many times, using various tools and experimenting with evolving versions of git and hg. The resulting repositories were nearly the same size (in April, that size was 65-70MB) in the early months. Now the git repo, even with many more deltas, is smaller still, at just 57MB. Of course, if you use a conversion mechanism that does not periodically repack, you'll find that the git repository takes far _more_ space than the mercurial one -- more than 10 times the space in my case. The tool I used, git-cvsimport, does this repacking every thousand deltas, but older versions of git-cvsimport and tailor didn't, and you had to repack manually. Even over the last few months, git's packing algorithms have improved. When I first began considering the switch to a dVCS, early this year, mercurial did have a small repo-size advantage. Now, git may have the advantage -- I haven't converted to hg in a while. However, at least with respect to repository size, these two tools are so close that this criteria is no longer a relevant issue when deciding which to use.

I've been using both hg and git for different projects, and I've spent a long time mulling over which to use for coreutils. I settled on git not just due to its larger developer base, but also to their mind set. I have the impression that there are far more people contributing to git development than to mercurial development. As you might expect, with so many more developers, git seems to have more features/tools than hg, but if you're new to the use of a dVCS this isn't immediately apparent. Also, I've been very impressed with (git maintainer) Junio's responsiveness to my few git bug reports.

Regarding developer "mind set," I found and reported a serious mercurial bug: Title: "hg commit non-existent f1 f2" succeeds!?!. But it has been over two months now, and it's still not fixed. In git, I'm confident that such a bug would have been fixed in days, or even hours. [update, it was marked as fixed on 2006-12-11]

In a similar vein (but not a critical bug), I reported what I thought was "obviously" a bug: "hg diff b a" should print "b"s diffs first. That behavior seemed so patently counter-intuitive to me, that I was totally taken aback to find the developers debating at length, arguing that this behavior is _desirable_. Disappointingly, last I heard, this will not be changed.

For the record, I too have been frustrated by the fact that the latest versions of those tools were too slow to make it into distribution. So I've been using cutting-edge versions (pulled and built every few days) of both hg and git for the better part of this year. I am glad that savannah now offers support (brand new) for git. If you want to try out a web interface, here's one for coreutils.

After converting coreutils development to git, I was motivated to convert gnulib, too. Here is some of the resulting discussion, e.g., convincing the other gnulib developers that it'd be good to switch:
  • switching gnulib from CVS to a dVCS
  • version control system
  • comparing gnulib repo sizes: git vs. cvs; $Id$ strings
  • Some more links:

  • Keith Packard switched to git and wrote Repository Formats Matter.
  • As for monotone and bzr, they're listed in this comparison of decentralized/distributed VCS.

    I searched a little, e.g., via this google query but didn't find anything compelling. Even a survey that's just 6 months old is seriously out of date. That was evident in the fedora-maintainers discussion.


    © 2007-2010 meyering.net | updated Mon 28-Jan-2008 3:10 PM