Blog Closed

This blog has moved to Github. This page will not be updated and is not open for comments. Please go to the new site for updated content.

Thursday, September 24, 2009

Version Control Systems

After a discussion earlier this week, I created a poll on parrot.org about version control systems. The question is simple: Which VCS would you prefer Parrot to use? Of course, it's just a non-binding strawpoll, but if there is overwhelming support for one option over the others it might inform and even motivate such a decision later.

My personal opinion is that I would like to migrate Parrot from SVN to Git.

Here's some background: I've never really used Git and don't know a lot about using it. Further, all my distributed development work, on Parrot and on other smaller projects has always used SVN. A few months ago I won a major victory at work when I convinced my boss to use SVN for all our in-house development. I setup and actively maintain the SVN server at work, teach my coworkers to use it and use it better, and am happy with the results overall.

pmichaud said it best the other day when I asked about it on IRC:
When rakudo needed to switch repositories earlier in the year, I put out a call to see whether people thought we should stick with svn versus moving to git. I got a lot of feedback. However, the clinching argument is that *nobody* argued in favor of SVN based on it's merits of being easier to use or having superior technology. The only reason that anybody had with staying with SVN was because that was what we had been using previously. So if it comes down to "influential arguments" as to making one choice versus another, I'd like to see informed reasons for staying with SVN. We clearly have a lot of Parrot developers who can give informed reasons for moving to Git
In the back of my head I am reminded of those grizzled old programmers who don't understand why anybody would ever write anything in a language besides Fortran or Cobol. I remember an argument I read once, back in the days when I was still young and stupid enough to participate in programming language flame wars on anonymous internet message boards, that said: "Perl is written in C, so anything Perl can do C can do but faster". Sure it's faster if you don't count all the extra development and debugging time, if you take the time to properly refactor and optimize your algorithms, and if you aren't worried about problems like leaking memory or security issues. But I digress. C, Fortran, Cobol, and Perl all have their uses (C is great for writing infrastructure like Parrot, Perl is great for most other things, and Fortran and Cobol are great for being rewritten in C or Perl).

SVN is great for some things. It's centralized, and although that brings a certain amount of rigidity it also brings some assurances and simplifications. Sure you can use Git in a way similar to SVN, but you don't need to and that option can be scary for some people. Sometimes you want one repository, strict revision numbering, and an unambiguous current state to the code. Sometimes you want to know that all your code is safe on a redundant SVN server, and people don't have "checked in" changes on their local machines that aren't being included in the nightly tape backups. There are plenty of good reasons for workplaces to use SVN, and I think it's a great choice for many of those situations. It's a particular solution to a particular problem, and it's perfect for some situations.

At work we moved to SVN because I could make arguments, impassioned and influential, that SVN was better then our previous system (externally-hosted VSS). And my arguments were good and true. The migration was not completely pain-free, there are some issues that our team is still working out months after the fact. For instance, the mechanics of branching, tagging, and merging is still foreign to some people who have never had to deal with them before. Of course, having problems using powerful new tools effectively is far better in my opinion then not having those tools at all. We went from a system with absolutely zero branches and tags to having dozens of each. We started with a culture having almost zero code sharing or collaborations between developers, and are gradually learning to embrace these ideas. SVN is the learning tool that makes these new ideas possible.

Git is a little more complicated, if only at the conceptual level. When I first started trying to learn Git I was bombarded by all sorts of terminology, long pages about philosophy, and page after page of very pretty charts and graphs. It was overwhelming and I wrote it off as being more hassle then it was worth. However, since my initial forays into the topic I've learned a lot more about it and have become much more comfortable with the ideas. I find use cases almost every single day where I have problems in SVN that could be easily and elegantly solved if I were using Git. Sometimes I want to make a branch where I can just make a few test commits. If those little "commits" turn into a huge SVN patch, it becomes much less easy to review and apply.

In SVN I follow a few personal rules that I've picked up gradually: Don't make multiple simultaneous branches that touch the same subsystem. Don't rename files in a branch. In fact, try to do as little file renaming as possible. Don't sync up branches with trunk too frequently. Don't let branches live too long without merging. Keep track of all your branching/merging revisions. Very very carefully examine merge diffs, especially if there are any conflicts at all, to make sure things are resolving in a sane way. When making branches, I have a script I follow and I don't go off the script. If too many problems pop up, I'll create a second branch or even abandon branches entirely and use a flat diff instead. It's like any tool, you have to know when to use it and how to use it properly when you do.

So why do I like Git? First off, it's flashy and new. As Austin Hastings mentioned on IRC:

And FWIW, switching to a new VC system is always a fashion show. You should expect mad enthusiasm for whatever new thing comes along after git, too.
And he's right, to a point. New tools are good if they fix the problems present in old tools, add new features, and make life easier. Fewer problems and more features always make people excited, and for good reason. Nobody is going to claim that SVN is the pinnacle of VCS technology. Git is an improvement over SVN, and there will be systems in the future that improve upon Git as well. It's the same exact way that we want programming language developers to move to Parrot instead of brewing their own interpreters from the ground-up: Parrot represents a better system that will be easier to use then the old method. We have things like PCT that makes compiler designer far easier then it would be even on other similarly-capable virtual machines. It's the same as how people switched to C++ 20 years ago, switched to Java 10 years ago, and switched to C# 2 years ago. You're never done learning, and new tools will always come along that make developers more productive.

People worry about the learning curve for Git, or the time it's going to take to get our developers working with the new tool. Of course, you rarely ever hear people complain about the learning curve for SVN: It's not just the time it takes to learn the simple commands like commit, checkout and update, but the time it takes to learn about all the nuances and best practices. SVN makes it exceptionally easy in some cases to dig yourself into a hole that is difficult to climb out of. From what I have seen of Git, this same thing is not true. The easy cases stay easy, and the hardest cases become possible.

Another great quote from pmichaud on IRC illustrates this point:

I think we have a significant number of parrot devs who have said that they find git merging much easier than svn. I don't know why we need more examples than that. Either that or we simply believe our developers have no clue about what they're talking about. afaik, none of the people who say "git merging is easier" are doing so based on speculation. It's all from hard experience in doing merges in both svn and git.
I won't even claim that branching and merging is easier in Git. I've never done it. However, I have heard lots of other people say that and I am very optimistic that this is the case. If it makes the hard case less hard, that's really all I need to hear.

In summary, here are my arguments for moving to Git: It represents an improvement in a critical development tool that will increase productivity for our developers. Upgrading tools and methodologies is a natural and healthy thing for projects to do. It keeps best practices current, keeps developers productive and happy, and keeps everything moving forward. Git is not the be-all end-all of version control, but being in a regular habit of upgrading and improving our toolset is a good thing for Parrot or any software development team. SVN is a great tool in certain cases, and has lots of merits. Supporting a large distributed development team working in large numbers of branches like we have been doing recently in Parrot is not one of those cases though.

Note: Parrot foundation members should be able to vote in the straw poll on the website. If you are a member but don't have the proper permissions on the website, talk to myself or one of the other admins to make sure you get properly flagged so you can participate. Keep in mind that this is only an informal straw poll and that the results of it are not binding in any way. It's just being used as a tool to measure the general opinions of the development community.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.