Version Control Habits of Effective Developers

(Apologies to Stephen R Covey for the title...)

Through some odd coincidence these posts crossed my path recently:

What I got to thinking about is that version control is one of the foundation tools in your development process. It enables many other processes and practices. The posts above make great points, but I'd like to put them in the context of the overall process. (I've also given some thought to the conflicting advice given, again in the context of the forest instead of the trees.)

  1. Use version control! If you're not doing this, you should get out of the profession now.
  2. Check in often. This is frequently given advice, but I feel it's often given out of context. [Jeff Atwood][Check In Early, Check In Often] includes a quote from AccuRev founder Damon Poole: My rule of thumb is "check-in early and often", but with the caveat that you have access to private versioning. If a check-in is immediately visible to other users, then you run the risk of introducing immature changes and/or breaking the build. (My emphasis.) "Private versioning" means a private branch: you can check in without affecting anyone else (see #8). But even with a private branch, consider Salvatore Iovene's advice: "SVN is not a backup tool". He makes the point that you shouldn't check in before you go home just because it's the end of the day. Instead check in when you have a logical unit of work (e.g. everything compiles, most of the tests pass but you have to make some experimental changes to finish the feature or perform debugging, etc.). The danger is that if you have to revert because you've made some disastrous changes, you won't have a sane state to revert to. Final point: Salvatore also reminds us that some tools don't commit to a central repository by default (git, darcs, etc), so checking in doesn't replace a good backup.
  3. Put everything in version control. This is a key enabler for continuous integration (CI). James Shore has several great posts on CI. I think he under-emphasizes the need to check in everything necessary to build. First, it should be easy for a new hire to pull down a source tree and immediately get a good build. It makes life easier for them and for you. Second, your CI server should be able to do the same thing. If someone upgrades one of the tools locally and the build depends on this then the integration build should break. The tool upgrade must be checked in so that everyone picks up the change.
  4. Don't break the build. [When I say "build" I mean "build and tests". It's unfortunate naming but I think this is the way the term is used by a lot of people.] This is a promise that the team has to make to each other: "I will not break the build." What it means is that every team member agrees not to check in code to the trunk that does not build and pass all tests. If you have jerks on your team that constantly break the build, ask them to play by the rules, and if they can't then get rid of them because they're dragging you down.
  5. Check in bite sized chunks. Too big and you've probably been working too long: did you go dark? Large checkins are also harder to review well, so quality will suffer. If a given feature is large, create a branch and continue to check in bite sized chunks (see #6).
  6. Branch, but only when needed. If you don't have bite sized chunks, or if more than one person needs to work on a feature that can't go into the trunk, create a branch. When you have a branch, it is critical to update (or merge, or pull, whatever terminology you prefer) into the branch from the trunk on a regular basis. Integration will be a nightmare if you don't. When the code on the branch is mature enough, merge (or push) it back to the trunk. (Tip: it doesn't need to be perfect, it just shouldn't introduce regressions into the trunk.) If more non-bite-sized work is needed on the feature, create a new branch and repeat the process.
  7. Check in when you're ready. But not sooner or later. This should be obvious, but some people don't seem to get it. If you check in before you're ready, you break the build (and your promise to your team members). This can stop your entire team dead in their tracks! If you wait too long, you've lost the opportunity to get rapid feedback on your work. This practice enables continuous integration. It relies on having work broken down into bite sized chunks (see #5).
  8. Use private versioning. Private versioning means that each developer has his own branch (or even multiple branches). With a VC tool like AccuRev, this is automatic: every tree has an associated private stream (branch) on the server. With distributed VC tools like git, this is also automatic: everything is private until you push/pull to the central repository or another developer. It can be simulated with other tools but requires a bit more effort. This allows individual developers to maintain several versions of their work before they are ready to share with the rest of the team. It also allows a developer to keep a sane version of code before merging in changes from the trunk. (If the merge becomes a disaster, he can discard the merge results and go back to the sane version without much effort.)
  9. Update/merge/pull regularly. This also seems obvious, but again not everybody gets it. I mentioned it above as part of branching, and it's especially important if you're using private branches. You must merge from the trunk to your private branch/workspace/sandbox regularly. If you have private versioning, this is low risk because it's easy to revert a bad merge. (These two practices create a self-reinforcing virtuous cycle!)
  10. Train your users. I've worked in shops where it seems like only half of the team understands anything beyond "checkout, update, checkin". Everyone must know how to perform daily duties like checkout, checkin, update and merge from trunk, merge from private branch to trunk, how to navigate history, perform diffs, etc. Everyone should know how to perform less-frequently-needed duties like odd sorts of merges - cherry picking, etc, though this is maybe less important.
  11. Use a common project structure and naming convention. Thanks to Anders Sandvig for this bit of wisdom. I wouldn't have thought to include it in VC best practices but it makes sense to list it here. A common project structure makes it easier to add every project into the CI server. For example, I've used the CI server to enforce the rule that every project must provide a $BUILDROOT/build script that runs the entire build. (The server also enforced placing tests in $BUILDROOT/tests, list of images to be published in $BUILDROOT/MANIFEST, etc.) This also makes it easier to move staff from one project to another.
  12. Enable checkin notifications. This is typically seen as an email from a post-commit-hook (or whatever your tool calls it). The checkin notification can also launch CI build. Ideally, the notification will tie each checkin back to whatever issue/defect/task/project tracker you're using.
  13. Write good checkin comments. This makes notifications worthwhile. Remember that VC is a communication tool as much as a time machine! I've had several experiences where reading someone else's checkin comment has spurred a discussion that either causes them to rework their bug fix or it has caused me to change course on something I've got in progress.
  14. Perform atomic checkins. - When it's time to check in, don't do ten different checkins for the ten different files you modified! They all belong together as a logical unit. This makes merging easier. It prevents the CI server from running multiple build and test cycles needlessly (and from reporting spurious breakages because it picked up half a change). This makes history easier to understand when others look at comments. This is part of training, and is reinforced if you expect all developers to merge regularly. (On the theory that merging is easier when checkins are atomic.)
  15. [Advanced practice.] Use a holding area. - Instead of checking in to trunk, developers check in to a holding branch (call it "pre-integration" or something similar). You perform code review and run integration tests on that branch. When review has approved the changes and the tests pass, promote (merge) the change from the holding area to trunk. The rest of the team pulls (merges) from the trunk, which is guaranteed safe. I've also seen the use of a two-level holding area where the first level is for review and the second level is integration test. This is the default if you use private branches (#8) and perform code reviews out of the private branch before integrating (#16).
  16. Perform reviews on checked-in code. This is so much better than manually gathering changes and emailing patches around. (Though tools like git have optimizations specifically for eamiling patches around.) This is easy if you're using private branches since you can request a review before integrating. Some code review tools enable this.

Posted on 2009-01-07 by brian in process .

Comments

Write good checkin comments:

I think Git commit message convention is a good idea to follow here: single-line "subject" of changes, short description of a commit, separated by empty line from more detailed description of changes. Also for open-source projects it could make sense to apply Linux kernel and git project signoff (certificate of origin) policy.

Jakub Narębski
2009-01-09 12:32:34

Excellent article!

I'd like to switch to Git to use private branches, but unfortunatly I can't miss the SVN integration in Eclipse, and TortoiseSVN is also mighty handy.

I'm currently implementing Hudson CI, the holding branch is a good idea! Probably going to implement that.

I'm going to print your article and hang it on our messageboard somewhere :) .

Dieter
2009-05-26 15:20:21
Comments on this post are closed. If you have something to share, please send me email.