Developer Drain Brain

November 16, 2010

A Journey Through Source Control. Part 1. Generations of Source Control

Filed under: Development — rcomian @ 1:27 pm

In this series I’m going to be looking at some of the key concepts available in source control, and how they can be applied to gain full control over your source code.

Now, in all fairness, I don’t make my living by telling people how to use source control, but it’s one of the things that fascinate me about software development, purely because it’s so underused and misunderstood by a large percentage of my fellow developers.

The first thing that I think is important to grasp is that we’ve already churned through several generations of source control tools. I can easily identify 4 generations based on just the surface architectures involved, and each one represents a significant advantage over the previous generations.

Source Control Generations
Generation Feature Example
1 History of individual files SCCS/RCS
2 Networked server CVS/VSS
3 History of whole trees SVN/TFS
4 Distributed repositories Git/Hg

So with the first generation of tools such as RCS, we had the ability to keep the history of the files we were working on. That was pretty much it, really. Working with multiple files was tricky at best, and sharing wasn’t exactly safe, as everyone would work in the same shared folder.

CVS & VSS came along and rescued us from this situation by wrapping RCS and keeping all it’s files in a repository, which could be accessed through a server on the network. This meant that all developers could have their own copy of their files on their machines, isolated from everyone else. Then when work was complete, the changes could be sent up to the server and new changes brought down.

This CVS style working is all very well, but it still has problems – namely that files are still versioned individually and that it can be very hard to reliably isolate changes that happened together across a range of files after the fact. SVN and others came on to the scene to rectify this by keeping all the revisions in a single stack of changes, numbered sequentially. Changes to multiple files could be kept within a single revision, or changeset, and you could see not only what changes were made to a particular file, but drill down into each change to find out what the full story was each time. This also makes branching and merging a far more tenable idea since a complete ‘changeset’ could be reliably migrated in full across different branches.

And most recently, after having 3 generations of centralised repositories, Git, Hg and others threw away the need for an enforced central repository. The effect of this is that it no-longer matters exactly where source code is located – what matters is the actual content itself, which could come from anywhere. Whether I pulled revision ’24g4g2′ out of the main company server, a junior developer’s laptop, a pirate website or the smoking remains of a tape in the server room, I would be guaranteed that the code was exactly what I was after, including all the history up to that point. Central servers are now used at your team’s convenience, not just because the tool demands one.

So there we go, a very short, and incomplete, potted history of source control tools.

Next we’ll look in more detail at the jump from 2nd generation to 3rd generation – and how so many people are thinking of it the wrong way round.

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: