The New Breed of Version Control Systems
by Shlomi Fish01/29/2004
A version control system enables developers to keep historical versions of the files under development and to retrieve past versions. It stores version information for every file (and the entire project structure) in a collection normally called a repository.
Inside the repository, several parallel lines of development, normally called branches, may exist. This can be useful to keep a maintenance branch for a stable, released version while still working on the bleeding-edge version. Another option is to open a dedicated branch to work on an experimental feature.
Version control systems also let the user give labels to a snapshot of a branch (often referred to as tags), to ease later extraction. This is useful to signify individual releases or the most recent usable development version.
Using a version control system is an absolute must for a developer of a project above a few hundred lines of code, and even more so for projects involving the collaboration of several developers. Using a good version control system is certainly better than the ad-hoc methods some developers use to maintain various revisions of their code.
Traditionally, the de-facto open source version control system was CVS, but lately many others have emerged that aim to be better in some or every way. This article provides an overview of several alternatives.
Common Features
Version control systems come in all shapes and sizes, but there are common guidelines for their design. Some systems support Atomic Commits, which means that the state of the entire repository changes all at once. Without atomic commits, each file or unit changes separately and so the state of the entire repository at any one point may not be preserved.
Most common VCSs allow merging of changes between branches. This means that changes committed to one branch will be committed to the trunk or another branch as well, with one automatic (or at least semi-automatic) operation.
A distributed version control system allows the cloning of a remote repository, producing an exact copy. It also allows changes to propagate from one repository to another. In non-distributed VCSs, a developer needs repository access in order to commit changes to the repository. That leaves developers without repository access as second-class citizens. With a distributed VCS, this is a non-issue, as each developer can clone the master repository and work on it, later propagating his changes to the master repository.
Another common factor is whether the repository allows versioned file and directory renames (and possibly copies well). If a file changes location, will the repository preserve its history? Can changes applied to the organization of the older files be applied to the new organization?
Of these features, CVS itself supports only merging.
|
Related Reading
|
CVS
CVS, the Concurrent Versions System, is a mature and relatively reliable version control system. Many large open source projects, including KDE, GNOME, and Mozilla use CVS. Most open source hubs such as SourceForge support it as a service, which as a result caused it to be used by many other projects.
Despite its popularity, CVS has its limitations. For example, it does not support file and directory renaming. Furthermore, binary files are not handled very well. CVS is not distributed and the commits are not atomic. As there are already better alternatives that aim to be a superset of its functionality, you are probably better off starting a new project by using something else.
On the plus side, CVS is extensively documented in its own online book and in many online tutorials. There are also many graphical clients and add-ons available for it.
Subversion
Subversion aims to create a better replacement for CVS. It retains most of the conventions of working with CVS, including a large part of the command set, so CVS users will quickly feel at home. Aside from that Subversion offers many useful improvements over CVS: copies and renames of files and directories, truly atomic commits, efficient handling of binary files, and the ability to be networked over HTTP (and HTTPS). Subversion also has a native Win32 client and server.
Subversion has recently entered its beta period after being alpha for a long time. As such it may still have some minor quirks, and its performance in some areas is lacking. Nevertheless, it's very usable for a beta-stage software, and was so even in a large part of its alpha-stage.
The HTTP (or HTTPS)-based Subversion service is difficult to deploy in comparison to other systems, as it requires setting up an Apache 2 service with its own specialized module. There is also an "svnserve" server that is less capable but easier to set up (and faster) and uses a custom protocol. Moreover, Subversion's support for merging is limited and resembles that of CVS. (i.e., merges to branches where files were moved will not be performed correctly). It is also relatively resource intensive, especially with large operations.
Subversion is extensively documented in the free online book, Version Control with Subversion. The rudimentary online help system supplied by the Subversion client can also prove useful for reference. Subversion has many add-ons, but they are still less mature than their CVS counterparts.
Arch
GNU Arch is a VCS originally created by Tom Lord for his own version control needs, as well of those of other free software projects. Arch was initially prototyped as a collection of shell scripts, but its main client now is tla, which is written in C and should be portable to any UNIX. It has not been ported to Win32; while it is possible to do so, it is not a priority for the project.
Arch is a distributed version control system. It does not require a special service in order to set up a network-accessible repository, and any remote file-service service (such as FTP, SFTP, or WebDAV) is a suitable Arch service. This makes setting up a service incredibly easy.
Arch supports versioned renames of files and directories, as well as intelligent merging that can detect if a file has been renamed and applies the changes cleanly. Arch aims to be superior to CVS, but there are still some individual features missing. Arch is a post-1.0 system and, as such, is declared mature and stable for any use.
Arch is documented with a very basic online help system and a tutorial.
OpenCM
OpenCM is a version control system created for the EROS project. OpenCM does not aim to be as feature-rich as CVS is, but it does have a few advantages. OpenCM has versioned renames of files and directories, atomic commits, automatic propagation of changes from branch to trunk, and some support for cryptographic authentication.
OpenCM uses its own custom protocol for communicating between the client and the server. It is not distributed. Since OpenCM is not very feature-rich, it is possible that other systems will better suit your needs. However, you may prefer using OpenCM if one or more of its features is attractive to you.
OpenCM runs on any UNIX and on Windows under the Cygwin emulation layer. It features a CVS-like command set and is well documented.
Aegis
Aegis is a source configuration management (SCM) system created by Peter Miller. It is not networked, and all operations are done via UNIX file-system operations. As such, it also uses the UNIX permissions system to determine who has permission to perform what operation. Despite the fact that Aegis is not networked, it is still distributed in the sense that repositories can be cloned and changes can be propagated from one repository to the other. Allowing network access requires using a file system such as NFS.
Being an SCM system, Aegis tries to assure the correctness of the code that was checked in. Namely, it:
Manages automated tests, prevents check-ins that do not pass the previous tests, and requires developers to add new tests.
Manages reviews of code. Check-ins must pass the review of a reviewer to get into the main line of development.
Has various other features that aim to ensure code quality.
Its command set reflects this philosophy and is quite tedious if you desire only a plain version control system.
Aegis is documented in several troff documents that are then rendered into PostScript. As such, it is sometimes hard to browse the documentation to find exactly what you want. Still, the documentation is of high quality.
Monotone
The Monotone Version Control System was created by Graydon Hoare, and exhibits a different philosophy than all of the above systems. It is distributed, with changesets propagated to a certain depot that can be a CGI script, an NNTP (Usenet news) receiver, or SMTP (email). From there, each developer pulls the desirable changes into his own copy of the repository.
This may have the unfortunate effect of causing the history or current state of the individual repositories to fall out of sync with each other, as individual repositories do not receive the appropriate changes, or receive inappropriate ones.
Monotone relies heavily on strong cryptography. It identifies files, directories, and revisions by SHA1 checksums. RSA certificates govern repository permissions.
Monotone supports renames and copies of files and directories. It has a command set that aims to be as CVS-compatible as possible, with some necessary deviations due to its different philosophy. It should be portable to Win32, but was not explicitly ported yet.
Monotone is still under development, and may still have some behavioral glitches. The Monotone developers expect to resolve these problems as work continues.
All in all, Monotone holds a lot of promise, and is well worth examining.
BitKeeper
BitKeeper is not an open source version control system, but is listed here for completeness because some open source projects use it. BitKeeper is very reliable and feature-rich, supporting distributed repositories; serving over HTTP, file, and directory copies, and renames; patches management; tracking changes from branch to trunk; and many other features.
BitKeeper comes in two licenses. The commercial license costs a few thousands dollars per seat (lease or buy). The gratis license is available for development of open source software, but has some restrictions, among them a non-compete clause and a requirement to upgrade the system as new versions come out, even if they have a different license. Furthermore, the source code is not publicly available, and binaries exist only for the most common systems, including Win32.
A handful of projects use BitKeeper, including some of the Linux kernel developers and the core MySQL developers. It has been the subject of much controversy in the Linux Kernel Mailing List. Due to its license, BitKeeper is not suitable for open source development, as this will alienate more "idealistic" developers, and impose various problems on the users who choose to use it. If you are working on a non-public project and can afford to pay for BitKeeper, it is naturally an option.
Conclusion
You probably should not use CVS, as there are several better alternatives, unless you cannot get hosting for something else. (Note that GNU Savannah provides hosting for Arch, and there is documentation for using it with SourceForge). You should also not use the free version of BitKeeper because of its restrictions.
Other systems are nicer than CVS and provide a better working experience. When I work in CVS, I always take a long time to think where to place a file or how to name it, because I know I cannot rename it later, without breaking history. This is no problem in other version control systems that support moving or renaming. One project in which I was involved decided to rename their directories and split the entire project history.
And you certainly have a lot of choice.
More Information
An item-by-item comparison of these systems can be found at the Better SCM Site. Rick Moen has a list of Version Control and SCMs for Linux on his web site. Finally, the DMOZ Configuration Management Tools directory provides many other useful links.
Finally, more information about version control systems and configuration management tools can be found in the comp.software.config-mgmt FAQs page.
is a software professional, who has been experimenting with programming since 1987 and with various UNIX technologies since 1996. He graduated from the Technion with a B.Sc. in Electrical Engineering, and has been heavily involved as a Linux and open source user, developer, and advocate.
His most successful project so far was Freecell Solver, but he also headed several other projects, and contributed to other projects such as Perl 5, Subversion, and the GIMP.
Return to ONLamp.com.
You must be logged in to the O'Reilly Network to post a talkback.
Showing messages 1 through 22 of 22.
-
New breed?
2007-01-03 10:02:11 VCWizard [Reply | View]
Although this article was about open source version control, I still don't get how fixing known issues with CVS can be considered 'new breed.' Version control tools are all about commoditized and part of larger suites of development tools. The only tool I can think of that isn't based on RCS or file-based is Accurev. There may be a learning curve if one is only familiar with CVS, but from what I saw in demonstrations, it is worth the investment to improve productivity, eliminate scripting, the so called, Merge Day parties, and not branching off the mainline out of fear. If I were in school today, I'd be learning Accurev version control right along with Clearcase so when Clearcase is replaced, I still have a job.
-
The firsr distributed version control system
2006-02-11 12:46:01 DrBartosz [Reply | View]
It's only fair to mention that the first distributed version control system was created by Reliable Software (www.relisoft.com) back in 1996. It's called Code Co-op and it's now available in version 4.6.
The creator of BitKeeper, Larry McVoy started working on his system about the same time. Code Co-op was first demoed at the Seventh International Workshop on Software Configuration Management, where I gave a talk about server-less version control systems.
If you Google "distributed version control", you'll get Code Co-op as one of the top hits (Monotone beats it, because open-source projects tend to get better ratings on Google).
-
Subversion 1.0 released
2004-02-23 07:51:00 otto [Reply | View]
Subversion 1.0 was released today, so that's one reason less to stay with CVS.
The 2 candidates I'm considering for future projects are GNU Arch and Subversion. Arch's command set seems a lot more complicated, and documentation is lacking, but the feature set is quite impressing. Subversion looks very familiar if you know CVS, but the one thing I dislike is using Berkeley DB as backend instead of the file system. Not that I think it's unstable, but being a pragmatic programmer, I prefer plain text files over any other storage format. -
Subversion 1.0 released
2005-06-12 21:08:12 qu1j0t3 [Reply | View]
Then you'll be pleased to know that Subversion supports the FSFS repository format as well as BDB. FSFS just uses the filesystem so avoids the drawbacks of a database. I had no particular objection to BDB but over time experienced some awkward locking behaviour with shared repositories, so I have converted all my projects to FSFS.
-
Didn't convince me
2004-02-05 15:44:17 craigg [Reply | View]
I'm sorry but after reading this article I see no reason to move off of CVS. The alternatives are beta, customized for a particular OSS project and offer nothing over CVS. Did I miss something here??
-
Familiarity Wins the Day
2004-02-02 06:13:16 russfink [Reply | View]
For many projects, version control software is viewed as "overhead." The bottom line is "the bottom line:" large projects will save money by using something that is popular, which in turn requires less re/training of staff, results in more familiarity with industry standards, and generally gets the job done with the least amount of fuss. Since money is core to the project, and no contract I'm aware of ever paid anyone per checkin, projects will go with what works. For about 90% of the industry, it's CVS.
Counterexample in point - a spacecraft control center project I worked on years ago was using a large, very expensive commercial version control system. The approximately 100 developers were having such difficulties with getting versioning to work and got so wrapped up in build environment issues that the project manager declared a halt of development and decreed a "Makefile Day." The festivities of Makefile Day had a core of 5 top-notch developers visit every development subgroup to help fix their makefiles to work with the versioning system. The cost to correct these problems was 100 developer days (Apologies to Frederick Brooks.)
If you are a new software developer and either work in the industry, or are looking to get into the industry, you are well advised to learn CVS. To the more seasoned developers thinking of changing to different control systems, I ask this: when have you gotten around to learning emacs so you can stop using VI? What's that, no time to relearn, what you're using works fine already? What about your neighbor? Precisely my point.
I have to agree with the first post - the "Troll" points out that the author recommends against CVS, but neither adequately supports that assertion, nor provides a recommended alternative.
Remember, for large projects, you have to use what the project is using. For managers, you must use what a bulk of your developers are familiar with, or what a bulk of your soon-to-be-hired developers are likely to be familiar with, or incur high overhead costs. For small projects, it doesn't much matter, but realize that any VCS is better than no VCS.
-
Re: Familiarity Wins the Day
2004-02-02 07:33:01 shlomif [Reply | View]
Here, I disagree. Using a sub-standard version control system (or any such development tool, for that matter) will cost a development team a lot of lost time, money and frustration. Teaching workers to use something else takes a constant time, and afterwards everything works faster, they feel less frustrated, and you save more money. CVS is a sub-standard version control system, and most of the other VCSes I mentioned tried to be as similar to it as possible, so people can still use the old conceptual model, and retain some of their habits.
The Joel Test:
http://www.joelonsoftware.com/articles/fog0000000043.html
Does not say that you should use the best tools money can buy for nothing.
What you are saying is actually perpetuating the Status Quo, claiming that developers cannot effectively be retrained and shouldn't. Developers are smart people - they can always be retrained. People have switched from SCCS to RCS to CVS and there's no reason there wouldn't be a future switch to Subversion or whatever. (and I am almost sure there will be). They also switched from sh to csh to tcsh to bash, and from troff to latex.
The world is dynamic and should be - get used to it.
-
terrible article
2004-02-02 01:58:57 drek [Reply | View]
Usually I see quality content in the articles on this site. This article is borderline ridiculous.
Without any solid reasoning or alternative provided, there is a recommendation at the end of the article to not use CVS.
IMO, this should be either rewritten or pulled. -
terrible article
2004-02-26 12:07:53 dettifoss [Reply | View]
I have to agree. I knew very little about versioning systems, other than that CVS is the "standard", while being simultaneously widely criticized for its failings. So I turned to this article - to O'Reilly - for pointers to alternatives.
What I found was profound equivocation. The message I got was, don't use CVS, and then, on the other hand, the alternatives aren't much better, if at all.
I felt there was really very little thought given to the content of the article beyond its structure, and the descriptions of the included systems left almost everything lacking.
As a result, I put the project on the backburner for a month. -
terrible article
2004-02-26 12:25:50 chromatic [Reply | View]
It's difficult to make one solid recommendation that suits everyone, but I do understand your frustration with the conclusion. There's a lot more heat than light, and the best anyone can do right now is give an opinion. Here's mine.
If you're already comfortable using CVS, it's worth trying Subversion on any new projects, now that they've released version 1.0. The learning curve is very gentle for existing CVS users and the new features and missing bugs make it nice.
If you're interested in more distributed development, where developers can maintain their own trees, Arch seems to be the best choice at the moment. It's under rapid development and has several high-profile projects. You will have to adapt to its style of use, though you'll see a lot of benefits from working with it, not against it.
If you'd rather have a system that manages your development process, perhaps Aegis will fit. I've not used it, but I'm impressed that it enforces practices such as test-driven development. Again, there's a lot to learn here.
If you're working with a hosting system, you're likely stuck with what they give you. If that's CVS, hopefully you know the limitations and can deal with them.
-
terrible article
2004-02-02 02:42:39 shlomif [Reply | View]
Pheeww... what a troll.
I think many of the systems covered are superior to CVS in one way or another. Subversion, especially is a superset of CVS' functionality, and so the choice is between Subversion and the others, not between CVS and the others.
I explained why CVS was not good and why other systems were better in the article.
-
terrible article
2004-02-05 05:23:38 zooko [Reply | View]
Subversion appears to be riddled with bugs, from what I've read.
I would definitely use CVS over Subversion, in February of 2004.
-
terrible article
2004-02-05 09:56:30 sanchonevesgraca [Reply | View]
FUD. Subversion has a release candidate and is used in several large projects. It has been usable for more than a year. How about leaving judgment to people who actually use the tool and can testify to its stability. I definitely use Subversion over CVS, in February of 2004.
-
Don't forget Perforce
2004-01-31 15:58:06 jstraitiff [Reply | View]
Yeah, it's another closed source solution. However it does have free liscenses for open source work... It has atomic check-ins (of groups of files), renaming,good branching,etc.
-
Not so fast...
2004-01-30 01:22:52 chrisrimmer [Reply | View]
CVS obviously has its problems. However, one advantage it has over the opposition is the very fact that it is the dominant player. This means that development tools are far more likely to integrate with it. Plus, your developers are much more likely to already know how to use it. -
Not so fast...
2004-01-30 01:37:18 shlomif [Reply | View]
As far as developers are concerned, they won't have any problem adjusting to a different version control system that is compatible enough with CVS. As for development tools, integration with them is a nice thing, but not a must to use such a tool. (you can always use the command line).
Having used CVS before and using Subversion now, I can tell that I find CVS extremely painful to use now. I wouldn't recommend it for anything. I think we will eventually see CVS superceded by something else entirely, and one is advised to join the switch sooner than later.
CVS - Die! Die! Die!
-
Not so fast...
2004-02-26 12:16:05 dettifoss [Reply | View]
"CVS - Die! Die! Die!" ? ? ? Maybe you should have used this as the title to the original piece: at least then your agenda would have been clearer.
So the point of the original, then, was not to present a balanced and informative comparison of available VC systems, but rather to stick it to CVS?
That's certainly what came across :-) Shame the title that was so misleading... -
Not so fast...
2004-01-30 03:43:02 sanchonevesgraca [Reply | View]
Thank you for your article. It provided an overview of SCM systems suitable for open source projects. I favour Subversion for the following reasons. I previously used CVS and a year ago I switched to Subversion. I am not involved in its development but I can write the following from a user perspective. The fact that CVS has an overwhelming presence in open source projects does not mean that the status quo should prevail. The lack of move operation for files and directories while keeping history is definitely a strong reason to migrate to a new SCM system. Also very important is the support of HTTP protocol. Apart from it ubiquity, it allows developers working behind a company firewall to access remote repositories without violating the firewall policy. Arguably, a natural replacement for CVS is Subversion, because of its Apache license and the fact that the project team developed it with the clear purpose of replacing CVS (at the beginning of this week, Subversion release candidate 1.0 was published; the Apache Software Foundation has since last year a preliminary Subversion repository). The Subversion SCM concepts, while not trying to replace all other SCM systems, has a powerful paradigm. Everything is a directory or file, and most SCM patterns can be applied following a few naming conventions for the main project directories. Also note that the svn commands are much cleaner and consistent than cvs, making the interaction from a Unix or DOS shell quite easy. Last but not least, the Subversion system was designed from the outset with both Unix and Windows in mind. Native support for Windows is very important because it forms the bulk of desktop systems, in and out of corporations. In this respect, the GNU Arch developers could learn a bit more about software engineering from Subversion, since they developed a system for Unix and still consider Windows as an afterthought, which is unrealistic.




