Keeping Your Life in Subversion
Subject:   Subversion cannot handle ~100 MB files
Date:   2005-01-07 15:52:28
From:   joeyh
Response to: Subversion cannot handle ~100 MB files

As a computer programmer, I don't generally edit large files. While I do have several gigabytes of data in svn, it's in hundreds of thousands of small files, with a large file being on the order of 5 or 10 mb. With this workload I find svn to be reasonably fast, even if I'm doing an update of my entire home directory. These is some overhead compared to rsync, but I can live with that.

Anyway, I did some benchmarking here with 200 mb files containing random data, and it takes me 11 minutes to check such a file in locally to a repository checked out with file://, 10 minutes to check such a file out over svn+ssh:// to a remote machine over 802.11b wireless, 9 minutes to scp the same file over the same link, and 7 minutes to check it out locally. These don't seem to match your numbers, but then you're using the http:// protocol which is probably much less efficient and anyway the network operations appeared bandwidth/disk-bound to me.

Hope this helps, but you can probably get better information on subversion scalability with larger files on one of the subversion mailing lists.

    Large files are an issue also for programmers: consider the scenario of checking out large project dependencies such as libraries. The HTTP protocol is widely used by Subversion projects, since it allows connection to repositories through firewalls. So the limitation I pointed out is not a minor problem. The long checkout times I observed occurred over HTTP in internet and intranet connectivity so the limitation was not because of the network bandwidth nor can it be explained by the HTTP protocol. I also noticed that the svn client would, after about ten minutes, take all the computer processor capability and the transfer speed would be quite small (~5 KBps). I do not know the internals of the Subversion checkout implementation, but these observations suggest that the design does not scale up to this file size. Others have encountered this limitation (see