Subversion tools

The version control system we use, is Subversion. Several tools are already available to make working with Subversion repositories more convenient. Two noteworthy tools are TortoiseSVN and Subclipse. Still, for some tasks we did not find any existing tools and developed our own.


Subversion supports attaching metadata to files through the use of properties. Properties are versioned in the same way that files are versioned. A number of predefined special properties guide Subversion in the way it should treat a file. Some examples are the following:

  • svn:eol-style This property indicates whether Subversion should enforce a certain type of end-of-line encoding (such as, CRLF, LF, CR, or native to the client platform) when checking out a text file.
  • svn:mime-type This property contains the mime-type of the file. When this property is available, Subversion uses it to decide on treating a file as a text file or a binary file.
  • svn:keywords This property tells Subversion for which terms (e.g., "$Author$") keyword substitution should be applied in the file.

Attaching metadata to a file in the repository has to be done manually: a subversion command needs to be issued for every property of every file that needs to be configured. This means that setting correct metadata on an existing codebase is a very labor intensive and error-prone process. Moreover, when adding new files to the repository, one should remember to set the correct metadata on the file.

It seems obvious that this process should be somehow automated to be effective. An important observation here is that, typically, the same types of files will need the same metadata. So, only a mechanism is needed to indicate that the metadata of a certain type of files needs to be configured with a given set of properties. Subversion actually implements such a mechanism. Sadly, Subversion only supports this as a client-side configuration option. This means that every developer would need to use the same configuration (for this particular option) at all times to ensure that new files are consistently initialized with correct metadata.

We believe that the latter mechanism is still error-prone and, for this reason, we decided to develop a script that visits all files in a given repository and corrects or adds metadata when needed. Meet svn_spider!

Using svn_spider

svn_spider is a Ruby script that visits every file (using the Subversion api) in a given set of Subversion repositories to correct and add metadata as needed. Both Ruby and the Subversion Ruby bindings are required dependencies of the svn_spider script. The script can be run manually from the command line, but will typically be launched at regular intervals from a cron-like system. A sample configuration file is included with the svn_spider download.