Good idea, bad idea

Good Idea
Have a common dump file format that uses key/value pairs of the form:

key: value\n

Bad Idea
Chose not to have full paths start with a leading ‘/’, so that you end up with oddities like:

Node-path: \n

I figured I’d take a look at Bazaar again, since they’ve made so many performance improvements. So, as a starting point, I decided to see how well svn2bzr worked. So I ran it on a dump file of The Subversion Repository. After some other tweaks to the script, I managed to get to revision 26429 where it choked:

Traceback (most recent call last):
  File "./svn2bzr.py", line 1086, in <module>
    main()
  File "./svn2bzr.py", line 1079, in main
    opts.prefix, opts.filter)
  File "./svn2bzr.py", line 1001, in svn2bzr
    dump = Dump(dump_file)
  File "./svn2bzr.py", line 662, in __init__
    self._read()
  File "./svn2bzr.py", line 900, in _read
    field, value = line.split(': ', 1)
ValueError: need more than 1 value to unpack
Exception exceptions.OSError: (2, 'No such file or directory', '/var/folders/9W/9WK-dXFfH2eMq+dEVqJZg++++TI/-Tmp-/tmproKrc3-saved-trees') in <bound method Dump.__del__ of <__main__.Dump object at 0x69a730>> ignored</bound></module>

Classic sign of expecting a value to be there, only to find out it’s not. :-( Turns out the empty Node-path is technically correct, since it was the root of the repository that was modified. It got there because a merge ticket was written as a property of the root directory by SVK in r26429. However, we really should have included a leading slash in the node paths to prevent having an “empty” value. I guess hind-sight is always 20/20.