bzr-svn, round 2

In a previous post, I spoke at length how to get started using bzr-svn as a client against a Subversion repository. After another 8 months of using it, I have a few more thoughts I’d like to share.

A word of warning: much of this is about the negatives of using Bazaar against a Subversion repository. I’d like to be clear on a couple of points.

The issues raised are born out of the differences between Bazaar and Subversion. Bazaar is not Subversion. Bazaar made different choices about it’s model for version control, and bzr-svn does it’s best to bridge the gap.

I’d also like to take a moment, and say that these issues in no way reflects on bzr-svn’s author, Jelmer Vernooij. Jelmer has been extremely responsive to my inquires, quick to react to any branches I’ve proposed, and generous with his time and knowledge. I personally believe Jelmer is a fantastic programmer, and his prolific contributions to many open source projects is nothing less than astounding–I really don’t know how he finds the time.

Ghosts suck

In my previous article, I advocated merging your code with the mainline (trunk) and pushing those changes back to the Subversion trunk (alternatively, you could use a bound branch and commit, which is what I find myself doing most of the time). One issue that my advocated workflow creates is that if you follow my outline directly, you end up introducing ghost revisions into the Subversion repository.

A ghost revision is essentially a revision that should exist, but doesn’t exist on the branch. In the true Bazaar sense, ghosts don’t occur–all revisions are transferred when you merge one branch into another. As a result, I’m not entirely sure why they’re tolerated, but perhaps it was intentional to facilitate working with foreign repositories. The way ghosts were envisioned, the merge commit simply would not reference the missing revisions. And some point down the road, perhaps you would merge in a branch that had knowledge of the missing revisions, and suddenly, you can now see the extra revisions. Unfortunately, that’s not how bzr-svn is introducing these revisions. It is referring to the missing revisions directly. Several Bazaar commands have been altered to workaround this issue, but it’s busted on several fronts.

‘bzr log’ with ghosts

Take the following example. I have a file called A.txt in the Subversion trunk. Let’s modify it, and merge it back into Subversion’s trunk. First, we clone trunk and create our feature branch:

$ bzr co $SVN_URL/trunk trunk
Initialising Subversion metadata cache in /.../.bazaar/svn-cache/...d4fd1b316ffe.
Copying history to "trunk". To checkout without local history use --lightweight.
$ bzr branch trunk my-feature
Branched 2 revision(s).

Next, we make the change:

$ cd my-feature
$ echo A >> A.txt
$ bzr ci -m "Add more A"
Committing to: /.../my-feature/
modified A.txt
Committed revision 3.

Then, we merge that back to Subversion’s trunk:

$ cd ../trunk
$ bzr merge ../my-feature
 M  A.txt
All changes applied successfully.
$ bzr ci -m "Merge my-feature."
Committing to: /.../repo/trunk
modified A.txt
Committed revision 3.

At this point, everything appears fine for you:

$ bzr log A.txt
------------------------------------------------------------
revno: 3 [merge]
committer: John Szakmeister <john@szakmeister.net>
branch nick: trunk
timestamp: Tue 2010-05-25 00:41:35 -0400
message:
  Merge my-feature.
------------------------------------------------------------
revno: 2
svn revno: 2 (on /trunk)
committer: jszakmeister
timestamp: Tue 2010-05-25 03:41:20 +0000
message:
  Add A.txt
------------------------------------------------------------
Use --include-merges or -n0 to see merged revisions.

Just what I’d expect. Revision 3 affected A.txt, and is listed.

Now, pretend that there is another user who doesn’t have access to the Bazaar branches that you have staged. I don’t think this is far-fetched by any stretch of the imagination, as big developer bases may have individual exploring other tooling options–such as myself–while still needing to meet the status quo. Given that, let’s create a new clone of trunk:

$ cd ..
$ bzr branch $SVN_URL/trunk trunk2

Note: I’m purposely not using shared repositories here because another developer wouldn’t have access to my shared repository. I want to show how revisions get transferred, and how they don’t get transferred too.

Let’s run the same command in our new version of trunk:

$ cd trunk2
$ bzr log A.txt
------------------------------------------------------------
revno: 2
svn revno: 2 (on /trunk)
committer: jszakmeister
timestamp: Tue 2010-05-25 03:41:20 +0000
message:
  Add A.txt

WTF? Where is revision 3 from that list? The way ‘bzr log’ works around this issue, unfortunately, means that you lose that information when asking about a specific file. ‘bzr log’ still nows about the rev, and it even seems to know about the fact that A.txt was modified:

$ bzr log -v -r -1
------------------------------------------------------------
revno: 3 [merge]
committer: John Szakmeister <john@szakmeister.net>
branch nick: trunk
timestamp: Tue 2010-05-25 00:41:35 -0400
message:
  Merge my-feature.
modified:
  A.txt
------------------------------------------------------------
Use --include-merges or -n0 to see merged revisions.

I filed an issue in Launchpad back in April, but haven’t heard anything. I just pinged again, so hopefully there is some movement on the problem. But, until the problem is fixed, you have to be weary of what bzr log <path> is showing you.

‘bzr diff’

In much the same way that log has a workaround, so does ‘bzr diff’. In this case, the actual time the file was modified in one of the ghost revision is inaccessible. So it compromises by using epoch:

$ bzr diff -c3
=== modified file 'A.txt'
--- A.txt	2010-05-25 03:41:20 +0000
+++ A.txt	1970-01-01 00:00:00 +0000
@@ -1,1 +1,2 @@
 A
+A

In this case, I think the result is satisfactory, although I’d rather see the time when the merge was made. The above is just plain confusing as it is: how can I have had a base that came after the new line was added?

‘bzr annotate’

Perhaps one of the biggest annoyances is that ‘bzr annotate’ is completely busted:

$ bzr annotate A.txt 
bzr: ERROR: exceptions.KeyError: 'john@szakmeister.net-20100525044133-vab4iqddqrxnx4y8'                                

[snip large traceback]
KeyError: 'john@szakmeister.net-20100525044133-vab4iqddqrxnx4y8'

bzr 2.2.0dev1 on python 2.6.1 (Darwin-10.3.0-i386-64bit)
arguments: ['/Users/jszakmeister/bin/bzr', 'annotate', 'A.txt']
encoding: 'UTF-8', fsenc: 'utf-8', lang: 'en_US.UTF-8'
plugins:
  bookmarks            /Users/jszakmeister/.bazaar/plugins/bookmarks [unknown]
  bzrtools             /Users/jszakmeister/.bazaar/plugins/bzrtools [2.1.0]
  colo                 /Users/jszakmeister/.bazaar/plugins/colo [0.2.0dev]
  diffstat             /Users/jszakmeister/.bazaar/plugins/diffstat [0.2.0]
  explorer             /Users/jszakmeister/.bazaar/plugins/explorer [1.0.1]
  keychain             /Users/jszakmeister/.bazaar/plugins/keychain [0.1.0]
  launchpad            /Users/jszakmeister/Library/Python/2.6/site-packages/bzrlib/plugins/launchpad [2.2.0dev1]
  netrc_credential_store /Users/jszakmeister/Library/Python/2.6/site-packages/bzrlib/plugins/netrc_credential_store [2.2.0dev1]
  news_merge           /Users/jszakmeister/Library/Python/2.6/site-packages/bzrlib/plugins/news_merge [2.2.0dev1]
  qbzr                 /Users/jszakmeister/.bazaar/plugins/qbzr [0.19.0dev1]
  rewrite              /Users/jszakmeister/.bazaar/plugins/rewrite [0.6.0]
  svn                  /Users/jszakmeister/.bazaar/plugins/svn [1.0.3dev]

*** Bazaar has encountered an internal error.  This probably indicates a
    bug in Bazaar.  You can help us fix it by filing a bug report at
        https://bugs.launchpad.net/bzr/+filebug
    including this traceback and a description of the problem.

Again, this is because the merge revision introduced by Subversion refers to a non-existent revision which then breaks the annotate algorithm. This bug was reported back in September of ‘09, but hasn’t seen much movement either. I believe that’s partially due to lack of a test case–something I hope to resolve, but it’s unclear how long it will linger for users of bzr-svn.

One workaround for this is to push your branches into Subversion, and then merge the branch to trunk. At this point, the revision is no longer a ghost and ‘bzr annotate’ will find the revision and use it. That workaround may be practical for some folks, and not others.

Tags… kind of

One other issue I’ve seen is in regards to tagging. Currently, bzr-svn brings in all the tag names, but you can end up with some being unresolved:

$ bzr tags
tag-unresolved       ?
tag1                 3

This can happen one of two ways:

  • You checked out the root of the repository, and did the svn cp locally, then committed the result.

  • You created the tag remotely, but then edited the contents of the tag in some way (added binary builds, changed the final release numbers, updated some docs, etc.).

The end result in both cases is that the tags end up being more akin to branches than a true tag. The fact that the tag name made it into the branch is great, but unfortunately you can’t examine the contents of the tag because the revision that modified the tag is not in your current branch.

You can resolve the issue by bzr branch-ing the tag, and then merge it into trunk and use bzr revert . to revert the changes from the tag, but keep the fact that the revisions were merged. That enables Bazaar to find the revision, but it does introduce another commit to trunk in order to bring those revisions into your Bazaar branch:

$ bzr branch $SVN_URL/tags/tag-unresolved tag-unresolved
Branched 5 revision(s).
$ cd trunk
$ bzr merge ../tag-unresolved
 M  A.txt
Text conflict in A.txt
1 conflicts encountered.
$ bzr revert .
 M  A.txt
$ bzr ci -m "Merge tag-unresolved to trunk."
Committing to: $SVN_URL/trunk
Committed revision 7.
$ bzr tags
tag-unresolved       3.2.2
tag1                 3

Lack of feature parity

Let’s face it: Bazaar is not Subversion. I’m okay with that. However, I’m not okay with the current state of a few features. In particular, eol-style handling. Bazaar continues to use this content filtering scheme, but you can only define rules in $BZR_HOME/rules. I personally don’t like this opt-in style, but the real rub is that many SVN repos out there have conflicting styles. In one repo, Makefiles are linefeed. In another, they’re native. At the moment, it is possible for someone using the Bazaar client to bork a file in a way that the svn client becomes extremely unhappy (think mixed line-endings, which some editors do introduce, like it or not).

Update: I tried reproducing the problem of introducing mixed-line endings into a file by Bazaar and breaking the Subversion client. At the moment, it appears to depend on which version of Subversion is on the backend. Using Subversion 1.5, I cannot reproduce the problem with breaking the client. However, I did turn up a different one. Checking out the branch with Subversion, you’ll find that the file with mixed line endings has been converted to just be native. In Bazaar, you’ll still see the mixed-line endings, even if you re-branch from the Subversion branch so that it doesn’t have your original commit history. This is somewhat unexpected. IIRC, files with svn:eol-style set are supposed to be canonicalized to LF format by Subversion. Apparently, that only happens client side and isn’t enforced server-side. It’s good to see that more recent versions of Subversion cope well, but I need to investigate some older versions, as I still have to work against some repos that are running with Subversion 1.4 or less.

Bazaar metadata sucks

Say you’re personally using Bazaar and bzr-svn on your team, but few others are doing the same (perhaps they haven’t had an opportunity yet to feel the speed of having history locally or some excellent plugins–such as qbzr). So, you’re working away, creating feature branches, pushing them to Subversion merging them to trunk, etc. You’ve also been diligent, and helping your team with a stable branch in preparation for a release. One of the SVN guys needs to cherry-pick a rev from trunk to the stable branch, and it happens to be a revision that you committed.

The issue is that any revisions introduced by using ‘push’ or ‘commit’ have metadata being written into the base of the branch. The problem is that Subversion is terrible when it comes to merging properties, and in this case, it has absolutely know idea how to resolve that metadata when the revision is cherry-picked from trunk. The poor SVN developer knows there is an error there, and a conflict that he somehow has to resolve, but there really is no good answer on how that should be done. So for now, you need to have the guy using Bazaar and bzr-svn do the merge.

I haven’t examined this issue in-depth yet, so it may be related more to a change in the bzr-svn mappings, but I suspect there’s more to the problem than that.

Update: I was searching around about more information, and ran across the following in Jelmer’s bzr-svn FAQ page:

bzr-svn can and will use revision properties rather than file properties if the Subversion server is running Subversion 1.5 or higher. These custom revision properties don’t show up in commit notifications or trac.

That’s really good news. It means you won’t end up with these conflicts with Subversion 1.5 or better on the backend.

That’s alot of bad news

So far, I’ve raised a few pretty troubling concerns for bzr-svn users. I think it’s important to keep in mind that many of these problems depend on how you use bzr-svn. One way around these issues is to use a different workflow.

Git and Mercurial users have been using a rebase/dpush workflow against Subversion with a great deal of success. That workflow has some limitations as well. In particular, it becomes much more difficult to have long term branches since dpush doesn’t insert metadata into the Subversion tree.

OTOH, it does appear to be friendly to the existing Subversion users. I’m only beginning to experiment with this and how it affects my crew. We need to maintain several long-term branches, and it’s unclear how to do that and still get things landed on Subversion’s trunk periodically without self-induced conflicts.

I believe there is a way to strike a middle-ground that allows us to do the long-term branches, keep our metadata, and still not push that metadata into Subversion. I’ve not worked out the details just yet, but I think we can keep an integration branch that has all of the metadata, and periodically merge a mirrored version of trunk into the branch, as well as periodically merging our branch into trunk. I’ve found the bzr merge --lca tends to work well when bringing in trunk after I’ve landed our changes without the metadata. But I need to work through some more of the details before recommending it.

I promise to do the same as I have for these two bzr-svn articles, and when I discover a useful solution, I’ll do my best to present it for others to use. At the moment, I still need to spend more time working through the issues before I recommend anything in particular.

And for the record, yes, I’m still using Bazaar and bzr-svn pretty extensively. :-)