Mercurial pretxn Hook Race
By mkupfer on Feb 11, 2009
Currently the ON gate (or at least the open source part) is mirrored on opensolaris.org. We were having a discussion the other day about what needs to be done so that we can actually host it on opensolaris.org.
One of the issues that came up is the race in the Mercurial
pre-transaction hooks, such as the
These hooks let a repository reject pushes
that don't meet whatever criteria that the hook has set. For
the ON gate, we use it for things like making sure there is an
approved RTI for the changegroup.
The problem is that the implementation of these hooks opens up a race condition. The metadata for the changegroups gets written to the repository, then the pre-transaction hook gets run. The advantage of this approach is that the pre-transaction hooks can use existing APIs and code paths when examining the incoming changegroup. But Mercurial repositories are structured so that readers don't need a lock; instead they depend on an atomic update of the top-level metadata. So the disadvantage of this approach is that there's a window during which someone pulling from the gate could get the pending changegroup, even if the hook later rejects it.
This issue is described in Section 10.3 of Bryan O'Sullivan's Mercurial book; it is also issue 1321 in the Mercurial bug tracker. The workaround that the Mercurial book describes is the one that we used for the ON gate: the repository that people push to is write-only. After the pre-transaction hooks have cleared the changegroup, another hook pushes the changegroup to a second clone repository, which developers pull from.
While this approach is functional, it's not esthetically pleasing. And there's a practical problem: the SCM infrastructure on opensolaris.org doesn't support having two repositories tied together like that. I'm sure it could be done, but administration (e.g., updating the access list) would be clumsy, and it might require giving the ON gatekeepers shell access to the opensolaris.org servers (which would not please them or the server administrators).
Fortunately, Matt Mackall has devised a fix for the race condition. The new changegroup will not be visible for pulls until it has passed the pre-transaction hooks. And if I understand correctly, the fix will not require changes to existing hooks, except for the case of Python hooks that spawn subprocesses..
There are other changes that we will probably make before hosting the gate on opensolaris.org. For example, we'll probably change the SCM console (the web interface for managing repositories) so that it scales better for large numbers of committers. But getting a fix for this race condition means we'll have one less issue to deal with.