|
|
OpenSolaris DSCM Evaluation: MercurialA. Tool InformationMercurial version 0.8. B. Configuration Details1. CPU architecture and machine specifics
kosh$ psrinfo -v
Status of virtual processor 0 as of: 03/30/2006 14:15:16
on-line since 03/09/2006 16:44:59.
The i386 processor operates at 1600 MHz,
and has an i387 compatible floating point processor.
kosh$ uname -a
SunOS kosh 5.11 snv_35 i86pc i386 i86pc
2. Memory available
Memory size: 1024 Megabytes
3. Size of the repository (space)641MB, including working directory. This repository mirrors ON's internal Teamware workspace as of 2006-04-11 16:52:19Z and has history back to build 18 (shortly after the OpenSolaris launch). 4. Number of files in the repository43,311. 5. Average number of deltas per file1.3. More details: 1,884 changesets, 57,173 changes. (57173/43311=1.3) C. Evaluation Areas1. Operation Functionality(Refer to the Requirements Document for detail.) e1. unbiased and disconnected distributionMercurial implements a model of independent repositories, though a repository can be configured to have a de-facto parent. Mercurial supports updates between two repositories with a common ancestor. If child-2 pulls from child-1 and has one or more changesets that aren't in child-1 (a common scenario in ON development), child-2 will have to do an "update -m", even if the changes are non-conflicting. Unlike Teamware, the merge must then be explicitly committed. (This is a feature, since it means you can test the merge before committing it, but it seems likely to cause confusion for Sun engineers. If we use Mercurial Queues for most development work, this issue probably goes away.) The "disconnected-use" requirements are all satisfied. e2. networked operationAs mentioned in the requirements document, Mercurial supports
remote access via e3. interface stability and completenessstorageThe storage representation appears to be well-documented. Certainly there's more information on the Mercurial website than we've been able to find for SCCS's storage representation. Mercurial's storage representation does not use Unix i-numbers, so snapshots such as those provided by ZFS or Network Appliance filers should not cause problems. A ZFS snapshot can be used as the source for a Mercurial clone operation. At least some of the on-disk data structures do not appear to be versioned. This is a potential hazard. At least one storage representation change is planned: "RevlogNG", which is planned for Mercurial 0.9. This change is expected to address the versioning issue. On-disk data structures are binary files, but I had no problems using the same repository from both SPARC and x86 systems. Binary files give improved performance, but in the unlikely event that manual repairs are needed, we'll need a binary editing program. Mercurial does not provide its own access control mechanism for
controlling access to subtrees within a repository. While it
might be possible to restrict user access to certain subtrees
using filesystem ACLs, it would probably be better to use
various pre-operation hooks (e.g.,
command-line interface, hooksThe command-line and hook interfaces appear to be adequately documented. One nit: the current documentation appears to reflect the current development version of the code, rather than the most recent release[1]; there is nothing in the documentation to clarify what version it applies to. If OpenSolaris uses Mercurial, we may wish to place snapshots of the code and documentation on opensolaris.org to avoid confusion. The hook infrastructure invokes the named hook(s) with a few tokens such as the changeset ID passed in via the environment. This means that the hook may need to invoke various Mercurial commands to find out more about the changeset. A potential issue is that it may not always be possible to get the desired information back from the existing Mercurial command-line interfaces. For example, "hg log" gives the old and new names of a renamed file, along with the names of any other files involved in the changeset, but it can't tell you that file "foo" was renamed to "bar". Fortunately, all the known examples of this problem are considered bugs and have fixes planned. Bryan O'Sullivan reports that the Mercurial team is also considering support for Python hooks that would run in the Mercurial process. Lock-reentrancy does not appear to be a problem. Mercurial does not need read locks, and hooks are currently limited to only doing read operations. At least one unexpected behavior was noted while testing:
pushing a changeset from repository A to repository B caused A's
network protocol(s)There is some documentation on the network protocol, though it's a bit sketchy. The protocol is versioned. e4. standard operations and transactionsMercurial operates on files. Rename or remove of a directory is translated into a rename or remove on the directory's contents. Rename is implemented as copy and delete. More work is needed here: merges don't track renames, and rename conflicts are not detected. Deleting a file and creating a new file with the same path is supported. The new instance includes the history of the old instance. Mercurial supports the scenario where a file is deleted in one workspace, the deletion is backed out in the "gate" repository, and the file is edited concurrently in a second workspace. The backout was done via $ hg revert -r lastrev where lastrev refers to the last changeset before the deletion. The merge in the second workspace was a little messy: after the "update -m", the working copy of the file had the old and new versions of the file concatenated together. Deleted files can still be referenced using the path and the -r option, for example $ hg cat -r rev file e5. per-changeset metadataMercurial associates a text comment with each changeset; this
is added as part of the c6. ease of useAlthough I haven't built Mercurial from source, Bryan O'Sullivan reports It builds out of the box on Solaris 10, once the system Python's Makefile is fixed to use gcc. The primary interface is the The model is straightforward: you Mercurial offers subcommands specifically for generating and accepting source patches. Mercurial supplies an HTTP server, as well. This can be used for browsing and for pulls over HTTP. Support for backouts: the By default, "hg status" lists files that aren't
tracked in the repository (e.g., compiled binaries, editor
backup files). While this is not peculiar to Mercurial, it's a
change from Teamware, and it will generate an impossible level
of noise in most real-life scenarios with ON.
While doing
"dmake clobber" will reduce the noise considerably[2], that is inconvenient for a tree the
size of ON. Mercurial does provide mechanisms to filter out
noise (e.g., Files should be imported with read-write permission. Mercurial keeps track of the permissions, and it complains if you try to update a read-only file (e.g., after a push or pull). mergingMercurial's default for resolving conflicts is the
If While first experimenting with Mercurial, I found it very easy to get my repository into a state where it would keep complaining about "outstanding uncommitted changes", but it was hard to figure out how to get out of that state. (Answer: use "hg update -C".) This probably needs to go into a FAQ. mismergesThe current version of First, if Second, the code for invoking the editor and determining whether the conflict was resolved is a bit brittle[3]. This may just be a bug that needs fixing. But we may also want a more explicit "yes I have resolved the conflicts" action from the user (which is something Subversion does). There is at least one open issue in the Mercurial bug database related to failed merges. Resolving this issue may address the brittleness problem mentioned above. intermediary snapshotsThe current ON convention is that putbacks should not introduce
SCCS deltas for intermediary snapshots or Teamware merges. This
is achieved by using the c7. no-dedicated-server modeMercurial can run without any server daemons. c8. tool community healthMercurial has an active developer community. At least one developer (Bryan O'Sullivan) has helped with our evaluation of Mercurial, and he is interested in helping to address issues that we have run into so far (e.g., rename). There have been a few Solaris-specific problems with Mercurial.
The Mercurial developers have been quick to respond. And in at
least one case they installed Solaris so that they could
troubleshoot the issue. Also, the developers have paid
attention to larger issues, rather than applying a steady stream
of band-aids. For example, to address the recurring
shell-compatibility problems with c9. OpenSolaris community expertiseMercurial is almost entirely written in Python;
c10. interface extensibilityMercurial has a hooks mechanism as well as a documented extensions mechanism. Some hooks can abort the current operation. c11. transactional operations and corruption recoveryMercurial's state files are updating by appending. So corrupted files can be repaired by rolling back to a consistent set of files. Signal handling (e.g., I simulated a crash using
There also appear to be a couple open issues related to locking[4]. c12. content generalityMercurial supports binary files as well as text. However, the merge code appears to assume text files. We'll need to think about to handle binary files. o13. partial treesNot currently supported, though there have been discussions about adding support for it.. o14. per-file historiesChangesets are for the entire repository, not per-file. But you can get a per-file history by specifying the file name with "hg log". 2. StorageWe didn't have any problems running out of storage or swap. No storage spikes were observed. One thing that deserves further investigation is the storage consumed by the conflict tests in the test harness that we used. The first test introduces a content conflict in usr/src/cmd/sort/Makefile.com. The second test introduces a couple rename conflicts in usr/src/cmd/pwd. This led to a size increase of 2.3MB in the test repository, which seems excessive. 3. PerformanceA local clone of the OpenSolaris ON 20060222 tree takes a couple minutes on the above hardware using ZFS; about twice that on UFS. Using the repo with history, performance looks like this (times are in mm:ss):
D. Changes/Features Required/DesiredMust Have Initially
Want Eventually
Notes[1] For example, the 2006-03-22 version of
the [2] Besides leaving editor backup files, our clobber builds leave some generated files behind. [3] The relevant code is
$EDITOR "$LOCAL" "$LOCAL.rej" && test -s "$LOCAL.rej" || exit 0
If
[4] issue132 "hg should revalidate its data after locking the repo" and issue154 "race between undo and all readers" History
|