OpenSolaris

  subsites   code review   repo   packages   bugs   defect   polls   planet

Project Requirements Specification
NFSv4.0 client referrals


Author

Evan Layton

Alok Aggarwal



Executive Summary

NFSv4 allows the file namespace to extend beyond the boundaries of a single server. This is done with a mechanism known as referrals.When a client crosses over a namespace boundary at a server, the server refers the client to another server via a specific error and the fs_locations attributes. The client connects to the new server and resumes its operations.
The specification of file system location provides a means by which file systems located on one server can be associated with a name space defined by another server, thus allowing a general multi-server namespace facility.
Multi-server namespaces can provide many advantages by separating a file system's logical position in a name space from the (possibly changing) logistical and administrative considerations that result in particular file systems being located on particular servers.
This project implements the client side capabilities for processing a referral. That is the Solaris NFSv4 client will be able to process a referral returned from the server. This project does not add the capabilities to the server to return the referral.

Project Requirements

  1. Automatic mounting of referral points
    • Whenever the client traverses a referral point in the server namespace,
      the client shall automatically mount the target of that referral
      (subject to the triggering rules listed below) and in exactly the same place in the client namespace as where the referral was discovered.
    • The referral mount will be processed by the client in such a fashion that it is transparent to the user. No special configuration will be required on the client to enable this behavior.
    • As part of that mount an entry will be placed in mnttab as would normally be expected for a mount.


  2. Triggering Actions
    • Any action on a target directory resulting in a vnode operations of either VOP_LOOKUP(target) or VOP_GETATTR(target) is not a triggering action.
    • Any action on a target directory resulting in any other vnode operation is a triggering action.
    • Any action on a parent directory containing the target directory resulting in a vnode operation of VOP_READDIR(parent) is not a triggering action.
  1. multiple referrals (referral is referred...)
    • The client must be able to handle getting the moved error while processing a previous referral.
    • A referral encountered during any mount (initial or otherwise) will result in the mount of the referred-to server.
    • If a referral points at a previously processed/mounted referral in a "chain" or set of nested referrals the referral and mount will fail.
  1. List of referrals (multiple locations & possible failover)
    • For the first phase of referrals we will attempt to mount the first server/resource in the list returned in fs_locations.
    • If the first server in the list returned in fs_locations is not available we will iterate through that list until we find one that is available. If none are available the mount will fail.
  2. Host/node name and IP address resolution
    • When a host name is returned in fs_locations it will be resolved to an IP address.
    • We will also correctly handle a hosts IP address (IPv4 or IPv6) when it is returned in fs_locations.
  1. Hierarchical unmounting
    • When unmounting the top of a tree/hierarchy of referrals all referrals under this must also be unmounted.
    • When autofs mounts are unmounted any referrals mounted within must also be automatically unmounted.
    • Referral mounts under a Mirror Mount or Mirror Mounts under Referral mounts must also be automatically unmounted when a mount above them is unmounted.
  2. Inherited mount properties
    • mount properties specified by the client in the original mount should be used for the referral.
    • The client will always attempt to inherit the security flavor from the parent mount however if this does not match that of the referred to server's file system, it will attempt to re-negotiate the security flavor. If the re-negotiation fails the mount will also fail.
    • In the case where the security flavor inherited from the parent mount does not match that of the referred to server's file system and the re-negotiation fails, the client will iterate through the list of servers returned in fs_locations. If the client is not able to agree on a security flavor with any of the servers in the list, the mount will fail.
      8. Replication/migration detection
    • The client will be able to distinguish between a referral event and a replication/migration event. A replication/migration event will not be processed by the client and the failure mode will be the same as that that exists currently in Solaris 10.
Other:
Out of scope/future work
This project will not be implementing v4.1 fs_locations_info.
Enabling of client side fs_locations based replication and migration will not be done as part of this project.
Enabling of server side referrals and fs_locations based replication and migration will not be done as part of this project.
Administration of server side referrals and fs_locations based replication and migration will not be be done as part of this project.

Impact

This project will require changes to the automounter with respect the unmounting of referrals that have been mounted under an autofs mounted file system.

Dependencies

This project is closely linked with the NFSv4 Mirror Mounts project, and will share its implementation.

Document History and Approvals

1.7 - 3/22/2007
Added comments from Bill Baker as well as from the Barker Meeting Review

1.6 - 3/15/2007
Formal approval draft including Rich Brown's comments
1.5 - 3/14/2007
NFS team draft
1.3 - 3/08/2007
I-team draft

Appendix

This line left intentionally blank...

Background Material



NFSv4.1 draft document

http://www.ietf.org/internet-drafts/draft-ietf-nfsv4-minorversion1-10.txt

Differences between a referral event and a replication/migration event

A referral event occurs upon the "first access" to a server filesystem. For example, a client looks up an object in the server namespace for the very first time but is told that that object is located on another server. The client is subsequently referred to the server filesystem that contains that object. In such a case the client is said to have encountered a referral event.

If the client traverses into a server filesystem and it finds the objects that once existed in the filesystem (having established that fact by virtue of having accessed those objects previously), are no longer present at that server but are present at an alternate set(s) of servers - it is said to have encountered a migration/replication event.

A case of a migration/replication event is one in which the client accesses a file/directory a number of times before being told by the server that the file/directory in question has been migrated over to a different server.

As outlined above, a referral event will be handled by this project whereas a migration/replication event will not be.

Triggering actions

The triggering actions on a target directory that will result in a referral are defined in terms of their resultant vnode operations.
From an API perspective: with the single exception of stat(2), all filesystem calls involving the target directory will trigger a mount. However, a readdir(3)/getdents(2) of the parent directory (/parent) enclosing the target (/parent/target) will not trigger a mount.
For example:

Nested referrals

Automounter comparison: the automounter will only automatically mount nested mounts when encountered under /net.

Browsing

referrals will enable a "browsing" feature similar, but not identical, to the automounter. (will use the same mechanism employed by NFSv4 mirror mounts)
Existing automount browsing behaviour
When the automounter browsing option is enabled for indirect maps, it is possible to see the existence of automount trigger points before they are mounted:
estale $ ls -ld /home/alice
dr-xr-xr-x   1 root     root           1 Oct 18 12:36 /home/alice
estale $ mount | grep alice
estale $
Note that the attributes of the directory are generated by the client, and do not match reality on the server. The directory is given mode 0555, with root ownership, and the modification time is the current time. If the directory is mounted - e.g. by changing into the directory - the automounter completes the mount, and the real directory attributes are seen:
estale $ ls -ld /home/alice
drwxr-xr-x  79 alice    pawns        20480 Oct 18 13:19 /home/alice
(as well as its contents).
Note that the automounter "/net" feature is a special case, where the automounter will automatically mount any server filesystems it traverses. The functionality proposed here is similar but includes being referred to another server, by using solely NFSv4 mechanisms, with no involvement of the automounter. In addition, of course, it is not tied to a particular trigger-point (/net).
Existing NFSv4 server namespace browsing
In the absence of any client automount map, the existing NFSv4 server implementation in Solaris still presents the entire server namespace to the client, i.e server mounts-points (in effect) are visible to the client before the client has mounted them, even if the server mount-points themselves are on a server filesystem that is not shared:
NFSv4-server # share
-               /dum   rw=pawns   ""
-               /dee   rw=pawns   ""

# note that the server does not share "/", yet we may mount it
NFSv4-client # mount NFSv4-server:/ /mnt
NFSv4-client # ls -l /mnt
total 4
drwxr-xr-x   3 alice    pawns        512 Oct 18 15:01 dee
drwxr-xr-x  37 root     sys         1024 Oct 18 14:50 dum
This continues to provide the useful browsing feature, previously available via the automounter, without imposing the overhead of a mount, which may be important in the presence of many server filesystems e.g. when using ZFS.
Note that the attributes of the file systems are not presented correctly. Because the new server has not be contacted yet we don't have access to the correct attributes. What will be presented will be the attributes for the referral point and not the file system on the new server.
However, the contents of the server's filesystems cannot be seen:
NFSv4-server # ls -al /dee
total 20
drwxr-xr-x   3 alice    pawns        512 Oct 18 15:01 .
drwxr-xr-x  31 root     root        1024 Oct 18 15:01 ..
drwx------   2 alice    pawns       8192 Oct 18 14:53 lost+found
-rw-r--r--   1 alice    pawns          0 Oct 18 14:58 this_file_is_in_slash_dee

NFSv4-client # ls -al /mnt/dee
total 4
drwxr-xr-x   3 alice    pawns        512 Oct 18 15:01 .
drwxr-xr-x  31 root     root        1024 Oct 18 15:01 ..
The proposed referral functionality would cause a real NFSv4 mount to occur when the client crosses into the new filesystem on the referred to server by accessing /mnt/dee. This functionality is the same as mirror mounts with the additional caveat NFSv4-server is the referred to server and the original lookup was to another server which referred us to NFSv4-server.