Chapter 1. Understanding ONC3/NFS

This chapter introduces the SGI implementation of the Sun Microsystems Open Network Computing Plus (ONC+) distributed services, which was previously referred to as Network File System (NFS). In this guide, NFS refers to the distributed network file system in ONC3/NFS.

The information in this chapter is prerequisite to successful ONC3/NFS administration. It defines ONC3/NFS and its relationship to other network software, introduces the ONC3/NFS vocabulary, and identifies the software elements that support ONC3/NFS operation. It also explains special utilities and implementation features of ONC3/NFS.

This chapter contains these sections:

Overview of ONC3/NFS

ONC3/NFS is the SGI implementation of ONC+ distributed services. ONC3/NFS is optimized for SGI systems, and integrated with the IRIX Interactive Desktop environment and system toolchest. ONC3/NFS can run only on a SGI system.

ONC3/NFS is made up of distributed services that allow users to access file systems and directories on remote systems and treat them as if they were local. Networks with heterogeneous architectures and operating systems can participate in the same ONC3/NFS service. The service can also include systems connected to different types of networks.

ONC3/NFS is a separate software product, and must be installed on both server and client. However, you should be familiar with the information in this chapter before setting up or modifying the ONC3/NFS environment.

ONC3/NFS Components

This section summarizes the components of ONC3/NFS, and is followed by expanded notes.

NFS 

The Network File System (NFS) is the distributed network file system in ONC3/NFS and contains the automatic mounters and lock manager. ONC3/NFS supports NFS version 3 (NFS3) and NFS version 2, but uses NFS3 by default. NFS is multi-threaded to take advantage of multiprocessor performance. For more about NFS, see page 4.

NIS 

The network information service (NIS) is a database of network entity location information that can be used by NFS. NIS is implemented as part of the Unified Name Service (UNS). Information about NIS and UNS is published in a separate volume called the NIS Administration Guide. For more about the interaction of NFS with NIS, see page 5.

AutoFS 

The AutoFS file system (AutoFS), introduced in IRIX 6.2, is an implementation of the automatic mounter that uses the autofs command instead of automount. Like automount, autofs provides automatic and transparent NFS mounts upon access of specified AutoFS file systems. autofs mainly differs from automount by providing multi-threaded service , by providing in-place mounts, and by using the LoFS (loopback file system) to access local file systems. autofs is multi-threaded; it accepts dynamic configuration updates. Unlike automount, autofs access cannot be blocked by a server that is down or responding slowly. One thread may block, but this does not prevent other references through autofs from completing. autofs and automount cannot exist on the same system. By default, autofs is enabled upon installation, although automount can be selected with chkconfig. Further information about AutoFS starts on page 5.

CacheFS 

The Cache File System (CacheFS), introduced in IRIX 5.3, provides client-side caching for NFS and other file system types. Using CacheFS on NFS clients with local disk space can significantly increase the number of clients a server can support and reduce the data access time for clients using read-only file systems. For more about CacheFS, refer to page 7.

Bulk Data Service 


The SGI implementation of the Bulk Data Service protocol, BDSpro, is available as an option for NFS. BDSpro is an extension to NFS for handling large file transactions over high-speed networks. BDSpro exploits the data access speed of the XFS filesystem and data transfer rates of network media, such as HIPPI and fiberchannel, to accelerate standard NFS performance. The BDS protocol modifies NFS functions to reduce the time needed to transfer files of 100 megabytes or larger over a network connection. For more information about BDSpro, refer to Getting Started With BDSpro.

About NFS

NFS is a network service that allows users to access file hierarchies across a network in such a way that they appear to be local. File hierarchies can be entire file systems or individual directories. Systems participating in the NFS service can be heterogeneous. They may be manufactured by different vendors, use different operating systems, and be connected to networks with different architectures. These differences are transparent to the NFS application.

NFS is an application layer service that can be used on a network running the User Datagram Protocol (UDP) or Transmission Control Protocol (TCP). The UDP protocol has traditionally been used as the transport layer protocol. UDP supports connectionless transmissions and stateless protocol, providing robust service. The TCP protocol supports connection-based transmissions that are beneficial in WAN configurations. TCP provides high reliability, and with its sophisticated packet tracking scheme, reduces client and server input buffer overflow and multiple packet resends.

NFS relies on remote procedure calls (RPC) for session layer services and external data representation (XDR) for presentation layer services. XDR is a library of routines that translate data formats between processes.

Figure 1-1 illustrates the NFS software implementation in the context of the Open Systems Interconnect (OSI) model.

Figure 1-1. NFS Software Implementation

Figure 1-1 NFS Software Implementation

NFS and Diskless Workstations

It is possible to set up a system so that all the required software, including the operating system, is supplied from remote systems by means of the NFS service. Workstations operating in this manner are considered diskless workstations, even though they may be equipped with a local disk.

Instructions for implementing diskless workstations are given in the Diskless Workstation Administration Guide. However, it is important to acquire a working knowledge of NFS before setting up a diskless system.

About NFS and the Network Information Service

The Network Information Service (NIS) is a database service that provides location information about network entities to other network servers and applications, such as NFS. NFS and NIS are independent services that may or may not be operating together on a given network. On networks running NIS, NFS may use the NIS databases to locate systems when NIS queries are specified.

About UNS and NIS

The Unified Name Service (UNS) provides a system wide interface to hostname, password and many other lookups. It controls the resolution of hostnames used by AutoFS and automount. Both AutoFS and automount bypass UNS when using information from NIS.

About The AutoFS File System

AutoFS is the kernel virtual file system that supports automatic mounting of file systems. Together with the implementation of autofsd (the autofs daemon), AutoFS solves several fundamental problems with the earlier implementation of automount daemon:

  • The symbolic links and the /tmp_mnt prepended to paths are replaced by in-place mounting.

  • AutoFS is file system independent.

By default, AutoFS tries NFS version 3 first, and if the server does not support version 3, AutoFS retries the mount using NFS2.

Without symbolic links, indirection to mount points is now performed entirely within the kernel, improving performance. Autofsd is now a stateless daemon, responsible for performing automatic mounts and unmounts. It allows mount points to be added or deleted without rebooting. The daemon is not required to access a filesystem once it is mounted.

In addition, autofsd can mount file systems besides NFS such as removable-media file systems. These improvements are compatible with previously existing maps and administrative procedures.

Simplified autofs Operation

The automount daemon, autofsd, starts at boot time from the /etc/init.d/network script since by default, autofs and nfs are chkconfig'd on. The /etc/init.d/network script also runs the autofs command, which reads a master map and installs AutoFS mount points.

Unlike mount, autofs does not read the file /etc/fstab, which is specific to each workstation, for a list of file systems to mount. Rather, autofs is controlled within a domain (and on particular workstations) through the maps, saving a great deal of administrator time.

How Autofs Navigates Through the Network (Maps)

Autofs searches a series of maps to navigate its way through the network. Maps are files that contain information mapping local directories or mount points to remote server file systems. A special map, -hosts, is supported by AutoFS to provide a convenient way of accessing all host machines on the network. Maps are available locally or through a network name service like NIS or NIS+. You create maps to meet the needs of your users' environment. See “NFS Automatic Mounting”, and Chapter 3, “Using Automatic Mounter Map Options” for detailed information on automatic mounting and its maps.

About CacheFS File System

A cache is a temporary storage area for data. With the cache file system (CacheFS), you can store frequently used data from a remote file system or CD-ROM on the local disk drive of a workstation. The data stored on the local disk is the cache.

When a file system is cached, the data is read from the original file system and stored on the local disk. The reduction in network traffic improves performance. If the remote file system is on a storage medium with slower response time than the local disk (such as a CD–ROM), caching provides an additional performance gain.

CacheFS can use all or part of a local disk to store data from one or more remote file systems. A user accessing a file does not need to know whether the file is stored in a cache or is being read from the original file system. The user opens, reads, and writes files as usual.

A cache with default parameters can be created with the mount command. Default parameters can be changed with the cfsadmin command. See “Cached File System Administration” and “Cache Resource Parameters in CacheFS”. Specific details of CacheFS are discussed in “Planning a CacheFS File System” in Chapter 2.

Client-Server Fundamentals

In an NFS transaction, the workstation requesting access to remote directories is known as the client. The workstation providing access to its local directories is known as the server. A workstation can function as a client and a server simultaneously. It can allow remote access to its local file systems while accessing remote directories with NFS. The client-server relationship is established by two complementary processes, exporting and mounting.

Exporting NFS File Systems

Exporting is the process by which an NFS server provides access to its file resources to remote clients. Individual directories, as well as file systems, can be exported, but exported entities are usually referred to as file systems. Exporting is done either during the server's boot sequence or from a command line as superuser while the server is running.

Once a file system is exported, any authorized client can use it. A list of exported file systems, client authorizations, and other export options are specified in the /etc/exports file (see “Operation of /etc/exports and Other Export Files” in Chapter 2 for details). Exported file systems are removed from NFS service by a process known as unexporting.

A server can export any file system or directory that is local. However, it cannot export both a parent and child directory within the same file system; to do so is redundant.

For example, assume that the file system /usr contains the directory /usr/demos. As the child of /usr, /usr/demos is automatically exported with /usr. For this reason, attempting to export both /usr and /usr/demos generates an error message that the parent directory is already exported. If /usr and /usr/demos were separate file systems, this example would be valid.

When exporting hierarchically related file systems such as /usr and /usr/demos in the previous example, we recommend the use of the -nohide option to reduce the number of mounts required by clients (see exports(4)).

Mounting NFS File Systems

Mounting is the process by which file systems, including NFS file systems, are made available to the IRIX operating system and consequently, the user. When NFS file systems or directories are mounted, they are made available to the client over the network by a series of remote procedure calls that enable the client to access the file system transparently from the server's disk. Mounted NFS directories or file systems are not physically present on the client system, but the mount looks like a local mount and users enter commands as if the file systems were local.

NFS clients can have directories mounted from several servers simultaneously. Mounting can be done as part of the client's boot sequence, automatically, at file system access, with the help of a user-level daemon, or with a superuser command after the client is running. When mounted directories are no longer needed, they can be relinquished in a process known as unmounting.

Like locally mounted file systems, NFS mounted file systems and directories can be specified in the /etc/fstab file (see “Operation of /etc/fstab and Other Mount Files” in Chapter 2 for details). Since NFS file systems are located on remote systems, specifications for NFS mounted resources must include the name of the system where they reside.

NFS Mount Points

The access point in the client file system where an NFS directory is attached is known as a mount point. A mount point is specified by a conventional IRIX pathname.

Figure 1-2 illustrates the effect of mounting directories onto mount points on an NFS client.

Figure 1-2. Sample Mounted Directory

Figure 1-2 Sample Mounted Directory

The pathname of a file system on a server can be different from its mount point on the client. For example, in Figure 1-2 the file system /usr/demos is mounted in the client's file system at mount point /n/demos. Users on the client gain access to the mounted directory with a conventional cd command to /n/demos, as if the directory were local.

NFS Mount Restrictions

NFS does not permit multihopping, mounting a directory that is itself NFS mounted on the server. For example, if host1 mounts /usr/demos from host2, host3 cannot mount /usr/demos from host1. This would constitute a multihop.

NFS also does not permit loopback mounting, mounting a directory that is local to the client via NFS. For example, the local file system /usr on host1 cannot be NFS mounted to host1, this would constitute a loopback mount.

NFS Automatic Mounting

As an alternative to standard mounting via /etc/fstab or the mount command, NFS provides two automatic mounting utilities. The original automatic mounter, called automount and a newer implementation introduced in IRIX 6.2, called autofs. Both automatic mounters dynamically mount file systems when they are referenced by any user on the client system, then unmount them after a specified time interval. Unlike standard mounting, automount and autofs, once set up, do not require superuser privileges to mount a remote directory. They also create the mount points needed to access the mounted resource. NFS servers cannot distinguish between directories mounted by the automatic mounters and those mounted by conventional mount procedures. autofs and automount cannot co-exist on the same system.

Unlike the standard mount process, automount and autofs do not read the /etc/fstab file for mount specifications. Instead, they read alternative files (either local or through NIS) known as maps for mounting information (see “Operation of Automatic Mounter Files and Maps” for details). They also provide special maps for accessing remote systems and automatically reflecting changes in the /etc/hosts file and any changes to the remote server's /etc/exports file.

Default configuration information for automatic mounting is contained in the files /etc/config/automount.options (for automount) and /etc/config/autofs.options (for autofs). These files can be modified to use different options and more sophisticated maps.

UDP Stateless Protocol

When NFS is used with UDP as its transport protocol, it uses a stateless protocol in which the server maintains almost no information on NFS processes. This stateless protocol insulates clients and servers from the effects of failures. If a server fails, the only effect to clients is that NFS data on the server is unavailable to clients. If a client fails, server performance is not affected.

Clients are independently responsible for completing NFS transactions if the server or network fails. By default, when a failure occurs, NFS clients continue attempting to complete the NFS operation until the server or network recovers. To the client, the failure can appear as slow performance on the part of the server. Client applications continue retransmitting until service is restored and their NFS operations can be completed. If a client fails, no action is needed by the server or its administrator in order for the server to continue operation.

TCP Connections for NFS

The TCP protocol transport option for NFS provides a highly efficient method for transmitting packets, especially in large WAN networks. With the TCP protocol, a connection is made between the client and the server, and all packets are labeled and tracked. Even though this tracking is more CPU-intensive, input buffer overflow and multiple packet resends due to lost packets and timeouts are handled much more efficiently than with UDP.

NFS Input/Output Management

In NFS2 transactions, data input and output is asynchronous read-ahead and write-behind, unless otherwise specified. As the server receives data, it notifies the client that the data was successfully written. The client responds by freeing the blocks of NFS data successfully transmitted to the server. In reality, however, the server might not write the data to disk before notifying the client, a technique called delayed writes. Writes are done when they are convenient for the server, but at least every 30 seconds. NFS2 uses delayed writes by default.

With synchronous writes, the server writes the data to disk before notifying the client that it has been written. Synchronous writes are supported as an option in NFS2 (see “/etc/exports Options” in Chapter 2 for details of NFS options), and in NFS3. Synchronous writes may slow NFS performance due to the time required for disk access, but increase data integrity in the event of system or network failure.

NFS File Locking Service

To help manage file access conflicts and protect NFS sessions during failures, NFS offers a file and record locking service called the network lock manager. The network lock manager is a separate service NFS makes available to user applications. To use the locking service, applications must make calls to standard IRIX lock routines (see the reference pages fcntl(2), flock(3B), and lockf(3C)). For NFS files, these calls are sent to the network lock manager process (see lockd(1M)) on the server.

The network lock manager processes must run on both client and server. Communication between the two processes is by means of RPC. Calls issued to the client process are handed to the server process, which uses its local IRIX locking utilities to handle the call. If the file is in use, the lock manager issues an advisory to the calling application, but it does not prevent the application from accessing a busy file. The application must determine how to respond to the advisory, using its own facilities.

Despite the fact that the network lock manager adheres to lockf and fcntl semantics, its operating characteristics are influenced by the nature of the network, particularly during crashes.

NFS Locking and Crash Recovery

As part of the file locking service, the network lock manager assists with crash recovery by maintaining state information on locked files. It uses this information to reconstruct locks in the event of a server or client failure.

When an NFS client goes down, the lock managers on all of its servers are notified by their status monitors, and they simply release their locks, on the assumption that the client will request them again when it wants them. When a server crashes, however, matters are different. When the server comes back up, its lock manager gives the client lock managers a grace period to submit lock reclaim requests. During this period, the lock manager accepts only reclaim requests. The client status monitors notify their respective lock managers when the server recovers. The default grace period is 45 seconds.

After a server crash, a client may not be able to recover a lock that it had on a file on that server, because another process may have beaten the recovering application process to the lock. In this case the SIGLOST signal is sent to the process (the default action for this signal is to kill the application).

NFS Locking and the Network Status Monitor

To handle crash recoveries, the network lock manager relies on information provided by the network status monitor. The network status monitor is a general service that provides information about network systems to network services and applications. The network status monitor notifies the network lock manager when a network system recovers from a failure, and by implication, that the system failed. This notification alerts the network lock manager to retransmit lock recovery information to the server.

To use the network status monitor, the network lock manager registers with the status monitor process (see statd(1M)) the names of clients and servers for which it needs information. The network status monitor then tracks the status of those systems and notifies the network lock manager when one of them recovers from a failure.