Chapter 8. Buffering

This chapter provides an overview of buffering and a description of file buffering as it applies to I/O.

Buffering Overview

I/O is the process of transferring data between a program and an external device. The process of optimizing I/O consists primarily of making the best possible use of the slowest part of the path between the program and the device.

The slowest part is usually the physical channel, which is often slower than the CPU or a memory-to-memory data transfer. The time spent in I/O processing overhead can reduce the amount of time that a channel can be used, thereby reducing the effective transfer rate. The biggest factor in maximizing this channel speed is often the reduction of I/O processing overhead.

A buffer is a temporary storage location for data while the data is being transferred. A buffer is often used for the following purposes:

  • Small I/O requests can be collected into a buffer, and the overhead of making many relatively expensive system calls can be greatly reduced.

    A collection buffer of this type can be sized and handled so that the actual physical I/O requests made to the operating system match the physical characteristics of the device being used. For example, a 42-sector buffer, when read or written, transfers a track of data between the buffer and the DD-49 disk; a track is a very efficient transfer size.

  • Many data file structures, such as the f77 and cos file structures, contain control words. During the write process, a buffer can be used as a work area where control words can be inserted into the data stream (a process called blocking). The blocked data is then written to the device. During the read process, the same buffer work area can be used to examine and remove these control words before passing the data on to the user (deblocking).

  • When data access is random, the same data may be requested many times. A cache is a buffer that keeps old requests in the buffer in case these requests are needed again. A cache that is sufficiently large and/or efficient can avoid a large part of the physical I/O by having the data ready in a buffer. When the data is often found in the cache buffer, it is referred to as having a high hit rate. For example, if the entire file fits in the cache and the file is present in the cache, no more physical requests are required to perform the I/O. In this case, the hit rate is 100%.

  • Running the disks and the CPU in parallel often improves performance; therefore, it is useful to keep the CPU busy while data is being moved. To do this when writing, data can be transferred to the buffer at memory-to-memory copy speed and an asynchronous I/O request can be made. The control is then immediately returned to the program, which continues to execute as if the I/O were complete (a process called write-behind). A similar process can be used while reading; in this process, data is read into a buffer before the actual request is issued for it. When it is needed, it is already in the buffer and can be transferred to the user at very high speed. This is another form or use of a cache.

Buffers are used extensively on UNICOS and UNICOS/mk systems. Some of the disk controllers have built-in buffers. The kernel has a cache of buffers called the system cache that it uses for various I/O functions on a system-wide basis. The Cray IOS uses buffers to enhance I/O performance. The UNICOS logical device cache (ldcache) is a buffering scheme that uses a part of the solid-state storage device (SSD) or buffer memory resident (BMR) in the IOS as a large buffer that is associated with a particular file system. The library routines also use buffers.

The I/O path is divided into two parts. One part includes the user data area, the library buffer, and the system cache. The second part is referred to as the logical device, which includes the ultimate I/O device and all of the buffering, caching, and processing associated with that device. This includes any caching in the disk controller and the operating system.

Users can directly or indirectly control some buffers. These include most library buffers and, to some extent, system cache and ldcache. Some buffering, such as that performed in the IOS, or the disk controllers, is not under user control.

A well-formed request refers to I/O requests that meet the criteria for UNICOS systems; a well-formed request for a disk file requires the following:

  • The size of the request must be a multiple of the sector size in bytes. For most disk devices, this will be 4096 bytes.

  • The data that will be transferred must be located on a word boundary.

  • The file must be positioned on a sector boundary. This will be a 4096-byte sector boundary for most disks.

Types of Buffering

The following sections briefly describe unbuffered I/O, library buffering, system cache buffering, and ldcache.

Unbuffered I/O

The simplest form of buffering is none at all; this unbuffered I/O is known as raw I/O. For sufficiently large, well-formed requests, buffering is not necessary; it can add unnecessary overhead and delay. The following assign(1) command specifies unbuffered I/O:

assign -s u  ...

Use the assign command to bypass library buffering and the UNICOS system cache for all well-formed requests. The data is transferred directly between the user data area and the logical device. Requests that are not well formed use system cache.

Library Buffering

The term library buffering refers to a buffer that the I/O library associates with a file. When a file is opened, the I/O library checks the access, form, and any attributes declared on the assign or asgcmd(1) command to determine the type of processing that should be used on the file. Buffers are usually an integral part of the processing.

If the file is assigned with one of the following options, library buffering is used:

-s blocked
-s tape/bmx (deferred implementation on IRIX systems)
-Fspec (buffering as defined by spec)
-s cos
-s bin
-s unblocked

The -F option specifies flexible file I/O (FFIO), which uses library buffering if the specifications selected include a need for some buffering. In some cases, more than one set of buffers might be used in processing a file. For example, the -F blankx,cos option specifies two library buffers for a read of a blank compressed COS blocked file. One buffer handles the blocking and deblocking associated with the COS blocked control words and the second buffer is used as a work area to process the blank compression. In other cases (for example, -F system), no library buffering occurs.

System Cache

The operating system or kernel uses a set of buffers in kernel memory for I/O operations. These are collectively called the system cache. The I/O library uses system calls to move data between the user memory space and the system buffer. The system cache ensures that the actual I/O to the logical device is well formed, and it tries to remember recent data in order to reduce physical I/O requests. In many cases, though, it is desirable to bypass the system cache and to perform I/O directly between the user's memory and the logical device.

On UNICOS and UNICOS/mk systems, if requests are well-formed, and the O_RAW flag is set by the libraries when the file is opened, the system cache is bypassed, and I/O is done directly between the user's memory space and the logical device.

On UNICOS systems, if the requests are not well formed, the system cache is used even if the O_RAW flag was selected at open time.

If UNICOS ldcache is present, and the request is well formed, I/O is done directly between the user's memory and ldcache even if the O_RAW bit was not selected.

The following assign(1) command options do not set the O_RAW bit, and it can be expected to use the system cache:

-s sbin
-F spec (FFIO, depends on spec)

The following assign command options set the O_RAW flag and bypass the system cache on UNICOS and UNICOS/mk systems:

-r on
-s unblocked
-s cos (or -s blocked)
-s bin
-s u
-F spec (FFIO, depends on spec)

See the Tape Subsystem User's Guide for details about the use of system caching and tapes.

For the assign -s cos , assign -s bin, and assign -s bmx commands, a library buffer ensures that the actual system calls are well formed. This is not true for the assign -s u option. If you plan to bypass the system cache, all requests go through the cache except those that are well-formed.

The assign -l buflev option controls kernel buffering. It is used by Fortran I/O, auxiliary I/O, and FFIO. The buflev argument can be any of the following values:

  • none: sets O_RAW and O_LDRAW

  • ldcache: sets O_RAW, clears O_LDRAW

  • full: clears O_RAW and O_LDRAW

If this option is not set, the level of system buffering is dependent on the type of open operation being performed.

See the explanation of the -B option on the assign(1) man page for information about bypassing system buffering on IRIX systems.

Restrictions on Raw I/O

The conditions under which UNICOS/mk can perform raw I/O are different from the conditions under the UNICOS operating system. In order for raw I/O to be possible under UNICOS/mk, the starting memory address of the transfer must be aligned on a cache line boundary. This means that it must be aligned on a 0 modulus 64 byte address for CRAY T3E systems.

A C program can cause static or stack data to be aligned correctly by using the following compiler directive:

_Pragma(_CRI cache_align buff);

buff is the name of the data to be aligned.

The malloc library memory allocation functions always return aligned pointers.

In most cases where raw I/O cannot be performed due to incorrect alignment, the system will perform buffered I/O instead. The O_WELLFORMED open flag causes the ENOTWELLFORMED error to be returned.

Logical Cache Buffering

On UNICOS systems, the following elements are part of the logical device: ldcache, IOS models B, C, and D, IOS buffer memory, and cache in the disk controllers. These buffers are connected to the file system on which the file resides.

Default Buffer Sizes

The Fortran I/O library automatically chooses appropriate default buffer sizes. On UNICOS systems, you can specify the default buffer sizes for the various types of I/O, using the loader for your compiler. See your loader documentation for complete details.