Chapter 8. System Administration for Guaranteed-Rate I/O

Guaranteed-rate I/O, or GRIO for short, is a mechanism that enables a user application to reserve part of a system's I/O resources for its exclusive use. For example, it can be used to enable “real-time” retrieval and storage of data streams. GRIO manages the system resources among competing applications, so the actions of new processes do not affect the performance of existing ones. GRIO can read and write only files on a real-time subvolume of an XFS filesystem. To use GRIO, the subsystem eoe.sw.xfsrt must be installed.

This chapter explains important guaranteed-rate I/O concepts, describes how to configure a system for GRIO; and provides instructions for creating an XLV logical volume for use with applications that use GRIO.

The major sections in this chapter are:

For additional information, see the grio(5) reference page.


Note: By default, IRIX supports four GRIO streams (concurrent uses of GRIO). To increase the number of streams to 40, you can purchase the High Performance Guaranteed-Rate I/O—5-40 Streams software option. For even more streams, you can purchase the High Performance Guaranteed-Rate I/O—Unlimited Streams software option.


Guaranteed-Rate I/O Overview

The guaranteed-rate I/O system (GRIO) allows applications to reserve specific I/O bandwidth to and from the filesystem. Applications request guarantees by providing a file descriptor, data rate, duration, and start time. The filesystem calculates the performance available and, if the request is granted, guarantees that the requested level of performance can be met for a given time. This frees programmers from having to predict system I/O performance and is critical for media delivery systems such as video-on-demand.

The GRIO mechanism is designed for use in an environment where many different processes attempt to access scarce I/O resources simultaneously. GRIO provides a way for applications to determine that resources are already fully utilized and attempts to make further use would have a negative performance impact.

If the system is running a single application that needs access to all the system resources, the GRIO mechanism does not need to be used. Because there is no competition, the application gains nothing by reserving the resources before accessing them.

Applications negotiate with the system to make a GRIO reservation, an agreement by the system to provide a portion of the bandwidth of a system resource for a period of time. The system resources supported by GRIO are files residing within real-time subvolumes of XFS filesystems. A reservation can by transferred to any process and to any file on the filesystem specified in the request.

A GRIO reservation associates a data rate with a filesystem. A data rate is defined as the number of bytes per a fixed period of time (called the time quantum). The application receives data from or transmits data to the filesystem starting at a specific time and continuing for a specific period. For example, a reservation could be for 1.2 MB every 1.29 seconds, for the next three hours, to or from the filesystem on /dev/xlv/video1. In this example, 1.29 seconds is the time quantum of the reservation.

The application issues a reservation request to the system, which either accepts or rejects the request. If the reservation is accepted, the application then associates the reservation with a particular file. It can begin accessing the file at the reserved time, and it can expect that it will receive the reserved number of bytes per time quantum throughout the time of the reservation. If the system rejects the reservation, it returns the maximum amount of bandwidth that can be reserved for the resource at the specified time. The application can determine whether the available bandwidth is sufficient for its needs and issue another reservation request for the lower bandwidth, or it can schedule the reservation for a different time.

The GRIO reservation continues until it expires or an explicit grio_unreserve_bw( ) library call is made (for more information, see the grio_unreserve_bw(3) reference pages). A GRIO reservation is also removed on the last close of a file currently associated with a reservation.

If a process has a rate guarantee on a file, any reference by that process to that file uses the rate guarantee, even if a different file descriptor is used. However, any other process that accesses the same file does so without a guarantee or must obtain its own guarantee. This is true even when the second process has inherited the file descriptor from the process that obtained the guarantee.

Sharing file descriptors between processes in an ancestral process group is supported for files used for GRIO, and the processes share the guarantee. For example, if a process got a rate guarantee of 2 Mb/s on a file and then forked, and the parent and child access the same file, they would be able to receive a combined rate of 2 Mb/s. If the child wanted a 4 Mb/s guarantee on the file, it would have to close and reopen the file and get a new rate guarantee of 4 Mb/s on it.

Four sizes are important to GRIO:

Optimal I/O size
 

Optimal I/O size is the size of the I/O operations that the system actually issues to the disks. All the disks in the real-time subvolume of an XLV volume must have the same optimal I/O size. Optional I/O sizes of disks in real-time subvolumes of different XLV volumes can differ. For more information see “/etc/grio_disks File Format”.

XLV volume stripe unit size
 

The XLV volume stripe unit size is the amount of data written to a single disk in the stripe. The XLV volume stripe unit size must be an even multiple of the optimal I/O size for the disks in that subvolume. See “Introduction to XLV Logical Volumes” in Chapter 3 for more information.

Reservation size (also known as the rate)
 

The reservation size is the amount of I/O that an application issues in a single time quantum.

Application I/O size
 

The application I/O size is the size of the individual I/O requests that an application issues. An application I/O size that equals the reservation size is recommended, but not required. The reservation size must be an even multiple of the application I/O size, and the application I/O size must be an even multiple of the optimal I/O size.

The application is responsible for making sure that all I/O requests are issued within a given time quantum, so that the system can provide the guaranteed data rate.

GRIO Guarantee Types

In addition to specifying the amount and duration of the reservation, the application must specify the type of guarantee desired. There are four different classes of options that need to be determined when obtaining a rate guarantee:

  • The rate guarantee can be made on a per-file or per-filesystem basis.

  • The rate guarantee can be private or shared.

  • The rate guarantee can be a fixed rotor, slip rotor, or non-rotor type.

  • The rate guarantee can have deadline or real-time scheduling, or it can be nonscheduled.

If the user does not specify any options, the rate guarantee has these options by default: shared, non-rotor options, and deadline scheduling. The per-file or per-filesystem guarantee is determined by the libgrio calls to make the reservation: either the grio_reserve_file() or grio_reserve_file_system() library calls.

Per-File and Per-Filesystem Guarantees

A per-file guarantee indicates that the given rate guarantee can be used only on one specific file. When a per-filesystem guarantee is obtained, the guarantee can be transferred to any file on the given filesystem.

Private and Shared Guarantees

A private guarantee can be used only by the process that obtained the guarantee; it cannot be transferred to another process. A shared guarantee can be transferred from one process to another. Shared guarantees are only transferable; they cannot be used by both processes at the same time.

Rotor and Non-Rotor Guarantees

The rotor type of guarantee (either fixed or slip) is also known as a VOD (video on demand) guarantee. It allows more streams to be supported per disk drive, but requires that the application provide careful control of when and where I/O requests are issued.

Rotor guarantees are supported only when using a striped real-time subvolume. When an application accesses a file, the accesses are time-multiplexed among the drives in the stripe. An application can only access a single disk during any one time quantum, and consecutive accesses are assumed to be sequential. Therefore, the stripe unit must be set to the number of kilobytes of data that the application needs to access per time quantum. (The stripe unit is set with the xlv_make command when volume elements are created.) If the application tries to access data on a different disk when it has a slip rotor guarantee, the system attempts to change the process's rotor slot so that it can access the desired disk. If the application has a fixed rotor guarantee it is suspended until the appropriate time quantum for accessing the given disk.

An application with a fixed rotor reservation that does not access a file sequentially, but rather skips around in the file, has a performance impact. For example, if the real-time subvolume is created on a four-way stripe, it could take as long as four (the size of the volume stripe) times the time quantum for the first I/O request after a seek to complete.

Non-rotor guarantees do not have such restrictions. Applications with non-rotor guarantees normally access the file in entire stripe size units, but can access smaller or larger units without penalty as long as they are within the bounds of the rate guarantee. The accesses to the file do not have to be sequential, but must be on stripe boundaries. If an application tries to access the file more quickly than the guarantee allows, the actions of the system are determined by the type of scheduling guarantee.

An Example Comparing Rotor and Non-Rotor Guarantees

Assume the system has eight disks, each supporting twenty-three 64 KB operations per second. (You can use the command grio_bandwidth to learn the number of I/O operations of a given size that can be performed on a particular disk in one second.) For non-rotor GRIO, if an application needs 512 KB of data each second, the eight disks are arranged in a eight-way stripe. The stripe unit is 64 KB. Each application read/write operation is 512 KB and causes concurrent read/write operations on each disk in the stripe. The application can access any part of the file at any time, provided that the read/write operation always starts at a stripe boundary. This configuration provides 23 process streams with 512 KB of data each second.

With a rotor guarantee, the eight drives are given an optimal I/O size of 512 KB. Each drive can support seven such operations each second. The higher rate (7 x 512 KB versus 23 x 64 KB) is achievable because the larger transfer size does less seeking. Again the drives are arranged in an eight-way stripe but with a stripe unit of 512 KB. Each drive can support seven 512K streams per second for a total of 8 * 7 = 56 streams. Each of the 56 streams is given a time period (also known as a time “bucket”). There are eight different time periods with seven different processes in each period. Therefore, 8 * 7 = 56 processes are accessing data in a given time unit. At any given second, the processes in a single time period are allowed to access only a single disk.

Using a rotor guarantee more than doubles the number of streams that can be supported with the same number of disks. The tradeoff is that the time tolerances are very stringent. Each stream is required to issue the read/write operations within one time quantum. If the process issues the call too late and real-time scheduling is used, the request blocks until the next time period for that process on the disk. In this example, this could mean a delay of up to eight seconds. In order to receive the rate guarantee, the application must access the file sequentially. The time periods move sequentially down the stripe allowing each process to access the next 512 KB of the file.

Real-Time Scheduling, Deadline Scheduling, and Nonscheduled Reservations

Three types of reservation scheduling are possible: real-time scheduling, deadline scheduling, and non-scheduled reservations.

Real-time scheduling means that an application receives a fixed amount of data in a fixed length of time. The data can be returned at any time during the time quantum. This type of reservation is used by applications that do only a small amount of buffering. If the application requests more data than its rate guarantee, the system suspends the application until it falls within the guaranteed bandwidth.

Deadline scheduling means that an application receives a minimum amount of data in a fixed length of time. Such guarantees are used by applications that have a large amount of buffer space. The application requests I/O at a rate at least as fast as the rate guarantee and is suspended only when it is exceeding its rate guarantee and there is no additional device bandwidth available.

Nonscheduled reservations means that the guarantee received by the application is only a reservation of system bandwidth. The system does not enforce the reservation limits and therefore cannot guarantee the I/O rate of any of the guarantees on the system. Nonscheduled reservations should be used with extreme care.

GRIO System Components

Several components make up the GRIO mechanism: a system daemon, support commands, configuration files, and an application library.

The system daemon is ggd. It is started from the script /etc/rc2.d/S94grio when the system is started. It is always started; unlike some other daemons, it is not turned on and off with the chkconfig command. A lock file is created in the /tmp directory to prevent two copies of the daemon from running simultaneously. Requests for rate guarantees are made to the ggd daemon. The daemon reads the GRIO configuration file /etc/grio_disks.

/etc/grio_disks describes the performance characteristics for the types of disk drives that are supported on the system, including how many I/O operations of each size (64 KB, 128 KB, 256 KB, or 512 KB) can be executed by each piece of hardware in one second. You can edit the file to add support for new drive types. (You can use the command grio_bandwidth to learn the number of I/O operations of a given size that can be performed on a particular disk in one second.) The format of this file is described in “/etc/grio_disks File Format”.

The command grio_bandwidth can be used to learn the number of I/O operations of a given size that can be performed on a particular disk in one second.

The /usr/lib/libgrio.so libraries contain a collection of routines that enable an application to establish a GRIO session. The library routines are the only way in which an application program can communicate with the ggd daemon. The library also includes a library routine that applications can use to check the amount of bandwidth available on a filesystem. This enables them to quickly get an idea of whether or not a particular reservation might be granted—more quickly than actually making the request.

Hardware Configuration Requirements for GRIO

Guaranteed-rate I/O requires the hardware to be configured so that it follows these guidelines:

  • Put only real-time subvolume volume elements on a single disk (not log or data subvolume volume elements). This configuration is recommended for soft guarantees and required for hard guarantees.

  • Each XLV volume you create with a real-time subvolume must include a data subvolume, even if you do not intend to use it. The data subvolume is used by XFS to store inodes and other internal filesystem information.

  • Disks used in the data and log subvolumes of the XLV logical volume must have their retry mechanisms enabled. The data and log subvolumes contain information critical to the filesystem and cannot afford an occasional disk error.

Configuring a System for GRIO


Caution: The procedure in this section can result in the loss of data if it is not performed properly. It is recommended only for experienced IRIX system administrators.

This section describes how to configure a system for GRIO: create an XLV logical volume with a real-time subvolume, make a filesystem on the volume and mount it, and configure and restart the ggd daemon.

  1. Choose disk partitions for the XLV logical volume and confirm the hardware configuration as described in “Hardware Configuration Requirements for GRIO”. This includes modifying the disk drive parameters as described in “Disabling Disk Error Recovery”. Be sure to create a data disk partition and subvolume for each real-time subvolume you create.

  2. Determine the values of variables used while constructing the XLV logical volume:

    vol_name  

    The name of the volume with a real-time subvolume.

    rate  

    The rate at which applications using this volume access the data. rate is the number of bytes per time quantum per stream (the rate) divided by 1 KB. This information may be available in published information about the applications or from the developers of the applications.

    num_disks  

    The number of disks included in the real-time subvolume of the volume.

    stripe_unit  

    When the real-time disks are striped (required for video on demand and recommended otherwise), this is the amount of data written to one disk before writing to the next. It is expressed in 512-byte sectors.

    For non-rotor guarantees:

    stripe_unit = rate * 1K / (num_disks * 512)
    

    For rotor guarantees:

    stripe_unit = rate * 1K / 512
    

    extent_size  

    The filesystem extent size.

    For non-rotor guarantees:

    extent_size = rate * 1K
    

    For rotor guarantees:

    extent_size = rate * 1K * num_disks
    

    opt_IO_size  

    The optimal I/O size. It is expressed in kilobytes. By default, the possible values for opt_IO_size are 64 (64 KB), 128 (128 KB), 256 (256 KB), and 512 (512 KB). Other values can be added by editing the /etc/grio_disks file (see “/etc/grio_disks File Format” for more information).

    For non-rotor guarantees, opt_IO_size must be an even factor of stripe_unit, but not less than 64.

    For rotor guarantees opt_IO_size must be an even factor of rate. Setting opt_IO_size equal to rate is recommended.

    Table 8-1 gives examples for the values of these variables.

    Table 8-1. Examples of Values of Variables Used in Constructing an XLV Logical Volume Used for GRIO

    Variable

    Type of Guarantee

    Comment

    Example Value

    vol_name

    any

    This name matches the last component of the device name for the volume, /dev/xlv/vol_name 

    xlv_grio

    rate 

    any

    For this example, assume 512 KB per second per stream

    512

    num_disks

    any

    For this example, assume 4 disks

    4

    stripe_unit

    non-rotor

    512*1K/(4*512)

    256

     

    rotor

    512*1K/512

    1024

    extent_size

    non-rotor

    512 * 1K

    512 KB

     

    rotor

    512 * 1K * 4

    2048 KB

    opt_IO_size

    non-rotor

    128/1 = 128 or 128/2 = 64 are possible

    64

     

    rotor

    Same as rate 

    512


  3. Create an xlv_make script file that creates the XLV logical volume. (See “Creating Volume Objects With xlv_make” in Chapter 4 for more information.) Example 8-1 shows an example script file for a volume.

    Example 8-1. Configuration File for a Volume Used for GRIO

    # Configuration file for logical volume vol_name. In this
    # example, data and log subvolumes are partitions 0 and 1 of
    # the disk at unit 1 of controller 1. The real-time
    # subvolume is partition 0 of the disks at units 1-4 of
    # controller 2.
    # 
    vol vol_name 
    data 
    plex 
    ve dks1d1s0 
    log 
    plex 
    ve dks1d1s1 
    rt 
    plex 
    ve -stripe -stripe_unit stripe_unit dks2d1s0 dks2d2s0 dks2d3s0 dks2d4s0 
    show 
    end 
    exit 
    


  4. Run xlv_make to create the volume:

    # xlv_make script_file
    

    script_file is the xlv_make script file you created in step 3.

  5. Create the filesystem by entering this command:

    # mkfs -r extsize=extent_size /dev/xlv/vol_name
    

  6. To mount the filesystem immediately, enter these commands:

    # mkdir mountdir 
    # mount /dev/xlv/vol_name mountdir 
    

    mountdir is the full pathname of the directory that is the mount point for the filesystem.

  7. To configure the system so that the new filesystem is automatically mounted when the system is booted, add this line to /etc/fstab:

    /dev/xlv/vol_name mountdir xfs rw,raw=/dev/rxlv/vol_name 0 0
    

  8. Restart the ggd daemon:

    # /etc/init.d/grio stop 
    # /etc/init.d/grio start 
    

    Now the user application can be started. Files created on the real-time subvolume volume can be accessed using guaranteed-rate I/O.

Additional Procedures for GRIO

The following subsections describe additional special-purpose procedures for configuring disks and GRIO system components. It is not advisable to perform these tuning procedures, because they can cause bad data to be returned from disk drives. However, in situations where data access speed is more important than data integrity, these tunings may be helpful.

Disabling Disk Error Recovery


Caution: Setting disk drive parameters must be performed correctly on approved disk drive types only. Performing the procedure incorrectly, or performing it on an unapproved type of disk drive can severely damage the disk drive. Setting disk drive parameters should be performed only by experienced system administrators.

The procedure for setting disk drive parameters is shown below. In this example all of the parameters shown in Table 8-2 are changed for a disk on controller 131 at drive address 1.

Table 8-2. Disk Drive Parameters for GRIO

Parameter

New Setting

Auto bad block reallocation (read)

Disabled

Auto bad block reallocation (write)

Disabled

Delay for error recovery (disabling this parameter enables the read continuous (RC) bit)

Disabled


  1. Start fx in expert mode:

    # fx -x 
    fx version 6.4, Sep 29, 1996
    

  2. Specify the disk whose parameters you want to change by answering the prompts:

    fx: "device-name" = (dksc) Enter 
    fx: ctlr# = (0) 131 
    fx: drive# = (1) 1 
    fx: lun# = (0)
    ...opening dksc(131,1,0)
    
    
    ...drive selftest...OK
    

  3. Confirm that the disk drive is disk drive type SGI 0664N1D 6s61 or disk drive type SGI 0664N1D 4I4I. These disk drive types are approved for changing disk parameters. The disk drive type appears in the next line of output:

    Scsi drive type == SGI     0664N1D         6s61
    ----- please choose one (? for help, .. to quit this menu)-----
    [exi]t               [d]ebug/             [l]abel/
    [b]adblock/          [exe]rcise/          [r]epartition/
    

  4. Show the current settings of the disk drive parameters (this command uses the shortcut of separating commands on a series of hierarchical menus with slashes):

    fx > label/show/parameters
    
    ----- current drive parameters-----
    Error correction enabled          Enable data transfer on error
    Don't report recovered errors     Do delay for error recovery
    Don't transfer bad blocks         Error retry attempts          10
    Do auto bad block reallocation (read)
    Do auto bad block reallocation (write)
    Drive readahead  enabled          Drive buffered writes disabled
    Drive disable prefetch   65535    Drive minimum prefetch         0
    Drive maximum prefetch   65535    Drive prefetch ceiling     65535
    Number of cache segments     4
    Read buffer ratio        0/256    Write buffer ratio         0/256
    Command Tag Queueing disabled
    
    
    ----- please choose one (? for help, .. to quit this menu)-----
    [exi]t               [d]ebug/             [l]abel/
    [b]adblock/          [exe]rcise/          [r]epartition/
    

    The parameters in Table 8-2 correspond to Do auto bad block reallocation (read), Do auto bad block reallocation (write), and Do delay for error recovery, in that order. Each of them is currently enabled.

  5. Give the command to start setting disk drive parameters and press Enter until you reach a parameter that you want to change:

    fx> label/set/parameters
    fx/label/set/parameters: Error correction = (enabled) Enter
    fx/label/set/parameters: Data transfer on error = (enabled) Enter
    fx/label/set/parameters: Report recovered errors = (disabled) Enter
    

  6. To change the delay for error recovery parameter to disabled, enter “disable” the prompt:

    fx/label/set/parameters: Delay for error recovery = (enabled) disable
    

  7. Press Enter through other parameters that do not need changing:

    fx/label/set/parameters: Err retry count = (10) Enter
    fx/label/set/parameters: Transfer of bad data blocks = (disabled) Enter
    

  8. To change the auto bad block reallocation parameters, enter disable at their prompts:

    fx/label/set/parameters: Auto bad block reallocation (write) = (enabled) disable
    fx/label/set/parameters: Auto bad block reallocation (read) = (enabled) disable
    

  9. Press Enter through the rest of the parameters:

    fx/label/set/parameters: Read ahead caching = (enabled) Enter
    fx/label/set/parameters: Write buffering = (disabled) Enter
    fx/label/set/parameters: Drive disable prefetch = (65535) Enter
    fx/label/set/parameters: Drive minimum prefetch = (0) Enter
    fx/label/set/parameters: Drive maximum prefetch = (65535) Enter
    fx/label/set/parameters: Drive prefetch ceiling = (65535) Enter
    fx/label/set/parameters: Number of cache segments = (4) Enter
    fx/label/set/parameters: Enable CTQ = (disabled) Enter
    fx/label/set/parameters: Read buffer ratio = (0/256) Enter
    fx/label/set/parameters: Write buffer ratio = (0/256) Enter
    

  10. Confirm that you want to make the changes to the disk drive parameters by entering “yes” to this question and start exiting fx:

     * * * * * W A R N I N G * * * * *
    about to modify drive parameters on disk dksc(131,1,0)! ok? yes
    
    ----- please choose one (? for help, .. to quit this menu)-----
    [exi]t             [d]ebug/           [l]abel/           [a]uto
    [b]adblock/        [exe]rcise/        [r]epartition/     [f]ormat
    fx> exit
    

  11. Confirm again that you want to make the changes to the disk drive parameters by pressing Enter in response to this question:

    label info has changed for disk dksc(131,1,0).  write out changes? (yes) Enter
    

Restarting the ggd Daemon

After either the /etc/grio_disks or /etc/config/ggd.options files are modified, ggd must be restarted to make the changes take effect. Give these commands to restart ggd:

# /etc/init.d/grio stop 
# /etc/init.d/grio start 

When ggd is restarted, current rate guarantees are lost.

Running ggd as a Real-time Process

Running ggd as a real-time process dedicates one or more CPUs to performing GRIO requests exclusively. Follow this procedure on a multiprocessor system to run ggd as a real-time process:

  1. Create or modify the file /etc/config/ggd.options and add -c cpunum to the file. cpunum is the number of a processor to be dedicated to GRIO. This causes the CPU to be marked isolated, restricted to running selected processes, and nonpreemptive. Processes using GRIO should mark their processes as real-time and runable only on CPU cpunum. The sysmp(2) reference page explains how to do this.

  2. Restart the ggd daemon. See “Restarting the ggd Daemon” for directions.

  3. After ggd is restarted, you can confirm that the CPU is marked by entering this command (cpunum is 3 in this example):

    # mpadmin -s 
    processors: 0 1 2 3 4 5 6 7
    unrestricted: 0 1 2 5 6 7
    isolated: 3
    restricted: 3
    preemptive: 0 1 2 4 5 6 7
    clock: 0
    fast clock: 0
    

  4. To mark an additional CPU for real-time processes after ggd is restarted, enter these commands:

    # mpadmin -rcpunum2 
    # mpadmin -Icpunum2 
    # mpadmin -Ccpunum2 
    

Using Real-Time Subvolumes

The files you create on the real-time subvolume of an XLV logical volume are known as real-time files. The next two sections describe the special characteristics of these files.

Files on the Real-Time Subvolume and Commands

Real-time files have some special characteristics that cause standard IRIX commands to operate in ways that you might not expect. In particular:

  • You cannot create real-time files using any standard commands. Only specially written programs can create real-time files. The section “File Creation on the Real-Time Subvolume” explains how.

  • Real-time files are displayed by ls, just as any other file. However, there is no way to tell from the ls output whether a particular file is on a data subvolume or is a real-time file on a real-time subvolume. Only a specially written program can determine the type of a file. The F_FSGETXATTR fcntl( ) system call can determine whether a file is a real-time or a standard data file. If the file is a real-time file, the fsx_xflags field of the fsxattr structure has the XFS_XFLAG_REALTIME bit set.

  • The df command displays the disk space in the data subvolume by default. When the -r option is given, the real-time subvolume's disk space and usage is added. df can report that there is free disk space in the filesystem when the real-time subvolume is full, and df –r can report that there is free disk space when the data subvolume is full.

File Creation on the Real-Time Subvolume

To create a real-time file, use the F_FSSETXATTR fcntl( ) system call with the XFS_XFLAG_REALTIME bit set in the fsx_xflags field of the fsxattr structure. This must be done after the file has first been created/opened for writing, but before any data has been written to the file. Once data has been written to a file, the file cannot be changed from a standard data file to a real-time file, nor can files created as real-time files be changed to standard data files.

Real-time files can only be read or written using direct I/O. Therefore, read( ) and write( ) system call operations to a real-time file must meet the requirements specified by the F_DIOINFO fcntl( ) system call. See the open(2) reference page for a discussion of the O_DIRECT option to the open( ) system call.

GRIO File Formats

The following subsections contain reference information about the contents of the two GRIO configuration files /etc/grio_disks and /etc/config/ggd.options.

/etc/grio_disks File Format

The file /etc/grio_disks contains information that describes I/O bandwidth parameters of the various types of disk drives that can be used on the system.

By default, /etc/grio_disks contains the parameters for disks supported by Silicon Graphics for optimal I/O sizes of 64 KB, 128 KB, 256 KB, and 512 KB. Table 8-3 lists some of these disks. Table 8-4 shows the optimal I/O sizes and the number of optimal I/O size requests each of the disks listed in Table 8-3 can handle in one second.

Table 8-3. Disks in /etc/grio_disks by Default

Disk ID String

"SGI IBM DFHSS2E 1111"

"SGI SEAGATE ST31200N8640"

"SGI SEAGATE ST31200N9278"

"SGI 066N1D 4I4I" 

"SGI 0064N1D 4I4I" 

"SGI 0664N1D 4I4I" 

"SGI 0664N1D 6S61"

"SGI 0664N1D 6s61"

"SGI 0664N1H 6s61"

"IBM OEM 0663E15 eSfS"

"IMPRIMIS 94601-15 1250"

"SEAGATE ST4767 2590"


Table 8-4. Optimal I/O Sizes and the Number of Requests per Second Supported

Optimal I/O Size

Number of Requests per Second

65536

23

131072

16

262144

9

524288

5

To add other disks or to specify a different optimal I/O size, you must add information to the /etc/grio_disks file. If you modify /etc/grio_disks, you must restart the ggd daemon for the changes to take effect (see “Restarting the ggd Daemon”).

The records in /etc/grio_disks are in these two forms:

ADD "disk id string" optimal_iosize number_optio_per_second 

REPLACE devicename optal_iosize number_optio_per_second

If the first field is the keyword ADD, the next field is a 28-character string that is the drive manufacturer's disk ID string. The next field is an integer denoting the optimal I/O size of the device in bytes. The last field is an integer denoting the number of optimal I/O size requests that the disk can satisfy in one second.

Some examples of these records are:

ADD     “SGI     SEAGATE ST31200N9278”  64K     23 

ADD     “SGI             0064N1D 4I4I”  50K     25 

If the first field is the keyword REPLACE, the next field is the pathname of a device (for a description of pathnames, see the grio(1M) man page). The third field is an integer denoting the optimal I/O size to be used on the device, and the number of I/O operations of that size that it can deliver per second.

An example of a REPLACE record is:

REPLACE /dev/rdsk/dks136d1s0 50K 20

/etc/config/ggd.options File Format

/etc/config/ggd.options contains command-line options for the ggd daemon. Options you might include in this file are:

-c cpunum 

Dedicate CPU cpunum to performing GRIO requests exclusively.

-o iosize 

Specify default optimal I/O size for all devices (e.g., 64, 128, 256, 512).

If you change this file, you must restart ggd to make your changes take effect. See “Restarting the ggd Daemon” for more information.