Chapter 2. Device Configuration

This chapter discusses how IRIX represents devices to software, and how it establishes the inventory of available hardware.

This information is essential when your work involves attaching a new device or a new class of devices to IRIX. The information is helpful background material when you intend to control a device from a user-level process.

The following primary topics are covered in this chapter.

In addition to the discussion here, you can find the system administrator's perspective on these issues in the books IRIX Admin: Disks and Filesystems and IRIX Admin: System Configuration and Operation .

Device Special Files

A device is represented in a UNIX system is as a device special file in a certain directory (historically, the /dev directory). Beginning with IRIX 6.4 the implementation of device special files has been changed and expanded, but the basic purpose—to treat a device as a special case of a file—is not changed.

Devices as Files

A device special file consists of a filename and access permissions, but no associated disk data. The access permissions, owner ID, and group ID of the file control whether the file can be opened. A device special file can be used like a regular file in most IRIX commands; for example, a device file can be the target of a symbolic link, the destination of redirected input or output, authorized by chmod, and so on. A process opens a device by passing the pathname of the device special file to the open() function (see the open(2) reference page).  

Historically, a device special file contained three items of information about a device:

Block or Character

A flag showing which of two types of access, block or character, applies to this device.

Major device number

A numeric code for the device driver that controls this device.  

Minor device number

A number passed to the device driver to distinguish this device from others of the same type.  

The device numbers are no longer relevant, but the distinction between block and character access still exists. To display the details of all block and character devices in a system using the /hw filesystem (described under “Hardware Graph”) use a command such as the following:

find /hw \( -type c -o -type b \) -exec ls -l {} \; | more

Block and Character Device Access

IRIX supports two classes of device. A block device such as a disk drive transfers data in fixed size blocks between the device and memory, and usually has some ability to reposition the medium so as to read or write the same data again. The driver for a block device typically has to manage buffering, and it is free to schedule I/O operations in a different sequence than they are requested.

A character device such as a printer accepts or returns data as a stream of bytes, and usually acts as a sink or source of data—the medium cannot be repositioned and read again. The driver for a character device typically transfers data as soon as it is requested and completes one operation before accepting another request. Character devices are also called raw devices, because their input is not buffered.

The two kinds of devices are supported by two different kinds of kernel-level device drivers, block and character drivers. The two kinds of drivers are expected to offer different kinds of service. For example, a block device driver is expected to provide a “strategy” entry point where it schedules asynchronous, buffered, transfers of data in units of 512 bytes. A character device driver is expected to provide read and write entry points that synchronously transfer any quantity of data from 1 byte upward.

Some device drivers offer both kinds of access. In particular, the disk device drivers support block-type access to data partitions of the disk, and character-type read/write access to the disk volume header.

Multiple Device Names

When a single device is accessed in different modes, the device is described by multiple device special files. Each device special file represents one way of accessing the device. Some reasons for using multiple names are as follows:

  • By convention, UNIX system supply certain default device names, and this is done by creating extra symbolic links. For example, the default device /dev/tapens is a link to the first device file in /dev/rmt/*.

  • When a device supports both block and character modes of access, there is a separate device special file for each mode. For example, the following (edited) pathnames provide block and character access to one partition of a SCSI device:

    /hw/.../scsi_ctlr/0/target/1/lun/0/disk/partition/0/block
    /hw/.../scsi_ctlr/0/target/1/lun/0/disk/partition/0/char
    

  • When a device can be treated as independent, logical partitions, each partition is given an independent device special file name, although the device is the same in each case. The following (edited) pathnames provide block access to, respectively, an entire disk volume, partition 0 (root), partition 1 (swap), and the volume header (label) of the same disk:

    /hw/.../scsi_ctlr/0/target/1/lun/0/disk/volume/block
    /hw/.../scsi_ctlr/0/target/1/lun/0/disk/partition/0/block
    /hw/.../scsi_ctlr/0/target/1/lun/0/disk/partition/1/block
    /hw/.../scsi_ctlr/0/target/1/lun/0/disk/volume_header/block
    

  • When a device needs different treatment at different times, it can have one device special file for each kind of treatment. The following pathnames all provide access to the identical tape drive. The user can open a different name for each combination of byte-swapped and non-byte-swapped I/O with fixed or variable record lengths:

    /hw/tape/tps0d3stat
    /hw/tape/tps0d3s
    /hw/tape/tps0d3sc
    /hw/tape/tps0d3nrs
    /hw/tape/tps0d3nrsc
    /hw/tape/tps0d3ns
    /hw/tape/tps0d3nsc
    /hw/tape/tps0d3
    /hw/tape/tps0d3c
    /hw/tape/tps0d3nrns
    /hw/tape/tps0d3nrnsc
    /hw/tape/tps0d3nr
    /hw/tape/tps0d3nrc
    /hw/tape/tps0d3sv
    /hw/tape/tps0d3svc
    /hw/tape/tps0d3nrsv
    /hw/tape/tps0d3nrsvc
    /hw/tape/tps0d3nsv
    /hw/tape/tps0d3nsvc
    /hw/tape/tps0d3v
    /hw/tape/tps0d3vc
    /hw/tape/tps0d3nrnsv
    /hw/tape/tps0d3nrnsvc
    /hw/tape/tps0d3nrv
    /hw/tape/tps0d3nrvc
    

Major Device Number

The major device number was, in traditional UNIX architecture, a numeric key that related a device special file to the device driver that managed it. When special file was opened, IRIX selected the driver to handle the device based on the major device number. In the newer /hw filesystem, a different means is used. The major number is no longer relevant.

The major number in all device special files in /hw is always 0. The device special files in /hw are created dynamically, by the device drivers, as the devices are attached. The identity of the device driver is stored in the device special files at this time, but not as a number. When a process opens a device special file in /hw (or a name in /dev that is a symbolic link to /hw), the kernel can tell directly which driver to call.

Minor Device Number

In conventional UNIX, and in versions of IRIX previous to IRIX 6.4, a minor device number was encoded in the device special file and was passed to the device driver. The major and minor numbers were passed together in an integer called a dev_t. The driver could extract the minor device number by passing the dev_t value to the geteminor() function.

Historical Use of Minor Number

Prior to IRIX 6.4, the minor device number served as an argument to help the device driver distinguish one device from another. Many devices can have the same major number and be serviced by the same driver. Using the minor number, the driver could distinguish the particular device being serviced.

Some device drivers treated the minor device number as a logical unit number, while other drivers used it to contain multiple, encoded bit fields. For example:

  • The IRIX tape device driver used the minor device number to encode the options for rewind or no-rewind, byte-swap or nonswap, and fixed or variable blocking, along with the logical unit number.

  • The IRIX disk device drivers encoded the disk partition number into the minor device number along with a disk unit number.

  • Both disk and tape devices encoded the SCSI adapter number in the minor number.

With STREAMS drivers, the minor device number can be chosen arbitrarily during a CLONE open—see “Support for CLONE Drivers” in Chapter 22.

Present Use of Minor Numbers

Beginning with IRIX 6.4, the minor device number has little importance because the driver has a direct way to distinguish each device and its special needs, through the hardware graph (see “Hardware Graph”.)

The minor number in device special files in /hw is an arbitrary integer with no relation to the device itself. The device special files in /hw are created dynamically, by the device drivers, as the devices are attached. The device driver stores any information it needs to distinguish one device from another, directly in the device special file itself. When a process opens a device special file in /hw (or a name in /dev that is a symbolic link to /hw), the driver can retrieve the information directly, without needing to decode the minor number.

Creating Conventional Device Names

Starting with IRIX 6.4, there is a complete filesystem, /hw, that is devoted to device special files. However, the use of /hw is both new and unique to IRIX. For the sake of compatibility, the conventional device special files in the /dev filesystem that are used in UNIX systems generally and in previous release of IRIX are retained. This topic describes these conventional names. See also “/hw Filesystem”.

Many device special files are created automatically at boot time by execution of the script /dev/MAKEDEV. Additional device special files can be created with administrator commands.

IRIX Conventional Device Names

Conventions for the format of device special filenames are spelled out in the following reference pages: intro(7) , dks(7) , dsreq(7) , and tps(7) . For example, the components of a disk device name in /dev/dsk include

dks c 

Constant prefix “dks” followed by bus adapter number c.

d u 

Constant letter “d” followed by disk SCSI ID number u.

l n 

Optionally, letter “l” (ell) and logical unit number n (used only when disk u controls multiple drives).

s p or vh or vol 

Constant letter “s” and partition number p, or else “vh” for volume header, or “vol” for (entire) volume.

Programs throughout the system rely on the conventions for these device names. In addition, by convention the associated major and minor numbers agree with the names. For example, the logical unit and partition numbers that appear in a disk name are also encoded into the minor number.

Beginning with IRIX 6.4, these highly-compressed conventional names are unpacked into longer pathnames in the /hw filesystem. However, the older, encoded names in /dev are retained for compatibility and portability.

The Script MAKEDEV

The conventions for all the IRIX device special names are written into the script /dev/MAKEDEV. This is a make file, but unlike most make files, it is not used to compile executable programs. It contains the logic to prepare device special names and their associated major and minor numbers and file permissions.

The MAKEDEV script is executed during IRIX startup from a script in /etc/rc2.d. It is executed after all device drivers have been initialized, so it can use the output of the hinv command to construct device names to suit the actual configuration.

The system administrator can invoke MAKEDEV to construct device special files. Administrator use of MAKEDEV is described in IRIX Admin: System Configuration and Operation.

Making Conventional Device Files

You or a system administrator can create device special files explicitly using the commands mknod or install. Either command can be used in a make file such as you might create as part of the installation script for a product.

For details of these commands, see the install(1) and mknod(1M) reference pages, and IRIX Admin: System Configuration and Operation. The following is a hypothetical example of install:

# install -m 644 -u root -g sys -root /dev -chr 62,0 

The -chr option specifies a character device, and 62,0 are the major and minor device numbers, respectively.


Tip: The mknod command is portable, being used in most UNIX systems. The install command is unique to IRIX, and has a number of features and uses beyond those of mknod. Examples of both can be found by reading /dev/MAKEDEV


Hardware Graph

Conventional UNIX software is designed based on the assumption that the computer has only a small, fixed set of peripheral devices under undemanding reliability constraints. IRIX 6.5 is designed to handle a system with a large complement of devices that can change dynamically, under high demands for reliability. To meet the new requirements, IRIX introduced the hwgraph (hardware graph) to represent system devices, and the /hw filesystem as the externally visible form of the hwgraph.

UNIX Hardware Assumptions, Old and New

Historically, UNIX was designed to support small computer systems that were administered by the same group of people that used them. When there are only a few, or a few dozen, peripheral devices, it is acceptable to:

  • Represent all devices as brief names in the /dev filesystem

  • Use a limited range of major device numbers to specify all possible device drivers

  • Use an 18-bit integer (the minor device number) as the sole parameter to represent a device's identify and access mode

  • Leave the details of device addressing to be specified in configuration files or by hard-coding in the source of device drivers.

When devices are only rarely added to or removed from the system, it is acceptable to require that the administrator shut the system down, modify a configuration file, and reboot, in order to remove or add a device. When the system has a small number of tolerant users, it is acceptable to shut the system down and restart it to make small changes in the I/O configuration.

All of these assumptions are challenged by the kinds of large-scale systems that can be built using the Silicon Graphics Origin2000 architecture.

  • It is possible to build very large Origin2000 systems with many independent nodes, each with a number of attached devices.

  • Because of the rich possibilities for interconnecting Origin2000 nodes, the topology of a Origin2000 system can be complex, with devices addressed by lengthy paths, and sometimes with multiple possible paths from a CPU to a device.

  • The hardware configuration of a Origin2000 system can change dynamically while the system runs, by adding and removing entire nodes, or single buses, or single cards on a PCI bus.

  • Origin2000 is designed to be the basis of systems that are available a very high percentage of the time, on which frequent or casual reboots are not allowed.

In this environment it is no longer acceptable to require downtime on any change, nor to require the administrator to issue detailed commands or to edit configuration files to make simple changes. Previous release of IRIX addressed some of these points through the MAKEDEV script (see “The Script MAKEDEV”), which creates device special files automatically for many types of hardware.

IRIX 6.4 moves away from the conventional UNIX model by creating the hwgraph, and by requiring all kernel-level device drivers to maintain the hwgraph as devices are attached and detached.

Hardware Graph Features

The hwgraph is an in-memory, graph-structured database that describes all hardware units that are addressable by the system. For a very concise overview of the hwgraph, see the hwgraph(4) reference page.

Hwgraph Nomenclature  

“In-memory” means that the hwgraph is contained in kernel memory. It is reconstructed dynamically in memory each time the system boots up, and is kept current in memory as the hardware configuration changes.

“Graph-structured” means that the hwgraph is topologically a directed graph, consisting of a set of “vertexes” (points) that represent devices, and “edges” (lines) that connect the vertexes. Each edge is a one-way linkage from a source vertex to a target vertex (this is the definition of a directed graph). Each edge has a label, a character string that names the edge. A small part of a typical hwgraph is depicted in Figure 2-1.

Figure 2-1. Part of a Typical Hwgraph

Part of a Typical Hwgraph

Figure 2-1 shows the part of the graph that represents block-mode and character-mode access to the whole-volume partition of a disk. The more familiar path notation for the same graph would be as follows:

/hw/module/1/io/pci/slot/0/scsi_ctlr/0/target/1/lun/0/disk/volume/char
/hw/module/1/io/pci/slot/0/scsi_ctlr/0/target/1/lun/0/disk/volume/block
/hw/module/1/io/dks0d0vol/block
/hw/module/1/io/dks0d0vol/char

Figure 2-1 is color-coded to show when the parts of graph are built:

  • The parts of the hwgraph built by the kernel during bootup are shown in blue.

  • The parts shown in cyan are built by the PCI bus adapter as it probes the bus.

  • The parts in magenta are built by the host adapter driver for the SCSI controller, to reflect the addressable units on the SCSI bus.

  • The parts shown in green are built by the disk device driver as it attaches the disk—including a shorthand link from /hw/module/1/io to the volume vertex.

Properties of Edges and Vertexes

An edge in the hwgraph originates in one vertex (the source vertex) and points to another vertex (the target vertex). The only property of an edge is its label.

A vertex in the hwgraph stores information about an addressable unit of hardware in the system. A vertex can contain the following kinds of information:

  • A pointer to an information structure supplied by the device driver.

  • One or more inventory_t objects, representing information to be reported out by the hinv command (see the hinv(1) reference page).

  • One or more labelled attributes, containing information that can be reported out by the attr command (see the attr(1) reference page).

  • One or more labelled attributes that are not exported for availability by attr.

  • The edges leading out of this vertex.

Not all vertexes have all this information.

Additional Edges

The basic hwgraph—as constructed by the kernel and by built-in drivers such as the PCI bus adapter—is highly detailed and explicit, and is generally tree-structured. However, kernel-level drivers are free to add edges between any two vertexes. A driver can add extra edges in order to provide short-circuit paths for convenient access to vertexes deep in the hwgraph.

Many device drivers distributed with IRIX create convenience vertexes and edges; and device drivers provided by OEMs are welcome to do so as well. One problem is that often a driver needs to label a convenience edge with a unique number—a controller number, a port number, or a line number of some kind. At the time a driver is initializing and creating vertexes, the total hardware complement is not known and it is impossible to decide which number of this kind to use. This problem is alleviated by a program like ioconfig; see “Using ioconfig for Global Controller Numbers”.

Implicit Edges

Every vertex has one implicit edge with the label “..” which leads back to a parent vertex. Every vertex has one implicit edge with the label “.” which leads to the vertex itself. This is deliberately the same convention used in a filesystem, where every directory contains “..” and “.” entries. No other edges are required.

A vertex that has only the implicit edges is a leaf vertex. A leaf vertex can stand for a device, so that a user process can name a leaf vertex in an open() call in order to open the device. A user process cannot open a non-leaf vertex, just as a process cannot open a directory as a file.

/hw Filesystem

The /hw filesystem is a visible reflection of the hwgraph. The /hw filesystem is a filesystem, on a par with an EFS or XFS filesystem, but of a different type. It is built dynamically (it has no disk contents) and changes to reflect changes in the hwgraph. (You can compare the /hw filesystem to another artificial, dynamic filesystem, /proc, which is an externally visible representation of the currently executing user processes.)

Any user can navigate the /hw filesystem using commands such as cd, ls, find, and file. Users can browse the /hw filesystem to discover the hardware configuration. Names in the /hw filesystem have access permissions that are applied in the same way as in other filesystems. Pathnames beginning /hw can be used wherever other filesystem pathnames are used, and in particular,

  • A process can use a /hw pathname with the open() function to open a device.

  • An /hw pathname can be used to construct a symbolic link.

The use of symbolic links to /hw paths is important. All the device special filenames that are conventionally expected to exist in /dev are implemented by creating symbolic links from /dev to /hw. Here is a typical link:

lrwxr-xr-x   1 root   sys    13 Aug 16 11:23 /dev/root -> /hw/disk/root

However, a symbolic link is not a perfect alias. Links are given special treatment by commands such as ls, tar, and chmod; and by the system function stat() on which the commands are based (see the stat(2) reference page). What is needed is a way to make a functional alias for a device special file under a different name. That is supplied by mknod.

Driver Interface to Hwgraph

A kernel-level device driver can make use of a variety of kernel functions for examining and modifying the hwgraph. These functions are covered in detail in “Hardware Graph Management” in Chapter 8. The kernel offers functions by which a driver can:

  • Traverse the hwgraph, following edges by name from vertex to vertex.

  • Create new vertexes.

  • Create new edges from existing vertexes to new vertexes.

  • Set, change, or retrieve the address of driver-defined data from a vertex.

  • Add hardware inventory data to a vertex.

  • Set, change, retrieve or remove labelled attributes, and specify whether the attributes should be accessible to the attr command or not.

  • Remove edges and destroy vertexes.

Some device drivers do not have to perform these functions, but most kernel-level drivers do need to create at least a few edges and vertexes to provide access to devices. Vertexes are typically created when the driver is called at its pfxattach() entry point (driver entry points are covered in detail in Chapter 7, “Structure of a Kernel-Level Driver”.) Vertexes are typically destroyed when the driver is called at its pfxdetach() entry point. 

Hardware Inventory

In IRIX previous to IRIX 6.4, during bootstrap, each device driver probed the hardware attachments for which it was responsible, and added information to a hardware inventory table. The kernel maintained a hardware inventory table in kernel virtual memory. The table could be queried by users and by programs.

Beginning with IRIX 6.4, what was once a simple table of devices has expanded into the hwgraph (“Hardware Graph”). Device drivers create the hardware inventory by adding vertexes to the hwgraph. However, existing programs continue to query the hardware inventory using the old programming interface, as well as new ones.

Using the Hardware Inventory

The hardware inventory is used by users, administrators, and programmers.

Contents of the Inventory

Using database terminology, the hardware inventory consists of a single table with the following columns:

Class

A code for the class of device; for example, audio, disk, processor, or network.

Type

A code for the type of device within its class; for example, FPU and CPU types within the processor class.

Controller

When applicable, the number of the controller, board, or attachment.

Unit

When applicable, the logical unit or device within a Controller number.

State

A descriptive number, such as the CPU model number.

Of these values,

  • The Class and Type are arbitrary codes that are defined in /usr/include/invent.h. Only the defined codes can be interpreted by the hinv command.

  • The Controller and Unit are small integers. The hinv command formats them based the Class code. For example, when Class is INV_DISK, hinv might report “Disk drive: unit 4 on SCSI controller 56.” When Class is INV_NETWORK and Type is INV_NET_ETHER, hinv might report “Integral Ethernet controller: et2, Ebus slot 11.”

  • The Controller number is used to distinguish between identical controllers. The device driver can assign a controller number when it attaches inventory data to a device vertex; or the controller numbers can be assigned dynamically at boot time, as discussed under “Using ioconfig for Global Controller Numbers”.

Displaying the Inventory with hinv

The hinv command formats all or selected rows of the inventory table for display (see the hinv(1) reference page), translating the numbers to readable form. The user or system administrator can use command options to select a class of entries or certain specific device types by name. The class or type can be qualified with a unit number and a controller number. For example, the following command displays information about disk 4 on controller 1:

hinv -c disk -b 1 -u 4 

You can use hinv to check the result of installing new hardware. The new hardware should show up in the report after the system is booted following installation, provided that the associated device driver was called and was written correctly.

A full inventory report (hinv -mv) is almost mandatory documentation for a software problem report, either submitted by your user to you, or by you to Silicon Graphics.

Testing the Inventory In Software

Within a shell script, you can test the output of hinv most conveniently in the command exit status. The command sets exit status of 0 when it finds or reports any items. It sets status of 1 when it finds no items. The code in Example 2-1 could be used in a shell script to test the existence of a disk controller.

Example 2-1. Testing the Hardware Inventory in a Shell Script

if hinv -s -c disk -b 1;
   then ;
   else echo No second disk controller;
fi ;

You can access the inventory table in a C program using the functions documented in the getinvent(3) reference page. The only access method supported is a sequential scan over the table, viewing all entries. Three functions permit access:

setinvent() 

initializes or reinitializes the scan to the first row

getinvent() 

returns the next table row in sequence

endinvent() 

releases storage allocated by setinvent() 

These functions use static variables and should only be used by a single process within an address space. Reentrant forms of the same functions, which can safely be used in a multithreaded process, are also available (see getinvent(3) ). Example 2-2 demonstrates the use of these functions.

The format of one inventory table row is declared as type inventory_t in the sys/invent.h header file. This header file also supplies symbolic names for all the class and type numbers that can appear in the table, as well as containing commentary explaining the meanings of some of the numbers.

Example 2-2. Function Returning Type Code for CPU Module

#include <stddef.h> /* for NULL */
#include <invent.h> /* includes sys/invent.h */
int getIPtypeCode()
{
   inv_state_t * pstate = NULL;
   inventory_t * work;
   int ret = 0;
   setinvent_r(&pstate);
   do {
      work = getinvent_r(pstate);
      if ( (INV_PROCESSOR == work->inv_class)
      &&   (INV_CPUBOARD == work->inv_type) )
         ret = work->inv_state;
   } while (!ret);
   endinvent_r(pstate); /* releases pstate-> */
   return ret;
}


Creating an Inventory Entry

Device drivers supplied by Silicon Graphics add information to the hardware inventory by adding vertexes to the hwgraph (see “Driver Interface to Hwgraph”) and then by attaching inventory_t structures to vertexes using the device_inventory_add() function. This and other hwgraph functions are discussed on the hwgraph.inv(d3x) reference page, and under “Hardware Graph Management” in Chapter 8.  

The inventory_t structure is declared in the header file sys/invent.h, along with the inventory type and class numbers that are valid.

Drivers written for releases prior to IRIX 6.4 called the add_to_inventory() kernel function in order to add a row to the inventory table. This function is supported in IRIX 6.5 in a limited way. When called, it attaches the inventory information to the root of the hwgraph (to the /hw directory itself). As a result, the hinv command does see and report the added inventory information, but the information is not physically associated with the hwgraph vertex to which it applies.


Note: The only valid inventory types and classes are those declared in sys/invent.h. Only those numbers can be decoded and displayed by the hinv command, which prints an error message if it finds an unknown device class, and which prints nothing at all for an unknown device type within a known class. There is no provision for adding new device-class or device-type values for third-party devices.  

However, it is possible now for a driver to add any arbitrary descriptive string desired to any vertex. These labelled attributes can be retrieved by the attr command and in software by the attr_get() function (see attr(1) and attr_get(2) ).

Using ioconfig for Global Controller Numbers

An Origin2000 system can be reconfigured dynamically, so the complement of devices can change from day to day or even minute to minute—a primary motive for creating the hwgraph. However, the dynamic nature of the hardware complement makes it difficult to define a stable, predictable numbering scheme for hardware devices. This need is met by the ioconfig command (see reference page ioconfig(1M) ).

Need for Stable Numbering

As discussed under “IRIX Conventional Device Names”, a conventional name for a disk device in the /dev/dsk directory is dksCdulnsp. The number C is the “controller” number, which in previous systems represented a fixed, well-known numbering of SCSI bus adapters. No such fixed numbering is inherent in the Origin2000 architecture. Controller cards can be added to and removed from modules, and entire modules can be added to and removed from the system.

Users of network interface cards, serial ports, bus adapters, and other devices need a predictable, static naming scheme for devices. The name /dev/ttyf2 should represent the same serial port tomorrow that it does today. A related problem is that some device drivers want to place extra, short-circuit vertexes under /hw to allow simpler access to their devices (see “Additional Edges”). Typically such short-circuit names ought to be distinguished by a predictable number.

However, it is impossible to assign stable, repeatable controller numbers dynamically at boot time, while the system is discovering the I/O complement. All the CPUs in the system boot at the same time. Bus controllers and device drivers are initialized in parallel on the nodes to which the hardware is connected. The sequence in which this happens is unpredictable; and in any case the hardware connections can change from boot to boot. A driver cannot know, when it is called to attach a device, what controller number it ought to specify in the hardware inventory.

Design of ioconfig

In order to solve these problems, the ioconfig command is invoked automatically, after device drivers have been initialized and the hwgraph has been initialized, but before user processes are started.

Operating in parallel for speed, ioconfig traverses the entire hwgraph, inspecting the hardware inventory data at each vertex. At a vertex where the hardware inventory Class value indicates a controller that should be numbered, ioconfig assigns a number, and updates the hardware inventory Controller value to reflect the assigned number. Then the program opens the device and optionally causes an ioctl() function. This results in an entry to the open() entry point, and optionally the ioctl() entry point., of the device driver (for an overview of this interaction, see “Overview of Device Open” in Chapter 3 and “Overview of Device Control” in Chapter 3).

In these entry points, the device driver can recognize that its device now has an assigned Controller number. The driver can use this information to create extra hwgraph vertexes and edges if it wishes. (For an overview of how the distributed SCSI drivers use this facility, see “SCSI Devices in the hwgraph” in Chapter 16.)

Configuration Control File

The ioconfig program uses three disk files. The first, /etc/ioconfig.conf, in which it records the controller numbers it has assigned and the related /hw pathnames. When it needs to assign a number, ioconfig first looks up the current hwgraph path in /etc/ioconfig.conf. If the path appears, ioconfig assigns the same controller number that it used last time. If the path does not appear, ioconfig assigns the lowest number that has never been assigned in this device Class, and adds the path and its number to /etc/ioconfig.conf.

This procedure ensures that a given device always receives the same controller number, even if the device is removed and later replaced. Users can inspect /etc/ioconfig.conf at any time to discover the numbering, and so can infer the relationship of a controller number in /dev/dsk (for example) to a vertex in the hwgraph. Alternatively, the system administrator can cause all numbers to be reassigned simply by removing the file /etc/ioconfig.conf.

Permissions Control File

The ioconfig command also can be used to set ownership and permissions on the device special files. This enables the administrator to specify ownership and permissions for device names that are created dynamically, each time the system boots.

Assignment of permissions is driven by the file /etc/ioperms. Its format (as described in ioconfig(1M) ) has four fields:

device_name 

A path in /hw or /dev. The path can include wildcards so it applies to many devices.

permissions 

The device file permissions, as an octal number, as described in chmod(1) or chmod(2) .

owner_name 

A valid userid to own the device, usually root.

group_name 

A valid group name to own the device, usually sys.

There is no requirement that /etc/ioperms describe only existing devices; it can describe devices that are not currently in the system. Also it can describe devices defined by third parties other than Silicon Graphics.

Device Management File

The ioconfig command has built-in knowledge of Silicon Graphics network and disk controllers and other devices. However, you can cause ioconfig to assign a controller number to an OEM device, and to call your driver when it does so. You do this by placing a file in the directory /var/sysgen/ioconfig.

All files in that directory are processed by ioconfig. A noncomment line in any of these files has the following seven fields (not 8 fields, as some editions of the ioconfig(1M) reference page show):

class 

The inventory Class value that is found in a vertex of this kind, as an integer number.

type 

The inventory Type value that is found in a vertex of this kind, as an integer number. Use -1 for “any.”

state 

The inventory State value that is found in a vertex of this kind, as an integer number. Use -1 for “any.”

suffix 

A suffix to be added to the hwgraph path name when opening the device. Use the two characters -1 to mean “none.”

pattern 

A hwgraph path prefix that defines the set of controller numbers for this Class, Type, and State of device. Use the characters -1 to mean “use the hwgraph base path string.”

start_num 

The lowest (first) controller number to be assigned to devices of this Class, Type, and State; the first number assigned under pattern.

ioctl_num 

The ioctl command number to pass in an ioctl call after opening the device, as decimal or hexadecimal integer. Use -1 to say “no ioctl.”

By placing a file in /var/sysgen/ioconfig, you can cause ioconfig to assign a controller number to devices that you support, and to open each device and optionally execute an ioctl call against each device, so the device driver can take note of the assigned number.

Configuration Files

IRIX uses a number of configuration files to supplement its knowledge of devices and device drivers. This is a summary of the files. The use of each file for device driver purposes is described in more detail in other chapters. (The uses of these files for other system administration tasks is covered in IRIX Admin: System Configuration and Operation.)

Most configuration files used by the IRIX kernel are located in the directory /var/sysgen. Files used by the X11 display system are generally in /usr/lib/X11. With regard to device drivers, the important files are:

/var/sysgen/master.d/* 

Descriptions of the attributes of kernel modules

/var/sysgen/boot/* 

Kernel object modules

/var/sysgen/system/*.sm 

Kernel configuration directions

/var/sysgen/mtune/* 

Values and limits of tunable parameters

/var/sysgen/stune 

New values for tunable parameters

/var/sysgen/ioconfig/* 

Directives to iconfig program

/usr/lib/X11/input/config/* 

Initialization commands for Xdm input modules


Master Configuration Database

Every configurable module of the kernel (this includes kernel-level device drivers and other optional kernel modules) is represented by a single file in the directory /var/sysgen/master.d.

A file in master.d describes the attributes of a module of the kernel which is to be loaded at boot time (or loaded later). The general syntax of the file is documented in detail in the master(4) reference page. Only a subset of the syntax is used to describe a device driver module. In general, the master.d file specifies device driver attributes such as:

  • the driver's prefix, a name that qualifies all its entry points

  • whether it is a block, character, or STREAMS driver

  • the major number serviced by the driver

  • whether the driver can be loaded dynamically as needed

  • whether the driver is multiprocessor-aware

  • which of the possible driver entry points the driver supplies

For each module described in a master.d file there should be a corresponding object module in /var/sysgen/boot. The creation of device driver modules and the syntax of master.d files is covered in detail in Chapter 9, “Building and Installing a Driver”.

Kernel Configuration Files

The files /var/sysgen/system/*.sm direct the lboot command in loading the modules of the kernel at boot time. Although there are normally several files with the names of subsystems, all the files in this directory are treated as one single file. The exact syntax of these files is documented in the system(4) reference page.

Use of Configuration Files by lboot

The contents of the files direct lboot in loading components that are described by files in /var/sysgen/master.d, and in probing for devices to see if they exist. (For details of the operation of lboot, see the lboot(1M) and autoconfig(1M) reference pages.)

The VECTOR statement in a kernel configuration file directs lboot to probe for the existence of hardware at a stated address, and to include a device driver only when the hardware existed. Starting with IRIX 6.3, the kernel automatically probes the PCI bus and other attachments in which the hardware devices can identify themselves dynamically. The VECTOR statement is used only for VME and EISA devices (in systems that support them) because these cannot identify themselves automatically.

Storing Device and Driver Attributes

The system administrator can place statements in any file in /var/sysgen/system. These statements cause labelled attributes to be placed in the hardware graph, where device drivers can retrieve them (see “Driver Interface to Hwgraph” and the system(4) reference page).

The DEVICE_ADMIN statement is used to attach an attribute giving information about a particular device. The attribute is attached to a specific device special file in the hwgraph. Its syntax is as follows:

DEVICE_ADMIN : /hw/path label = value [, label = value]... 

The colon (:) is required; do not overlook it. The values you supply are:

path 

Completion of a path to a device special file in the /hw filesystem.

label 

The label for which the device driver will ask.

value 

The value, a character string, the driver will retrieve.

The path is terminated by white space. The label is terminated by the “=” or by white space. The value is terminated by a comma or by the end of the line, so the value can contain white space and special characters other than the comma. As one example of the use of DEVICE_ADMIN, you can find the following in /var/sysgen/system/irix.sm:

DEVICE_ADMIN: /hw/module/1/slot/io1/baseio/pci/0/scsi_ctlr/0
                                          ql_request_queue_depth=1024

The path specifies a particular SCSI controller. The label is “ql_request_queue_depth,” and the value is 1024.

The DRIVER_ADMIN statement is used to pass a value directly to a device driver. Its syntax is as follows:

DRIVER_ADMIN : prefix label = value [, label = value]... 

The values you supply are:

prefix 

The prefix string that identifies a driver (see “Driver Name Prefix” in Chapter 7

).

label 

The label for which the device driver will ask.

value 

The value, a character string, the driver will retrieve.

The prefix is terminated by white space. The label is terminated by the “=” or by white space. The value is terminated by a comma or by the end of the line, so the value can contain white space and special characters other than the comma.

These two statements can be placed in any file in /var/sysgen/system, but typically appear in the irix.sm file. The device driver must expect to receive labeled values, and must request them using the interface described under “Retrieving Administrator Attributes” in Chapter 8.

Setting Interrupt Targets and Levels

The DEVICE_ADMIN statement is used to perform general administration of device interrupts. These uses are documented with examples in /var/sysgen/system/irix.sm:

  • DEVICE_ADMIN: CPU-path NOINTR=1 blocks all interrupts from that CPU.

  • DEVICE_ADMIN: device-path INTR_TARGET=CPU-path directs all interrupts from a device to a CPU.

  • DEVICE_ADMIN: device-path INTR_SWLEVEL=n sets the dispatching priority for the thread that executes the interrupt handler for a device. The default is 230 and normally should not be changed.

Setting 32-bit Direct Mapping Node

The DEVICE_ADMIN statement is also used to administer 32-bit direct mapping.


Note: The following information does not apply to O2 or Octane systems.

When a PCI driver uses 32-bit direct mapping (with the pciio_dmatrans_addr() and pciio_dmatrans_list() functions), the memory space that is being mapped must be on one specific node. The default is node zero. You can use the DEVICE_ADMIN statement to change the mapping node for a specific PCI bus.


Caution: This change occurs at the PCI bus level, not the device level. This means that each device on that PCI bus will be affected by the change.

These uses are documented with examples in /var/sysgen/system/irix.sm:

  • DEVICE_ADMIN: pcibus-hwgraph-path PCIBUS_DMATRANS_NODE=node-hwgraph-path sets the node to be used by the specified PCI bus, for all 32-bit direct mapping.

  • The following example applies to SGI Origin 2000 systems only:

    
    DEVICE_ADMIN: /hw/module/1/slot/io11/xtalk_pci/pci PCIBUS_DMATRANS_NODE=/hw/nodenum/2
    

  • The following example applies to SGI Origin 3000 systems only:

    
    DEVICE_ADMIN: /hw/module/006p05/Pbrick/xtalk/8/pci PCIBUS_DMATRANS_NODE=/hw/nodenum/1
    

System Tuning Parameters

The IRIX kernel supports a variety of tunable parameters, some of which can be interrogated by device drivers. The current values of the parameters are recorded in files in /var/sysgen/mtune/* (one file per major subsystem).

You or the system administrator can view the current settings using the systune command (see the systune(1M) reference page). The system administrator can use systune to request changes in parameters. Some changes take effect at once; others are recorded in a modified kernel that is loaded the next time the system boots.

To retrieve certain tuning parameters from within a kernel-level device driver, include the header file sys/var.h.

The use of systune and its related files is covered in IRIX Admin: System Configuration and Operation.

X Display Manager Configuration

Most files related to the configuration of the X Display Manager Xdm are held in /var/X11. These files are documented in reference pages such as xdm(1) and in the programming manuals related to the X Windows System.

One set of files, in /usr/lib/X11/input/config, controls the initialization of nonstandard input devices. These devices use STREAMS modules, and their configuration is covered in Chapter 22, “STREAMS Drivers”