Glossary

active metadata server

A server-capable administration node chosen from the list of potential metadata servers. There can be only one active metadata server for any one filesystem.

administration node

See CXFS administration node.

cell ID

A number associated with a node that is used by the CXFS software and appears in messages.

CLI

Underlying command line interface commands used by the CXFS Manager graphical user interface (GUI) and the cmgr(1M) command.

client

See CXFS client node, CXFS client-only node and CXFS administration node.

cluster

A cluster is the set of systems (nodes) configured to work together as a single computing resource. A cluster is identified by a simple name and a cluster ID. A cluster running multiple operating systems is known as a multiOS cluster .

There is only one cluster that may be formed from a given pool of nodes.

Disks or logical units (LUNs) are assigned to clusters by recording the name of the cluster on the disk (or LUN). Thus, if any disk is accessible (via a Fibre Channel connection) from machines in multiple clusters, then those clusters must have unique names. When members of a cluster send messages to each other, they identify their cluster via the cluster ID. Thus, if two clusters will be sharing the same network for communications, then they must have unique cluster IDs. In the case of multiOS clusters, both the names and IDs must be unique if the clusters share a network.

Because of the above restrictions on cluster names and cluster IDs, and because cluster names and cluster IDs cannot be changed once the cluster is created (without deleting the cluster and recreating it), SGI advises that you choose unique names and cluster IDs for each of the clusters within your organization.

Cluster ID

A unique number within your network in the range 1 through 128. The cluster ID is used by the operating system kernel to make sure that it does not accept cluster information from any other cluster that may be on the network. The kernel does not use the database for communication, so it requires the cluster ID in order to verify cluster communications. This information in the kernel cannot be changed after it has been initialized; therefore, you must not change a cluster ID after the cluster has been defined. Clusters that share a network must have unique names and IDs.

cluster administrator

The person responsible for managing and maintaining a cluster.

cluster database

Contains configuration information about all nodes and the cluster. The database is managed by the cluster administration daemons.

cluster domain

XVM concept in which a filesystem applies to the entire cluster, not just to the local node. See also local domain .

cluster database membership

The group of administration nodes in the pool that are accessible to cluster administration daemons and therefore are able to receive cluster database updates; this may be a subset of the nodes defined in the pool. The cluster administration daemons manage the distribution of the cluster database (CDB) across the administration nodes in the pool. (Also known as user-space membership and fs2d database membership.)

cluster mode

One of two methods of CXFS cluster operation, Normal or Experimental. In Normal mode, CXFS resets any node for which it detects heartbeat failure; in Experimental mode, CXFS ignores heartbeat failure. Experimental mode allows you to use the kernel debugger (which stops heartbeat) without causing node failures. You should only use Experimental mode during debugging.

cluster node

A node that is defined as part of the cluster. See also node.

coexecution

The ability to run CXFS and IRIS FailSafe together. For more information, see “Overview of IRIS FailSafe Coexecution” in Chapter 1.

control messages

Messages that cluster software sends between the cluster nodes to request operations on or distribute information about cluster nodes. Control messages and heartbeat messages are sent through a node's network interfaces that have been attached to a control network.

A node's control networks should not be set to accept control messages if the node is not a dedicated CXFS node. Otherwise, end users who run other jobs on the machine can have their jobs killed unexpectedly when CXFS resets the node.

control network

The network that connects nodes through their network interfaces (typically Ethernet) such that CXFS can send heartbeat messages and control messages through the network to the attached nodes. CXFS uses the highest priority network interface on the control network; it uses a network interface with lower priority when all higher-priority network interfaces on the control network fail.

CXFS administration node

A node in the pool that is installed with the cluster_admin.sw.base software product, allowing the node to perform cluster administration tasks and contain a copy of the cluster database. There are two types of administration nodes: server-capable administration nodes and client administration nodes.

client administration node

A node that is installed with the cluster_admin software product, allowing the node to perform cluster administration tasks and contain a copy of the cluster database, but is not capable of coordinating cluster activity and metadata.

client-only node

A node that installed with the cxfs_client.sw.base software product; it does not run cluster administration daemons and is not capable of coordinating cluster activity and metadata. Any node can be client-only node. See also CXFS administration node and client administration node.

CXFS database

See cluster database.

CXFS kernel membership

The group of CXFS nodes that can share filesystems in the cluster, which may be a subset of the nodes defined in a cluster. During the boot process, a node applies for CXFS kernel membership. Once accepted, the node can share the filesystems of the cluster. (Also known as kernel-space membership.) CXFS kernel membership differs from cluster database membership and FailSafe membership. For more information about FailSafe, see IRIS FailSafe Version 2 Administrator's Guide.

CXFS shutdown

The failure action that stops CXFS kernel-based services on the node in response to a loss of CXFS filesystem membership. The surviving cluster delays the beginning of recovery to allow the node time to complete the shutdown.

CXFS tiebreaker node

A node identified as a tiebreaker for CXFS to use in the process of computing CXFS kernel membership for the cluster, when exactly half the nodes in the cluster are up and can communicate with each other. There is no default CXFS tiebreaker. The CXFS tiebreaker differs from the FailSafe tiebreaker; see IRIS FailSafe Version 2 Administrator's Guide.

database

See cluster database.

database membership

See cluster database membership.

details area

The portion of the GUI window that displays details about a selected component in the view area. See also view area .

domain

See cluster domain and local domain.

FailSafe Membership

The group of nodes that are actively sharing resources in the cluster, which may be a subset of the nodes defined in a cluster. FailSafe membership differs from CXFS kernel membership and cluster database membership. For more information about FailSafe, see IRIS FailSafe Version 2 Administrator's Guide.

failure action hierarchy

The set of instructions that determine what happens to a failed node; the second instruction will be followed only if the first instruction fails; the third instruction will be followed only if the first and second fail. The available actions are: I/O fencing , reset, and shutdown.

fencing

See I/O fencing.

fencing recovery

The process of recovery from fencing, in which the affected node automatically withdraws from the CXFS kernel membership, unmounts all file systems that are using an I/O path via fenced HBA(s), and then rejoins the cluster.

fs2d database membership

See cluster database membership.

heartbeat messages

Messages that cluster software sends between the nodes that indicate a node is up and running. Heartbeat messages and control messages are sent through the node's network interfaces that have been attached to a control network.

heartbeat interval

The time between heartbeat messages. The node timeout value must be at least 10 times the heartbeat interval for proper CXFS operation. The higher the number of heartbeats (smaller heartbeat interval), the greater the potential for slowing down the network.

I/O fencing

The failure action that isolates a problem node so that it cannot access I/O devices, and therefore cannot corrupt data in the shared CXFS filesystem. I/O fencing can be applied to any node in the cluster (CXFS clients and metadata servers). The rest of the cluster can begin immediate recovery.

kernel-space membership

See CXFS kernel membership.

local domain

XVM concept in which a filesystem applies only to the local node, not to the cluster. See also cluster domain.

log configuration

A log configuration has two parts: a log level and a log file, both associated with a log group. The cluster administrator can customize the location and amount of log output, and can specify a log configuration for all nodes or for only one node. For example, the crsd log group can be configured to log detailed level-10 messages to the crsd-foo log only on the node foo and to write only minimal level-1 messages to the crsd log on all other nodes.

log file

A file containing notifications for a particular log group. A log file is part of the log configuration for a log group.

log group

A set of one or more CXFS processes that use the same log configuration. A log group usually corresponds to one daemon, such as gcd.

log level

A number controlling the number of log messages that CXFS will write into an associated log group's log file. A log level is part of the log configuration for a log group.

membership

See cluster database membership and CXFS kernel membership.

membership weight

In previous releases, a number (usually 0 or 1) that is assigned to a node for purposes of calculating the CXFS kernel membership quorum. 1 indicates that the node is eligible to be a potential metadata server. In most circumstances, this is no longer set and has been replaced by the node function definition.

membership version

A number associated with a node's cell ID that indicates the number of times the CXFS kernel membership has changed since a node joined the membership.

metadata

Information that describes a file, such as the file's name, size, location, and permissions.

metadata server

The administration node that coordinates updating of metadata on behalf of all nodes in a cluster. There can be multiple potential metadata servers, but only one is chosen to be the active metadata server for any one filesystem.

multiOS

A cluster that is running multiple operating systems, such as IRIX and Solaris.

node

A node is an operating system (OS) image, usually an individual computer. (This use of the term node does not have the same meaning as a node in an SGI Origin 3000 or SGI 2000 system.)

A given node can be a member of only one pool (and therefore) only one cluster.

A node can run the IRIX operating system or another operating system, such as Solaris, as defined in the CXFS MultiOS for CXFS Client-Only Nodes: Installation and Configuration Guide.

See also CXFS administration node, client administration node, client-only node , server-capable administration node, and standby node,

node ID

An integer in the range 1 through 32767 that is unique among the nodes in the pool. If you do not specify a number, CXFS will calculate an ID for you. The default for an IRIX node ID is a 5-digit number based on the machine's serial number and other machine-specific information; it is not sequential. You must supply the node ID for a client running an operating system other than IRIX. You must not change the node ID number after the node has been defined.

node membership

The list of nodes that are active (have CXFS kernel membership) in a cluster.

node timeout

If no heartbeat is received from a node in this period of time, the node is considered to be dead. The node timeout value must be at least 10 times the heartbeat interval for proper CXFS operation.

notification command

The command used to notify the cluster administrator of changes or failures in the cluster and nodes. The command must exist on every node in the cluster.

owner host

A system that can control a node remotely, such as power-cycling the node. At run time, the owner host must be defined as a node in the pool.

owner TTY name

The device file name of the terminal port (TTY) on the owner host to which the system controller is connected. The other end of the cable connects to the node with the system controller port, so the node can be controlled remotely by the owner host.

pool

The pool is the set of nodes from which a particular cluster may be formed. Only one cluster may be configured from a given pool, and it need not contain all of the available nodes. (Other pools may exist, but each is disjoint from the other. They share no node or cluster definitions.)

A pool is formed when you connect to a given node and define that node in the cluster database using the CXFS GUI or cmgr(1M) command. You can then add other nodes to the pool by defining them while still connected to the first node, or to any other node that is already in the pool. (If you were to connect to another node and then define it, you would be creating a second pool).

port password

The password for the system controller port, usually set once in firmware or by setting jumper wires. (This is not the same as the node's root password.)

potential metadata server

A server-capable administration node that is listed in the metadata server list when defining a filesystem; only one node in the list will be chosen as the active metadata server.

quorum

The number of nodes required to form a cluster, which differs according to membership:

  • For CXFS kernel membership:

    • A majority (>50%) of the server-capable nodes in the cluster are required to form an initial membership

    • Half (50%) of the server-capable nodes in the cluster are required to maintain an existing membership

  • For cluster database membership, 50% of the nodes in the pool are required to form and maintain a cluster.

recovery

The process by which the metadata server moves from one node to another due to an interruption in services on the first node. Recovery in this release is supported only on standby nodes.

relocation

The process by which the metadata server moves from one node to another due to an administrative action; other services on the first node are not interrupted. Relocation is disabled in this release.

reset

See serial hardware reset.

server-capable administration node

A node that is installed with the cluster_admin product and is also capable of coordinating cluster activity and metadata.

SAN

Storage area network, a high-speed, scalable network of servers and storage devices that provides storage resource consolidation, enhanced data access/availability, and centralized storage management.

serial hardware reset

The failure action that performs a system reset via a serial line connected to the system controller. This failure action hierarchy choice applies only to IRIX nodes with system controllers; see “Requirements” in Chapter 1.

shutdown

See CXFS shutdown.

snooping

A security breach involving illicit viewing.

split-brain syndrome

A situation in which multiple clusters are formed due to a network partition and the lack of serial hardware reset and/or CXFS tiebreaker capability.

spoofing

A security breach in which one machine on the network masquerades as another.

standby node

A server-capable administration node that is configured as a potential metadata server for a given filesystem, but does not currently run any applications that will use that filesystem.

system controller port

A port sitting on an IRIX node that provides a way to power-cycle the node remotely. Enabling or disabling a system controller port in the cluster database tells CXFS whether it can perform operations on the system controller port. System controller port information is optional for a node in the pool, but is required if the node will be added to a cluster; otherwise resources running on that node never will be highly available.

tiebreaker node

See CXFS tiebreaker node.

user-space membership

See cluster database membership.

view area

The portion of the GUI window that displays components graphically. See also details area.

weight

See membership weight.