Chapter 4. Mutual Exclusion

You use mutual exclusion facilities whenever data is shared by multiple, independent processes or threads. Using such objects as locks (also called mutexes) and semaphores, you can:

In order to share data between processes, you share memory between them. Memory sharing is covered in Chapter 3, “Sharing Memory Between Processes.” When independent processes share access to data in disk files, they can ensure mutual exclusion using file locks, which are covered in Chapter 7, “File and Record Locking.”

This chapter covers the following major topics:

Overview of Mutual Exclusion

IRIX offers five kinds of mutual exclusion, each kind with its limits and advantages:

  • Test-and-set instructions use special instructions in the MIPS CPU to update a memory location in a predictable way.

  • The lock (or mutex) enables processes to enforce serial use of data or code.

  • The condition variable lets a thread give up a lock and sleep until an event happens, then reclaim the lock and continue, all in a single operation.

  • The semaphore lets independent processes manage a countable resource in an orderly way.

  • The barrier lets processes coordinate their initialization.

There is a hierarchy of complexity. Test-and-set instructions are a primitive facility that could be used to implement the others. The lock is a simple object that could be used to implement semaphores and barriers. The semaphore is the most flexible and general facility.

Test-and-Set Instructions

The MIPS instruction architecture includes two instructions designed to let programs update memory from independent processes running concurrently in a multiprocessor.

  • The Load Linked (LL) instruction loads a 32- or 64-bit word from memory and also tags that cache line so that the hardware can recognize any change to memory from any CPU in a multiprocessor.

  • The Store Conditional (SC) instruction stores a 32- or 64-bit word into memory provided that the destination cache line has not been modified. If the cache line has been altered since the LL instruction was used, SC does not update memory and sets a branch condition.

The combination of LL and SC can be used to guarantee that a change to a memory location is effective, even when multiple concurrent CPUs are trying to update the same location. You can use LL and SC only from an assembly language module. However, the IRIX kernel contains a family of services that are implemented using LL/SC, and you can call them from C or C++. These calls are discussed under “Using Test-and-Set Functions”.

Locks

A lock is a small software object that stands for the exclusive right to use some resource. The resource could be the right to execute a section of code, or the right to modify a variable in memory, or the right to read or write in a file, or any other software operation that must by performed serially, by one process at a time. Before using a serial resource, the program claims the lock, and releases the lock when it is done with the resource.

The POSIX standard refers to an object of this kind as a mutex, a contraction of “mutual exclusion” that is a conventional term in computer science. This book uses the simpler word “lock” when discussing locks in general and IRIX locks in particular, and uses “mutex” when discussing POSIX mutexes.

You can use IRIX locks to coordinate between unrelated processes or lightweight processes through an IRIX shared memory arena. You can use POSIX mutexes to coordinate between POSIX threads in a threaded program only (not IRIX processes).

You define the meaning of a lock in terms that are relevant to your program's design. You decide what resources can be used freely at any time, and you decide what resources must be used serially, by one process at a time. You create and initialize a lock for each serial resource.

It is also your job to ensure that locks are used consistently in all parts of the program. Two errors are easy to make. You can forget to claim a lock, so that some part of the program uses a resource freely instead of serializing. Or you can forget to release a lock, so that other processes trying to claim the lock “hang,” or wait forever.

Both of these errors can be hard to find because the symptoms can be intermittent. Most of the time, there is no contention for the use of a shared variable. For example, if one process sometimes fails to claim a lock before updating memory, the program can seem to run correctly for hours (or months) before it suffers precisely the right combination of coincidences that cause two processes to update the variable at the same time.

Semaphores

A semaphore is an integer count that is accessed atomically using two operations that are conventionally called P and V:

  • The P operation (mnemonic deplete) decrements the count. If the result is not negative, the operation succeeds and returns. If the result is negative, the P operation suspends the calling process until the count has been made nonnegative by another process doing a V operation.

  • The V operation (mnemonic revive) increments the count. If this changes the value from negative to nonnegative, one process that is waiting in a P operation is unblocked.

You can use a semaphore in place of a lock, to enforce serial use of resource. You initialize the semaphore to a value of 1. The P operation claims the semaphore, leaving it at 0 so that the next process to do P will be suspended. The V operation releases the semaphore.

You can also use a semaphore to control access to a pool that contains a countable number for resources. For example, say that a buffer pool contains n buffers. A process can proceed if there is at least 1 buffer available in the pool, but if there are no buffers, the process should sleep until at least 1 buffer is returned.

A semaphore, initialized to n, represents the population of the buffer pool. The pool itself might be implemented as a LIFO queue. The right to update the queue anchor (either to remove a buffer or to return one) is a separate resource that is guarded by a lock. The procedure for obtaining a buffer from the pool is as follows:

  1. Perform P on the pool semaphore. When the operation completes, you are assured there is at least one buffer in the pool; and you are also assured that the count representing the buffer you need has been decremented from the semaphore.

  2. Claim the lock that guards the buffer queue anchor. This ensures that there will be no conflict with another process taking or returning a buffer at the same time.

  3. Remove one buffer from the queue, updating the queue anchor. Step 1 assures that the queue is not empty.

  4. Release the lock on the queue anchor.

The procedure for returning a buffer to the pool is as follows:

  1. Claim the lock that guards the buffer queue anchor. This ensures that there will be no conflict with another process taking or returning a buffer at the same time.

  2. Put the returned buffer back on the queue, updating the queue anchor. The queue could be empty at this time.

  3. Release the lock on the queue anchor.

  4. Perform V on the pool semaphore. This announces that at least one additional buffer is now free, and may unblock some process waiting for a buffer.

The same two basic procedures work to allocate any collection of objects. For example, the semaphore could represent the number of open slots in a ring buffer, and the lock could stand for the right to update the ring buffer pointers. (A LIFO queue can be managed without a lock; see “Using Compare-and-Swap”.)

Semaphores created using POSIX functions, and semaphores created by the SVR4 IPC facility, can be used to coordinate IRIX processes or POSIX threads. Semaphores supported by the IRIX IPC facility can be used to coordinate IRIX processes only.

Condition Variables

A condition variable is a software object that represents the occurrence of an event. Typically the event is a software action such as “other thread has supplied needed data.”

Condition variable support is included in the POSIX pthreads library, and can be used only to coordinate among POSIX threads, not between IRIX processes. (See Chapter 13, “Thread-Level Parallelism” for information on the pthread library.)

A thread that wants to wait for an event claims the condition variable, which causes the thread to wait. The thread that recognizes the event signals the condition variable, releasing one or all threads that are waiting for the event.

In the expected mode of use, there is a shared resource that can be depleted. Access to the resource is represented by a mutex. A thread claims the mutex, but then finds that the shared resource is depleted or unready. This thread needs to do three things:

  1. Give up the mutex so that some other thread can renew the shared resource.

  2. Wait for the event that “resource is now ready for use.”

  3. Re-claim the mutex for the shared resource.

These three actions are combined into one action using a condition variable. When a thread claims a condition variable, it must pass a mutex that it owns. The claim releases the mutex, waits, and reclaims the mutex in one operation.

Barriers

Barriers provide a convenient way of synchronizing parallel processes on multiprocessor systems. To understand barriers, think of a time when you planned to go to lunch with other people at your workplace. The group agrees to meet in the lobby of the building. Some of your coworkers reach the lobby early, and others arrive later. One comes running in last, apologizing. When all of you have gathered and you know that everyone is ready, you all leave the building in a group.

A barrier is the software equivalent of the lobby where you waited. A group of processes are going to work on a problem. None should start until all the data has been initialized. However, starting each process is part of the initialization, and they cannot all be started at the same time. Each process must be created; each must join an arena and perhaps open a file; and you cannot predict when they will all be ready. To coordinate them, you create a barrier. Each process, when it is ready to start the main operation, calls barrier(), passing the address of the barrier and the number of processes that will meet. When that many processes have called barrier(), all of them are released to begin execution.

Barriers are part of IRIX IPC and require the use of a shared arena. Barriers cannot be used to coordinate POSIX threads.

POSIX Facilities for Mutual Exclusion

The POSIX real-time extensions (detailed in IEEE standard 1003.1b) include named and unnamed semaphores. The POSIX threads library (detailed in IEEE standard 1003.1c) introduces mutexes and condition variables.

Managing Unnamed Semaphores

An unnamed semaphore is a semaphore object that exists in memory only. An unnamed semaphore can be identified only by its memory address, so it can be shared only by processes or threads that share that memory location.

The functions for creating and freeing unnamed semaphores are summarized in Table 4-1.

Table 4-1. POSIX Functions to Manage Unnamed Semaphores

Function Name

Purpose and Operation

sem_init(3)

Initialize a semaphore object, setting its value and preparing it for use.

sem_destroy(3)

Make a semaphore unusable.

The type of a POSIX semaphore is sem_t, which is declared in the header file semaphore.h. You create an unnamed semaphore by allocating memory for a sem_t variable, either dynamically or statically, and initializing it with sem_init(). The function in Example 4-1 allocates and initializes an unnamed semaphore and returns its address. It returns NULL if there is a failure of either malloc() or sem_init().

Example 4-1. Dynamic Allocation of POSIX Unnamed Semaphore

sem_t * allocUnnSem(unsigned initVal)
{
   sem_t *psem = (sem_t*)malloc(sizeof(sem_t));
   if (psem) /* malloc worked */
   {
      if (sem_init(psem,0,initVal))
      {
         free(psem);
         psem = NULL;
      }
   }
   return psem;
}

The function in Example 4-1 passes the second argument of sem_init(), pshared, as 0, meaning the semaphore can only be used within the current process. A semaphore of this kind can be used to coordinate pthreads in a threaded program.

If you want to use a semaphore to coordinate between IRIX processes with separate address spaces, you must create the semaphore with a nonzero pshared, and place the semaphore in a memory segment that is shared among all processes. This feature is fully supported. However, you should specify pshared as 0 when possible, because nonshared semaphores have higher performance.

Managing Named Semaphores

A named semaphore is named in the filesystem, so it can be opened by any process (subject to access permissions), even when the process does not share address space with the creator of the semaphore. The functions used to create and manage named semaphores are summarized in Table 4-2.

Table 4-2. POSIX Functions to Manage Named Semaphores

Function Name

Purpose and Operation

sem_open(3)

Create or access a named semaphore, returning an address.

sem_close(3)

Give up access to a named semaphore, releasing a file descriptor.

sem_unlink(3)

Permanently remove a named semaphore.

The sem_open() function takes the following arguments:

name

Name of the semaphore in the form of a file pathname.

oflag

Either zero, or O_CREAT, or O_CREAT+O_EXCL.

mode

The access permissions to apply if the semaphore is created.

value

Initial value of the semaphore.


Creating a Named Semaphore

The POSIX standard leaves it to the implementation whether or not a named semaphore is represented by a disk file. The IRIX implementation does create a file to stand for each named semaphore (see “POSIX IPC Name Space”). The file that stands for a semaphore takes up no disk space other than the file node in a directory.

The oflag is used to handle the following cases:

  • Specify 0 to receive an error if the semaphore does not exist; that is, to require that the semaphore must exist.

  • Specify O_CREAT+O_EXCL to receive an error if the semaphore does exist; that is, to require that the semaphore not exist.

  • Specify O_CREAT to have the semaphore created if necessary.

When sem_open() creates a semaphore, it sets the file permissions specified by mode. These permissions control access to a semaphore by UID and GID, just as for a file. (See the open(2) and chmod(2) reference pages.)

When sem_open() creates a semaphore, it sets the initial value to value, or to 0 if value is not specified. Otherwise the value depends on the history of the semaphore since it was created. The value of a semaphore is not preserved over a reboot (the POSIX standard says it is not valid to depend on the value of a semaphore over a reboot).

A named semaphore is opened as a file, and takes up one entry in the file descriptor table for the process. There is no way to convert between the address of the sem_t and the file descriptor number, or vice versa. As a result, you cannot directly pass the semaphore to a function such as fcntl() or chmod().

Closing and Removing a Named Semaphore

When a process stops using a named semaphore, it can close the semaphore, releasing the associated file descriptor slot. This is done with sem_close(). The semaphore name persists in the filesystem, and as long as the system is up, the current semaphore value persists in a table in memory.

To permanently remove a semaphore, use sem_unlink().

Using Semaphores

POSIX named and unnamed semaphores can be used to coordinate the actions of IRIX processes and POSIX threads. They are the only mutual-exclusion objects that can be freely used to coordinate between threaded and unthreaded programs alike. (Message queues can be used between threaded and unthreaded programs also; see Chapter 6, “Message Queues.”)

The functions that operate on semaphores are summarized in Table 4-3.

Table 4-3. POSIX Functions to Operate on Semaphores

Function Name

Purpose and Operation

sem_getvalue(3)

Return a snapshot of the current value of a semaphore.

sem_post(3)

Perform the V operation, incrementing a semaphore and possibly unblocking a waiting process.

sem_trywait(3)

Perform the P operation only if the value of the semaphore is 1 or more.

sem_wait(3)

Perform the P operation, decrementing a semaphore and blocking if it becomes negative.

The abstract operation P is implemented as the sem_wait() function. Use this to decrement a semaphore's value and, if the result is negative, to suspend the calling function until the value is restored. The V operation is sem_post().

You can sample a semaphore's value using sem_getvalue(). The sem_trywait() operation is useful when a process or thread cannot tolerate being suspended.

Using Mutexes and Condition Variables

Two additional types of mutual exclusion are available only within a threaded program, to coordinate the actions of POSIX threads. The mutex is comparable to a lock or to a semaphore initialized to a count of 1. The condition variable provides a convenient way for a thread to give up ownership of a mutex, wait for something to happen, and then reclaim the mutex.

Both of these facilities are covered in detail in Chapter 13, “Thread-Level Parallelism.” See the headings “Mutexes” and “Condition Variables”.

IRIX Facilities for Mutual Exclusion

IRIX supports a wide selection of mutual-exclusion facilities, all tuned for use between processes that run concurrently in a multiprocessor.

Using IRIX Semaphores

Two kinds of semaphores are supported in IRIX IPC: normal and polled. Both are allocated in a shared memory arena (see “IRIX Shared Memory Arenas”).

Creating Normal Semaphores

The functions for managing normal semaphores are summarized in Table 4-4.

Table 4-4. IRIX Functions to Manage Nonpolled Semaphores

Function Name

Purpose and Operation

usnewsema(3P)

Allocate a semaphore in an arena and give it an initial value.

usfreesema(3P)

Release arena memory used by a semaphore (does not release any process waiting on the semaphore).

usinitsema(3P)

Reset a semaphore value and its metering information (does not release any process waiting on the semaphore).

usctlsema(3P)

Set and reset semaphore metering information and other attributes.

usdumpsema(3P)

Dump semaphore metering information to a file.

To allocate a new shared-arena semaphore and set its initial value, call usnewsema(). Use usctlsema() to enable recursive use of the semaphore and to enable the collection of metering information. You can use the metering information to find out whether a semaphore is a bottleneck or not.


Tip: When reading the reference pages cited above, notice that usnewsema() returns the address of a usema_t object, and all the other functions take the address of a usema_t. That is, usema_t represents the type of the semaphore object itself, and you refer to a semaphore by its address. This is different from locks, which are discussed later in this chapter.


Creating Polled Semaphores

A polled semaphore differs from a normal semaphore in the P operation. When decrementing the semaphore value produces a negative number, the calling process is not blocked. Instead, it receives a return code. The process then has to include the address of the semaphore in the list of events passed to poll() (see the poll(2) reference page). The V operation, applied to a polled semaphore, does not release a block process but rather causes a poll() operation to end.

You can use polled semaphores to integrate semaphore handling with other events for which you wait with poll(), such as file operations. You cannot combine the use of normal semaphores with the use of polled devices, since a single process cannot wait in a poll() call and in a uspsema() call at the same time. The functions for creating and controlling polled semaphores are summarized in Table 4-5.

Table 4-5. IRIX IPC Functions for Managing Polled Semaphores

Function Name

Purpose and Operation

usnewpollsema(3P)

Allocate a polled semaphore in an arena and give it an initial value.

usopenpollsema(3P)

Assign a file descriptor to a polled semaphore. The file descriptor can be passed to poll() or select(). This must be done before the semaphore can be used.

usclosepollsema(3P)

Release a file descriptor assigned with usopenpollsema().

usfreepollsema(3P)

Release arena memory used by a polled semaphore and invalidate any file descriptors assigned to it.


Operating on Semaphores

The functions for semaphore operations are summarized in Table 4-6.  

Table 4-6. IRIX IPC Functions for Semaphore Operations

Function Name

Purpose and Operation

uspsema(3P)

Perform the P operation on either type of semaphore.

usvsema(3P)

Perform the V operation on either type of semaphore.

ustestsema(3P)

Return the current (instantaneous) value of a semaphore.

uscpsema(3P)

Perform the P operation only if the resulting count will be nonnegative.

usinitsema(3P)

Reset a semaphore value and its metering information (does not release any process waiting on the semaphore).

usctlsema(3P)

Set and reset semaphore metering information and other attributes.

usdumpsema(3P)

Dump semaphore metering information to a file.

To perform the P operation on a semaphore of either type, use uspsema(). When the decremented semaphore value is nonnegative, the function returns 1. The action when the decremented count would be negative differs between the polled and normal semaphores:

  • When a normal semaphore count remains or becomes negative, the calling process is blocked; the function does not return until the count is nonnegative.

  • When a polled semaphore count remains or becomes negative, the function returns 0 and the calling process must use poll() to find out when it becomes nonnegative.

To perform the V operation on a semaphore of either type, call usvsema().

The uscpsema() function provides a conditional P operation: it performs a P operation on the semaphore only if it can do so without making the value negative. The ustestsema() function returns the current value of the semaphore—which of course is immediately out of date.

The usinitsema() function reinitializes the semaphore to a specified value. Note that if you reinitialize a semaphore on which processes are waiting, the processes continues to wait. You should reinitialize a semaphore only in unusual circumstances.

You can call usctlsema() to enable the keeping of either metering information—cumulative counts of usage—or a history trace. The metering information shows whether a semaphore is a bottleneck in the program's operations. The history trace can be used to analyze bugs.

Using Locks

IRIX locks are implemented differently depending on the hardware architecture of the computer using them. On a multiprocessor computer, locks are busy-wait locks, so the processor continually tries to acquire the lock until it succeeds. This implementation makes sense only on multiprocessor systems, where one processor can release the lock while another processor is “spinning,” trying to acquire the lock. On a uniprocessor, a process waiting to claim a lock is suspended until the lock is released by another process.

Creating and Managing Locks

The functions for creating and controlling locks are summarized in Table 4-7.

Table 4-7. IRIX IPC Functions for Managing Locks

Function Name

Purpose and Operation

usnewlock(3P)

Allocate a lock in a specified arena.

usfreelock(3P)

Release lock memory (does not release any process waiting on the lock).

usinitlock(3P)

Reset a lock and its metering information (does not release any process waiting on the lock).

usctllock(3P)

Fetch and reset semaphore metering information or debugging information.

usdumplock(3P)

Dump lock metering information to a file.

You decide whether the locks in an arena will have metering information or not. You specify this before creating the arena, to usconfig() (see “Initializing Arena Attributes”). When lock metering is enabled, you can retrieve the information about a lock at any time to find out whether a lock is a bottleneck in a program.

Claiming and Releasing Locks

The functions for using locks are summarized in Table 4-8.

Table 4-8. IRIX IPC Functions for Using Locks

Function Name

Purpose and Operation

ussetlock(3P)

Seize a lock, suspending the caller if necessary, until the lock is available.

usunsetlock(3P)

Release a lock, making it available for other processes.

uscsetlock(3P)

Seize a lock if it is available; otherwise return a 1.

uswsetlock(3P)

Seize a lock, suspending the caller if necessary; takes a specified number of spins as an argument.

ustestlock(3P)

Test a lock, returning 0 if it is instantaneously available and 1 if it is not available.



Tip: When reading the reference pages cited above, notice that usnewlock() returns a ulock_t object, which is simply a pointer. All the functions that operate on locks take a ulock_t object—not a pointer to a ulock_t. That is, the ulock_t type represents a handle or reference to a lock, not a lock itself. This differs from the treatment of semaphores, which is described under “Creating Normal Semaphores”.

On uniprocessors, none of the functions us[c,w]setlock() spin; if the lock is available they return immediately, and if it is not, they suspend the calling process and give up the CPU. On multiprocessors, ussettlock() spins for a default number of times before it suspends the process. The function uswsetlock() is the same, but you can specify the number of spins to take before suspending.

A process can call usunsetlock() on a lock that is either not locked or locked by another process. In either case, the lock is unlocked. “Double tripping”—calling a set-lock function twice with the same lock—is also permissible. The caller blocks until another process unsets the lock.

Using Barriers

The functions to manage and use barriers are summarized in Table 4-9.

Table 4-9. IRIX IPC Functions for Barriers

Function Name

Purpose and Operation

new_barrier(3P)

Allocate and initialize a barrier in a specified arena.

free_barrier(3P)

Release the storage associated with a barrier.

barrier(3P)

Wait at a barrier until a specified number of processes have gathered.

init_barrier(3P)

Reinitialize a barrier (does not release any processes waiting).

The main process uses new_barrier() to allocate a barrier in some arena. To use the barrier, each process calls barrier(), passing the number of processes that are supposed to meet before proceeding.


Note: The barrier() function assumes that it is used on a multiprocessor. It always passes time by spinning in an empty loop. When used on a uniprocessor (or when used on a multiprocessor with fewer available CPUs than barrier processes), a call to barrier(n) can be quite inefficient. The waiting functions spin until each in turn uses up its time-slice. In general it is not a good idea to use barrier() except in a multiprocessor with a number of CPUs approximately equal to the number of coordinating processes.


Using Test-and-Set Functions

The C library includes a family of functions that apply the MIPS instructions Load Linked and Store Conditional to modify memory words in a reliable way in a multiprocessor. These functions are detailed in the test_and_set(3) and uscas(3) reference pages. In addition, the MIPSpro C and C++ compilers, version 7.0 and after, contain built-in support for these operations.

Using Test-and-Set

All test-and-set functions solve a similar problem: how to update the contents of a memory word reliably from two or more CPUs concurrently. Use a test-and-set function to avoid the traditional “race” condition. For example, suppose that two or more processes could execute code to increment a variable, as in the C expression ++shared:

  • Process A loads shared into a register and adds 1 to it.

  • Process B loads shared into a register and adds 1 to it.

  • Process A stores the value in memory.

  • Process B stores the value in memory.

The result is to increment shared by 1 when it should be incremented by 2. However, if both processes use test_then_add(&shared,1) instead, they are assured that both increments will occur regardless of timing.

Using Compare-and-Swap

The test-and-set functions are not adequate to do race-free pointer manipulation; you need a compare-and-swap function for that. The C library includes the uscas() and uscas32() functions for this purpose. Use uscas() to work with pointer-sized values (which can be either 32 or 64 bits depending on compile options). Use uscas32() to work with words that should always be 32 bits in every program.

The compare-and-swap functions take four arguments:

destp

Address of the target memory field you want to update.

old

Expected current value of the memory field.

new

Desired new value, based on the expected old value.

u

Address of any IRIX shared memory arena.

The arena address u is not actually used by the functions. However, the functions cannot work until usinit() has been called at least once. Passing an arena address ensures that this has happened.

Use a compare-and-swap function in a loop like the following:

  1. Copy the current value of the target memory field.

  2. Calculate a new value based on that current value.

  3. Use compare-and-swap to install the new value, provided that the current value has not changed during step 2.

  4. If the compare failed so the swap was not done (uscas() returns 0), another process has changed the target: return to step 1 and repeat.

The code in Example 4-2 illustrates how this type of loop can be used to manage a simple LIFO queue.

Example 4-2. Using Compare-and-Swap on a LIFO Queue

#include <ulocks.h>
typedef struct item_s {
    struct item_s *next;
    /* ... other fields ... */
} item_t;
void push_item( item_t **lifo, item_t *new, usptr_t *u)
{
    item_t *old;
    do {
        new->next = old = *lifo;
    } while(0 == uscas(lifo,(ptrdiff_t)old,(ptrdiff_t)new,u));
}
item_t * pull_item( item_t **lifo, usptr_t *u)
{
    item_t *old, *new;  
    do {
        old = *lifo;
        if (!old) break;
        new = old->next;
    } while(0 == uscas(lifo,(ptrdiff_t)old,(ptrdiff_t)new,u));
    return old;
}
#include <stdio.h>
main()
{
    usptr_t *arena = usinit("/var/tmp/cas.arena");
    item_t *lifo = NULL;
    item_t t1, t2;
    item_t *p1, *p2;
    push_item(&lifo, &t1, arena);
    push_item(&lifo, &t2, arena);
    p2 = pull_item(&lifo, arena);
    p1 = pull_item(&lifo, arena);
    printf("%x == %x ?\n", &t1, p1);
    printf("%x == %x ?\n", &t2, p2);
}

In Example 4-2, the push_item() function pushes an item_t onto a LIFO queue, and pull_item() removes and returns the first item_t from a queue. Both use uscas() to update the queue anchor. The main() function contains a unit-test of the functions, first pushing two items, then pulling them off, finally displaying the addresses to verify that what was pushed, could be pulled.

Using Compiler Intrinsics for Test-and-Set

The MIPSpro C and C++ compilers version 7.0 introduce the intrinsic functions summarized in Table 4-10.

Table 4-10. Compiler Intrinsics for Atomic Operations

Intrinsic Prototype

Purpose

Barrier

__op_and_fetch(p,v...)

Atomically execute {*p op = v; *p;}. The op can be add, sub, or, and, xor, and nand.

Full

__fetch_and_op(p,v...)

Atomically execute {t = *p; *p op = v; t;}. The op can be add, sub, or, and, xor, and nand.

Full

__lock_test_and_set(p,v...)

Atomically execute {t = *p; *p = v; t;}.

Backward

__lock_release(p...)

Atomically execute {*p = 0;}.

Forward

__compare_and_swap(p,w,v...)

Atomically execute (w==*p) ?(*p=v, 1): 0.

Full

__synchronize(...)

Issue the MIPS-3 instruction sync to synchronize the cache with memory.

Full

Each of the compiler intrinsics except __synchronize() causes the compiler to generate inline code using Load Linked and Store Conditional to update memory predictably. In this respect they are similar to the library functions documented in the test_and_set(3) and uscas(3) reference pages. For example, the statement

__add_and_fetch(&shared,1);

is functionally equivalent to the library call

test_then_add(&shared,1);

The compiler intrinsic __compare_and_swap() is simpler to use than uscas() since you do not have to create a shared memory arena first, and avoids the overhead of a system call.

The compiler intrinsics are different from the library functions, and different from an assembly language subroutine you might write, in one important way. The optimizer phases of the compiler recognize these intrinsics as barriers to code motion. The “Barrier” column in Table 4-10 shows this effect. For example, the compiler cannot move code in either direction across s use of __compare_and_swap(). However, it can move code backward (but not forward) across __lock_test_and_set().

You can make the code motion barrier explicit or general. If you invoke __compare_and_swap() passing only the pointer and two value arguments, the compiler can move no code across that source line. Alternatively, you can list specific variables as additional arguments to __compare_and_swap() (this is why the functions are shown as having a variable number of arguments). When you do so, the compiler cannot move assignments to the named variables across this point, but can move assignments to other variables, if the optimizer needs to.

System V Facilities for Mutual Exclusion

The System V Release 4 (SVR4) semaphore facility lets you create persistent semaphores that can be used to coordinate any processes or threads. The SVR4 facility differs from POSIX named semaphores in two ways:

  • Each object is a set of from 1 to 25 independent semaphores, rather than a single semaphore. A process can operate on any selection of semaphores in a set in one system call.

  • You can use SVR4 semaphores in ways that IRIX and POSIX do not support: incrementing or decrementing by more than 1, and waiting for a zero value.

  • The name of a set is an integer in a kernel table, rather than a pathname in the filesystem (see “SVR4 IPC Name Space”).

The functions used to create and operate on semaphore sets are summarized in Table 4-11.

Table 4-11. SVR4 Semaphore Management Functions

Function Name

Purpose and Operation

semget(2)

Create a semaphore set, or return the ID of a semaphore set.

semctl(2)

Query or change semaphore values; query or change semaphore set attributes.

semop(2)

Perform operations on one or more semaphores in a set.

Semaphores are also discussed in the intro(2) reference page. You can display semaphore sets from the command line using ipcs, and remove them with ipcrm (see the ipcs(1) and ipcr(1) reference pages).

Creating or Finding a Semaphore Set

A process creates a semaphore set, or locates an existing set, using the semget() system function. The function creates a set only if the specified key is IPC_PRIVATE, or no set with that key exists, and the IPC_CREAT flag is used. When it creates a set, the arguments to the function establish

  • the numeric key of the set

  • the number of semaphores in the set, from 1 to 25

  • the access permissions to the set

In addition, the effective user ID and group ID of the calling process become the creator and owner identification of the new semaphore set. (See “Example Uses of semget()” for example code.)

When semget() locates an existing set, access is controlled by the access permissions of the set and by the user ID and group ID of the calling process.

The value returned by semget() is the ID number of the semaphore set. It is used to identify the segment to other functions.

Managing Semaphore Sets

The semctl() function gives you the ability to get information about a semaphore set, or to modify its attributes. These operations are summarized in Table 4-12.

Table 4-12. SVR4 Semaphore Set Management Operations

Keyword

Operation

Can Be Used By

IPC_STAT

Get information about the set.

Any process having read access.

IPC_SET

Set owner UID, owner GID, or access permissions.

Creator UID, owner UID, or superuser.

IPC_RMID

Remove the set from the IPC name space.

Creator UID, owner UID, or superuser.

GETALL

Copy current values of all semaphores to an array.

Any process having read access.

SETALL

Set current values of all semaphores from an array of integers.

Any process having write access.

Examples of some of these uses can be found under “Example Uses of semctl() for Management”.

In addition, semctl() allows you to query or set information about individual semaphores within the set, as summarized in Table 4-13.

Table 4-13. SVR4 Semaphore Management Operations

Keyword

Operation

Can Be Used By

GETVAL

Return value of one semaphore.

Any process having read access.

GETPID

Return process ID of the process that last operated on a semaphore.

Any process having read access.

GETNCNT

Return number of processes waiting for one semaphore to exceed zero

Any process having read access.

GETZCNT

Return number of processes waiting for one semaphore to equal zero.

Any process having read access.

SETVAL

Set current value of one semaphores.

Any process having write access.

Examples of some of these uses can be seen under “Example Uses of semctl() for Query”.


Caution: Some operations of the semctl() function use only three arguments, but some operations require a fourth argument (see reference page semctl(2) for details). When passing a fourth argument to semctl(), it is extremely important that you pass a union semun, as specified in the reference page. You might look at the contents of the union and think that, since all its fields are addresses, there is no effective difference between passing a union and passing a plain address of a buffer or array. However, if your program is compiled with the -n32 or -64 options, the alignment of the two kinds of arguments is different. Always pass an address as shown in the example programs in this chapter:


union semun arg4;
...
arg4.buffer = &ds_buffer;
semctl(a,b,c,arg4);

If your program passes only the address, as in

semctl(a,b,c,&ds_buffer);

the code will not work correctly when compiled -n32 or -64.

Using Semaphore Sets

You perform operations on the semaphores in a set by calling semop(). This function takes a semaphore set ID, and an array of one or more semaphore operation structures. Each of the operation structures specifies the following:

  • The index of a semaphore in the set, numbering the semaphores from 0

  • A number specifying one of three operations:

    • Zero, meaning to test the semaphore for equality to 0.

    • A positive number such as 1, meaning to increment the semaphore value, possibly releasing waiting processes or threads (the V operation).

    • A negative number such as -1, meaning to decrement the semaphore value when that can be done without making it negative (the P operation).

  • A flag word that can specify these flags:

    • IPC_NOWAIT, do not suspend but return an error if the Zero test fails or the P operation cannot be done.

    • SEM_UNDO, undo this operation if it succeeds but an operation later in the array should fail.

In the simplest case, you pass an array containing just one operation, to increment or decrement one semaphore by 1 (the traditional V or P operation). Used this way, a semaphore in a set is functionally the same as an IRIX or POSIX semaphore.

SVR4 semaphores permit additional operations not available with IRIX or POSIX semaphores. The negative or positive value in the operation structure is not required to be 1, so you can increment or decrement a semaphore by more than 1 in an operation. The wait-for-zero operation allows one process or thread to monitor the state of a semaphore, independent of the P and V operations performed on the semaphore by other processes or threads.

You can also perform a sequence of operations—a sequence of P, or V, or zero-wait operations, or a mix of operation types—on multiple semaphores in a single call. To do this, you specify an array containing more than one operation structure. The semop() function performs each operation in sequence.

You can use this feature, for example, to claim multiple resources, each represented by a different semaphore. Your array would specify the P operation on each of the semaphores in sequence. When semop() returns successfully, you own all the resources. A similar, multiple V operation returns all the resources at once.

The IPC_NOWAIT and SEM_UNDO flags are important when claiming multiple resources at once. Specify SEM_UNDO on all operations; and specify IPC_NOWAIT on all but the first one. If the second or later resource is unavailable, semop() restores all preceding claims and returns an error code. As long as all processes or threads operate on semaphores in the same order, this logic prevents deadlocks, and it avoids long, fruitless suspensions.

Example Programs

The programs in this section allow you to experiment with semaphore sets from the command line:

  • Example 4-3 on page 102 can be used to experiment with semget(), creating semaphore sets with different sizes and permissions.

  • Example 4-4 on page 104 can be used to test semctl() for displaying and changing owner IDs and permissions.

  • Example 4-5 on page 106 can be used to test semctl() for sampling the values of semaphores, or to display the state of a semaphore set.

  • Example 4-6 on page 108 can be used to test semop() for single or multiple operations.

Example Uses of semget()

The program in Example 4-3, semget, invokes semget() with arguments you specify on the command line:

-k key

Numeric key to identify the semaphore set, required; for example -k 99. Default is IPC_PRIVATE.

-p perms

Access permissions to apply to a created set; for example, -p 0664. Default is octal 0600.

-s setsize

Number of semaphores in a created set; for example -s 8. The limit is 25, but feel free to experiment with larger numbers to see the return code.

-c

Use IPC_CREAT. No set is created unless this is specified.

-x

Use IPC_EXCL. Use with -c to require that a set not exist.


Example 4-3. Program to Demonstrate semget()

/*
|| semget: program to test semget(2) for creating semaphores.
||      semget [-k <key>] [-c] [-x] [-p <perms>] [-s <setsize>]
||          -k <key>        the key to use, default == 0 == IPC_PRIVATE
||          -p <perms>      permissions to use, default is 0666
||          -s <setsize>    size to use, default is 1
||          -c              use IPC_CREAT
||          -x              use IPC_EXCL
*/
#include <unistd.h> /* for getopt() */
#include <sys/sem.h>    /* for shmget etc */
#include <errno.h>  /* errno and perror */
#include <stdio.h>
int main(int argc, char **argv)
{
    key_t key = IPC_PRIVATE;/* key */
    int nsems = 1;          /* setsize */
    int perms = 0600;       /* permissions */
    int semflg = 0;         /* flag values */
    struct semid_ds ds;     /* info struct */
    union semun arg4;       /* way to pass &ds properly aligned */
    int c, semid;
    while ( -1 != (c = getopt(argc,argv,"k:p:s:xc")) )
    {
        switch (c)
        {
        case 'k': /* key */
            key = (key_t) strtoul(optarg, NULL, 0);
            break;
        case 's': /* setsize */
            nsems = (int) strtoul(optarg, NULL, 0);
            break;
        case 'p': /* permissions */
            perms = (int) strtoul(optarg, NULL, 0);
            break;
        case 'c':
            semflg |= IPC_CREAT;
            break;
        case 'x':
            semflg |= IPC_EXCL;
            break;
        default: /* unknown or missing argument */
            return -1;      
        }
    }
    semid = semget(key,nsems,semflg+perms);
    if (-1 != semid)
    {
        printf("semid = %d\n",semid);
        arg4.buf = &ds;
        if (-1 != semctl(semid,0,IPC_STAT,arg4))
        {
            printf(
    "owner uid.gid: %d.%d  creator uid.gid: %d.%d  mode: 0%o nsems:%d\n",
            ds.sem_perm.uid,ds.sem_perm.gid,
            ds.sem_perm.cuid,ds.sem_perm.cgid,
            ds.sem_perm.mode, ds.sem_nsems);
        }
        else
            perror("semctl(IPC_STAT)");
    }
    else
        perror("semget()");
    return errno;
} 


Example Uses of semctl() for Management

The program in Example 4-4, semmod, allows you to call semctl() from the command line to display the size, permissions, and owner and creator IDs of a semaphore set, and to change the permissions and owner. It takes the following arguments on the command line:

-k key

Numeric key to identify the semaphore set; for example -k 99.

-i id

Semaphore ID number, alternative to specifying the key.

-p perms

Access permissions to apply to the selected set; for example, -p 0664.

-u uid

New user ID for the semaphore owner.

-g gid

New group ID for the semaphore owner.

If only the key or ID is given, the program only displays the state of the set. When you specify permissions, owner, or group, the program first queries the current information to initialize an information structure. Then it inserts the new items you specified, and calls semctl() with IPC_SET to change the information.

Example 4-4. Program to Demonstrate semctl() for Management

/*
|| semmod: program to test semctl(2) for status, ownership and permissions.
||      semmod {-k <key> | -i <semid>} [-p <perms>] [-u <user>] [-g <group>]
||          -k <key>            the key to use, or..
||          -i <semid>          ..the semid to use
||          -p <perms>          permissions to set with IPC_SET
||          -u <uid>            uid to set as owner with IPC_SET
||          -g <gid>            gid to set as owner with IPC_SET
*/
#include <unistd.h> /* for getopt() */
#include <sys/sem.h> /* for shmget etc */
#include <errno.h> /* errno and perror */
#include <stdio.h>
int main(int argc, char **argv)
{
    key_t key;              /* key */
    int semid = -1;         /* object ID */
    int perms, popt = 0;    /* perms to set, if given */
    int uid, uopt = 0;      /* uid to set, if given */
    int gid, gopt = 0;      /* gid to set, if given */
    int val, vopt = 0;      /* setall value if given */
    struct semid_ds ds;
    union semun arg4;       /* way to pass semctl 4th arg, properly aligned */
    int c;
    while ( -1 != (c = getopt(argc,argv,"k:i:p:u:g:")) )
    {
        switch (c)
        {
        case 'k': /* key */
            key = (key_t) strtoul(optarg, NULL, 0);
            break;
        case 'i': /* semid */
            semid = (int) strtoul(optarg, NULL, 0);
            break;
        case 'p': /* permissions */
            perms = (int) strtoul(optarg, NULL, 0);
            popt = 1;
            break;
        case 'u': /* uid */
            uid = (int) strtoul(optarg, NULL, 0);
            uopt = 1;
            break;
        case 'g': /* gid */
            gid = (int) strtoul(optarg, NULL, 0);
            gopt = 1;
            break;
        default: /* unknown or missing argument */
            return -1;      
        }
    }
    if (-1 == semid) /* -i not given, must have -k */
        semid = semget(key,0,0);
    if (-1 != semid)
    {
        arg4.buf = &ds;
        if (0 == semctl(semid,0,IPC_STAT,arg4))
        {
            if ((popt)||(uopt)||(gopt))
            {
                if (popt) ds.sem_perm.mode = perms;
                if (uopt) ds.sem_perm.uid = uid;
                if (gopt) ds.sem_perm.gid = gid;
                if (0 == semctl(semid,0,IPC_SET,arg4) )
                    semctl(semid,0,IPC_STAT,arg4); /* refresh info */
                else
                    perror("semctl(IPC_SET)");
            }
            printf(
    "owner uid.gid: %d.%d  creator uid.gid: %d.%d  mode: 0%o nsems:%d\n",
            ds.sem_perm.uid,ds.sem_perm.gid,
            ds.sem_perm.cuid,ds.sem_perm.cgid,
            ds.sem_perm.mode, ds.sem_nsems);
        }
        else
            perror("semctl(IPC_STAT)");
    }
    else
        perror("semget()");
}


Example Uses of semctl() for Query

The program in Example 4-5, semsnap, displays a snapshot of the current values of all semaphores in a set you specify. The value of each semaphore is displayed in the first row (GETVAL), followed by the count of processes waiting in a P operation (GETNCNT) and the count of processes waiting for zero (GETZCNT). The arguments are as follows:

-k key

Numeric key to identify the semaphore set; for example -k 99.

-i id

Semaphore ID number, alternative to specifying the key.


Example 4-5. Program to Demonstrate semctl() for Sampling

/*
|| semsnap: program to test semctl(2) for semaphore status commands
||      semsnap {-k <key> | -i <semid>}
||          -k <key>            the key to use, or..
||          -i <semid>      ..the semid to use
*/
#include <unistd.h> /* for getopt() */
#include <sys/sem.h>    /* for shmget etc */
#include <errno.h>  /* errno and perror */
#include <stdio.h>
int main(int argc, char **argv)
{
    key_t key;                  /* key */
    int semid = -1;         /* object ID */
    int nsems, j;               /* setsize, and loop variable */
    ushort_t semvals[25];   /* snapshot of values */
    ushort_t semns[25];     /* snapshot of P-waiting */
    ushort_t semzs[25];     /* snapshot of zero-waiting */  
    struct semid_ds ds;
    union semun arg4;       /* semctl 4th argument, properly aligned */
    int c;
    while ( -1 != (c = getopt(argc,argv,"k:i:")) )
    {
        switch (c)
        {
        case 'k': /* key */
            key = (key_t) strtoul(optarg, NULL, 0);
            break;
        case 'i': /* semid */
            semid = (int) strtoul(optarg, NULL, 0);
            break;
        default: /* unknown or missing argument */
            return -1;      
        }
    }
    if (-1 == semid) /* -i not given, must have -k */
        semid = semget(key,0,0);
    if (-1 != semid)
    {
        arg4.buf = &ds;
        if (0 == semctl(semid,0,IPC_STAT,arg4))
        {
            nsems = ds.sem_nsems;
            arg4.array = semvals;
            semctl(semid,0,GETALL,arg4);
            for (j=0; j<nsems; ++j)
            {
                semns[j] = semctl(semid,j,GETNCNT);
                semzs[j] = semctl(semid,j,GETZCNT);
            }
            printf("vals:");
            for (j=0; j<nsems; ++j) printf(" %2d",semvals[j]);
            printf("\nncnt:");
            for (j=0; j<nsems; ++j) printf(" %2d",semns[j]);
            printf("\nzcnt:");
            for (j=0; j<nsems; ++j) printf(" %2d",semzs[j]);
            putc('\n',stdout);
        }
        else
            perror("semctl(IPC_STAT)");
    }
    else
        perror("semget()");
}


Example Uses of semop()

The program in Example 4-6, semop, performs one or more semaphore operations on a set you specify. You can use it to specify any sequence of operations (including nonsensical sequences) from the command line. The command arguments are:

-k key

Numeric key to identify the semaphore set; for example -k 99.

-i id

Semaphore ID number, alternative to specifying the key.

-n

Apply IPC_NOWAIT to all following operations.

-u

Apply SEM_UNDO to all following operations.

-p sem

Apply the P (decrement by 1) operation to sem; for example, -p 1.

-v sem

Apply the V (increment by 1) operation to sem; for example, -v 1.

-z sem

Wait for sem to contain 0; for example, -z 4.

You can give a sequence of operations. For example, consider the following sequence:

  1. Wait for zero in semaphore 4.

  2. Increment semaphore 0, with undo if a following operation fails.

  3. Decrement semaphore 2, not waiting and with undo.

  4. Decrement semaphore 3, not waiting and with undo.

The sequence above can be specified as follows:

semop -k 0x101 -z 4 -u -v 0 -n -p 2 -p 3

The program does not support incrementing or decrementing by other than 1, and there is no way to turn off IPC_NOWAIT or SEM_UNDO once it is on.

Example 4-6. Program to Demonstrate semop()

/*
|| semop: program to test semop(2) for all functions.
||      semop {-k <key> | -i <semid>} [-n] [-u] {-p <n> | -v <n> | -z <n>}...
||          -k <key>        the key to use, or..
||          -i <semid>      ..the semid to use
||          -n              use the IPC_NOWAIT flag on following ops
||          -u              use the SEM_UNDO flag on following ops
||          -p <n>          do the P operation (+1) on semaphore <n>
||          -v <n>          do the V operation (-1) on semaphore <n>
||          -z <n>          wait for <n> to become zero
*/
#include <unistd.h>     /* for getopt() */
#include <sys/sem.h>    /* for shmget etc */
#include <errno.h>      /* errno and perror */
#include <stdio.h>
int main(int argc, char **argv)
{
    key_t key;                  /* key */
    int semid = -1;             /* object ID */
    int nsops = 0;              /* setsize, and loop variable */
    short flg = 0;              /* flag to use on all ops */
    struct semid_ds ds;
    int c, s;
    struct sembuf sops[25];
    while ( -1 != (c = getopt(argc,argv,"k:i:p:v:z:nu")) )
    {
        switch (c)
        {
        case 'k': /* key */
            key = (key_t) strtoul(optarg, NULL, 0);
            break;
        case 'i': /* semid */
            semid = (int) strtoul(optarg, NULL, 0);
            break;
        case 'n': /* use nowait */
            flg |= IPC_NOWAIT;
            break;
        case 'u': /* use undo */
            flg |= SEM_UNDO;
            break;
        case 'p': /* do the P() */
            sops[nsops].sem_num = (ushort_t) strtoul(optarg, NULL, 0);
            sops[nsops].sem_op = -1;
            sops[nsops++].sem_flg = flg;
            break;
        case 'v': /* do the V() */
            sops[nsops].sem_num = (ushort_t) strtoul(optarg, NULL, 0);
            sops[nsops].sem_op = +1;
            sops[nsops++].sem_flg = flg;
            break;
        case 'z': /* do the wait-for-zero */
            sops[nsops].sem_num = (ushort_t) strtoul(optarg, NULL, 0);
            sops[nsops].sem_op = 0;
            sops[nsops++].sem_flg = flg;
            break;
        default: /* unknown or missing argument */
            return -1;      
        }
    }
    if (-1 == semid) /* -i not given, must have -k */
        semid = semget(key,0,0);
    if (-1 != semid)
    {
        if (0 != semop(semid,sops,nsops) )
            perror("semop()");
    }
    else
        perror("semget()");
}


Using the Examples

The following commands demonstrate the use of the example programs. First, a semaphore set is created by semget and its existence verified with ipcs:

$ ipcs -s
IPC status from /dev/kmem as of Wed Jun 19 11:19:37 1996
T     ID     KEY        MODE       OWNER    GROUP
Semaphores:
$ semget -k 0xfab -c -x -p 0666 -s 4
semid = 130
owner uid.gid: 1110.20  creator uid.gid: 1110.20  mode: 0100666 nsems:4
$ ipcs -s
IPC status from /dev/kmem as of Wed Jun 19 11:19:59 1996
T     ID     KEY        MODE       OWNER    GROUP
Semaphores:
s    130 0x00000fab --ra-ra-ra-  cortesi     user

The effect of the IPC_EXCL flag is tested:

$ semget -k 0xfab -c -x
semget(): File exists

The permissions are changed using semmod:

$ semmod -i 130 -p 0640
owner uid.gid: 1110.20  creator uid.gid: 1110.20  mode: 0100640 nsems:4
$ ipcs -s
IPC status from /dev/kmem as of Wed Jun 19 11:20:09 1996
T     ID     KEY        MODE       OWNER    GROUP
Semaphores:
s    130 0x00000fab --ra-r-----  cortesi     user

The present state of the four semaphores in the set is displayed, then semop is used to increment the first two.

$ semsnap -i 130
vals:  0  0  0  0
ncnt:  0  0  0  0
zcnt:  0  0  0  0
$ semop -i 130 -v 0 -v 1
$ semsnap -i 130
vals:  1  1  0  0
ncnt:  0  0  0  0
zcnt:  0  0  0  0

One instance of semop is started in the background to wait on a sequence of operations. The semsnap display verifies that one process is waiting on zero in semaphore 0:

$ semop -i 130 -z 0 -p 1 -p 2 &
9956
$ semsnap -i 130
vals:  1  1  0  0
ncnt:  0  0  0  0
zcnt:  1  0  0  0

Semaphore 0 is decremented, and semsnap reveals that there is no longer a process waiting for zero in that semaphore, but that now a process is waiting for semaphore 2 to be incremented:

$ semop -i 130 -p 0
$ semsnap -i 130
vals:  0  1  0  0
ncnt:  0  0  1  0
zcnt:  0  0  0  0

Semaphore 2 is incremented and now there are no processes waiting:

$ semop -i 130 -v 2
$ semsnap -i 130
vals:  0  0  0  0
ncnt:  0  0  0  0
zcnt:  0  0  0  0

Another process is put in the background waiting on semaphore 0. Then the semaphore set is removed with ipcrm. The waiting instance of semop ends, displaying the error code from semop():

$ semop -i 130 -p 0 &
9962
$ ipcrm -s 130
$ semop(): Identifier removed