Chapter 4. OpenMP Fortran API Multiprocessing Directives

This chapter provides an overview of the supported multiprocessing directives. These directives are based on the OpenMP Fortran application program interface (API) standard. Programs that use these directives are portable and can be compiled by other compilers that support the OpenMP standard.

The complete OpenMP standard is available at http://www.openmp.org/specs. See that documentation for complete examples, rules of usage, and restrictions. This chapter provides only an overview of the supported directives and does not give complete details about usage or restrictions.

To enable recognition of the OpenMP directives, specify -mp on the f90(1) command line. The -mp option must be specified in order for the compiler to honor any -MP:... options that may also be specified on the command line. The -MP:open_mp=ON option is on by default and must be in effect during compilation.

The following example command line can compile program ompprg.f , which contains OpenMP Fortran API directives:

f90 -mp ompprg.f

In addition to directives, the OpenMP Fortran API describes several library routines and environment variables. Information on these other utilities can be found in the following locations:

Programming Utility 

Information Location

Command line information 

For information on the -mp option, and the -MP: option, see the f90 man page.

Library routines 

omp_lock(3), omp_nested(3), and omp_threads(3) man pages

Environment variables 

pe_environ(5) man page


Note: If individual loops in your program contain both OpenMP directives and extensions (prefixed with !$OMP or !$SGI) and any of the outmoded multiprocessing directives (prefixed with !$ or !$PAR), you must specify the set of directives that the compiler should use. To direct the compiler to ignore the OpenMP directives, compile with -MP:open_mp=OFF. To direct the compiler to ignore the outmoded multiprocessing directives, compile with -MP:old_mp=OFF. To direct the compiler to ignore the outmoded distributed shared memory directives, specify -MP:dsm=OFF.



Note: The SGI multiprocessing directives, including the Origin series distributed shared memory directives, are outmoded. Their preferred alternatives are the OpenMP Fortran API directives described in this chapter.


Using Directives

All multiprocessing directives are case-insensitive and are of the following form:

prefix directive [clause[[,] clause]...]
prefix

Each directive begins with a prefix, and the prefixes you can use depend on your source form, as follows:

  • If you are using fixed source form, the following prefixes can be used: !$OMP, C$OMP, or *$OMP.

    Prefixes must start in column one and appear as a single word with no intervening white space. Fortran fixed form line length, case sensitivity, white space, continuation, and column rules apply to the directive line.

  • If you are using free source form, the following prefix can be used: !$OMP.

    A prefix can appear in any column as long as it is preceded only by white space. It must appear as a single word with no intervening white space. Fortran free form line length, case sensitivity, white space, and continuation rules apply to the directive line.

directive

The name of the directive.

clause

One or more directive clauses. Clauses can appear in any order after the directive name and can be repeated as needed, subject to the restrictions listed in the description of each clause.

Directives cannot be embedded within continued statements, and statements cannot be embedded within directives. Comments cannot appear on the same line as a directive.

In fixed source form, initial directive lines must have a space or zero in column six, and continuation directive lines must have a character other than a space or a zero in column six.

In free source form, initial directive lines must have a space after the prefix. Continued directive lines must have an ampersand as the last nonblank character on the line. Continuation directive lines can have an ampersand after the directive prefix with optional white space before and after the ampersand.

Example 4-1. OpenMP fixed source form

The following formats for specifying directives are equivalent (the first line represents the position of the first 9 columns):

C23456789
!$OMP PARALLEL DO SHARED(A,B,C)

C$OMP PARALLEL DO
C$OMP+SHARED(A,B,C)

C$OMP PARALLELDOSHARED(A,B,C)


Example 4-2. OpenMP free source form

The following formats for specifying directives are equivalent (the first line represents the position of the first 9 columns):

!23456789
       !$OMP PARALLEL DO &
                 !$OMP SHARED(A,B,C)

!$OMP PARALLEL &
       !$OMP&DO SHARED(A,B,C)

      !$OMP PARALLEL DO SHARED(A,B,C)

One or more blanks or tabs must be used to separate adjacent keywords in directives in free source form, except in the following cases where white space is optional between the keywords:

END CRITICAL
END DO
END MASTER
END ORDERED
END PARALLEL
END SECTIONS
END SINGLE
END WORKSHARE
PARALLEL DO
PARALLEL SECTIONS
PARALLEL WORKSHARE



Note: In order to simplify the presentation, the remainder of this chapter uses the !$OMP prefix in all syntax descriptions and examples.

Comments are allowed inside directives. Comments can appear on the same line as a directive. In free source form, the exclamation point initiates a comment; in fixed source form, it initiates a comment when it appears after column 6. Regardless of form, the comment extends to the end of the source line and is ignored. If the first nonblank character after the initial prefix (or after a continuation directive line in fixed source form) is an exclamation point, the line is ignored.

Conditional Compilation

Fortran statements can be compiled conditionally as long as they are preceded by one of the following conditional compilation prefixes: !$, C$, or *$. The prefix must be followed by a Fortran statement on the same line. During compilation, the prefix is replaced by two spaces, and the rest of the line is treated as a normal Fortran statement.

Your program must be compiled with the -mp option in order for the compiler to honor statements preceded by conditional compilation prefixes; without the mp command line option, statements preceded by conditional compilation prefixes are treated as comments.

You must define the _OPENMP symbol to be used for conditional compilation. This symbol is defined during OpenMP compilation to have the decimal value YYYYMM where YYYY and MM are the year and month designators of the version of the OpenMP Fortran API is supported.

The !$ prefix is accepted when compiling either fixed source form files or free source form files. The C$ and *$ prefixes are accepted only when compiling fixed source form. The source form you are using also dictates the following:

  • In fixed source form, the prefixes must start in column one and appear as a single word with no intervening white space. Fortran fixed form line length, case sensitivity, white space, continuation, and column rules apply to the line. Initial lines must have a space or zero in column six, and continuation lines must have a character other than a space or zero in column six.

  • In free source form, the !$ prefix can appear in any column as long as it is preceded only by white space. It must appear as a single word with no intervening white space. Fortran free source form line length, case sensitivity, white space, and continuation rules apply to the line. Initial lines must have a space after the prefix. Continued lines must have an ampersand as the last nonblank character on the line prior to any comment appearing on the conditionally compiled line. Continuation lines can have an ampersand after the prefix, with optional white space before and after the ampersand.

Parallel Region Constructs

The PARALLEL and END PARALLEL directives define a parallel region. A parallel region is a block of code that is to be executed by multiple threads in parallel. This is the fundamental OpenMP parallel construct that starts parallel execution.

The END PARALLEL directive denotes the end of the parallel region. There is an implied barrier at this point. Only the master thread of the team continues execution past the end of a parallel region.

Work-sharing Constructs

A work-sharing construct divides the execution of the enclosed code region among the members of the team that encounter it. A work-sharing construct must be enclosed within a parallel region in order for the directive to execute in parallel. When a work-sharing construct is not enclosed dynamically within a parallel region, it is treated as though the thread that encounters it were a team of size one. The work-sharing directives do not launch new threads, and there is no implied barrier on entry to a work-sharing construct.

The following restrictions apply to the work-sharing directives:

  • Work-sharing constructs and BARRIER directives must be encountered by all threads in a team or by none at all.

  • Work-sharing constructs and BARRIER directives must be encountered in the same order by all threads in a team.

If NOWAIT is specified on the END DO, END SECTIONS, END SINGLE, or END WORKSHARE directive, an implementation may omit any code to synchronize the threads at the end of the worksharing construct. In this case, threads that finish early may proceed straight to the instructions following the work-sharing construct without waiting for the other members of the team to finish the work-sharing construct.

The following list summarizes the work-sharing constructs:

  • The DO directive specifies that the iterations of the immediately following DO loop must be divided among the threads in the parallel region. If there is no enclosing parallel region, the DO loop is executed serially.

    The loop that follows a DO directive cannot be a DO WHILE or a DO loop without loop control. If an END DO directive is not specified, it is assumed at the end of the DO loop.

  • The SECTIONS directive specifies that the enclosed sections of code are to be divided among threads in the team. It is a noniterative work-sharing construct. Each section is executed once by a thread in the team.

    Each section must be preceded by a SECTION directive, though the SECTION directive is optional for the first section. The SECTION directives must appear within the lexical extent of the SECTIONS/END SECTIONS directive pair. The last section ends at the END SECTIONS directive. Threads that complete execution of their sections wait at a barrier at the END SECTIONS directive unless a NOWAIT is specified.

  • The SINGLE directive specifies that the enclosed code is to be executed by only one thread in the team. Threads in the team that are not executing the SINGLE directive wait at the END SINGLE directive unless NOWAIT is specified.

  • The WORKSHARE directive divides the work of executing the enclosed code into separate units of work, and causes the threads of the team to share the work of executing the enclosed code such that each unit is executed only once. The units of work may be assigned to threads in any manner as long as each unit is executed exactly once.

Combined Parallel Work-sharing Constructs

The combined parallel work-sharing constructs are shortcuts for specifying a parallel region that contains only one work-sharing construct. The semantics of these directives are identical to that of explicitly specifying a PARALLEL directive followed by a single work-sharing construct.

The following list describes the combined parallel work-sharing directives:

  • The PARALLEL DO directive provides a shortcut form for specifying a parallel region that contains a single DO directive.

    If the END PARALLEL DO directive is not specified, the PARALLEL DO is assumed to end with the DO loop that immediately follows the PARALLEL DO directive. If used, the END PARALLEL DO directive must appear immediately after the end of the DO loop.

    The semantics are identical to explicitly specifying a PARALLEL directive immediately followed by a DO directive.

  • The PARALLEL SECTIONS/END PARALLEL SECTIONS directives provide a shortcut form for specifying a parallel region that contains a single SECTIONS directive. The semantics are identical to explicitly specifying a PARALLEL directive immediately followed by a SECTIONS directive.

  • The PARALLEL WORKSHARE/END PARALLEL WORKSHARE directive provides a shortcut form for specifying a parallel region that contains a single WORKSHARE directive. The semantics are identical to explicitly specifying a PARALLEL directive immediately followed by a WORKSHARE directive.

Synchronization Constructs

The following list describe the synchronization constructs:

  • The code enclosed within MASTER and END MASTER directives is executed by the master thread.

  • The CRITICAL and END CRITICAL directives restrict access to the enclosed code to one thread at a time.

    A thread waits at the beginning of a critical section until no other thread is executing a critical section with the same name. All unnamed CRITICAL directives map to the same name. Critical section names are global entities of the program. If a name conflicts with any other entity, the behavior of the program is unspecified.

  • The BARRIER directive synchronizes all the threads in a team. When it encounters a barrier, a thread waits until all other threads in that team have reached the same point.

  • The ATOMIC directive ensures that a specific memory location is updated atomically, rather than exposing it to the possibility of multiple, simultaneous writing threads.

  • The FLUSH directive identifies synchronization points at which thread-visible variables are written back to memory. This directive must appear at the precise point in the code at which the synchronization is required.

    Thread-visible variables include the following data items:

    • Globally visible variables (common blocks and modules)

    • Variables visible through host association

    • Variables that appear in an EQUIVALENCE statement with a threat-visible variable

    • Local variables that have had their address taken and saved or have had their address passed to another subprogram.

    • Local variables that do not have the SAVE attribute that are declared shared in the enclosing parallel region.

    • Dummy arguments

    • All pointer dereferences

  • The code enclosed within ORDERED and END ORDERED directives is executed in the order in which it would be executed in a sequential execution of an enclosing parallel loop.

    An ORDERED directive can appear only in the dynamic extent of a DO or PARALLEL DO directive. This DO directive must have the ORDERED clause specified. For information on directive binding, see “Directive Binding”.

    Only one thread is allowed in an ordered section at a time. Threads are allowed to enter in the order of the loop iterations. No thread can enter an ordered section until it is guaranteed that all previous iterations have completed or will never execute an ordered section. This sequentializes and orders code within ordered sections while allowing code outside the section to run in parallel. ORDERED sections that bind to different DO directives are independent of each other.

Data Environment Constructs

The THREADPRIVATE directive makes named common blocks and named variables private to a thread but global within the thread.

Data Scope Attribute Clauses

In addition to the THREADPRIVATE directive, several directives accept clauses that allow a user to control the scope attributes of variables for the duration of the construct. Not all of the clauses are allowed on all directives; usually, if no data scope clauses are specified for a directive, the default scope for variables affected by the directive is SHARED.

The following list describes the data scope attribute clauses:

  • The PRIVATE clause declares variables to be private to each thread in a team.

  • The SHARED clause makes variables shared among all the threads in a team. All threads within a team access the same storage area for SHARED data.

  • The DEFAULT clause allows the user to specify a PRIVATE, SHARED, or NONE default scope attribute for all variables in the lexical extent of any parallel region. Variables in THREADPRIVATE common blocks are not affected by this clause.

  • The FIRSTPRIVATE clause provides a superset of the functionality provided by the PRIVATE clause.

  • The LASTPRIVATE clause provides a superset of the functionality provided by the PRIVATE clause.

    When the LASTPRIVATE clause appears on a DO directive, the thread that executes the sequentially last iteration updates the version of the object it had before the construct. When the LASTPRIVATE clause appears in a SECTIONS directive, the thread that executes the lexically last SECTION updates the version of the object it had before the construct. Subobjects that are not assigned a value by the last iteration of the DO or the lexically last SECTION of the SECTIONS directive are undefined after the construct.

  • The REDUCTION clause performs a reduction on the variables specified, with the operator or the intrinsic specified.

    At the end of the REDUCTION, the shared variable is updated to reflect the result of combining the original value of the (shared) reduction variable with the final value of each of the private copies using the operator specified. The reduction operators are all associative (except for subtraction), and the compiler can freely reassociate the computation of the final value (the partial results of a subtraction reduction are added to form the final value).

    The value of the shared variable becomes undefined when the first thread reaches the containing clause, and it remains so until the reduction computation is complete. Normally, the computation is complete at the end of the REDUCTION construct; however, if the REDUCTION clause is used on a construct to which NOWAIT is also applied, the shared variable remains undefined until a barrier synchronization has been performed to ensure that all the threads have completed the REDUCTION clause.

  • The COPYIN clause applies only to common blocks that are declared THREADPRIVATE. A COPYIN clause on a parallel region specifies that the data in the master thread of the team be copied to the thread private copies of the common block at the beginning of the parallel region.

  • The COPYPRIVATE clause uses a private variable to broadcast a value, or a pointer to a shared object, from one member of a team to the other members. It is an alternative to using a shared variable, or pointer association, and is useful when providing such a shared variable would be difficult. The COPYPRIVATE clause can only appear on the END SINGLE directive.

There are several rules and restrictions that apply with respect to data scope. See the OpenMP specification at http://www.openmp.org/specs for complete details.

Directive Binding

Some directives are bound to other directives. A binding specifies the way in which one directive is related to another. For instance, a directive is bound to a second directive if it can appear in the dynamic extent of that second directive. The following rules apply with respect to the dynamic binding of directives:

  • A parallel region is available for binding purposes, whether it is serialized or executed in parallel.

  • The DO, SECTIONS, SINGLE, MASTER, BARRIER, and WORKSHARE directives bind to the dynamically enclosing PARALLEL directive, if one exists. The dynamically enclosing PARALLEL directive is the closest enclosing PARALLEL directive regardless of the value of the expression in the IF clause, should the clause be present.

  • The ORDERED directive binds to the dynamically enclosing DO directive.

  • The ATOMIC directive enforces exclusive access with respect to ATOMIC directives in all threads, not just the current team.

  • The CRITICAL directive enforces exclusive access with respect to CRITICAL directives in all threads, not just the current team.

  • A directive can never bind to any directive outside the closest enclosing PARALLEL.

Directive Nesting

The following rules apply to the dynamic nesting of directives:

  • A PARALLEL directive dynamically inside another PARALLEL directive logically establishes a new team, which is composed of only the current thread unless nested parallelism is enabled.

  • DO, SECTIONS, SINGLE, and WORKSHARE directives that bind to the same PARALLEL directive cannot be nested one inside the other.

  • DO, SECTIONS, SINGLE, and WORKSHARE directives are not permitted in the dynamic extent of CRITICAL and MASTER directives.

  • BARRIER directives are not permitted in the dynamic extent of DO, SECTIONS, SINGLE, WORKSHARE, MASTER, CRITICAL, and ORDERED directives.

  • MASTER directives are not permitted in the dynamic extent of DO, SECTIONS, SINGLE, WORKSHARE, MASTER, CRITICAL, and ORDERED directives.

  • ORDERED directives must appear in the dynamic extent of a DO or PARALLEL DO directive which has an ORDERED clause.

  • ORDERED directives are not allowed in the dynamic extent of SECTIONS, SINGLE, WORKSHARE, CRITICAL, and MASTER directives.

  • CRITICAL directives with the same name are not allowed to be nested one inside the other.

  • Any directive set that is legal when executed dynamically inside a PARALLEL region is also legal when executed outside a parallel region. When executed dynamically outside a user-specified parallel region, the directive is executed with respect to a team composed of only the master thread.