Chapter 2. Invoking MIPSpro 7 Fortran 90

This chapter describes the options for the f90(1) command. “CPU Targeting (Cross Compiling) Using the compiler.defaults File”, describes CPU targeting.

The f90(1) command invokes the MIPSpro 7 Fortran 90 compiler. The following syntax boxes show the f90(1) command syntax:

f90 [-64 | -n32][-mipsn]file.suffix[90][file.suffix[90]]...
f90 [-64 | -n32][-alignn][-ansi][-apo][-apokeep][-apolist][-auto_use module_name[,module_name] ...][-C][-check_bounds][-c][-chunk=integer][-cif][-coln][-cord][-cpp][-cray_mp][-Dvar[=def][,var[=def]]...][-DEBUG:...][-dn][-default64][-E][-extend_source][-fbfile][-fixedform][-flist][-FLIST:][-freeform][-ftpp][-fullwarn][-Gnum][-g[debug_lvl]][-help][-I[dir]][-INLINE:...][-ipa][-IPA[:...]][-in][-ignore_suffix][-KPIC][-keep][-Ldirectory][-llibrary][-LANG:...][-LIST:...][-LNO:...][-listing][-macro_expand][-MDupdate[file]][-mipsn][-mp][-mplist][-MP:...][-mp_schedtype=mode][-noappend][-nocpp][-noextend_source][-nostdinc][-Olevel][-OPT:...][-oout_file][-P][-pfa][-pfakeep][-pfalist][-rreal_spec][-rprocessor][-S][-static][-static_threadprivate][-TARG:...][-TENV:...][-Uvar][-u][-version][-Wl,opt[,arg][,opt[,arg]]...][-w[arg]][-woffnum][-xdirlist][-xgot][--]file.suffix[90][file.suffix[90]...]

In some cases, more than one option can have an effect on a single compiler feature. The following list shows some of the compiler features and the options that affect them:


Note: The MIPSpro Auto-Parallelizing Option is invoked when you specify the -apo command line option. You must be licensed for the MIPSpro Auto-Parallelizing Option in order to be able to use this command line option.

Various environment variable settings can affect your compilation. For more information on the environment variables, see the pe_environ(5) man page.

Some f90(1) command options, for example, -LNO:..., -LIST:..., -MP:... , -OPT:..., -TARG:..., and -TENV:... accept several suboptions and allow you to specify a setting for each suboption. To specify multiple suboptions, either use colons to separate each suboption or specify multiple options on the command line. For example, the following command lines are equivalent:

f90 -LIST:notes=ON:options=OFF b.f
f90 -LIST:notes=ON -LIST:options=OFF b.f

Some arguments to suboptions of this type are specified with a setting that either enables or disables the feature. To enable a feature, specify the suboption either alone or with =1, =ON, or =TRUE. To disable a feature, specify the suboption with either =0, =OFF, or =FALSE. For example, the following command lines are equivalent:

f90 -LNO:auto_dist:blocking=OFF:oinvar=FALSE a.f
f90 -LNO:auto_dist=1:blocking=0:oinvar=OFF a.f

For brevity, this manual shows only the ON or OFF settings to suboptions, but the compiler also accepts 0, 1, TRUE, and FALSE as settings.

-64, -n32

Specifies the Application Binary Interface (ABI), either -n32 or -64. Specifying -n32 generates 32-bit objects. Specifying -64 generates 64-bit objects.


Note: Certain predefined system defaults can greatly affect your compilation. These include system defaults for your ABI, Instruction Set Architecture (ISA), and processor type. To determine the default ABI for your system, look in file /etc/compiler.defaults. To determine your system's processor, use the hinv(1) command. The -64 and -n32 options can affect the Instruction Set Architecture (ISA) used during compilation. For more information on this interaction, see the -mipsn option.

When -n32 is specified, the total memory allocation for a program and individual arrays cannot exceed 2 gigabytes (2 GB, or 2,048 MB). When -64 is specified, the compiler supports arrays that are larger than 2 GB.

As the following example shows, the arrays can be local, global, or dynamically created when compiling with the following command line:

f90 -64 -i8 -mips3 whale.f

       MODULE DEFS
       INTEGER, PARAMETER   :: ARRAY_SIZE = 4294967304_8     ! Z'100000008'
       INTEGER              :: I(ARRAY_SIZE)
       END MODULE


       PROGRAM MAIN
       USE DEFS
       INTEGER, ALLOCATABLE :: J(:)
       INTEGER              :: STATUS

       ALLOCATE(J(ARRAY_SIZE), STAT=STATUS)

       IF (STATUS == 0) THEN
         I(ARRAY_SIZE) = 7
         J(ARRAY_SIZE) = 8
         CALL SUB
       END IF

       END PROGRAM


       SUBROUTINE SUB
       USE DEFS
       INTEGER :: K(ARRAY_SIZE)

       K(ARRAY_SIZE) = 9;

       END SUBROUTINE


Note: In the preceding example, you cannot specify an array with a size greater than 32 bits in an input or an output list of a READ, WRITE, or PRINT statement.

You must have enough swap space to support the working set size and you must have your shell limit datasize, stack size, and vmemoryuse variables set to values large enough to support the sizes of the arrays. For information on these settings, see the sh(1) man page.

The following example compiles and runs the preceding code after setting the stack size to a correct value:

$uname -a
IRIX64 cydrome 6.2 03131016 IP19
$f90 -64 -i8 -mips3 whale.f
$limit
cputime         unlimited
filesize        unlimited
datasize        unlimited
stacksize       65536 kbytes
coredumpsize    0 kbytes
memoryuse       524288 kbytes
descriptors     300
vmemoryuse      unlimited
threads         1024
$limit stacksize unlimited
$limit
cputime         unlimited
filesize        unlimited
datasize        unlimited
stacksize       524288 kbytes
coredumpsize    0 kbytes
memoryuse       754544 kbytes
descriptors     300
vmemoryuse      unlimited
threads         1024

-alignn

Aligns data objects on specified boundaries. The -alignn specifications are as follows:

Option

Action

-align32

Aligns objects 32 bits or larger on 32-bit boundaries.

-align64

Aligns objects 64 bits or larger on 64-bit boundaries. Default.

When an alignment is specified, objects smaller than the specification are aligned on boundaries that correspond to their sizes. For example, when align64 is specified, 32-bit and larger objects are aligned on 32-bit boundaries; 16-bit and larger objects are aligned on 16-bit boundaries; and 8-bit and larger objects are aligned on 8-bit boundaries.

-ansi

Causes the compiler to generate messages when it encounters source code that does not conform to the Fortran standard. Specifying this option in conjunction with the -fullwarn option causes all messages, regardless of level, to be generated. For more information on the -fullwarn option, see “-fullwarn”.

-apo, -apokeep, -apolist

Controls the Auto-Parallelizing Option (APO), which automatically converts sequential code into parallel code by inserting parallel directives where it is safe and beneficial to do so. Specifying -apo invokes APO and sets the -mp option, which enables recognition of parallel directives inserted into your code.

The -apolist option produces a parallelization listing, file.list.

The -apokeep option specifies that file.anl and file.m should be retained after compilation and enables -apolist. For information on these files, see “-apokeep and -apolist” in Chapter 9.


Note: These options are ignored unless you are licensed for the MIPSpro Auto-Parallelizing Option. For more information on this product contact, your sales representative.

The -apo, -apokeep, and -apolist options are equivalent to the -pfa, -pfakeep, and -pfalist options. The -apo options are preferred.

The outmoded forms of these options, -apolist and -apo keep, are still accepted, but the preferred format is the format without the extra space. For more information on APO, see Chapter 9, “The Auto-Parallelizing Option (APO)”.

If -apokeep is specified in conjunction with -ipa or -IPA, the default settings for IPA suboptions are used with the exception of the inline=setting suboption. For that suboption, the default becomes OFF. For more information on IPA, see the ipa(5) man page.

-auto_use module_name[,module_name] ...

Directs the compiler to behave as if a USE module_name statement were entered in your Fortran source code for each module_name. The USE statements are entered in every program unit and interface body in the source file being compiled.


Note: Using this option can add to compile time in some situations.


-c

Disables the load step and writes the binary object file to file.o.

For example, the following command line produces file more.o:

% f90 -c more.f

-C, -check_bounds

Performs run-time array subscript range checking. Subscripts that are out of range cause fatal run-time errors. If you set the F90_BOUNDS_CHECK_ABORT environment variable to YES, the program aborts.

These options are equivalent to the -DEBUG:subscript_check option. For more information on this option, see the debug_group(5) man page.

-chunk=integer

When compiling a multitasked program, this option specifies the number of loop iterations per chunk. For scheduling purposes, the iterations of a loop are broken up into pieces. This option must be specified in conjunction with the -mp option.

Specify a nonzero, unsigned, positive integer for integer. There is no default value for integer.

-cif

Generates a compiler information file (CIF) for use by the programming tools. For more information on CIF, see the Compiler Information File (CIF) Reference Manual.

-coln

Specifies the line width for fixed-format source lines. Specify 72, 80, or 120 for n. By default, fixed-format lines are 72 characters wide. Specifying -col120 implies -extend_source and recognizes lines up to 132 characters wide.

For more information on specifying line length, see the -extend_source and -noextend_source options.

-cord

Runs the procedure rearranger, cord(1), on the resulting file after loading. The rearrangement is done to reduce virtual memory paging and/or instruction cache misses.

For more information on procedure rearranging, see the cord(1), pixie(1), and prof(1) man pages.

-cpp

Runs a nondefault source preprocessor, cpp(1), on all input source files, regardless of suffix, before compiling. This preprocessor automatically expands macros outside of preprocessor statements.

The default is to run the Fortran preprocessor if the intput file ends in a .F or .F90 suffix.

For more information on source preprocessing compiler options, see the following options: [-Dvar[=def][,var[=def]]...], -E, -ftpp, -macro_expand, -nocpp, -P, and -Uvar.

For information on source preprocessing and the macros available, see Chapter 7, “Source Preprocessing”.

-cray_mp

Specifies that all Autotasking directives described in Appendix C, “Autotasking Directives (Outmoded)”, should be recognized. These directives are also implemented in the CF90 compiler on UNICOS systems. The prefix for these directives is !MIC$.


Note: The Autotasking directives are outmoded. The preferred alternatives are the OpenMP Fortran API directives described in Chapter 4, “OpenMP Fortran API Multiprocessing Directives”.

You must specify this option if you want the following directives to be recognized in your code: DOALL, DOPARALLEL, ENDDO, [END]GUARD, [END]PARALLEL, [END]CASE, and NUMCPUS. For more information on these directives, see Appendix C, “Autotasking Directives (Outmoded)”.

It is not necessary to specify this option in order for the following two directives to be recognized: PERMUTATION and CNCALL. These two directives are recognized even when -cray_mp is not specified.

This option can be specified on the command line along with -apo or -pfa, but it cannot be specified along with -mp.

-dn

Specifies the KIND specification used for objects declared DOUBLE COMPLEX and DOUBLE PRECISION, as follows:

Option

KIND value

-d8

Uses REAL(KIND=8) for objects declared as DOUBLE PRECISION. Uses COMPLEX(KIND=8) for objects declared DOUBLE COMPLEX. Default.

-d16

Uses REAL(KIND=16) for objects declared as DOUBLE PRECISION. Uses COMPLEX(KIND=16) for objects declared DOUBLE COMPLEX.

-Dvar[=def][,var[=def]]...

Defines variables used for source preprocessing as if they had been defined by a #define directive. If no def is specified, 1 is used. For information on undefining variables, see the -Uvar option.

For more information on source preprocessing compiler options, see the following options: -cpp, -E, -ftpp, -macro_expand, -nocpp, -P, and -Uvar.

For information on source preprocessing and the macros available, see Chapter 7, “Source Preprocessing”.

-DEBUG:...

Controls the compiler's attempts to detect various errors (at compile time or run time) and controls how the errors are reported. For more information on the debugging options, see the debug_group(5) man page.

-default64

Sets the sizes of default integer, real, logical, and double precision objects to be the same as if the program were executing on a UNICOS system. This option causes the following options to go into effect: -r8, -i8, -d16, and -64.

Calling a routine in a specialized library, such as SCSL, requires that its 64-bit entry point be specified when 64-bit data are used. Similarly, its 32-bit entry point must be specified when 32-bit data are used.

-E

Run only the source preprocessor files, without considering suffixes, and writes the result to stdout. This option overrides the -nocpp option. The output file contains line directives. To generate an output file without line directives, see the -P option.

For more information on source preprocessing compiler options, see the following options: -cpp, -Dvar[=def][,var[=def]]..., -ftpp, -macro_expand, -nocpp, -P, and -Uvar.

For information on source preprocessing and the macros available, see Chapter 7, “Source Preprocessing”.

-extend_source

Specifies a 132-character line length for fixed-format source lines. By default, fixed-format lines are 72 characters wide. For more information on controlling line length, see the -coln option

-fbfile

Specifies the feedback file to be used. This file (with the suffix .cfb) can be produced by prof(1) with its -feedback option from one or more .Counts files generated by the execution of the instrumented program produced by pixie(1).

-fixedform

Treats all input source files, regardless of suffix, as if they were written in fixed source form. By default, only input files suffixed with .f or .F are assumed to be written in fixed source form.

-flist

Invokes all Fortran listing control options. Shows lowering, versioning, and tilling. The effect is the same as if all -FLIST:... options had been enabled.

-FLIST:...

Invokes the Fortran listing control group, which controls production of the compiler's internal program representation back into Fortran code, after IPA inlining and loop-nest transformations. This is used primarily as a diagnostic tool, and the generated Fortran code may not always compile.

The following sections describe the individual -FLIST:... options.

-FLIST:=setting

Enables or disables the listing. setting can be either ON or OFF. The default is OFF.

This option is enabled when any other -FLIST:... options are enabled, but it can also be used to enable a listing when no other options are enabled.

-FLIST:ansi_format=setting

Sets ANSI format. setting can be either ON or OFF. When set to ON, the compiler uses a space (instead of tab) for indentation and a maximum of 72 characters per line. The default is OFF.

-FLIST:emit_pfetch=setting

Writes prefetch information, as comments, in the transformed source file. setting can be either ON or OFF. The default is OFF.

In the listing, PREFETCH identifies a prefetch and includes the variable reference (with an offset in bytes), an indication of read/write, a stride for each dimension, and a number in the range from 1 (low) to 3 (high), which reflects the confidence in the prefetch analysis. prefetch identifies the reference(s) being prefetched by the PREFETCH descriptor. The comments occur after a read/write to a variable and note the identifier of the PREFETCH-spec for each level of the cache.

-FLIST:emit_omp=setting

Controls whether or not code written to listings and intermediate files is written using OpenMP Fortran API directives. setting can be either ON or OFF. The default is ON.

When ON is in effect, which is the default, all generated files are written using OpenMP directives. When OFF is in effect, all generated files are written using the outmoded MIPS multiprocessing directives.

-FLIST:ftn_file=file

Writes the program to file. By default, the program is written to file.w2f.f.

-FLIST:linelength=n

Sets the maximum line length to n characters.

-FLIST:show=setting

Writes the input and output filenames to stderr. setting can be either ON or OFF. The default is ON.

-freeform

Treats all input source files, regardless of suffix, as if they were written in free source form. By default, only input files suffixed with .f90 or .F90 are assumed to be written in free source form.

-ftpp

Runs the Fortran source preprocessor on input Fortran source files that are suffixed with .f or .f90 before compiling. By default, only files suffixed with .F or .F90 are run through the Fortran source preprocessor.

The Fortran source preprocessor does not automatically expand macros outside of preprocessor statements, so you need to specify -macro_expand if you want macros expanded.

If -ftpp and -P are specified, the preprocessed source code is placed in file.i, and file.i does not contain # lines.

For more information on source preprocessing compiler options, see the following options: -cpp, -Dvar[=def][,var[=def]]..., -E, -macro_expand, -nocpp, -P, and -Uvar.

For information on source preprocessing and the macros available, see Chapter 7, “Source Preprocessing”.

-fullwarn

Requests that the compiler generate comment-level messages. These messages are suppressed by default. This option can be useful during software development.

-Gnum

Specifies the maximum size, in bytes, of a data item that is to be accessed from the Global Pointer (GP). num must be a decimal number.

If num is 0, no data is accessed. The default value for num is 8 bytes. Data stored relative to the GP can be accessed by the program quickly, but this space is limited. Large programs can overflow the space accessed by the GP at load time.

If the loader gives the Bad -G num value error message, recompile the program with -G0. Use the same value for this option or for compiling all files that comprise a program executable or DSO.

-gdebug_lvl

Generates debugging information and establishes a debugging level. Specify one of the following:

Option

Support

-g0

No debugging information produced. Default.

-g2, -g

Information for symbolic debugging is produced, and optimization is disabled.

-g3

Information for symbolic debugging of fully optimized code is produced. The debugging information produced may be inaccurate. This option can be used in conjunction with the -O, -O1, -O2, and -O3 options.

-help

Lists all available options. The compiler is not invoked.

To list all suboptions within an option group, specify -LIST:all_options=ON. This shows, for example, all the suboptions to the -TENV:, -OPT:, and -LNO: options. For more information on the LIST: option, see “-LIST:...”.

-in

Specifies the length of default integer constants, default integer variables, and logical quantities. Specify one of the following:

Option

Action

-i4

Specifies 32-bit (4-byte) objects. Default.

-i8

Specifies 64-bit (8-byte) objects. Also see the -default64 option.

-Idir

Specifies a directory to be searched for the following types of files:

  • Files named in INCLUDE lines in the Fortran source file that do not begin with a slash (/) character

  • Files named in #include source preprocessing directives that do not begin with a slash (/) character

  • Files specified on USE statements

Files are searched in the following order: first, in the directory that contains the input file; second, in the directories specified by dir; and third, in the standard directory, /usr/include.

-ignore_suffix

Compiles all files as if they were Fortran source files. By default, the f90(1) command determines the type of processing necessary for an input file based in its suffix. Files that end in .c, for example, are compiled by cc(1). When -ignore_suffix is specified, the compiler processes all files named as if they were Fortran source files, regardless of suffix.

-INLINE:...

Specifies actions for the standalone inliner. These options control the application of subprogram inlining within one file when interprocedural analysis (IPA) is not enabled.

If you have included inlining directives in your source code, the -INLINE option must be specified in order for those directives to be recognized.

For more information on the individual options in this group, see ipa(5).

-ipa

Invokes interprocedural analysis (IPA). Specifying this option is identical to specifing -IPA or -IPA:. Default settings for the individual IPA suboptions are used.

-IPA[:...]

Controls the application of interprocedural analysis (IPA) and optimization. This includes inlining, common block array padding, constant propagation, dead function elimination, alias analysis, and other features. Specify -IPA with no arguments to invoke the interprocedural analysis phase with default options.

If you have included IPA directives in your source code, the -IPA option must be specified in order for those directives to be recognized.

If you compile and load in distinct steps, you must use at least -IPA for the compile step, and you must specify -IPA and the individual options in the group for the load step. For more information on the individual options in this group, see the ipa(5) man page.

-keep

Writes all intermediate compilation files. file.s contains the generated assembly language code. file.i contains the preprocessed source code.

These files are retained after compilation is finished.

If IPA is in effect and you want to retain file.s, you must specify -IPA:keeplight=OFF in addition to -keep.

-KPIC

Generates position-independent code (PIC), which is necessary for programs loaded with dynamic shared libraries. Enabled by default.

-llibrary

Searches the library named liblibrary.a or liblibrary.so. Libraries are searched in the order given on the command line.

If you are using another compiler, for example the C compiler, to load Fortran object files, you need to explicitly specify to the C compiler that the Fortran libraries be loaded.

The following table shows the Fortran libraries that the f90(1) command loads by default.

-l option

Link library

Content

-lfortran

 

/usr/lib*/libfortran.so

Intrinsic procedure, I/O, multiprocessing, IRIX interface, and indexed sequential access method library for shared loading and compiling.

-lm

/usr/lib*/libm.so

Mathematics library.

Example 1. In the following example, the cc(1) command loads Fortran object files. The -l option loads the Fortran library files:

cc -o myprog main.o rest.o -lfortran -lm 

See the ld(1) man page for information on specifying the -l option.

Example 2. You may need to specify libraries when you use IRIX system packages that are not part of a particular language. Most of the man pages for these packages list the required libraries. For example, the getwd(3c) subroutine requires the BSD compatibility library libbsd.a. Specify this library as follows:

% f90 main.o more.o rest.o -lbsd

Example 3. To load the Silicon Graphics/Cray Scientific Library (SCSL), specify one of the following command lines:

% f90 -lscs sci.f

or

% f90 -lscs_mp mpsci.f

The -lscs_mp option used in the preceding command line loads the multiprocessed version of SCSL, which is supported on Origin series systems.

Example 4. To specify a library created with the archiver, type in the path name of the library as follows:

% f90 main.o more.o rest.o libfft.a


Note: The loader searches libraries in the order you specify. Therefore, if you have a library named libfft.a that uses data or procedures from -lfourier, you must specify libfft.a first.


-Ldirectory

Changes the library search algorithm for the loader. For directory, specify the path to a directory that should be searched before using the default system libraries. You can specify multiple -L options on the command line. The library search algorithm searches these directories in left to right order.

-LANG:...

Controls the language option group. The following sections describe the suboptions available in this group.

-LANG:heap_allocation_threshold=size

Determines heap or stack allocation. If the size of an automatic array or compiler temporary variable exceeds size bytes, it is allocated on the heap instead of the stack. If size is -1, objects are always put on the stack. If size is 0, objects are always put on the heap. The default is -1, which allows for maximum performance and for compatibility with previous releases.

-LANG:IEEE_minus_zero=setting

Controls whether or not a minus sign (-) is written for negative zero. Specify either ON or OFF for setting. The default is OFF, which suppresses the minus sign. The minus sign is suppressed by default to prevent problems from hardware instructions and optimizations that can return a -0.0 result from a 0.0 value.

-LANG:recursive=setting

Invokes the language option control group to control recursion support. setting can be either ON or OFF. The default is OFF.

In either mode, the compiler supports a recursive, stack-based calling sequence. The difference lies in the optimization of statically allocated local variables, as follows:

  • With -LANG:recursive=ON, the compiler assumes that a statically allocated local variable could be referenced or modified by a recursive procedure call. Therefore, such a variable must be stored into memory before making a call and reloaded afterwards.

  • With -LANG:recursive=OFF, the compiler can safely assume that a statically allocated local variable is not referenced or modified by a procedure call. This setting enables the compiler to optimize more aggressively.

-LIST:...

Writes an assembler listing file to file.l. If the -S option is also in effect, the content of this listing is also written to the assembly language file (file.s).


Note: For information on how to obtain a source listing and cross reference, see the -listing option.

The following sections describe the individual -LIST: options.


Note: If -LIST: is not specified, all the suboptions described in the following sections are set to OFF.


-LIST:=setting

Writes or suppresses the listing file. Specify ON or OFF for setting.

If one or more -LIST options are enabled, the listing file is written. By default, the listing file contains a list of compiler options in effect during compilation.

-LIST:all_options=setting

Writes or suppresses the list of all supported options in the listing file. Specify ON or OFF for setting. The default is OFF.

-LIST:notes=setting

Writes or suppresses notes regarding various optimization phases in the assembly listing file (file.s). Must be specified in conjunction with -S. Specify ON or OFF for setting. The default is ON.

-LIST:options=setting

Writes or suppresses a listing of the compiler options in effect during compilation in the listing file. Specify ON or OFF for setting. The default is OFF.

-LIST:symbols=setting

Writes or suppresses a listing of the internal compiler symbol tables used in the compilation in the listing file. Specify ON or OFF for setting. The default is OFF.

-listing

Writes a source code listing and a cross reference listing to file.L.

-LNO:...

Specifies options and transformations performed on loop nests by the Loop Nest Optimizer. The -LNO options are enabled only if -O3 is also specified on the f90(1) command line.

The arguments to -LNO are divided into the following groups:

  • General options

  • Transformation options

  • Cache memory management options

  • Translation Lookaside Buffer (TLB) options

  • Prefetch options

For information on the LNO options that are in effect during a compilation, specify -LIST:all_options=ON.

The following sections describe the individual LNO options.

General Options

The following sections describe the general options.

-LNO:auto_dist=setting (Origin Series Only)

Distributes local arrays and arrays in common blocks that are accessed in parallel. Specify ON or OFF for setting. The default is OFF.

When -LNO:auto_dist=ON, the compiler distributes local and COMMON arrays that are accessed in parallel based on access patterns inside the routines that contain definitions of arrays (as opposed to array declarations). Access patterns of arrays used as dummy arguments are ignored. This optimization works with either automatic parallelism or parallelism expressed through directives. This optimization is always safe, does not affect the layout of arrays in virtual space. and does not incur addressing overhead.

Example:

      PROGRAM FRED
      REAL A(1000,100)
      COMMON A
!$OMP PARALLEL DO PRIVATE (I,J)
      DO I=1,N
        DO J=1,N
          A(J,I) = 0.0
        END DO
      END DO
      END

In the preceding code fragment, every processor accesses a block of iterations of parallel loop I. This implies that every processor will zero a block of columns of array A. When this option is enabled, the compiler distributes the array using the !$SGI DISTRIBUTE A(*,BLOCK) directive so that each processor accesses data local to its own memory. The compiler might not pick the best distribution. In particular, if arrays are accessed differently in different subroutines, the distribution is that which suites the majority. This option is useful for programs that are not written with data distribution in mind. For more information on the DISTRIBUTE directive, see “Determining the Data Distribution for an Array: !$SGI DISTRIBUTE, !$SGI DISTRIBUTE_RESHAPE, and !$SGI REDISTRIBUTE” in Chapter 5.

-LNO:gather_scatter=n

Performs gather-scatter optimizations. Specify 0, 1, or 2 for n. The default is 1.

gather_scatter=0 disables all gather-scatter optimization. gather_scatter=1 performs gather-scatter optimizations on non-nested IF statements. gather_scatter=2 performs multilevel gather-scatter optimizations.

The following code fragment shows gatter-scatter optimization:

      SUBROUTINE SUB(N)
      COMMON/BLK/A(1000),B(1000),C(1000),INDEX(1000)
      DO J = 1,N
        IF(A(J) .EQ. B(J)) THEN
            C(J) = SQRT(A(J))
        END IF
      END DO
      END

The compiler transforms this as follows:

        INC = 0
        DO J = 1, N
          ITEMP(INC + 1) = J
          IF(A0(J) .EQ. B0(J)) THEN
            INC = INC + 1
          ENDIF
        END DO
        DO IND_0 = 0, INC_0 + -1
          J_TMP = ITEMP(IND_0 + 1)
          C0(J_TMP) = SQRT(A0(J_TMP))
        END DO

-LNO:ignore_pragmas=setting

Specifies that the command line options override directives in the source file. Specify either ON or OFF for setting. The default is ignore_pragmas=OFF.

By default, directives within a file override command line options.

-LNO:oinvar=setting

Controls outer loop hoisting. Hoisting is the process by which invariant statements or expressions are taken out of a loop. The compiler looks for expressions that vary in the inner loop but are invariant in an outer loop. The compiler precomputes all the invariant expressions and stores them in a temporary array. All references to the expression in the inner loop are replaced by loads from the array. Specify ON or OFF for setting. The default is oinvar=ON.

-LNO:opt=n

Controls the LNO optimization level. Specify either 0 or 1 for n. The default is 1.

opt=0 disables nearly all loop nest optimization. opt=1 performs full LNO transformations.

-LNO:outer=setting

Enables or disables outer loop fusion. Specify ON or OFF for setting. The default is outer=ON.

For more information on controling loop fusion, see the -LNO:fusion option.

-LNO:parallel_overhead=num_cycles

Overrides internal compiler estimates concerning the efficiency to be gained by executing certain loops in parallel rather than serially. num_cycles specifies the number of processor cycles. Specify an integer for num_cycles. The default is 2600.

When the -apo or -pfa options are in effect, loops in a program are parallelized automatically. The compiler tests each DO loop to ensure that there is enough work in the loop to make it worth executing in parallel. Generally, the testing performed by the compiler evaluates each loop as follows:

IF ((work_per_processor(N, P) + parallel_overhead) < total_work_in_loop(N)) THEN
   perform parallel execution
   ELSE
      perform serial execution
   END IF 

The work_per_processor, parallel_overhead, and total_work_in_loop are compiler-generated estimates, expressed in machine cycles, as follows:

  • The work_per_processor depends on the loop's trip count, N (a value that is known just before the loop is executed), and the number of processors, P, upon which the loop is to be executed.

  • The total_work_in_loop depends only on N, the loop's trip count.

  • The parallel_overhead represents the costs to initiate execution of the loop by all the threads and to synchronize them at the end. This value is 2600 cycles by default.

The -LNO:parallel_overhead=num_cycles option changes parallel_overhead to num_cycles cycles.

As parallel_overhead increases, it becomes less likely that loops will run in parallel. Increasing the value of parallel_overhead is useful when the parallelized loop actually runs more slowly than the serial version, but the work_per_processor is underestimated, causing the loop to run in parallel and suffer a slowdown.

Conversely, as parallel_overhead decreases, it becomes more likely for the loop to run in parallel. Decreasing parallel_overhead is useful if the loop runs faster in parallel than in serial, but the work_per_processor determination overestimates the actual execution time and causes the slower serial version of the loop to be executed.

-LNO:pure=n

Specifies the extent to which the compiler should consider the effect of a PURE procedure or a !DIR$ NOSIDEEFFECTS directive when performing parallel analysis. Specify 0, 1, or 2 for n, as follows:

nValue

Description

0

Directs the compiler to ignore the fact that a PURE procedure or a procedure preceded by a !DIR$ NOSIDEEFFECTS directive does not modify global data or its arguments.

1

Directs the compiler to consider the fact that PURE procedures and procedures preceded by a !DIR$ NOSIDEEFFECTS directive do not modify global data or procedure arguments when performing parallel analysis. Default.

2

Asserts to the compiler that that PURE procedures and procedures preceded by a !DIR$ NOSIDEEFFECTS directive do not modify global data, do not modify procedure dummy arguments, and do not access global data.

This setting asserts that the only non-local data items referenced by the procedure are the dummy arguments to the procedure. This is an extension of the Fortran standard meaning of PURE and of the meaning of !DIR$ NOSIDEEFFECTS. At this setting, more aggressive parallelization can occur if procedures are known not to access global data.

-LNO:vintr=setting

Specifies that vectorizable versions of the math intrinsic functions should be used. Vector versions of routines return multiple results per call, reducing the number of calls made in a loop and, thus, the call over head. Specify ON or OFF for setting. The default is vintr=ON.

For information on the math intrinsic functions, see the math(3m) man page.

Transformation Options

The loop transformation options described in the following sections allow you to control cache blocking, loop fission, loop fusion, loop unrolling, and loop interchange.

-LNO:blocking=setting

Specifies whether cache blocking is performed.

Specify blocking=OFF to disable cache blocking. Cache blocking is performed to improve reuse of data in cache. Specify ON or OFF for setting. The default is blocking=ON.

For more information on blocking, see the MIPSpro Compiling and Performance Tuning Guide.

-LNO:blocking_size=n1[,n2]

Specifies a code blocking size that the compiler must use when performing any blocking. Specify a value for n2 when using a 2-level cache. For n1 or n2, enter an integer number that represents the number of iterations.

-LNO:fission=n

Controls loop fission. Specify 0, 1, or 2 for n. The default is 1.

Loop fission is an optimization process by which a loop is divided into smaller, independent loops. This can improve register use for large inner loops. It also enables other optimizations, such as loop interchange and blocking, to execute more efficiently. Consider the following loop:

DO I ...
   DO J1 ...
      ...
   ENDDO
   DO J2 ...
      ...
   ENDDO
ENDDO

With loop fission, the preceding loop is transformed into the following two loops:

DO I1 ...
   DO J1 ...
      ...
   ENDDO
ENDDO
DO I2 ...
   DO J2 ...
      ...
   ENDDO
ENDDO

fission=0 disables loop fission. fission=1 performs normal fission as necessary. fission=2 specifies that fission be tried before fusion.

If -LNO:fission=n and -LNO:fusion=n are both set to 1 or to 2, fusion is performed.

-LNO:fusion=n

Controls loop fusion. Loop fusion is an optimization process by which two small loops are transformed into one larger loop. Loop fusion can lower the number of memory references and improve cache behavior. It also enables other optimizations, such as loop interchange and cache blocking, to execute more efficiently. Specify 0, 1, or 2 for n. The default is 1. The loops to be fused need not have identical iteration counts, but the iteration counts should be approximately the same.

Consider the following loop:

DO I = 1,N
   DO J = 1,N
      A(I,J) = B(I,J) + B(I,J-1) + B(I,J+1)
   END DO
END DO
DO I = 1,N
   DO J = 1,N
      B(I,J) = A(I,J) + A(I,J-1) + A(I,J+1)
   END DO
END DO

With loop fusion, the preceding loops are transformed into the following loop:

DO I=1,N
   A(I,1) = B(I,0) + B(I,1) + B(I,2)
   DO J = 2,N
      A(I,J) = B(I,J) + B(I,J-1) + B(I,J+1)
      B(I,J-1) = A(I,J-2) + A(I,J-1) + A(I,J)
   END DO
   B(I,N) = A(I,N-1) + A(I,N) + A(I,N+1)
END DO

fusion=0 disables loop fusion. fusion=1 performs standard outer loop fusion. fusion=2 specifies that outer loops should be fused, even if it means partial fusion. The compiler attempts fusion before fission. The compiler performs partial fusion if not all levels can be fused in the multiple-level fusion.

If -LNO:fission=n and -LNO:fusion=n are both set to 1 or to 2, fusion is performed. For information on controling outer loop fusion, see the -LNO:outer option.

The fusion= options affect the singly nested loops produced by the compiler.

-LNO:fusion_peeling_limit=n

Sets the limit for the number of iterations allowed to be peeled, where n≥ 0. By default, fusion_peeling_limit=5.

Loops that are candidates for loop fusion must have identical iteration counts. Loop peeling is an optimization that the compiler may need to perform on loops prior to loop fusion. For example, consider the following loops:

      DO I = 1,N     ! loop 1
        . . .
      END DO

      DO I = 1,N-1     ! loop 2
        . . .
      END DO

In the preceding example, the iteration counts of loop 1 and loop 2 differ. The compiler removes (peels) one iteration from loop 1; fuses loop 1 and loop 2; and executes the peeled iteration from loop 1 separately from the resulting fused loop. In this example, one iteration was peeled. The default maximum number of iterations that can be peeled is five iterations. This option allows you to specify a different maximum number of iterations that the compiler can peel.

-LNO:interchange=setting

Specifies whether loop interchange optimizations are performed.

Loop nests such as the following benefit from loop interchange optimizations:

DO I ...
   DO J ...
      DO K ...
         A(J,K) = A(J, K) + B(I,K)
      END DO
   END DO
END DO

In the preceding loop, each iteration of loop K requires two loads and one store. Also, if the loop bounds are large, every memory reference results in a cache miss.

With -LNO:interchange=ON, the loop is transformed into the following loop:

DO K ...
   DO J ...
      DO I ...
         A(J,K) = A(J,K) + B(I,K)
      END DO
   END DO
END DO

In the new loop, note that A(J,K) is a loop invariant entity; only one load is needed per iteration. The new loop is also more efficient with regard to cache management.

Specifying -LNO:interchange=OFF disables loop interchange optimizations. Specify ON or OFF for setting. The default is interchange=ON.

-LNO:ou=n, ou_max=n, and ou_prod_max=n

Specifies aspects of loop unrolling. When a loop is unrolled, the compiler makes copies of the loop body and executes them in sequence. The compiler performs some loop unrolling by default, but this option let you override default system assumptions.

Specifying ou=n indicates that all outer loops for which unrolling is legal should be unrolled n times; the result is that the compiler creates n copies of the loop. Specify an integer for n. The compiler unrolls loops by this amount (if specified) or not at all.

Specifying ou_max=n indicates that the compiler can unroll as many as n copies per loop, but no more.

Specifying ou_prod_max=n indicates that the product of unrolling of the various outer loops in a given loop nest is not to exceed n. The default is 16.

Example. The following loop is compiled with -LNO:ou=2:

DO I = 1,N
   DO J = 1,N
      A(J,I) = A(J,I) + B(J)
   END DO
END DO

After unrolling, the loop is as follows:

DO I = 1,N-1,2
   DO J = 1,N
      A(J,I) = A(J,I) + B(J)
      A(J,I+1) = A(J,I+1) + B(J)
   END DO
END DO
DO I = I,N        ! This nest computes remaining iterations.
   DO J = 1,N     ! This is the wind down loop.
      A(J,I) = A(J,I) + B(J)
   END DO
END DO

The advantage of unrolling, in the example, is that there is no need to load B(J)N times but instead N/2 times.

-LNO:ou_deep=setting

Specifies that for loops with a nesting depth of 3 or more, the compiler should outer unroll the wind-down loops that result from outer unrolling loops further out. This results in a large executable file, but it generates much faster code whenever wind-down loop execution costs are important. The default is ou_deep=ON.

-LNO:ou_further=n

Specifies whether the compiler performs outer loop unrolling on wind-down loops. When unrolling a loop with n iterations u times, the compiler must generate a wind-down loop to handle cases in which n is not a multiple of u. The wind-down loop handles the extra iterations at the end. The wind-down loop will have at most u-1 iterations. When the unrolling factor, u, is large, it may be beneficial to unroll the wind-down loop itself. When this option is set to n, the compiler unrolls a wind-down loop only if the original loop was unrolled by at least a factor of n. Specify an integer for n.

You can disable additional wind-down unrolling by specifying -LNO:ou_further=999999. Unrolling is enabled as much as is sensible by specifying -LNO:ou_further=3.

Cache Memory Management Options

LNO does several transformations, such as blocking and loop interchange, to improve the cache behavior of programs. When performing these transformations, LNO assumes that the target platform has certain cache characteristics. The following sections describe suboptions that allow you to change the default cache characteristics, thereby giving finer control over the optimizations that LNO performs.

The cache memory management options allow you to tune up to four aspects of the memory hierarchy on your system. For example, these four levels could include level 1 cache, level 2 cache, the TLB, and main memory.

The numbering in these arguments starts with the cache level closest to the processor and works outward.

-LNO:assoc1=n, assoc2=n, assoc3=n, assoc4=n

Specifies cache set associativity. For example, main memory is a fully associative cache for disk. Set n to any sufficiently large number, such as 128. Specifying n=0 indicates that there is no cache at that level.

-LNO:cmp1=n, cmp2=n, cmp3=n, cmp4=n and dmp1=n, dmp2=n, dmp3=n, dmp4=n

Specifies, in processor cycles, the time for a clean or dirty miss to the next outer level of the memory hierarchy. This number is approximate because it depends upon a clean or dirty line, read or write miss, and so on. Specifying n=0 indicates that there is no cache at that level.

-LNO:cs1=n, cs2=n, cs3=n, cs4=n

Specifies the cache size. The value n can be 0, or it can be a positive integer followed by one of the following letters: k, K, m, or M. This specifies the cache size in kilobytes or megabytes. Specifying n=0 indicates that there is no cache at that level.

cs1 refers to the primary cache. cs2 refers to the secondary cache. cs3 refers to memory. cs4 refers to disk. The default cache size for each type of cache depends on your system. You can specify -LIST:all_options=ON to direct the compiler to generate a listing that includes the default cache sizes used during compilation. In addition, you can enter the following command to see the secondary cache size(s) on your system:

hinv -c memory | grep Secondary

-LNO:is_mem1=setting, is_mem2=setting, is_mem3=setting, is_mem4=setting

Specifies that certain memory hierarchies should be modeled as memory, not cache. Specify ON or OFF for setting. The default is OFF for each option.

If an is_memk=setting setting is specified, the corresponding assocn=n specification is ignored. Blocking can be attempted for this memory hierarchy level, and blocking appropriate for memory, rather than cache, is applied. No prefetching is performed, and any prefetching options are ignored. Any cmpn=n and dmpn=n options on the command line are ignored.

-LNO:local_pad_size=n

Specifies the amount by which to pad local array dimensions. By default, the compiler automatically chooses the amount of padding to improve cache behavior for local array accesses. The unit for n is in elements of the original arrays.

-LNO:ls1=n, ls2=n, ls3=n, ls4=n

Specifies the line size, in bytes. This is the number of bytes, specified in the form of an integer number, n, that are moved from the memory hierarchy level further out to this level on a miss. Specifying n=0 indicates that there is no cache at that level.

Translation Lookaside Buffer (TLB) Options

The following options control the TLB. The TLB is a cache for the page table. Blocking for the TLB can improve cache performance. The following sections describe options that control how the loop nest optimizer models the TLB when performing transformations. The TLB hardware is assumed to be fully associative.

-LNO:ps1=n, ps2=n, ps3=n, ps4=n

Specifies the number of bytes in a page, where n is an integer in the range 4000 ≤n≤ 256000. The default n depends on your system hardware, and you can obtain this information through the getpagesize(2) system call. For more information on this system call, see the getpagesize(2) man page.

-LNO:tlb1=n, tlb2=n, tlb3=n, tlb4=n

Specifies the number of entries in the TLB for this cache level, where n is an integer in the range 40 ≤n≤ 100. The default n depends on your system hardware.

-LNO:tlbcmp1=n, tlbcmp2=n, tlbcmp3=n, tlbcmp4=n and tlbdmp1=n, tlbdmp2=n, tlbdmp3=n, tlbdmp4=n

Specifies the number of processor cycles it takes to service a clean or dirty TLB miss, where n is an integer in the range 40 ≤n≤ 200. The default n depends on your system hardware.

Prefetch Options

The following options control use of the prefetch operation. When an LNO prefetch option is enabled, the compiler examines the source code for memory references that can cause cache misses. It then inserts prefetches into the generated code so that the prefetches are performed ahead of the corresponding memory references.

The -mips4 and -r10000 (or -r12000) options must be in effect in order for the LNO prefetch options to be recognized.

-LNO:pfk=setting

Selectively disables and enables prefetching for cache level k, where 1 ≤k≤ 4. Specify ON or OFF for setting.

When -r10000 or -r12000 is in effect, pf1=ON and pf2=ON by default. At any other -rn setting, OFF is in effect for all cache levels.

-LNO:prefetch=n

Specifies levels of prefetching.

prefetch=0 disables all prefetching. This is the default when -r4000, -r5000, or -r8000 is in effect.

prefetch=1 enables conservative prefetching. This is the default when -r10000 or -r12000 is in effect.

prefetch=2 enables aggressive prefetching.

-LNO:prefetch_ahead=n

Prefetches the specified number of cache lines ahead of the reference. The default is 2.

-LNO:prefetch_manual=setting

Specifies whether manual prefetches (through directives) should be respected or ignored. Specify ON or OFF for setting.

prefetch_manual=OFF ignores manual prefetches. This is the default when -r4000, -r5000, or -r8000 is in effect.

prefetch_manual=ON respects manual prefetches. This is the default when -r10000 or -r12000 is in effect.

-macro_expand

Enables macro expansion in preprocessed Fortran source files throughout each file.

When -macro_expand is specified, macro expansion occurs throughout the source file. When -macro_expand is not specified, macro expansion is limited to preprocessor (#) directives in files processed by the Fortran preprocessor.

For more information on source preprocessing compiler options, see the following options: -cpp, [-Dvar[=def][,var[=def]]...], -E, -ftpp, -nocpp, -P, and -Uvar.

For information on source preprocessing and the macros available, see Chapter 7, “Source Preprocessing”.

-MDupdate[file]

Updates makefile dependencies in file. The file can be included by smake(1) and pmake(1) to get dependencies. Files named on INCLUDE statements and modules named on USE statements are updated.

When file is not specified, the lines updated are those that begin with the name of the output file, followed by a colon (:), and end with a distinctive make(1) comment.

When file is specified, file is updated during compilation to contain header, library, and run-time make(1) dependencies for the output file.

For example, assume that file foo.f90 contains the following two lines:

INCLUDE "bar.h"
USE mod

The updated file will contain a line similar to the following:

foo.o : bar.h MOD.mod

-mipsn

Specifies the Instruction Set Architecture (ISA). Specify -mips3 to specify the MIPS III instruction set. Specify -mips4 to specify the MIPS IV instruction set. For information on the default setting for your system, see file /etc/compiler.defaults.

The -mipsn option interacts with the -64 and -n32 options.

-mp

Generates multiprocessing code for the files being compiled. This option causes the compiler to recognize all multiprocessing directives and enables all -MP:... options.

If you have specified more than one type of multiprocessing directive for an individual loop, you need to disable one or more sets of directives by using the -MP option in conjunction with the -mp option. Only one set of multiprocessing directives can be recognized for a specific loop. Specifying -mp sets all the -MP options to ON. To disable one or more sets of directives, specify one or more -MP options in conjunction with -mp.

The following list describes the sets of multiprocessing directives and indicates the command line options needed to selectively disable one or more sets of directives:

At load time, you can specify both object files produced with the -mp option and object files produced without it. If any or all of the files are compiled with -mp, the executable must be loaded with -mp so that the correct libraries are used.

Example 1: Multiprocessor executable. The following command line compiles and loads the Fortran program foo.f:

% f90 -mp foo.f

Example 2: Multiprocessor and optimizer. In the following example, the Fortran routines in the file snark.f are compiled with multiprocessing code generation enabled. The optimizer is also used.

% f90 -c -mp -O2 snark.f

A standard snark.o binary is produced, which must be loaded.

% f90 -mp -o boojum snark.o bellman.o

In this example, the -mp option signals the loader to use the Fortran multiprocessing library. The bellman.o file did not have to be compiled with the -mp option.

After loading, the resulting executable can be run like any executable file. Creating multiple execution threads, running and synchronizing threads, and task termination are all handled automatically.

When an executable file is loaded with -mp, the Fortran initialization routines determine how many parallel threads of execution to create. This determination occurs each time the task starts; the number of threads is not compiled into the code. The default is to use either 8 or the number of processors that are on the machine, whichever is less. You can override the default by setting the OMP_NUM_THREADS environment variable to a value that is less than or equal to the number of physical processors. If it is set, Fortran tasks use the specified number of execution threads. For more information on the OMP_NUM_THREADS environment variable, see pe_environ(5).

-MP:...

Specifies individual multiprocessing options that provide fine control over certain optimizations.

Specifying -mp enables all the -MP:... options. The -mp option must be specified in conjunction with any -MP: options in order for the -MP: options to be honored.

The following sections describe the -MP: options.

-MP:check_reshape=setting

Enables or disables run time consistency checks across procedure boundaries when passing reshaped arrays (or portions thereof) as actual arguments. Specify ON or OFF for setting. The default is check_reshape=OFF.

-MP:clone=setting

Enables or disables autocloning. Specify ON or OFF for setting. The compiler automatically duplicates procedures that are called with reshaped arrays as actual arguments for the incoming distribution. If you have explicitly specified the distribution on all relevant dummy arguments, you can disable autocloning. The consistency checking of the distribution between actual and dummy arguments is not affected by this option and is always enabled. The default is clone=ON.

For more information on regular and reshaped distribution, see Chapter 5, “Parallel Processing on Origin Series Systems”.

-MP:dsm=setting (Origin Series Systems Only)

Enables or disables recognition of the Origin series distributed shared memory directives described in Chapter 5, “Parallel Processing on Origin Series Systems”. These directives begin with a !$ prefix and are outmoded.

Specify ON or OFF for setting. When the -mp option is also in effect, the default is dsm=ON. When the -mp option is not in effect, the default is dsm=OFF.


Note: The Origin series distributed shared memory directives that begin with the !$ prefix are outmoded. Silicon Graphics and Cray Research encourage you to write new codes using the Silicon Graphics directives that are extensions to OpenMP Fortran API. The OpenMP extension directives begin with the !$SGI prefix and are otherwise identical to the Origin series distributed shared memory directives.

The effects of this option when used in conjunction with -mp are as follows:

Options specified 

Directives recognized

-MP:dsm=ON and -mp 

OpenMP multiprocessing directives described in Chapter 4, “OpenMP Fortran API Multiprocessing Directives”, and the Silicon Graphics extension directives to OpenMP described in Chapter 5, “Parallel Processing on Origin Series Systems”.

Multiprocessing directives described in Appendix D, “Multiprocessing Directives (Outmoded)”.

Origin series distributed shared memory multiprocessing directives described in Chapter 5, “Parallel Processing on Origin Series Systems”, that begin with the !$ prefix.

-MP:dsm=OFF and -mp 

OpenMP multiprocessing directives described in Chapter 4, “OpenMP Fortran API Multiprocessing Directives”, and the Silicon Graphics extension directives to OpenMP described in Chapter 5, “Parallel Processing on Origin Series Systems”.

Multiprocessing directives described in Appendix D, “Multiprocessing Directives (Outmoded)”.

When the -mp option is specified on the f90(1) command line, the compiler silently generates bookkeeping information in the rii_files directory. This information is used to implement data distribution directives, as well as perform consistency checks of these directives across multiple source files. To disable the processing of the data distribution directives and not generate the rii_files, compile the program with the -MP:dsm=off option.

-MP:old_mp=setting

Enables or disables recognition of the Silicon Graphics multiprocessing directives described in Appendix D, “Multiprocessing Directives (Outmoded)”, and the Origin series distributed shared memory directives described in Chapter 5, “Parallel Processing on Origin Series Systems”, that begin with a !$ prefix. These directives are the loop-level multiprocessing directives (including those for Origin series systems) and the PCF directives. These directives begin with a !$ or !$PAR prefix.

Specify ON or OFF for setting. The default is ON.


Note: The Silicon Graphics multiprocessing directives are outmoded. Their preferred alternatives are the OpenMP Fortran API directives described in Chapter 4, “OpenMP Fortran API Multiprocessing Directives”.

The effects of this option when used in conjunction with -mp are as follows:

Options specified 

Directives recognized

-MP:old_mp=ON and -mp 

OpenMP multiprocessing directives described in Chapter 4, “OpenMP Fortran API Multiprocessing Directives”, and the Silicon Graphics extension directives to OpenMP described in Chapter 5, “Parallel Processing on Origin Series Systems”.

Multiprocessing directives described in Appendix D, “Multiprocessing Directives (Outmoded)”

Origin series distributed shared memory multiprocessing directives described in Chapter 5, “Parallel Processing on Origin Series Systems”, that begin with the !$ prefix.

-MP:old_mp=OFF and -mp 

OpenMP multiprocessing directives described in Chapter 4, “OpenMP Fortran API Multiprocessing Directives”, and the Silicon Graphics extension directives to OpenMP described in Chapter 5, “Parallel Processing on Origin Series Systems”.

-MP:open_mp=setting

Enables or disables recognition of the OpenMP Fortran API multiprocessing directives described in Chapter 4, “OpenMP Fortran API Multiprocessing Directives”, and the Silicon Graphics extensions to OpenMP described in Chapter 5, “Parallel Processing on Origin Series Systems”. These directives begin with a !$OMP or a !$SGI prefix.

Specify ON or OFF for setting. The default is ON.

The effects of this option when used in conjunction with -mp are as follows:

Options specified 

Directives recognized

-MP:open_mp=ON and -mp 

OpenMP multiprocessing directives described in Chapter 4, “OpenMP Fortran API Multiprocessing Directives”, and the Silicon Graphics extension directives to OpenMP described in Chapter 5, “Parallel Processing on Origin Series Systems”.

Multiprocessing directives described in Appendix D, “Multiprocessing Directives (Outmoded)”.

Origin series distributed shared memory multiprocessing directives described in Chapter 5, “Parallel Processing on Origin Series Systems”, that begin with a !$ prefix.

-MP:open_mp=OFF and -mp 

Multiprocessing directives described in Appendix D, “Multiprocessing Directives (Outmoded)”.

Origin series distributed shared memory multiprocessing directives described in Chapter 5, “Parallel Processing on Origin Series Systems”, that begin with a !$ prefix.

-mplist

Generates file.w2f.f.


Note: Because of data conflicts, do not specify the -mplist or -FLIST options when -apokeep or -pfakeep are specified. For more information on -FLIST, see “-FLIST:...”. For more information on -apokeep and -pfakeep, see “-apo, -apokeep, -apolist”.


-mp_schedtype=mode

Specifies a default mode for scheduling work among the participating tasks in loops. This option must be specified in conjunction with -mp.

Specifying this option has the same effect as putting a !$MP_SCHEDTYPE=mode directive at the beginning of the file. Specify one of the following for mode:

mode

Action

DYNAMIC

Breaks the iterations into pieces, the size of which is specified by the -chunk=integer option. As each process executes a piece, it enters a critical section and obtains the next available piece. For more information, see the -chunk=integer option.

GSS

Schedules pieces according to the sizes of the pieces awaiting execution.

INTERLEAVE

Breaks the iterations into pieces, the size of which is specified by the -chunk=integer option. Execution of the pieces is interleaved among the processes. For more information, see the -chunk=integer option.

RUNTIME

Schedules pieces according to information contained in the MP_SCHEDTYPE environment variable.

SIMPLE

Divides the iterations among processes by dividing them into contiguous pieces and assigning one piece to each process. Default.

For more information on environment variables, these modes, and their effects, see the pe_environ(5) man page.

-noappend

Prevents the compiler from appending a trailing underscore character (_) on external names.

-nocpp

Disables the source preprocessor.

For more information on source preprocessing compiler options, see the following options: -cpp, -Dvar[=def][,var[=def]]..., -E, -ftpp, -macro_expand, -P, and -Uvar.

For information on source preprocessing and the macros available, see Chapter 7, “Source Preprocessing”.

-noextend_source

Restricts Fortran source code lines to columns 1 through 72. See the -coln and -extend_source options for more information on controlling line length.

-nostdinc

Directs the system to skip the standard directory, /usr/include, when searching for #include files and files named on Fortran INCLUDE statements.

-oout_file

Writes the executable file to out_file rather than to a.out. By default, the executable output file is written to a.out.

For example, the following command line loads object module myprog.o and produces an executable object named myprog:

% f90 -o myprog myprog.o

-Olevel

Specifies the basic optimization level, as follows:

Option

Action

-O0

No optimization. Default.

-O1

Local optimization.

-O2, -O

Extensive optimization. Optimizations performed at this level are almost always beneficial. The execution time is shortened, but compile time may be lengthened.

-O3

Aggressive optimization. Optimizations performed at this level may generate results that differ from those obtained when -O2 is specified.

-Ofast[=ipxx]

Enables -O3 and -ipa.

This option enables optimizations selected to maximize performance for the target platform ipxx processor type. To determine a platform ipxx designation, use the hinv(1) command.

The optimizations performed may differ from release to release and among the supported platforms. The optimizations always enable the full instruction set of the target platform (for example, -mips4 for an R10000). Although the optimizations are generally safe, they may affect floating-point accuracy due to operator reassociation. Typical optimizations selected include those performed at -O3. See the -TARG:platform=ipxx option for more information on the ipxx argument. The default is an R10000 POWER CHALLENGE, IP25.

-OPT:...

Controls miscellaneous optimizations. These options override defaults based on the main optimization level.

For information on inlining, see the -INLINE:... option. For information on loop nest optimization, see the -LNO:... option. For information on interprocedural optimization, see the -IPA:... option.

The following sections describe the various general optimization options.

-OPT:alias=name

Specifies the pointer aliasing model to be used. By specifying one of the following for name, the compiler is able to make assumptions throughout the compilation:

name 

Assumption

parm or no_parm 

parm asserts that Fortran arguments do not alias to any other variable. Default.

no_parm asserts that Fortran arguments can alias to any other variable.

cray_pointer or no_cray_pointer 

cray_pointer asserts that a pointee's storage is never overlaid on another variable's storage. The pointee is stored in memory before a call to an external procedure and is read out of memory as its next reference. It is also stored before a RETURN or END statement of a subprogram.

no_cray_pointer asserts that a pointee's storage can overlay on another variable's storage. Default.

-OPT:cis=setting

Converts SIN/COS pairs with the same argument to a single call that calculates both values at once. Specify ON or OFF for setting. The default is cis=ON.

-OPT:cray_ivdep=setting

Instructs the compiler to ignore all dependencies when an IVDEP directive is encountered. Specify ON or OFF for setting. The default is OFF.

For more information on the IVDEP directive, see “Ignore Vector Dependencies: IVDEP” in Chapter 6.

-OPT:div_split=setting

Enables or disables the calculation of x/y as x× (1.0/y). Specify ON or OFF for setting. The default is div_split=OFF.

This is enabled by the -OPT:IEEE_arithmetic=3 option. Also see the -OPT:recip option. This option should be used with caution because it produces less accurate results.

-OPT:fast_bit_intrinsics=setting

fast_bit_intrinsics=ON turns off the check for the bit count being within range for Fortran bit intrinsics (for example, BTEST and ISHFT). Specify ON or OFF for setting. The default is fast_bit_intrinsics=OFF.

-OPT:fast_complex=setting

fast_complex=ON enables fast calculations for values declared as type complex. When set to ON, complex absolute value (norm) and complex division calculations use fast algorithms that can cause overflow for an operand (divisor, in the case of division) that has an absolute value that is larger than the square root of the largest representable floating-point number (or underflow for a value that is smaller than the square root of the smallest representable floating point number).

Specify ON or OFF for setting. The default is fast_complex=OFF. fast_complex=ON is enabled if -OPT:roundoff=3 is in effect.

-OPT:fast_exp=setting

fast_exp=ON optimizes exponentiation by replacing the run-time call for exponentiation by multiplication and/or square root operations for certain compile-time constant exponents (integers and halves). This can produce results that are rounded differently than the run-time routine. fast_exp=ON is in effect unless -OPT:roundoff=1 is in effect.

Specify ON or OFF for setting. The default is fast_exp=ON.

-OPT:fast_nint=setting

fast_nint=ON uses hardware features to implement NINT and ANINT (both single- and double-precision versions). Specify ON or OFF for setting. The default is fast_nint=OFF, but fast_nint=ON is enabled by default if -OPT:roundoff=3 is in effect. fast_nint=ON is also enabled when fast_trunc=ON is in effect.

When fast_nint=ON is in effect, rounding is performed according to the IEEE standard rather than the Fortran standard. For example, the Fortran standard requires that NINT(1.5)=2 and NINT(2.5)=3. The IEEE standard, however, rounds both of these to 2.

-OPT:fast_sqrt=setting

fast_sqrt=ON calculates square roots using the identity sqrt(x)=x*rsqrt(x), where rsqrt is the reciprocal square root operation. Specify ON or OFF for setting. The default is OFF.

The -mips4 and -r8000 options must be in effect in order for -OPT:fast_sqrt to be recognized.


Warning: This option results in sqrt(0.0) producing a NaN result. Use it only when zero sqrt operands are not valid.


-OPT:fast_trunc=setting

fast_trunc=ON inlines the NINT, ANINT, AINT, and AMOD Fortran intrinsics, both single- and double-precision versions. Specify ON or OFF for setting. The default is fast_trunc=OFF. fast_trunc=ON is enabled automatically if -OPT:roundoff=1 (or greater) is in effect.

Although fully compliant with the Fortran standard, fast_trunc=ON reduces the valid argument range somewhat.

If fast_trunc=ON is in effect, fast_nint=ON is also enabled.

-OPT:fold_reassociate=setting

fold_reassociate=ON allows optimizations involving reassociation of floating-point quantities. Specify ON or OFF for setting. The default is fold_reassociate=OFF. fold_reassociate=ON is enabled automatically when -O3 is in effect or when -OPT:roundoff=2 or greater is in effect.

-OPT:fold_unsafe_relops=setting

fold_unsafe_relops=ON folds relational operators in the presence of possible integer overflow. Specify ON or OFF for setting. The default is fold_unsafe_relops=ON.

-OPT:fold_unsigned_relops=setting

fold_unsigned_relops=ON folds unsigned relational operators in the presence of possible integer overflow. Specify ON or OFF for setting. The default is fold_unsigned_relops=OFF.

-OPT:got_call_conversion=setting

got_call_conversion=ON loads function addresses to be moved out of loops. The load is set up with the proper relocation so that the address is resolved at program start-up time. Specify ON or OFF for setting. got_call_conversion=OFF is the default when -O2 or lower is in effect. got_call_conversion=ON when -O3 is in effect.


Note: This option should be disabled when compiling shared objects that contain function addresses that may be preempted by rld(1). For more information, see the dso(5) man page.


-OPT:IEEE_arithmetic=n

Specifies the level of conformance to ANSI/IEEE 754-1985, the IEEE Standard for Binary Floating-point Arithmetic, which describes a standard for NaN and inf operands, arithmetic roundoff, and overflow. Specify one of the following for n:

n

Description

1

Inhibits optimizations that produce less accurate results than required by ANSI/IEEE 754-1985. This is the default.

2

Allows compiler optimizations that can produce less accurate inexact results (but accurate exact results) on the target hardware. That is, expressions that would have produced a NaN or an inf may produce different answers, but otherwise answers are the same as those obtained when IEEE_arithmetic=1 is in effect.

Examples: 0*X may be changed to 0, and X/X may be changed to 1 even though this is inaccurate when X is +inf, -inf, or NaN.

3

Performs arbitrary, mathematically valid transformations, even if they can produce inaccurate results for operations specified in ANSI/IEEE 754-1985. These transformations can cause overflow or underflow for a valid operand range. An example is the conversion of x/y to x*recip(y) for MIPS IV targets. Also see -OPT:roundoff=n.

-OPT:IEEE_comparisons=setting

Forces all comparisons to yield results that conform to ANSI/IEEE 754-1985, the IEEE Standard for Binary Floating-point Arithmetic, which describes a standard for NaN and inf operands. Specify ON or OFF for setting. The default is IEEE_comparisons=OFF.

IEEE_comparisons=OFF produces non-IEEE results for comparisons. For example, x=x is treated as TRUE without executing a test.


Note: This option has been deprecated and will be removed in a future release. The preferred alternative is -OPT:IEEE_NaN_inf=setting.


-OPT:IEEE_NaN_inf=setting

Forces all operations that might have NaN or inf operands to yield results that conform to ANSI/IEEE 754-1985, the IEEE Standard for Binary Floating-point Arithmetic, which specifies the standard for NaN and inf operands. Specify ON or OFF for setting. The default is IEEE_NaN_inf=OFF.

IEEE_NaN_inf=OFF produces non-IEEE results for various operations. For example, x=x is treated as TRUE without executing a test and x/x is simplified to 1 without dividing. Turning this option on can suppress many such common optimizations and hurt performance.

-OPT:inline_intrinsics=setting

inline_intrinsics=OFF turns all Fortran intrinsics that have a library function into a call to that function. Specify ON or OFF for setting. The default is inline_intrinsics=ON.

-OPT:liberal_ivdep=setting

Specifies that the compiler should use UNICOS semantics when a !DIR$ IVDEP directive is encountered. The compiler ignores all lexically backward loop iteration dependencies. Specify ON or OFF for setting. The default is OFF, which directs the compiler to use IRIX semantics when a !DIR$ IVDEP directive is encountered.

For more information on the !DIR$ IVDEP directive, see “Ignore Vector Dependencies: IVDEP” in Chapter 6.

-OPT:Olimit=n

Specifies that any routine bigger than n should not be optimized.

You should use this option if you receive a message indicating that a different Olimit value is needed for optimizing your program; the value itself is based on internal compiler calculations. If -O2 or greater is in effect and a routine is so big that the compile speed can be slow, the compiler generates a message indicating the Olimit value that is needed to optimize. You can recompile with that value of n, or you can recompile with -OPT:Olimit=0 and avoid having any Olimit cutoff.

-OPT:pad_common=setting

pad_common=ON reorganizes common blocks to improve the cache behavior of accesses to members of the common block. This may involve adding padding between members and/or breaking a common block into a collection of common blocks. Specify ON or OFF for setting. The default is pad_common=OFF.

This option should not be used unless the common block definitions (including EQUIVALENCE) are consistent among all sources comprising a program. In addition, pad_common=ON should not be specified if common blocks are initialized with DATA statements. If specified, pad_common=ON must be used for all source files in the program.

pad_common=ON is supported for Fortran only. It should not be used if a common block is referenced from C code.

-OPT:procedure_reorder=setting

procedure_reorder=ON must be specified in conjunction with the ld(1) command's -LD_LAYOUT:reorder_file=feedback_file option to enable linker cording. Linker cording is the linker's ability to optimize the layout of functions based upon a feedback file; this minimizes page faults and cache misses. The default is OFF.

For more information on the -LD_LAYOUT option, see the ld(1) man page. For an example that shows reordering of code regions, see the MIPSpro Compiling and Performance Tuning Guide.

-OPT:recip=setting

The -OPT:recip=setting option causes your program's executable code to conform more closely to the IEEE floating-point standard than the default mode. When specified, many identity optimizations are disabled, executable code is slower, and a scaled complex divide mechanism is enabled that increases the range of complex values that can be handled without producing an underflow.

The -OPT:recip=setting option causes the compiler to optimize expressions such as X.NE.X to false and X/X to 1, where X is a floating-point value. With -OPT:recip=setting in effect, these and other similar arithmetic identity optimizations are not performed.

recip=ON specifies that faster, but potentially less accurate, reciprocal operations should be performed. Specify ON or OFF for setting. The default is recip=OFF. -r8000 must be in effect in order for -OPT:recip=ON to have an effect. If -O3 or -OPT:IEEE_arithmetic=2 or above are in effect, recip=ON is enabled automatically.

-OPT:reorg_common=setting

reorg_common=ON reorganizes common blocks to improve the cache behavior of accesses to members of the common block. The reorganization is performed only if the compiler detects that it is safe to do so. Specify ON or OFF for setting.

This option produces consistent results for programs that conform to the Fortran standard; for example, programs that do not overindex arrays in common blocks. The optimizations performed are safe even if common blocks are declared differently in different subroutines or if elements in the common block are equivalenced.

reorg_common=ON is enabled by default when -O3 is in effect and when all files that reference the common block are compiled at -O3. reorg_common=OFF is set when the file that contains the common block is compiled at -O2 (or below).

-OPT:roundoff=n

Specifies the level of acceptable departure from source language floating-point round-off, and overflow semantics. Specify 0, 1, 2, or 3 for n. Program performance is best at roundoff=3.

roundoff=0 is the default when optimization levels -O0, -O1, and -O2 are in effect. This inhibits optimizations that might affect the floating-point behavior.

roundoff=1 allows simple transformations that might cause limited round-off or overflow differences. Compounding such transformations could have more extensive effects.

roundoff=2 is the default level when -O3 is in effect. This level allows more extensive transformations, such as the reordering of reduction loops.

roundoff=3 enables any mathematically valid transformation.

To obtain best performance in conjunction with software pipelining, specify roundoff=2 or roundoff=3. This is because reassociation is required for many transformations to break recurrences in loops. Also see the descriptions for -OPT:IEEE_arithmetic, -OPT:fast_complex, -OPT:fast_trunc, and -OPT:fast_nint.

-OPT:rsqrt=setting

rsqrt=ON specifies that faster, but potentially less accurate, reciprocal square root operations may be performed. Specify ON or OFF for setting. The default is rsqrt=OFF.

If -OPT:IEEE_arithmetic=2 or above or -O3 are in effect, rsqrt=ON is enabled.

-OPT:space=setting

space=ON specifies that code space is to be given priority in tradeoffs with execution time in optimization choices. For instance, this forces all exits from a function to go through a single exit block. Specify ON or OFF for setting. The default is space=OFF.

This option can affect loop unrolling size. For more information on this, see “-OPT:unroll_size=n”.

-OPT:speculative_ptr_deref=setting

This option allows speculative loads of memory locations that differ by a small offset from some referenced memory location.

This option is enabled by default at -O2 and -O3. However, the legal offset ranges are different at each level. At -O2, the range is 32 (-16 .. +16). At -O3, the range is 128 (-64 .. +64).

This optimization can result in an exception if the speculated location is on a different page than that of the referenced memory location. The chances of this happening with these legal offset ranges is very remote.

-OPT:swp=setting

swp=ON enables software pipelining. Software pipelining is a compiler code generation technique in which operations from various loop iterations are overlapped in order to exploit instruction-level parallelism, increase the instruction issue rate, and better hide memory and instruction latency. As an optimization technique, software pipelining is similar to bottom loading, but it includes additional, and more efficient, scheduling optimizations.

Specify ON or OFF for setting. swp=ON is enabled when -O3 is in effect. The default is swp=OFF.

-OPT:unroll_analysis=setting

unroll_analysis=ON analyzes resource usage and recurrences in bodies of innermost loops that do not qualify for being fully unrolled. Such loops are unrolled only to the extent for which there is a potential benefit in doing so. A loop could be unrolled, for example, to decrease the shortest possible schedule length per iteration. Specify ON or OFF for setting. The default is unroll_analysis=ON.

unroll_analysis=ON can have the negative effect of unrolling loops less than the upper limit dictated by the -OPT:unroll_times_max and -OPT:unroll_size specifications.

-OPT:unroll_size=n

Specifies the maximum size (in instructions) of an unrolled loop. Specify an integer for n. When -OPT:space=OFF is in effect, the default is unroll_size=80. When -OPT:space=ON is in effect, the default is unroll_size=20. For more information, see “-OPT:space=setting”.

This option indirectly determines which loops can be fully unrolled. Also see the -OPT:unroll_times_max option.

-OPT:unroll_times_max=n

Specifies the maximum number of times a loop will be unrolled if it is not going to be fully unrolled. Specify an integer for n.

The default value of n depends on the target processor. The default is 8 when -r8000, -r10000 or -r12000 are in effect, and the default is 4 in all other cases. Also see the -OPT:unroll_size option.

-OPT:wrap_around_unsafe_opt=setting

Allows you to prevent the compiler from performing potentially unsafe optimizations involving induction variable replacement and linear function replacement. These optimizations are performed by default when -O2 or -O3 are specified. Specify ON or OFF for setting.

Setting wrap_around_unsafe_opt=OFF disables both the induction variable replacement and linear function test replacement optimizations. These optimizations are safe when loop induction variables do not overflow or wrap around in memory. These optimizations are unsafe when incorrect code is generated due to multiple induction variables in loops having combined initial values that overflow or wrap around in memory. Using this option can degrade performance. It is provided as a diagnostic tool.

-P

Runs only the source preprocessor and puts the results for each source file (that is, for file.f[90], file.F[90], and/or file.s) in a corresponding file.i. The file.i that is generated does not contain # lines.

For more information on source preprocessing compiler options, see the following options: -cpp, [-Dvar[=def][,var[=def]]...], -E, -ftpp, -macro_expand, -nocpp, and -Uvar.

For information on source preprocessing and the macros available, see Chapter 7, “Source Preprocessing”.

-pad_char_literals

Blank pads all character literal constants that are shorter than the size of the default integer type and that are passed as actual arguments. The padding extends the length to the size of the default integer type.

-pfa, -pfakeep, -pfalist

The -pfa, -pfakeep, and -pfalist options control the Auto-Parallelizing Option (APO). These options have been superseded by the -apo, -apokeep, and -apolist options.

For more information on the -apo, -apokeep, and -apolist options, see “-apo, -apokeep, -apolist”. For more information on APO, see Chapter 9, “The Auto-Parallelizing Option (APO)”.


Note: These options are ignored unless you are licensed for the MIPSpro Auto-Parallelizing Option. For more information on this product contact your sales representative.

If -pfakeep is specified in conjunction with -ipa or -IPA, the default settings for IPA suboptions are used with the exception of the inline=setting suboption. For that suboption, the default becomes OFF. For more information on IPA, see the ipa(5) man page.

-rprocessor

Specifies the code scheduler. The -r option accepts 4000, 5000, 8000, 10000, and 12000 as arguments, as follows:

Option

Action

-r4000

Schedules code for the R4000 processor.

-r5000

Schedules code for the R5000 processor.

-r8000

Schedules code for the R8000 processor.

-r10000

Schedules code for the R10000 processor.

-r12000

Schedules code for the R12000 processor.

Note that these options can also be specified with a k substituted for 000, as follows: -r8k, -r10k, and so on.

This option adds one of the following to the head of the library search path, where processor is as you specified:

  • -L/usr/lib32/mips3/processor

  • -L/usr/lib32/mips4/processor

  • -L/usr/lib64/mips3/processor

  • -L/usr/lib64/mips4/processor

The actual library search path that is added depends on the ABI that is specified or implied. For information on specifying an ABI, see the -64 and -n32 options described in “-64, -n32”.

-rreal_spec

Specifies the default kind specification for real values, as follows:

Option

Kind value

-r4

Uses REAL(KIND=4) and COMPLEX(KIND=4) for real and complex variables, respectively. Default.

-r8

Uses REAL(KIND=8) and COMPLEX(KIND=8) for real and complex variables, respectively. You can specify -r8 when porting programs from UNICOS systems.

-S

Generates an assembly file, file.s, rather than an object file (file.o).

-static

Statically allocates all local variables. Statically allocated local variables are initialized to zero and exist for the life of the program. This option can be useful when porting programs from older systems in which all variables are statically allocated.

When compiling with the -static option, global data is allocated as part of the compiled object (file.o) file. The total size of any file.o cannot exceed 2 GB, but the total size of a program loaded from multiple .o files can exceed 2 GB. An individual common block cannot exceed 2 GB, but you can declare multiple common blocks each having that size.

For more information on compiling with large files, see the -64 and -n32 options described in “-64, -n32”.

If a parallel loop in a multiprocessed program calls an external routine, that external routine cannot be compiled with the -static option. You can mix static and multiprocessed object files in the same executable, but a static routine cannot be called from within a parallel region.

-static_threadprivate

Makes all static variables private to each thread. This option can be specified in conjunction with the -static option, which statically allocates all local variables.

-TARG:...

Cross compiling is compiling a program on one system and executing it on another. To cross compile, you can either use the -TARG: command line options to control the target architecture and machine for which code is generated or you can set the COMPILER_DEFAULTS_PATH environment variable to specify the file that contains the default processor information needed to generate executable code for the target system.

The following sections describe cross compiling using both the -TARG: options and the COMPILER_DEFAULTS_PATH environment variable.

-TARG:dismiss_mem_faults=setting

Forces the kernel to dismiss any memory faults, such as SIGSEGV or SIGBUS, that occur during execution of the program (not just the code being compiled). This option allows optimizations that might cause extra faults and can slow down execution if extra faults occur. It also prevents recognition of legitimate faults. setting can be ON or OFF. The default is OFF.

-TARG:exec_max=letter

Specifies the maximum set of IEEE-754 floating-point exceptions for which traps may be enabled at run time for the program (not just the code being compiled). The default is IUOZV. The default can be affected by the -TENV:X option.

This option allows optimizations that might cause extra exceptions, and it may prevent recognition of legitimate faults. It does not affect explicit setting of exception enable flags by the program and should be avoided if the program does this.

Specify zero or more of the following for letter to specify exceptions: I specifies inexact; U specifies underflow; O specifies overflow; Z specifies divide by zero; and V specifies invalid operations.

For related information, see the -TARG:exc_min option description. For information on the -TENV:X option, see “-TENV:X=n”.

-TARG:exec_min=letter

Specifies the minimum set of IEEE-754 floating-point exceptions for which traps must be enabled at runtime for the program (not just the code being compiled). The default is none.

This option does not affect explicit setting of exception enable flags by the program and should be avoided if the program does this.

Specify zero or more of the following for letter to specify exceptions: I specifies inexact; U specifies underflow; O specifies overflow; Z specifies divide by zero; and V specifies invalid operations.

For related information, see the -TARG:exc_max option description. The -TARG:exc_max and -TARG:exc_min options specified for the various files that comprise a program must be consistent; for example, none of the -TARG:exc_min values may require exceptions disabled by -TARG:exc_max values.

-TARG:fp_precise=setting

Forces the target processor into precise floating-point mode at execution time. Using this option to compile any component source files of a program invokes this feature in the resulting program. Specify ON or OFF for setting. The default is OFF.

This option is only meaningful when -r8000 is in effect. It can cause significant performance degradation for programs with heavy floating-point usage. For more information on floating-point mode, see the fpmode(1) man page.

-TARG:isa=instruction_set

Identifies the target instruction set architecture for compilation, such as the set of instructions that are generated. For instruction_set specify mips3 or mips4. Specify -TARG:isa=mips3 for code that must run on R4000 processors. This option is equivalent to specifying -mips3 or -mips4. For information on defaults, and for information on the -mipsn option, see “-mipsn”.

-TARG:madd=setting

Enables or prevents transformations from using multiply and add instructions. Specify ON or OFF for setting. The default is ON. This option is ignored unless -mips4 is in effect.

These instructions perform a multiply/add with a single round off. They are more accurate than the usual discrete operations, and they may cause results not to match baselines from other targets. Use this option to determine whether observed differences are due to multiply/add instructions.

-TARG:platform=ipxx

Specifies the target platform for compilation, choosing various internal parameters (such as cache sizes) appropriately. Supported values are as follows: ip19, ip20, ip21, ip22_4k, ip22_5k, ip24, ip25, ip26, ip27, ip28, ip30, ip32_5k, and ip32_10k. The appropriate selection for your platform can be determined by entering the following command:

hinv -c processor

The first line of output identifies the proper IP number. If a processor suffix (for example, _4k) is required, the next line identifies the processor (for example, R4000).

-TARG:processor=processor

Selects the processor for which to schedule code. The chosen processor must support the instruction set architecture (ISA) that is specified (or implied by the ABI). Specify one of the following for processor: r4000, r5000, r8000, r10000, or -r12000.

-TARG:r4krev22=setting

Generates code to work around bugs in the R4000 rev 2.2 chip. This currently means simulating 64-bit variable shifts in the software. Specify ON or OFF for setting. The default is OFF.

-TARG:sync=setting

Enables or disables use of SYNC instructions. Specify ON or OFF for setting. The default is ON.

CPU Targeting (Cross Compiling) Using the compiler.defaults File

The MIPSpro 7 Fortran 90 compiler retrieves default information for the Application Binary Interface (ABI), instruction set architecture (ISA), and processor type, optimization, and IEEE arithmetic computations from /etc/compiler.defaults.

To compile for a different system, set the COMPILER_DEFAULTS_PATH environment variable to a path or to a colon-separated list of paths designating where the compiler is to look for the compiler.defaults file. For more information on this environment variable, see the pe_environ(5) man page.

The target compiler.defaults file must contain a -DEFAULT:option specifier that specifies the default information in the following format:

-DEFAULT:[abi=n32|64] [:isa=mips3|mips4]
[:proc=r4000|r5000|r8000|r10000|r12000] [:opt=0|1|2|3]
[:arith=1|2|3]

Note that command line settings override any settings in the system-supplied compiler.defaults file or in the compiler.defaults file that you create.

-TENV:...

Specifies the target environment option group. The target environment is the system upon which the executable code will be run. These options control the target environment assumed and/or produced by the compiler.

The following sections describe the -TENV:... options.

-TENV:align_aggregate=bytes

Controls alignment of allocated aggregates (that is, arrays and derived types). The value specified for bytes specifies that any aggregate object at least that large is to be given at least that alignment. By default, or if bytes is not specified, aggregates are aligned to the integer register size, which, for example, is 8 bytes for 64-bit programs and 4 bytes for 32-bit programs.

If align_aggregate=0 is specified, the minimum alignment consistent with the ABI is used. Otherwise, the value specified must be 1, 2, 4, 8, or 16.

-TENV:check_div=n

Inserts checks for divide by zero operations and overflow conditions on integer divide operations. Specify 0, 1, 2, or 3 for n. The default is check_div=1.

check_div=0 inhibits checking. check_div=1 checks for division by zero. check_div=2 checks for overflow. check_div=3 checks for both division by zero and overflow.


Note: This option is deprecated. It will be removed in a future release. The preferred alternative is to specify DEBUG:div_check. For more information on the DEBUG:div_check option, see the debug_group(5) man page.


-TENV:large_GOT=setting

Generates code to accommodate a larger Global Offset Table (GOT) than is standard. Specify ON or OFF for setting. The default is large_GOT=OFF.

You can set this option to ON if you get a GOT or GP overflow message. For more information about the GOT, see the What should I do about a GOT overflow? question in the FAQ section of the dso(5) man page.


Note: If you specify both -TENV:large_GOT=ON and -TENV:small_GOT=ON on your command line, a message is issued and the -TENV:small_GOT=ON directive is recognized.


-TENV:small_GOT=setting

Assumes that the GOT for shared code is smaller than 64 KB, that is, assumessmall offsets for references to it. Specify ON or OFF for setting. The default is small_GOT=ON.

For more information on controlling the GOT, see the -TENV:large_GOT option.

-TENV:X=n

Specifies the level of enabled exceptions that will be assumed for purposes of performing speculative code motion; exceptions considered here are the floating point exceptions defined in the ANSI/IEEE 754-1985, the IEEE Standard for Binary Floating-point Arithmetic (inexact, overflow, underflow, divide-by-zero, and invalid operation) and memory traps (SIGSEGV and SIGBUS).

The default is X=2 when -O3 is in effect. The default is X=1 when other -O optimization levels are in effect. Specify 0, 1, 2, 3, or 4 for n. The default is X=1.

Generally, an instruction is not speculated (moved above a branch by the optimizer) unless any exceptions it might cause are disabled by this option. X=0 inhibits speculative code motion.

X=1 specifies that safe speculative code motion be performed and disables all underflow and inexact exceptions according to ANSI/IEEE 754-1985.

X=2 disables all exceptions described in ANSI/IEEE 754-1985, except divide by zero.

X=3 disables all exceptions described in ANSI/IEEE 754-1985, including divide by zero.

X=4 disables or ignores memory exceptions.

At levels higher than the X=1 default level, various hardware exceptions, which are normally useful for debugging, or which are trapped and repaired by the hardware, may be disabled or ignored. This can hide obscure bugs. The program should not explicitly manipulate the IEEE floating-point trap-enable flags in the hardware if this option is used.

-u

Makes the default type of a variable undefined, rather than using default Fortran rules.

-Uvar

Undefines a variable for the source preprocessor. See the [-Dvar[=def][,var[=def]]...] option for information on defining variables.

For more information on source preprocessing compiler options, see the following options: -cpp, [-Dvar[=def][,var[=def]]...], -E, -ftpp, -macro_expand, -nocpp, and -P.

For information on source preprocessing and the macros available, see Chapter 7, “Source Preprocessing”.

-version

Writes compiler release version information to stdout. No input file needs to be specified when this option is used.

-w[arg]

Specifies messages. This option can take one of the following forms:

Option

Action

-w

Suppresses warning messages.

-w2

Shows warning messages. Default.

-Wl,opt[,arg][,opt[,arg]]...

Specifies options to be passed directly to the loader. For opt, specify any of the options that the loader, ld(1), accepts. For arg, specify an argument, if necessary, to opt. For information on possible values for opt and arg, see the ld(1) man page.

Example. The following command line passes the loader options -B static and -nostdlib to ld(1):

f90 -Wl,-B,static,-nostdlib herfile.f

-woffnum

Specifies message numbers to suppress. Examples:

  • Specifying -woff2026 suppresses message number 2026.

  • Specifying -woff2026-2352 suppresses messages 2026 through 2352.

  • Specifying -woff2026-2352,2400-2500 suppresses messages 2026 through 2353 and messages 2400 through 2500.

In the message level indicator, the message numbers appear after the dash in the message itself. For example, in the following message prefix, the message number is 197:

f90-197 mfef90:...

You cannot suppress messages issued at the ERROR level.

-xdirlist

Disables specified directives or specified classes of directives. If specifying a multiword directive, either enclose the directive name in quotation marks or remove the spaces between the words in the directive's name.

For dirlist, enter one of the following:

dirlist
 

Directives disabled

all or mipspro
 

All directives.

conditional_omp
 

Directives prefixed with !$.

dir
 

Directives with a !DIR or CDIR prefix.

mic
 

Directives with a !MIC or CMIC prefix.

directive
 

One or more directives. If specifying more than one, separate them with commas, as follows: -x ORDERED,"ASSERT NOARGUMENTALIASING".

--

Separates options and file names. This option, which consists of two dashes, signifies the end of the options. After this symbol, you can specify the files to be processed.

file.suffix[90][file.suffix[90]...]

File or files to be processed, where suffix is either an uppercase F or a lowercase f for source files. Files ending in .i, .o, and .s are also accepted. The Fortran source files are compiled, and an executable object file is produced.

The default name of the executable object file is a.out. For example, the following command line produces a.out:

% f90 myprog.f

By default, several files are created during processing. The MIPSpro 7 Fortran 90 compiler can add a suffix to the file portion of the file name and write the files it creates to your working directory.

The following is a file summary:

File

Content

a.out

Executable output file.

file.a

Object file archive.

file.B

Intermediate file written by the front end of the compiler. To retain this file, specify the -keep option. For more information on the -keep option, see “-keep”.

file.cfb

Feedback file for use with performance tools.

file.f or file.F

Input Fortran source file in fixed source form. If file ends in .F, the source preprocessor is invoked. For more information on preprocessing, see Chapter 7, “Source Preprocessing”.

file.f90 or file.F90

Input Fortran source file in free source form. If file ends in .F90, the source preprocessor is invoked. For more information on preprocessing, see Chapter 7, “Source Preprocessing”.

file.i

File generated by the source preprocessor. To retain this file, specify the -P option. For more information on the -P option, see “-P”.

file.l

Assembler listing file. To retain this, specify the -LIST option. For more information on the -LIST option, see “-LIST:...”.

file.list

APO listing file. To retain this, specify the -apolist option. For more information on the -apolist option, see “-apo, -apokeep, -apolist”.

file.L

Listing file containing a cross reference and a source listing. To retain this file, specify the -listing option. For more information on the -listing option, see “-listing”.

file.mod

Module file.

file.o

Object file.

The implementation of alternate returns is not compatible between the MIPSpro Fortran 77 and the MIPSpro 7 Fortran 90 compilers. You cannot specify file.o files from both the MIPSpro Fortran 77 and the MIPSpro 7 Fortran 90 compilers, as input files, on the f90(1) command line if the files use alternate returns.

file.s

Assembly language file. To retain this file, specify the -S option. For information on the -S option, see “-S”.

file.so

Dynamic Shared Object (DSO) library.

f90sigfpe.h

Floating-point exception-handling include file.