Chapter 1. Introduction

This document is a guide to the SGI compilers, compiling tools, and the documentation for those products. It provides overview information about the compilers and the performance tools used with the compilers. It also provides a brief description of the documentation available for all of these SGI products.

The SGI compilers include the FORTRAN 77 compiler, the Fortran 90 compiler, the C compiler, the C++ compiler, and the Assembler.

Compiling tools include the WorkShop suite of tools (Debugger, Performance Analyzer, Static Analyzer, ProMP, and Tester) as well as dbx and SpeedShop.

This book discusses the following topics:

There are three “versions” of Fortran and C/C++ compilers in use at SGI:

Chapter 2, “Compilers and Compiler Documentation”, discusses these different compiling systems and the documentation that supports those systems.

Sources of Performance Problems

To tune a program's performance, you must first determine where machine resources are being used. At any point in a process, there is one limiting resource controlling the speed of execution. Processes can be slowed down by:

  • CPU speed and availability: a CPU-bound process spends its time executing in the CPU and is limited by CPU speed and availability. To improve the performance of CPU-bound processes, you may need to streamline your code. This can entail modifying algorithms, reordering code to avoid interlocks, removing nonessential steps, blocking to keep data in cache and registers, or using alternative algorithms.

  • I/O processing: an I/O-bound process has to wait for input/output (I/O) to complete. I/O may be limited by disk access speeds or memory caching. To improve the performance of I/O-bound processes, you can try one of the following techniques:

    • Improve overlap of I/O with computation

    • Optimize data usage to minimize disk access

    • Use data compression

  • Memory size and availability: a program that continuously needs to swap out pages of memory is called memory-bound. Page thrashing is often due to accessing virtual memory on a haphazard rather than strategic basis; cache misses result. Insufficient memory bandwidth could also be the problem.

    To fix a memory-bound process, you can try to improve the memory reference patterns or, if possible, decrease the memory used by the program.

  • Bugs: you may find that a bug is causing the performance problem. For example, you may find that you are reading in the same file twice in different parts of the program, that floating-point exceptions are slowing down your program, that old code has not been completely removed, or that you are leaking memory (making malloc calls without the corresponding calls to free).

  • Performance phases: because programs exhibit different behavior during different phases of operation, you need to identify the limiting resource during each phase. A program can be I/O-bound while it reads in data, CPU-bound while it performs computation, and I/O-bound again in its final stage while it writes out data. Once you have identified the limiting resource in a phase, you can perform an in-depth analysis to find the problem. And after you have solved that problem, you can check for other problems within the phase. Performance analysis is an iterative process.

The documentation available for the compilers and the performance tools can help you pinpoint where these problems are occuring, and can help you determine how to make the necessary changes to improve program performance.