Chapter 8. Miscellaneous Commands

This chapter describes additional SpeedShop commands that are useful in analyzing application performance. It contains the following sections:

Using the thrash Command

The thrash command allows you to explore paging behavior by allocating a region of virtual memory and accessing that memory either randomly or sequentially.

thrash Syntax

The syntax for the thrash(1) command is as follows:

thrash args [-n count] [-s] [-w time]
  • args can be one of the following options:

    • -k n: the amount of memory to access in kilobytes, where n is the number of kilobytes. The minimum value for n is the size of one page, or the value will be changed appropriately.

    • -m n: the amount of memory to access in megabytes, where n is the number of megabytes.

    • -p n: the amount of memory to access in pages, where n is the number of pages.

  • -n [count]: the number of references to make before exiting. The default is 10,000.

  • -s: sequential thrashing. The default is random.

  • -w time: an integer amount of time, in seconds, thrash should sleep after thrashing but before exiting. The default is 0 seconds.

Effects of thrash

After the memory is allocated, thrash prints a message on stdout, saying how much memory it is using and then proceeds to access it. The following is an example:

% thrash -m 4

thrashing randomly: 4.00 MB (= 0x00400000 = 4194304 bytes = 1024 pages)

        10000 iterations

You can use thrash in conjunction with ssusage(1) and squeeze(1) to determine the approximate available working memory on a system, as described in “Calculating the Working Set of a Program”.

Using the squeeze Command

The squeeze command lets you specify an amount of virtual memory to lock down into real memory, thus making it unavailable to other processes. This command can be used only in superuser mode.

squeeze Syntax

The syntax for the squeeze(1) command is as follows:

squeeze [unit] amount

The following arguments are used:

  • unit: can be one of the following options indicating the unit of measure. If no option is specified, the default is megabytes.

    • -k: kilobytes

    • -m: megabytes

    • -p: pages

    • -%: a percentage of the installed memory

  • amount: the amount of memory to be locked.

Effects of squeeze

The squeeze(1) command performs the following operations:

  • Locks down the amount of virtual memory you supply as an argument to the command.

  • Prints a message to stdout that provides information on how much memory has been locked and how much working memory is available.

  • Sleeps indefinitely, or until interrupted by SIGINT or SIGTERM. At that time, it frees up the memory and exits with an exit message.

Wait until after the exit message is printed before doing any experiments.

Here is an example that locks down 4 megabytes of memory:

% squeeze 4
squeeze: leaving 60.00 MB ( = 0x03c01000 = 62918656 ) available memory;
         pinned 4.00 MB ( = 0x00400000 = 4194304 ) at address 0x1000e000;
         from 64.00 MB ( = 0x04001000 = 67112960 ) installed memory.

Use Ctrl-C to exit squeeze. The following message is printed:

squeeze exiting

Calculating the Working Set of a Program

You can use the thrash, squeeze, and ssusage commands together to determine the approximate working set of a program. For all practical purposes, the working set of your program is the size of memory allocated. The following procedure assumes that you are running on a system that is either stand-alone or where the environment will not change while you are running these tests.

Procedure 8-1. Calculating the Working Set

  1. Determine the working set of the kernel and other applications:

    • Choose a machine that has a large amount of physical memory (enough to allow your target application to run without any paging other than at startup).

    • Make sure that the machine is running a minimal number of applications that will remain fairly consistent for the duration of these steps.

    • Run thrash with ssusage to determine the working set of the kernel and any other applications you have running.

      In this example, the thrash command uses 4 MB of memory:

      % ssusage thrash -m 4

      When the thrash command completes, ssusage prints the resource usage of thrash. The value labeled majf gives the number of major page faults (that is, the number of faults that required a physical I/O). When you run on a machine with a large amount of physical memory, this value is the number of faults needed to start the program, which is the minimum number for any run. For more information on ssusage, see Chapter 5, “Collecting Data on Machine Resource Usage”.

    • As super user in a separate window, run the squeeze command to lock down an amount of memory.

    • Rerun thrash with ssusage, as shown here:

      % ssusage thrash -m 4

    • Repeat the previous two steps, increasing the amount of memory for squeeze, until the majf number begins to rise.

      The amount of working memory available reported by squeeze at the point at which page faults begin to rise for thrash tells you the combined working set of thrash (approximately 4 MB), the kernel, and any other applications you have running.

    • Deduct the 4 MB that thrash uses from the amount of working memory reported by squeeze at the point the page faults began to rise.

      This computation helps you find the approximate base working set of the kernel and any other applications that are running on the machine. You will need this number when you reach the next steps.

  2. Determine the working set of the kernel and other applications:

    • The applications that the machine is running should remain consistent with the machine in the first step.

    • Run ssusage with your program to ensure that the machine has the amount of memory your program needs.

      % ssusage prog_name

      When your program exits, ssusage prints the application's resource usage. The majf field gives the number of major page faults. When run on a machine with a large amount of physical memory, this value is the number of faults needed to start the program, which is the minimum number for any run.

    • In another window, become super user.

    • In this new window, run squeeze to lock down an amount of memory. The following example locks down 15 megabytes of memory:

      % squeeze 15

    • In the first window, rerun your program with ssusage.

    • In the second window running squeeze, enter ctrl-c to cause squeeze to exit.

    • Repeat these steps, using squeeze to lock down increasing amounts of memory until the majf number begins to rise.

    • Deduct the amount squeezed at the point at which the application begins to page fault from the total amount of physical memory in the system. This computation determines the combined working set of your program, the kernel, and any other applications you have running.

  3. Calculate the working set size of your program.

    Deduct the amount of working memory calculated in step 1g from the combined working set size calculated in step 2h. This computation determines the approximate working set of your program.

Combining Multiple Experiment Files into One

The ssaggregate(1) command lets you combine the data from two or more experiment files of the same experiment type (such as bbcounts) into a single file. You can then view the new file with either prof(1) or the WorkShop performance analyzer, cvperf(1).

The ssaggregate command takes the following form:

ssaggregate -e files -noverbose -o output_file

The following example combines two pcsamp experiments into a single file and displays the file with prof:

% ssaggregate -e generic.pcsamp.f14636 generic.pcsamp.f14635 -o combo
% prof combo

The output from prof is as follows:

-------------------------------------------------------------------------
SpeedShop profile listing generated Tue Nov 24 11:30:03 1998
   prof combo
                 generic (n32): Target program
                        pcsamp: Experiment name
               pc,2,10000,0:cu: Marching orders
                 R5000 / R5000: CPU / FPU
                             1: Number of CPUs
                           180: Clock frequency (MHz.)
  Experiment notes--
          From file combo:
        Caliper point 0 at target begin, PID 14635
                        /home/saffron02/speedshop/c/generic ll.u.cvt.d.i.f.dso ll.u.cvt.d.i.f.dso ll.u.cvt.d.i.f.dso
        Caliper point 0 at target begin, PID 14636
                        /home/saffron02/speedshop/c/generic ll.u.cvt.d.i.f.dso ll.u.cvt.d.i.f.dso ll.u.cvt.d.i.f.dso
        Caliper point 1 at exit(0)
-------------------------------------------------------------------------
Summary of statistical PC sampling data (pcsamp)--
                          4012: Total samples
                        40.120: Accumulated time (secs.)
                          10.0: Time per sample (msecs.)
                             2: Sample bin width (bytes)
-------------------------------------------------------------------------
Function list, in descending order by time
-------------------------------------------------------------------------
 [index]      secs    %    cum.%   samples  function (dso: file, line)

     [1]    37.480  93.4%  93.4%      3748  anneal (generic: generic.c, 1573)
     [2]     1.450   3.6%  97.0%       145  slaveusrtime (dlslave.so: dlslave.c, 22)
     [3]     0.490   1.2%  98.3%        49  _read (libc.so.1: read.s, 15)
     [4]     0.330   0.8%  99.1%        33  _xstat (libc.so.1: xstat.s, 12)
     [5]     0.300   0.7%  99.8%        30  cvttrap (generic: generic.c, 317)
     [6]     0.030   0.1%  99.9%         3  _write (libc.so.1: write.s, 15)
     [7]     0.010   0.0%  99.9%         1  fread (libc.so.1: fread.c, 27)
     [8]     0.010   0.0% 100.0%         1  _syscall (libc.so.1: syscall.s, 15)
             0.020   0.0% 100.0%         2  **OTHER** (includes excluded DSOs, rld, etc.)

            40.120 100.0% 100.0%      4012  TOTAL

By default, ssaggregate issues periodic status messages while it is processing. The -noverbose option turns the status messages off. See the ssaggregate(1) man page.