Chapter 9. Miscellaneous Commands

This chapter describes SpeedShop commands for exploring memory usage and paging, and for printing data files generated by SpeedShop tools. It contains the following sections:

Using the thrash Command

The thrash command allows you to explore paging behavior by allocating a region of virtual memory, and either randomly or sequentially accessing that memory to explore the system paging behavior.

thrash Syntax

thrash [args]

args 

One or more of the following flags:

-kN

The amount of memory to access in kilobytes, where N is the number of kilobytes.

-mN

The amount of memory to access in megabytes, where N is the number of megabytes.

-ncount

The number of references to make before exiting. The default is 10,000.

-pN

The amount of memory to access in pages, where N is the number of pages.

-s

Sequential thrashing. The default is random.

-wtime

The amount of time thrash should sleep after thrashing but before exiting.


Effects of thrash

Once the memory is allocated, thrash prints a message on stdout saying how much memory it is using and then proceeds to thrash over it. Here's an example:

fraser 82% thrash -m 4
thrashing randomly: 4.00 MB (= 0x00400000 = 4194304 bytes = 1024 pages)
        10000 iterations

You can use thrash in conjunction with ssusage and squeeze to determine the approximate available working memory on a system, as described in the section "Calculating the Working Set of a Program".

Using the squeeze Command

The squeeze command allows you to specify an amount of virtual memory to lock down into real memory, thus making it unavailable to other processes. This command can only be used only by superuser.

squeeze Syntax

squeeze [flag] amount

flag 

One of the following flags. If no flag is specified, the default is megabytes.

-k

Kilobytes

 

-m

Megabytes

-p

Pages

-%

A percentage of the installed memory


amount 

The amount of memory to be locked.

Effects of squeeze

squeeze performs the following operations:

  • Locks down the amount of virtual memory you supply as an argument to the command.

  • Prints a message to stdout that provide information on how much memory has been locked, and how much working memory is available.

  • Sleeps indefinitely, or until interrupted by SIGINT or SIGTERM. At that time, it frees up the memory and exits with an exit message.

Wait until after the exit message is printed before doing any experiments.

Here's an example:

fraser 1# squeeze 4
squeeze: leaving 60.00 MB ( = 0x03c01000 = 62918656 ) available memory;
          pinned 4.00 MB ( = 0x00400000 = 4194304 ) at address 0x1000e000;
         from 64.00 MB ( = 0x04001000 = 67112960 ) installed memory.

Use Ctrl-C to exit squeeze. The following message is printed:

squeeze exiting

Calculating the Working Set of a Program

You can use the thrash, squeeze, and ssusage commands together to determine the approximate working set of a program as follows. For all practical purposes, the working set of your program is the size of memory allocated.

The process involves three steps. First you determine the working set of the kernel and other applications:

  1. Choose a machine that has a large amount of physical memory (enough to allow your target application to run without any paging other than at start-up).

  2. Make sure that the machine is running a minimal number of applications that will remain fairly consistent for the duration of these steps.

  3. Run thrash with ssusage to determine the working set of the kernel and any other applications you have running.

    In this example, the thrash command uses 4 MB of memory:

    ssusage thrash -m 4

    When the thrash command completes, ssusage prints the resource usage of thrash; the value labelled majf gives the number of major page faults (i.e. the number of faults that required a physical read.) When you run on a machine with a large amount of physical memory, this value is the number of faults needed to start the program, which is the minimum number for any run. For more information on ssusage, see Chapter 5, "Collecting Data on Machine Resource Usage."

  4. As superuser in a separate window, run the squeeze command to lock down an amount of memory.

  5. Rerun thrash with ssusage:

    ssusage thrash -m 4

  6. Repeat steps 1 and 2, increasing the amount of memory for squeeze, until the majf number begins to rise.

    The amount of working memory available reported by squeeze at the point at which page faults begin to rise for thrash tells you the combined working set of thrash (approximately 4 MB), the kernel and any other applications you have running.

  7. Deduct the 4 MB that thrash uses from the amount of working memory reported by squeeze at the point the page faults began to rise.

    This computation helps you find out the approximate working set of the kernel and any other applications that are running on the machine. You'll need this number when you reach the next steps.

  8. Determine the working set of the program you're interested in. Make sure the applications that the machine is running remain consistent with the setup from step 2.

  9. Run ssusage with your program to ensure that the machine has the amount of memory your program needs.

    ssusage prog_name

    When your program exits, ssusage prints the application's resource usage: the majf field gives the number of major page faults. When run on a machine with a large amount of physical memory, this value is the number of faults needed to start the program, which is the minimum number for any run.

  10. Switch to superuser.

  11. Run squeeze to lock down an amount of memory. The following example locks down 15 megabytes of memory:

    squeeze 15

  12. Rerun your program with ssusage.

  13. Repeat steps 11 and 12 until the majf number begins to rise.

  14. Deduct the amount squeezed at the point at which the application begins to page fault from the total amount of physical memory in the system.

    This computation determines the combined working set of your program, the kernel and any other applications you have running.

  15. Deduct the amount of working memory calculated in step 7 from the total amount of physical memory in the system.

    This computation determines the approximate working set of your program.

Dumping Performance Data Files

All the performance data for a single process is in one file. The file begins with a prologue and continues with a mixture of performance data, sample records, and control records.

The ssdump command can be used for printing performance data files. It provides a formatted ASCII dump of one or more performance experiment data files. This command is most likely to be useful in verifying performance data that does not seem accurate when reported through prof.

ssdump Syntax

ssdump [options] {datafile1 ... datafileN} ...

options 

Zero or more of the following print options:

-d

Prints detailed information for each bead. For compressed beads, the compressed form will be dumped.

 

-D

Prints detailed information for each bead. For compressed beads, the uncompressed form will be dumped.

 

-h

Prints the hex contents of the body of each bead.

-iindex

Prints only one bead at index in the file.

-q

Suppresses printing of those fields that will normally change from run to run such as process IDs and time stamps. This option is useful for QA work, to enable automatic comparisons of recorded experiments.

-soffset

Prints only one bead at offset into the file.


Experiment File Format

The file is written as a string of "beads," each of which is a record with

  • a 32-bit type

  • a 32-bit byte count

  • a body whose length is given by the byte-count, rounded up to a double-word boundary

The file prologue consists of these beads:

  • file-identifier bead, which acts as a magic number, indicating that the file is a SpeedShop data file

  • machine and executable name

  • hardware inventory describing the machine

  • machine page size

  • O/S revision, date, and checksum information about the executable

  • target name (the target is the executable after instrumentation)

  • arguments with which the target was invoked

  • instrumentation performed

  • types of performance data that are to be recorded in the remainder of the file

The following example calls ssdump on performance data for a pcsamp experiment:

ssdump generic.pcsamp.m847

Below is some partial output from ssdump. The format has been adjusted slightly to meet presentation needs.

Printing experiment record file "generic.pcsamp.m847" (2688 bytes), last written on Tue  15 Apr 1997  15:27:02
SpeedShop File Preface                 1, offset 0 = 0x00000000 (size 32)
          file type 1 (SSRUN); version 4
          process control flags: 0xd
                    _SPEEDSHOP_TRACE_FORK=True
                    _SPEEDSHOP_TRACE_FORK_TO_EXEC=False
                    _SPEEDSHOP_TRACE_SPROC=True
                    _SPEEDSHOP_TRACE_EXEC=True
                    _SPEEDSHOP_TRACE_SYSTEM=False
          ancestor exp file name:
          created: Tue  15 Apr 1997  15:26:10.719
Hardware Inventory                     2, offset 40 = 0x00000028 (size 280)
          hardware inventory: 17 items
          class   1, type   1, contrlr 100, unit 255, state 12
          class   1, type   3, contrlr   0, unit   0, state 8192
          class   1, type   2, contrlr   0, unit   0, state 8208
          class   4, type   8, contrlr   0, unit   0, state 2
          class   5, type   5, contrlr   0, unit   0, state 1
          class   3, type   3, contrlr   0, unit   0, state 16384
          class   3, type   4, contrlr   0, unit   0, state 16384
          class   3, type   9, contrlr   0, unit   0, state 64
          class   3, type   1, contrlr   0, unit   0, state 67108864
          class  12, type   3, contrlr   0, unit   0, state 16
          class   8, type   7, contrlr  17, unit   0, state 16777472
          class  10, type   3, contrlr   0, unit   0, state 16400
          class   8, type   0, contrlr   0, unit   0, state 1
          class   2, type   1, contrlr   0, unit  13, state 2
          class   2, type   2, contrlr   0, unit   2, state 0
          class   2, type   2, contrlr   0, unit   1, state 0
          class   7, type  14, contrlr   0, unit   0, state 0
 
Experiment name                        3, offset 328 = 0x00000148 (size 8)
          pcsamp
 
Experiment marching orders             4, offset 344 = 0x00000158 (size 16)
          pc,2,10000,0:cu
 
Capture module symbol                  5, offset 368 = 0x00000170 (size 16)
          pc,2,10000,0
 
Capture module symbol                  6, offset 392 = 0x00000188 (size 8)
          cu
 
Executable file                        7, offset 408 = 0x00000198 (size 8)
          generic
  
Target file                            8, offset 424 = 0x000001a8 (size 8)
          generic
 
Target arguments                       9, offset 440 = 0x000001b8 (size 32)
          Time: Tue  15 Apr 1997  15:26:10.719,  process pid = 847
          arguments: ""
Target begin                          10, offset 480 = 0x000001e0 (size 40)
          process # -1, pid = 847, event # 0
          event type = 0,0
                    at time = Tue  15 Apr 1997  15:26:10.719
Program Object List                   11, offset 528 = 0x00000210 (size 312)
          process # -1, pid = 847, event # 0, -- 5 DSOs
          Program Object 0, Named `generic'
               Link Time Address: 0x0000000010000000
                Run Time Address: 0x0000000010000000
                            Size: 0x0000000000007000 (28672)
                    Base Pointer: 0x0000000000000000
 
          Program Object 1, Named `/usr/lib32/libss.so'
               Link Time Address: 0x0000000009e50000
                Run Time Address: 0x0000000009e50000
                            Size: 0x0000000000002000 (8192)
                    Base Pointer: 0x0000000000000000
 
          Program Object 2, Named `/usr/lib32/libssrt.so'
               Link Time Address: 0x0000000009da0000
                Run Time Address: 0x0000000009da0000
                            Size: 0x000000000008b000 (569344)
                    Base Pointer: 0x0000000000000000
 
          Program Object 3, Named `/usr/lib32/libm.so'
               Link Time Address: 0x000000000f840000
                Run Time Address: 0x000000000f840000
                            Size: 0x0000000000028000 (163840)
                    Base Pointer: 0x0000000000000000
 
          Program Object 4, Named `/usr/lib32/libc.so.1'
               Link Time Address: 0x000000000fa00000
                Run Time Address: 0x000000000fa00000
                            Size: 0x0000000000108000 (1081344)
                    Base Pointer: 0x0000000000000000
 
 
Target DSO open                       12, offset 848 = 0x00000350 (size 56)
          process # -1, pid = 847, event # 0
                    at time = Tue  15 Apr 1997  15:27:00.716
          fname = ./dlslave.so
Program Object List                   13, offset 912 = 0x00000390 (size 360)
          process # -1, pid = 847, event # 0, -- 6 DSOs
          Program Object 0, Named `generic'
               Link Time Address: 0x0000000010000000
                Run Time Address: 0x0000000010000000
                            Size: 0x0000000000007000 (28672)
                    Base Pointer: 0x0000000000000000
 
          Program Object 1, Named `/usr/lib32/libss.so'
               Link Time Address: 0x0000000009e50000
                Run Time Address: 0x0000000009e50000
                            Size: 0x0000000000002000 (8192)
                    Base Pointer: 0x0000000000000000
 
          Program Object 2, Named `/usr/lib32/libssrt.so'
               Link Time Address: 0x0000000009da0000
                Run Time Address: 0x0000000009da0000
                            Size: 0x000000000008b000 (569344)
                    Base Pointer: 0x0000000000000000
 
          Program Object 3, Named `/usr/lib32/libm.so'
               Link Time Address: 0x000000000f840000
                Run Time Address: 0x000000000f840000
                            Size: 0x0000000000028000 (163840)
                    Base Pointer: 0x0000000000000000
 
          Program Object 4, Named `/usr/lib32/libc.so.1'
               Link Time Address: 0x000000000fa00000
                Run Time Address: 0x000000000fa00000
                            Size: 0x0000000000108000 (1081344)
                    Base Pointer: 0x0000000000000000
 
          Program Object 5, Named `./dlslave.so'
               Link Time Address: 0x000000005ffe0000
                Run Time Address: 0x000000005ffe0000
                            Size: 0x0000000000001000 (4096)
                    Base Pointer: 0x0000000000000000
 
Sample event trigger                  14, offset 1280 = 0x00000500 (size 40)
          process # -1, trap index # -1
                    at time = Tue  15 Apr 1997  15:27:01.989, #-1
 
Compressed PC sampling array (16-bit)   15, offset 1328 = 0x00000530 (size 320)
          compressed short array, dso index = 0, array size = 7168, 156
          compressed
 
Compressed PC sampling array (16-bit)   16, offset 1656 = 0x00000678 (size 16)
          compressed short array, dso index = 1, array size = 2048, 4 compressed
 
Compressed PC sampling array (16-bit)   17, offset 1680 = 0x00000690 (size 40)
          compressed short array, dso index = 2, array size = 142336, 16
          compressed
 
Compressed PC sampling array (16-bit)   18, offset 1728 = 0x000006c0 (size 16)
          compressed short array, dso index = 3, array size = 40960, 4 compressed
 
Compressed PC sampling array (16-bit)   19, offset 1752 = 0x000006d8 (size 64)
          compressed short array, dso index = 4, array size = 270336, 28
          compressed
 
Compressed PC sampling array (16-bit)   20, offset 1824 = 0x00000720 (size 48)
          compressed short array, dso index = 5, array size = 1024, 20 compressed
 
PC sampling array (16-bit)            21, offset 1880 = 0x00000758 (size 16)
          short array, dso index = -1, array size = 1
 
Resource usage                        22, offset 1904 = 0x00000770 (size 680)
 
Sample data end marker                23, offset 2592 = 0x00000a20 (size 40)
 
Target termination                    24, offset 2640 = 0x00000a50 (size 40)
          process # -1, pid = 847, event # 0
          event type = 0,0 (normal termination, exit status 0)
                    at time = Tue  15 Apr 1997  15:27:02.231
 
 
  ** End-of-File                     25, offset 2688 = 0x00000a80 (size 0)
 
**** End of experiment record file "generic.pcsamp.m847"

Dumping Compiler Feedback Files

The fbdump command can be used to print out the compiler feedback files generated by running prof -feedback. For more information on using compiler feedback files, view the cord or cc reference pages.

fbdump Syntax

fbdump options filename

options 

Zero or more of the options described in table Table 9-1.

filename 

The feedback filename. This file has a .fb extension.

Table 9-1. Options for fbdump

Option

Prints.

-all

Feedback using all options. This is the default.

-ascii

Feedback in the same style as earlier version of the feedback dump program.

-bb

Feedback per basic block table as described in "cmplrs/fb.h". If -verbose is specified, all basic blocks are printed, even those with zero execution counts. If -verbose is not specified, fbdump prints only the basic blocks that have non-zero execution counts.

-call

Feedback call table as described in "cmplrs/fb.h". If -verbose is specified, all the points of call are printed, even if they have not been called. If -verbose is not specified, fbdump prints only the relevant information on the calls.

-header

Feedback file header as described in "cmplrs/fb.h".

-proc

Feedback procedure table as described in "cmplrs/fb.h". If -verbose is specified, all procedures will be printed, even if they are not invoked. If -verbose is not specified, fbdump prints only the relevant information on the procedures that have been invoked.

-sections

Feedback file section headers table as described in "cmplrs/fb.h".

-str

Feedback string table.

-verbose

All the information in verbose mode including a table with all zero entries.