Chapter 5. Comprehensive System Accounting

The IRIX system has three types of accounting: basic accounting, extended accounting, and Comprehensive System Accounting (CSA). You can use either one type of accounting or a combination of them, depending on your site's accounting needs. This chapter contains detailed information about CSA.

You can use the three types of IRIX accounting to log and charge for certain types of system activity. Using accounting data, you can determine how system resources were used and if a particular user has used more than a reasonable share; trace significant system events, such as security breaches, by examining the list of all processes invoked by a particular user at a particular time; and set up billing systems to charge login accounts for using system resources.

Basic accounting consists of standard UNIX accounting features. Basic accounting is process oriented; a new accounting record is produced for each process that has been run, containing statistics about the resources used by that individual process. The runacct(1M) command is the main daily accounting shell script usually initiated by cron(1M). The runacct(1M) command processes accounting records written into the process accounting data file.

Extended accounting is an IRIX feature that has extended process accounting capabilities, along with project and array session accounting features. Unlike basic processing accounting and CSA, which write accounting data directly to an accounting data file, extended accounting writes data files using the system audit trail (SAT) facility. Audit data is collected directly from the kernel by the satd(1M) program. The extended accounting data is a superset of the data collected and reported by basic accounting.

CSA provides additional capabilities that provide more detailed and accurate accounting data per job. It also contains data from some daemons. The csarun(1M) command, usually initiated by the cron(1M) command, directs the processing of the CSA daily accounting files. The csarun(1M) command processes accounting records written into the CSA accounting data file.

For more detailed information on basic accounting and extended accounting, see “About the Process Accounting System” and “IRIX Extended Accounting”, respectively, in Chapter 7, “System Accounting” of the IRIX Admin: Backup, Security and Accounting manual.

This chapter contains the following sections:

Read Me First

The sections in this chapter contain information about installing CSA software on your system. You should reference them in the order they are listed here:

  1. For a general description of CSA, see “CSA Overview”.

  2. To install the CSA package and job limits package used by CSA, see “Enabling or Disabling CSA”.

  3. For information about CSA directories and files, see “CSA Files and Directories”.

  4. For detailed information about CSA, such as, setting CSA up on your system, daily operation, tailoring CSA to your system, see “Comprehensive System Accounting Expanded Description”.

  5. For a list of CSA man pages, see “CSA Man Pages”.

  6. For information about the types of reports you can generate using CSA, see “CSA Reports”.

CSA Overview

Comprehensive System Accounting (CSA) is a set of C programs and shell scripts that, like the other accounting packages, provide methods for collecting per-process resource usage data, monitoring disk usage, and charging fees to specific login accounts. CSA provides:

  • Per-job accounting

  • Daemon accounting (tape, NQS and workload management systems)

  • Flexible accounting periods (daily and periodic (monthly) accounting reports can be generated as often as desired and are not restricted to once per day or once per month)

  • Flexible system billing units (SBUs)

  • Offline archiving of accounting data

  • User exits for site specific customizing of daily and periodic (monthly) accounting

  • Configurable parameters within the /etc/csa.conf file

  • User job accounting (ja(1) command)

CSA takes this per-process accounting information and combines it by job identifier (jid) within system boot uptime periods. CSA accounting for a job consists of all accounting data for a given job identifier during a single system boot period. However, since NQS jobs or workload management jobs may span multiple reboots and thereby consist of multiple job identifiers, CSA accounting for these jobs includes the accounting data associated with the NQS identifier or the workload management identifier.

Daemon accounting records are written at the completion of daemon specific events. These records are combined with per-process accounting records associated with the same job.

By default, CSA only reports accounting data for terminated jobs. Interactive jobs, cron jobs and at jobs terminate when the last process in the job exits, which is normally the login shell. An NQS or workload management job is recognized as terminated by CSA based upon daemon accounting records and an end-of-job record for that job. Jobs which are still active are recycled into the next accounting period. This behavior can be changed through use of the csarun command -A option.

A system billing unit (SBU) is a unit of measure that reflects use of machine resources. SBUs are defined in the CSA configuration file /etc/csa.conf and are set to 0.0 by default. The weighting factor associated with each field in the CSA accounting records can be altered to obtain an SBU value suitable for your site. For more information on SBUs, see “System Billing Units (SBUs)”.

The CSA accounting records are not written into the basic accounting pacct file but are written into a separate CSA /var/adm/acct/day/pacct file. The CSA commands can only be used with CSA generated accounting records. Similarly, the basic accounting commands can only be used with the records generated by basic accounting.

There are four user exits available with the csarun(1M) daily accounting script. There is one user exit available with the csaperiod(1M) monthly accounting script. These user exits allow sites to tailor the daily and monthly run of accounting to their specific needs by creating user exit scripts to perform any additional processing and to allow archiving of accounting data. See the csarun(1M) and csaperiod(1M) man pages for further information.

CSA provides two user accounting commands, csacom(1) and ja(1). The csacom command reads the CSA pacct file and writes selected accounting records to standard output. The csacom command is very similar to the basic accounting acctcom(1) command. The ja command provides job accounting information for the current job of the caller. This information is obtained from a separate user job accounting file to which the kernel writes. See the csacom(1) and ja(1) man pages for further information.

The /etc/csa.conf file contains CSA configuration variables. These variables are used by the CSA commands.

Like any accounting or monitoring package, the CSA features do contribute to overall system overhead. For this reason, CSA is disabled in the kernel by default. To enable CSA, see “Enabling or Disabling CSA”.

Concepts and Terminology

The following concepts and terms are important to understand when using the accounting features:

Term 

Description

Daily accounting 

Daily accounting is the processing, organizing, and reporting of the raw accounting data, generally performed once per day.

In basic accounting, daily accounting can only be run once a day. With CSA, it can be run as many times as necessary during a day; however, this feature is still referred to as daily accounting.

Job 

A job is a grouping of processes that the system treats as a single entity and is identified by a unique job identifier (job ID).

CSA is the only accounting type to organize accounting data by jobs and boot times and then place the data into a sorted pacct file.

For non-NQS or non-workload management jobs, a job consists of all accounting data for a given job ID during a single boot period.

An NQS job consists of the accounting data for all job IDs associated with the job's NQS sequence number, and a workload management job consists of the accounting data for all job IDs associated with the workload management request ID. NQS or workload management jobs may span multiple boot periods. If a job is restarted, it has the same job ID associated with it during all boot periods in which it runs. Rerun NQS or workload management jobs have multiple job IDs. CSA treats all phases of an NQS job or workload management job as being in the same job.

Periodic accounting 

Periodic (monthly) accounting further processes, reports, and summarizes the daily accounting reports to give a higher level view of how the system is being used.

In basic accounting, this refers to accounting that is run on a monthly basis. CSA, however, lets system administrators specify the time periods for which monthly or cumulative accounting is to be run. Thus, periodic accounting can be run more than once a month, but sometimes is still referred to as monthly accounting.

Daemon accounting 

Daemon accounting is the processing, organizing, and reporting of the raw accounting data, performed at the completion of daemon specific events.

Recycled data 

Recycled data is data left in the raw accounting data file, saved for the next accounting report run.

By default, accounting data for active jobs is recycled until the job terminates. CSA reports only data for terminated jobs unless csarun is invoked with the -A option. csarun places recycled data into the /var/adm/acct/day/pacct0 data file.

The following abbreviations and definitions are used throughout this chapter:

Abbreviation 

Definition

MMDD 

Month, day

hhmm 

Hour, minute

Enabling or Disabling CSA

The following steps are required to set up CSA job accounting:

  1. Use the inst(1M) utility to install the eoe.sw.csaacct subsystem from your IRIX distribution media. Installing CSA also requires that the eoe.sw.acct and eoe.sw.jlimits subsystems are installed.

  2. Enable CSA within the kernel by using the systune(1M) utility to set do_csaacct to a nonzero value. It will be necessary to reboot the system after completing this step.

  3. Configure CSA on across system reboots by using the chkconfig(1M) utility as follows:

    chkconfig csaacct on

  4. Modify the CSA configuration variables in /etc/csa.conf as desired.

  5. Use the csaswitch(1M) command to configure on the accounting record types and thresholds defined in /etc/csa.conf as follows:

    csaswitch -c on

    This step will be done automatically for subsequent system reboots when CSA is configured on via the chkconfig(1M) utility.

    For information on adding entries to the crontabs file so that the cron(1M) command automatically runs daily accounting, see “Setting Up CSA”.

The following steps are required to disable CSA job accounting:

  1. To turn off CSA, use the csaswitch(1M) command:

    csaswitch -c halt

  2. To stop CSA from initiating after a system reboot, use the chkconfig(1M) command:

    chkconfig csaacct off

  3. Disable CSA within the kernel by using the systune(1M) utility to set do_csaacct to a zero value. It will be necessary to reboot the system after completing this step.

CSA Files and Directories

The following sections describe the CSA files and directories.

Files in the /var/adm/acct Directory

The /var/adm/acct directory contains CSA data and report files within various subdirectories. /var/adm/acct contains data collection files used by CSA. CSA and IRIX basic accounting access separate pacct files. The following diagram shows the directory and file layout for CSA:

Figure 5-1. The /var/adm/acct Directory

The /var/adm/acct Directory

Each data and report file for CSA has a month-day-hour-minute suffix.


Warning: On a IRIX security-enhanced system, the csacom(1) command is considered to be a covert channel. You may want to consider restricting access to this command to the adm group.


Files in the /var/adm/acct/ Directory

The /var/adm/acct directory contains the following directories:

Directory

Description

day

Contains the current raw accounting data files in pacct format.

work

Used by CSA as a temporary work area. Contains raw files that were moved from /var/adm/acct/day at the start of an CSA daily accounting run and the spacct file.

sum/csa

Contains the cumulative daily accounting summary files and reports created by csarun(1M). The ASCII format is in /var/adm/acct/sum/csa/rprt.MMDDhhmm.

The binary data is in /var/adm/acct/sum/csa/cacct.MMDDhhmm, /var/adm/acct/sum/csa/cms.MMDDhhmm, and /var/adm/acct/sum/csa/dacct.MMDDhhmm.

fiscal/csa

Contains periodic accounting summary files and reports created by csaperiod(1M). The ASCII format is in /var/adm/acct/fiscal/csa/rprt.MMDDhhmm.

The binary data is in /usr/adm/acct/fiscal/csa/cms.MMDDhhmm and /usr/adm/acct/fiscal/csa/pdacct.MMDDhhmm.

nite/csa

Contains log files, csarun state, and execution times files.

Files in the /var/adm/acct/day Directory

The following files are located in the /var/adm/acct/day directory:

File 

Description

dodiskerr 

Disk accounting error file.

pacct 

Process and daemon accounting data.

pacct0 

Recycled process and daemon accounting data.

dtmp 

Disk accounting data (ASCII) created by dodisk.

Files in the /var/adm/acct/work Directory

The following files are located in the /var/adm/acct/work/MMDD/hhmm directory:

File 

Description

BAD.Wpacct* 

Unprocessed accounting data containing invalid records (verified by csaverify(1M)).

Ever.tmp1 

Data verification work file.

Ever.tmp2 

Data verification work file.

Rpacct0 

Process and daemon accounting data to be recycled in the next accounting run.

Wdiskcacct 

Disk accounting data (cacct.h format) created by dodisk(1M) (See the dodisk(1M) man page).

Wdtmp 

Disk accounting data (ASCII) created by dodisk(1M).

Wpacct* 

Raw process and daemon accounting data.

spacct 

sorted pacct file.

Files in the /var/adm/acct/sum/csa Directory

The following data files are located in the /var/adm/acct/sum/csa directory:

File

Description

cacct.MMDDhhmm

Consolidated daily data in cacct.h format. This file is deleted by csaperiod if the -r option is specified.

cms.MMDDhhmm

Daily command usage data in command summary (cms) record format. This file is deleted by csaperiod if the -r option is specified.

dacct.MMDDhhmm

Daily disk usage data in cacct.h format. This file is deleted by csaperiod if the -r option is specified.

loginlog

Login record file created by lastlogin.

rprt.MMDDhhmm

Daily accounting report.

Files in the /var/adm/acct/fiscal/csa Directory

The following files are located in the /var/adm/acct/fiscal/csa directory:

File 

Description

cms.MMDDhhmm 

Periodic command usage data in command summary (cms) record format.

pdacct.MMDDhhmm 

Consolidated periodic data.

rprt.MMDDhhmm 

Periodic accounting report.

Files in the /var/adm/acct/nite/csa Directory

The following files are located in the /var/adm/acct/nite/csa directory:

File 

Description

active  

Used by the csarun(1M) command to record progress and print warning and error messages. activeMMDDhhmm is the same as active after csarun detects an error.

clastdate 

Last two times csarun was executed; in MMDDhhmm format.

dk2log 

Diagnostic output created during execution of dodisk (see the cron entry for dodisk in “Setting Up CSA”).

diskcacct 

Disk accounting records in cacct.h format, created by dodisk.

EaddcMMDDhhmm 

Error/warning messages from the csaaddc(1M) command for an accounting run done on MMDD at hhmm.

Earc1MMDDhhmm 

Error/warning messages from the csa.archive1(1M) command for an accounting run done on MMDD at hhmm.

Earc2MMDDhhmm 

Error/warning messages from the csa.archive2(1M) command for an accounting run done on MMDD at hhmm.

Ebld.MMDDhhmm 

Error/warning messages from the csabuild(1M) command for an accounting run done on MMDD at hhmm.

Ecmd.MMDDhhmm 

Error/warning messages from the csacms(1M) command when generating an ASCII report for an accounting run done on MMDD at hhmm.

Ecms.MMDDhhmm 

Error/warning messages from the csacms(1M) command when generating binary data for an accounting run done on MMDD at hhmm.

Econ.MMDDhhmm 

Error/warning messages from the csacon(1M) command for an accounting run done on MMDD at hhmm.

Ecrep.MMDDhhmm 

Error/warning messages from the csacrep(1M) command for an accounting run done on MMDD at hhmm.

Ecrpt.MMDDhhmm 

Error/warning messages from the csacrep(1M) command for an accounting run done on MMDD at hhmm.

Edrpt.MMDDhhmm 

Error/warning messages from the csadrep(1M) command for an accounting run done on MMDD at hhmm.

Erec.MMDDhhmm 

Error/warning messages from the csarecy(1M) command for an accounting run done on MMDD at hhmm.

Euser.MMDDhhmm 

Error/warning messages from the csa.user(1M) user exit for an accounting run done on MMDD at hhmm.

Epuser.MMDDhhmm 

Error/warning messages from the csa.puser(1M) user exit for an accounting run done on MMDD at hhmm.

Ever.tmp1MMDDhhmm 

Output file from invalid record offsets from the csaverify(1M) command for an accounting run done on MMDD at hhmm.

Ever.tmp2MMDDhhmm 

Error/warning messages from the csaverify(1M) command for an accounting run done on MMDD at hhmm.

Ever.MMDDhhmm 

Error/warning messages from the csaedit(1M) and csaverify(1M) command (from the Ever.tmp2 file) for an accounting run done on MMDD at hhmm.

fd2log  

Diagnostic output created during execution of csarun (see cron entry for csarun in “Setting Up CSA”).

lock lock1 

Used to control serial use of the csarun(1M) comand.

pd2log 

Diagnostic output created during execution of csaperiod (see cron entry for csaperiod in “Setting Up CSA”).

pdact 

Progress and status of csaperiod. pdact.MMDDhhmm is the same as pdact after csaperiod detects an error.

statefile  

Used to record current state during execution of the csarun command.

/usr/lib/acct Directory

The /usr/lib/acct directory contains the following commands and shell scripts used by CSA:

Command 

Description

csaaddc 

Combines cacct records.

csabuild 

Organizes accounting records into job records.

csachargefee 

Charges a fee to a user.

csackpacct 

Checks the size of the CSA process accounting file.

csacms 

Summarizes command usage from per-process accounting records.

csacon 

Condenses records from the sorted pacct file.

csacrep 

Reports on consolidated accounting data.

csadrep 

Reports daemon usage.

csaedit 

Displays and edits the accounting information.

csagetconfig 

Searches the accounting configuration file for the specified argument.

csajrep 

Prints a job report from the sorted pacct file.

csaperiod 

Runs periodic accounting.

csarecy 

Recycles unfinished job records into next accounting run.

csarun 

Processes the daily accounting files and generates reports.

csaswitch 

Checks the status of, enables or disables the different types of Comprehensive System Accounting (CSA), and switches accounting files for maintainability.

csaverify 

Verifies that the accounting records are valid.

The /usr/bin directory contains user commands associated with CSA:

Command 

Description

ja 

Starts and stops user job accounting information.

csacom 

Searches and prints the CSA process accounting files.

The /usr/lib/acct directory may also contain the following scripts if your site uses the accounting user exits:

Script 

Description

csa.archive1 

Site-generated user exit for csarun.

csa.archive2 

Site-generated user exit for csarun.

csa.fef 

Site-generated user exit for csarun.

csa.user 

Site-generated user exit for csarun.

csa.puser 

Site-generated user exit for csaperiod.

/etc Directory

The /etc directory is the location of the csa.conf file that contains the parameter labels and values used by CSA software.

/etc/config Directory

The /etc/config directory is the location of the csaacct file used by the chkconfig(1M) command. The csaacct.options contains options passed to the csaswitch(1M) command. Use a text editor to add any csaswitch(1M) options to be passed to csaswitch during system startup only.

Comprehensive System Accounting Expanded Description

This section contains detailed information about CSA and covers the following topics:

Daily Operation Overview

When the IRIX operating system is run in multiuser mode, accounting behaves in a manner similar to the following process. However, because sites may customize CSA, the following may not reflect the actual process at a particular site:

  1. When CSA accounting is enabled and the system is switched to multiuser mode, the /usr/lib/acct/csaswitch (see the csaswitch(1M) man page) command is called by /etc/rc2.

  2. By default, csa, memory, and I/O record types are enabled in /etc/csa.conf. However, to run NQS, workload management, or tape daemon accounting you must modify the /etc/csa.conf file and the appropriate subsystem. For more information, see “Setting Up CSA”.

  3. The amount of disk space used by each user is determined periodically. The /usr/lib/acct/dodisk command (see dodisk(1M)) is run periodically by the cron command to generate a snapshot of the amount of disk space being used by each user. The dodisk command should be run at most once for each time /usr/lib/acct/csarun is run (see csarun(1M)). Multiple invocations of dodisk during the same accounting period write over previous dodisk output.

  4. A fee file is created. Sites desiring to charge fees to certain users can do so by invoking /usr/lib/acct/csachargefee (see csachargefee(1M)). Each accounting period's fee file (/var/adm/acct/day/fee) is merged into the consolidated accounting records by /usr/lib/acct/csaperiod (see csaperiod(1M)).

  5. Daily accounting is run. At specified times during the day, csarun is executed by the cron command to process the current accounting data. The output from csarun is daily accounting files and an ASCII report.

  6. Periodic (monthly) accounting is run. At a specific time during the day, or on certain days of the month, /usr/lib/acct/csaperiod (see csaperiod) is executed by the cron command to process consolidated accounting data from previous accounting periods. The output from csaperiod is periodic (monthly) accounting files and an ASCII report.

  7. Accounting is disabled. When the system is shut down gracefully, the csaswitch(1M) command is executed to halt all CSA process and daemon accounting.

Setting Up CSA

The following is a brief description of setting up CSA. Site-specific modifications are discussed in detail in “Tailoring CSA”. As described in this section, CSA is run by a person with superuser permissions. CSA also can be run by users who are in the adm group and have the CAP_ACCT_MGT capability. See the capability(4) and capabilities(4) man pages for more information on the capability mechanism that provides fine grained control over the privileges of a process. See “Allowing Non Superusers to Execute CSA”, for the necessary modifications.

  1. Change the default system billing unit (SBU) weighting factors, if necessary. By default, no SBUs are calculated. If your site wants to report SBUs, you must modify the configuration file /etc/csa.conf.

  2. Modify any necessary parameters in the /etc/csa.conf file, which contains configurable parameters for the accounting system.

  3. If you want daemon accounting, you must enable daemon accounting at system startup time by performing the following steps:

    1. Ensure that the variables in /etc/csa.conf for the subsystems for which you want to enable daemon accounting are set to on. Set NQS_START to on to enable NQS accounting. Set WKMG_START to on to enable workload management accounting. Set TAPE_START to on to enable tape accounting.

    2. If necessary, enable accounting from the daemon's side. Specifically, NQS, workload management, and tape accounting must also be enabled by the associated daemon. Use the qmgr set accounting on command to turn on NQS accounting. To enable tape daemon accounting, execute tmdaemon with the -c option. For more information on the tmdaemon command, see the TMF Administrator's Guide. To enable the workload management accounting, see the appropriate workload management guide for your system.

  4. As root, use the crontab(1) command with the - e option to add entries similar to the following:


    Note: If you do not use the crontab(1) command to update the crontab file (for example, using the vi(1) editor to update the file), you must signal cron(1M) after updating the file. The crontab command automatically updates the crontab file and signals cron(1M) when you save the file and exit the editor. For more information on the crontab command, see the crontab(1) man page.


    0 4 *  * 1-6  if /etc/chkconfig csaacct; then /usr/lib/acct/csarun 2> /var/adm/acct/nite/csa/fd2log; fi
    0 2 *  * 4    if /etc/chkconfig csaacct; then /usr/lib/acct/dodisk -c > /var/adm/acct/nite/csa/dk2log; fi
    5 * *  * 1-6  if /etc/chkconfig csaacct; then /usr/lib/acct/csackpacct; fi
    0 5 1  * *    if /etc/chkconfig csaacct; then /usr/lib/acct/csaperiod -r  \
    2> /var/adm/acct/nite/csa/pd2log; fi

    These entries are described in the following steps:

    1. For most installations, entries similar to the following should be made in /var/spool/cron/crontabs/root so that cron(1M) automatically runs daily accounting:

      0 4 *  * 1-6  if /etc/chkconfig csaacct; then /usr/lib/acct/csarun 2> /var/adm/acct/nite/csa/fd2log; fi
      0 2 *  * 4    if /etc/chkconfig csaacct; then /usr/lib/acct/dodisk -c > /var/adm/acct/nite/csa/dk2log; fi

      The csarun(1m) command should be executed at such a time that dodisk has sufficient time to complete. If dodisk does not complete before csarun executes, disk accounting information may be missing or incomplete.

      The dodisk command must be invoked with the -c option. For more information, see the dodisk(1M) man page.

    2. Periodically check the size of the pacct files. An entry similar to the following should be made in /var/spool/cron/crontabs/root:

      5 * *  * 1-6  if /etc/chkconfig csaacct; then /usr/lib/acct/csackpacct; fi

      The cron command should periodically execute the csackpacct(1m) shell script. If the pacct file grows larger than 4000 1K blocks (default), csackpacct calls the command /usr/lib/acct/csaswitch -c switch to start a new pacct file. The csackpacct command also makes sure that there are at least 2000 1K blocks free on the file system containing /var/adm/acct (located in the /var directory by default). If there are not enough blocks, CSA accounting is turned off. The next time csackpacct is executed, it turns CSA accounting back on if there are enough free blocks.

      Ensure that the MIN_BLKS variable has been set correctly in the /etc/csa.conf configuration file. MIN_BLKS is the minimum number of free 1K blocks needed on the file system on which the var/adm/acct directory resides. The default is 2000.

      It is very important that csackpacct be run periodically so that an administrator is notified when the accounting file system (located in the /var directory by default) runs out of disk space. After the file system is cleaned up, the next invocation of csackpacct enables process and daemon accounting. You can manually re-enable accounting by invoking csaswitch -c on.

      If csackpacct is not run periodically, and the accounting file system runs out of space, an error message is written to the console stating that a write error occurred and that accounting is disabled. If you do not free disk space as soon as possible, a vast amount of accounting data can be lost unnecessarily. Additionally, lost accounting data can cause csarun to abort or report erroneous information.

    3. To run monthly accounting, an entry similar to the command shown below should be made in /var/spool/cron/crontabs/root. This command generates a monthly report on all consolidated data files found in /var/adm/acct/sum/csa/* and then deletes those data files:

      0 5 1  * *    if /etc/chkconfig csaacct; then /usr/lib/acct/csaperiod -r \
      2> /var/adm/acct/nite/csa/pd2log; fi

      This entry is executed at such a time that csarun has sufficient time to complete. This example results in the creation of a periodic accounting file and report on the first day of each month. These files contain information about the previous month's accounting.

  5. On Trusted IRIX systems, perform the following steps::

    1. Ensure that user adm has the CAP_ACCT_MGT capability.

    2. Ensure that the following user exits (if they exist) are both readable and executable by user adm:

      • /usr/lib/acct/csa.archive1

      • /usr/lib/acct/csa.archive2

      • /usr/lib/acct/csa.fef

      • /usr/lib/acct/csa.puser

    3. Include an entry similar to the one shown below in /var/spool/cron/crontabs/root:

       2 * * 4 suattr -M dbadmin -C CAP_DAC_READ_SEARCH,CAP_DAC_WRITE,
      CAP_FOWNER,CAP_MAC_READ+eip -c "if /etc/chkconfig csaacct; 
      then /usr/lib/acct/dodisk -c 2> /var/adm/acct/nite/csa/dk2log; fi"

    4. Include entries similar to the ones shown below in /var/spool/cron/crontabs/adm:

      0 4 * * 1-6 su adm -C CAP_ACCT_MGT+pi -c "if /etc/chkconfig csaacct; 
      then /usr/lib/acct/csarun 2> /var/adm/acct/nite/csa/fd2log; fi"
      5 * * * 1-6 su adm -C CAP_ACCT_MGT+pi -c "if /etc/chkconfig csaacct; 
      then /usr/lib/acct/csackpacct; fi"
      0 5 1 * * if /etc/chkconfig csaacct; 
      then /usr/lib/acct/csaperiod -r 2> /var/adm/acct/nite/csa/pd2log; fi

  6. Update the holidays file. The file /usr/lib/acct/holidays contains the prime/nonprime table for the accounting system. The table should be edited to reflect your location's holiday schedule for the year. The format is composed of three types of entries:

    • Comment Lines, which may appear anywhere in the file as long as the first character in the line is an asterisk.

    • Year Designation Line, which should be the first data line (noncomment line) in the file and must appear only once. The line consists of three fields of four digits each (leading white space is ignored). For example, to specify the year as 1992, prime time at 9:00 a.m., and nonprime time at 4:30 p.m., the following entry is appropriate:

      1992 0900 1630 

      A special condition allowed for in the time field is that the time 2400 is automatically converted to 0000

    • Company Holidays Lines, which follow the year designation line and have the following general format:

      day-of-year Month Day Description of Holiday 

      The day-of-year field is a number in the range of 1 through 366, indicating the day for the corresponding holiday (leading white space is ignored). The other three fields are actually commentary and are not currently used by other programs.

The csarun Command

The /usr/lib/acct/csarun command, usually initiated by cron(1), directs the processing of the daily accounting files. csarun processes accounting records written into the pacct file. It is normally initiated by cron during nonprime hours.

The csarun command also contains four user-exit points, allowing sites to tailor the daily run of accounting to their specific needs.

The csarun command does not damage files in the event of errors. It contains a series of protection mechanisms that attempt to recognize an error, provide intelligent diagnostics, and terminate processing in such a way that csarun can be restarted with minimal intervention.

Daily Invocation

The csarun command is invoked periodically by cron. It is very important that you ensure that the previous invocation of csarun completed successfully before invoking csarun for a new accounting period. If this is not done, information about unfinished jobs will be inaccurate.

Data for a new accounting period can also be interactively processed by executing the following:

nohup csarun 2> /var/adm/acct/nite/csa/fd2log &

Before executing csarun in this manner, ensure that the previous invocation completed successfully. To do this, look at the files active and statefile in /var/adm/acct/nite/csa. Both files should specify that the last invocation completed successfully. See “Restarting csarun”.

Error and Status Messages

The csarun error and status messages are placed in the /var/adm/acct/nite/csa directory. The progress of a run is tracked by writing descriptive messages to the file active. Diagnostic output during the execution of csarun is written to fd2log. The lock and lock1 files prevent concurrent invocations of csarun; csarun will abort if these two files exist when it is invoked. The clastdate file contains the month, day, and time of the last two executions of csarun.

Errors and warning messages from programs called by csarun are written to files that have names beginning with E and ending with the current date and time. For example, Ebld.11121400 is an error file from csabuild for a csarun invocation on November 12, at 14:00.

If csarun detects an error, it writes a message to the SYSLOG file, removes the locks, saves the diagnostic files, and terminates execution. When csarun detects an error, it will send mail either to MAIL_LIST if it is a fatal error, or to WMAIL_LIST if it is a warning message, as defined in the configuration file /etc/csa.conf.

States

Processing is broken down into separate reentrant states so that csarun can be restarted. As each state completes, /var/adm/acct/nite/csa/statefile is updated to reflect the next state. When csarun reaches the CLEANUP state, it removes various data files and the locks, and then terminates.

The following describes the events that occur in each state. MMDD refers to the month and day csarun was invoked. hhmm refers to the hour and minute of invocation.

State

Description

SETUP

The current accounting file is switched via csaswitch. The accounting file is then moved to the /var/adm/acct/work/MMDD/hhmm directory. File names are prefaced with W. /var/adm/acct/nite/csa/diskcacct is also moved to this directory.

VERIFY

The accounting files are checked for valid data. Records with invalid data are removed. Names of bad data files are prefixed with BAD. in the /var/adm/acct/work/MMDD/hhmm directory. The corrected files do not have this prefix.

ARCHIVE1

First user exit of the csarun script. If a script named /usr/lib/acct/csa.archive1 exists, it will be executed through the shell . (dot) command. The . (dot) command will not execute a compiled program, but the user exit script can. You might use this user exit to archive the accounting files in ${WORK}.

BUILD

The pacct accounting data is organized into a sorted pacct file.

ARCHIVE2

Second user exit of the csarun script. If a script named /usr/lib/acct/csa.archive2 exists, it will be executed through the shell . (dot) command. The . (dot) command will not execute a compiled program, but the user exit script can. You might use this exit to archive the sorted pacct file.

CMS

Produces a command summary file in cms.h format. The cms file is written to /var/adm/acct/sum/csa/cms.MMDDhhmm for use by csaperiod.

REPORT

Generates the daily accounting report and puts it into /var/adm/acct/sum/csa/rprt.MMDDhhmm. A consolidated data file, /var/adm/acct/sum/csa/cacct.MMDDhhmm, is also produced from the sorted pacct file. In addition, accounting data for unfinished jobs is recycled.

DREP

Generates a daemon usage report based on the sorted pacct file. This report is appended to the daily accounting report, /var/adm/acct/sum/csa/rprt.MMDDhhmm.

FEF

Third user exit of the csarun script. If a script named /var/lib/acct/csa.fef exists, it will be executed through the shell . (dot) command. The . (dot) command will not execute a compiled program, but the user exit script can. The csarun variables are available, without being exported, to the user exit script. You might use this exit to convert the sorted pacct file to a format suitable for a front-end system.

USEREXIT

Fourth user exit of the csarun script. If a script named /usr/lib/acct/csa.user exists, it will be executed through the shell . (dot) command. The . (dot) command will not execute a compiled program, but the user exit script can. The csarun variables are available, without being exported, to the user exit script. You might use this exit to run local accounting programs.

CLEANUP

Cleans up temporary files, removes the locks, and then exits.

Restarting csarun

If csarun is executed without arguments, the previous invocation is assumed to have completed successfully.

The following operands are required with csarun if it is being restarted:

csarun [MMDD [hhmm [state]]]

MMDD is month and day, hhmm is hour and minute, and state is the csarun entry state.

To restart csarun, follow these steps:

  1. Remove all lock files, by using the following command line:

    rm -f /var/adm/acct/nite/csa/lock*

  2. Execute the appropriate csarun restart command, using the following examples as guides:

    1. To restart csarun using the time and the state specified in clastdate and statefile, execute the following command:

      nohup csarun 0601 2> /var/adm/acct/nite/csa/fd2log &

      In this example, csarun will be rerun for June 1, using the time and state specified in clastdate and statefile.

    2. To restart csarun using the state specified in statefile, execute the following command:

      nohup csarun 0601 0400 2> /var/adm/acct/nite/csa/fd2log &

      In this example, csarun will be rerun for the June 1 invocation that started at 4:00 A.M., using the state found in statefile.

    3. To restart csarun using the specified date, time, and state, execute the following command:

      nohup csarun 0601 0400 BUILD 2> /var/adm/acct/nite/csa/fd2log &

      In this example, csarun will be restarted for the June 1 invocation that started at 4:00 A.M., beginning with state BUILD.

Before csarun is restarted, the appropriate directories must be restored. If the directories are not restored, further processing is impossible. These directories are as follows:

/var/adm/acct/work/MMDD/hhmm
/var/adm/acct/sum/csa

If you are restarting at state ARCHIVE2, CMS, REPORT, DREP, or FEF, the sorted pacct file must be in /var/adm/acct/work/MMDD/hhmm. If the file does not exist, csarun automatically will restart at the BUILD state. Depending on the tasks performed during the site-specific USEREXIT state, [the sorted pacct file may or may not need to exist.] This may or may not be acceptable.

Verifying and Editing Data Files

This section describes how to remove bad data from various accounting files.

The csaverify(1M) command verifies that the accounting records are valid and identifies invalid records. The accounting file can be a pacct or sorted pacct file. When csaverify finds an invalid record, it reports the starting byte offset and length of the record. This information can be written to a file in addition to standard output. A length of -1 indicates the end of file. The resulting output file can be used as input to csaedit(1M) to delete pacct or sorted pacct records.

  1. The pacct file is verified with the following command line, and the following output is received:

    $  /usr/lib/acct/csaverify -P pacct -o offsetfile
    acct.cat-330 /usr/lib/acct/csaverify: CAUTION
       readacctent(): An error was returned from the 'readpacct()' routine.

  2. The file offsetfile from csaverify is used as input to csaedit to delete the invalid records as follows (remaining valid records are written to pacct.NEW):

    /usr/lib/acct/csaedit -b offsetfile -P pacct -o pacct.NEW

  3. The new pacct file is reverified as follows to ensure that all the bad records have been deleted:

    /usr/lib/acct/csaverify -P pacct.NEW

You can use the csaedit -A option to produce an abbreviated ASCII version of pacct or sorted pacct files.

CSA Data Processing

The flow of data among the various CSA programs is explained in this section and is illustrated in Figure 5-2.

Figure 5-2. CSA Data Processing

CSA Data Processing

  1. Generate raw accounting files. Various daemons and system processes write to the raw pacct accounting files.

  2. Create a fee file. Sites that want to charge fees to certain users can do so with the csachargefee(1m) command. The csachargefee command creates a fee file that is processed by csaaddc(1m).

  3. Produce disk usage statistics. The dodisk(1m) shell script allows sites to take snapshots of disk usage. dodisk does not report dynamic usage; it only reports the disk usage at the time the command was run. Disk usage is processed by csaaddc.

  4. Organize accounting records into job records. The csabuild(1M) command reads accounting records from the CSA pacct file and organizes them into job records by job ID and boot times. It writes these job records into the sorted pacct file. This sorted pacct file contains all of the accounting data available for each job. The configuration records in the pacct files are associated with the job ID 0 job record within each boot period. The information in the sorted pacct file is used by other commands to generate reports and for billing.

  5. Recycle information about unfinished jobs. The csarecy(1M) command retrieves job information from the sorted pacct file of the current accounting period and writes the records for unfinished jobs into a pacct0 file for recycling into the next accounting period. csabuild(1M) marks unfinished accounting jobs (those are jobs without an end-of-job record). csarecy takes these records from the sorted pacct file and puts them into the next period's accounting files directory. This process is repeated until the job finishes.

    Sometimes data for terminated jobs are continually recycled. This can occur when accounting data is lost. To prevent data from recycling forever, edit csarun so that csabuild is executed with the -onday option, which causes all jobs older than nday days to terminate. Select an appropriate nday value (see the csabuild man page for more information and “Data Recycling”).

  6. Generate the daemon usage report, which is appended to the daily report. csadrep(1m) reports usage of the NQS, workload management, and tape daemons. Input is either from a sorted pacct file created by csabuild(1M) or from a binary file created by csadrep with the -o option. The files operand specifies the binary files.

  7. Summarize command usage from per-process accounting records. The csacms(1m) command reads the sorted pacct files. It adds all records for processes that executed identically named commands, and it sorts and writes them to var/adm/acct/sum/csa/cms.MMDDhhmm, using the cms format. The csacms(1m) command can also create an ASCII file.

  8. Condense records from the sorted pacct file. The csacon(1M) command condenses records from the sorted pacct file and writes consolidated records in cacct format to var/adm/acct/sum/csa/cacct.MMDDhhmm.

  9. Generate an accounting report based on the consolidated data. The csacrep(1m) command generates reports from data in cacct format, such as output from the csacon(1M) command. The report format is determined by the value of CSACREP in the /etc/csa.conf file. Unless modified, it will report the CPU time, total KCORE minutes total KVIRTUAL minutes, block I/O wait time, and raw I/O wait time. The report will be sorted first by user ID and then by the secondary key of project ID and the headers will be printed.

  10. Create the daily accounting report. The daily accounting report includes the following:

    • Consolidated information report (step 11)

    • Unfinished recycled jobs (step 5)

    • Disk usage report (step 3)

    • Daily command summary (step 7)

    • Last login information

    • Daemon usage report (step 6)

  11. Combine cacct records. The csaaddc(1M) command combines cacct records by specified consolidation options and writes out a consolidated record in cacct format.

  12. Summarize command usage from per-process accounting records. The csacms(1m) command reads the cms files created in step 7. Both an ASCII and a binary file are created.

  13. Produce a consolidated accounting report. csacrep(1m) is used to generate a report based on a periodic accounting file.

  14. The periodic accounting report layout is as follows:

    • Consolidated information report

    • Command summary report

Steps 4 through 11 are performed during each accounting period by csarun(1m). Periodic (monthly) accounting (steps 12 through 14) is initiated by the csaperiod(1m) command. Daily and periodic accounting, as well as fee and disk usage generation (steps 2 through 3), can be scheduled by cron(1m) to execute regularly. See “Setting Up CSA”, for more information.

Data Recycling

A system administrator must correctly maintain recycled data to ensure accurate accounting reports. The following sections discuss data recycling and describe how an administrator can purge unwanted recycled accounting data.

Data recycling allows CSA to properly bill jobs that are active during multiple accounting periods. By default, csarun reports data only for jobs that terminate during the current accounting period. Through data recycling, CSA preserves data for active jobs until the jobs terminate.

In the sorted pacct file, csabuild flags each job as being either active or terminated. csarecy reads the sorted pacct file and recycles data for the active jobs. csacon consolidates the data for the terminated jobs, which csaperiod uses later. csabuild, csarecy, and csacon are all invoked by csarun.

csarun puts recycled data in the /var/adm/acct/day/pacct0 file.

Normally, an administrator should not have to manually purge the recycled accounting data. This purge should only be necessary if accounting data is missing. Missing data can cause jobs to recycle forever and consume valuable CPU cycles and disk space.

How Jobs Are Terminated

Interactive jobs, cron jobs, and at jobs terminate when the last process in the job exits. Normally, the last process to terminate is the login shell. The kernel writes an end-of-job (EOJ) record to the pacct file when the job terminates.

When the NQS daemon or workload management daemon delivers an NQS or workload management request's output, the request terminates. The daemon then writes an NQ_DISP record type for NQS or WM_TERM record type for workload management to the pacct accounting file, while the kernel writes an EOJ record to the pacct file.

Unlike interactive jobs, NQS or workload management requests can have multiple EOJ records associated with them. In addition to the request's EOJ record, there can be EOJ records for pipe clients (NQS only), net clients, and checkpointed portions of the request. The pipe client and net client perform NQS or workload management processing on behalf of the request. The Load Sharing Facility (LSF) system currently does not support net clients.

The csabuild command flags jobs in the sorted pacct file as being terminated if they meet one of the following conditions:

  • The job is an interactive, cron, or at job, and there is an EOJ record for the job in the pacct file.

  • The job is an NQS request, and there is both an EOJ record for the request and an NQ_DISP record type in the pacct file.

  • The job is a workload management request, and there is both an EOJ record for the request and an WM_TERM record type in the pacct file.

  • The job is an interactive, cron, or at job and is active at the time of a system crash.

  • The job is manually terminated by the administrator using one of the methods described in “How to Remove Recycled Data”.

Why Recycled Sessions Should Be Scrutinized

Recycling unnecessary data can consume large amounts of disk space and CPU time. The sorted pacct file and recycled data can occupy a vast amount of disk space on the file system containing /var/adm/acct/day. Sites that archive data also require additional offline media. Wasted CPU cycles are used by csarun to reexamine and recycle the data. Therefore, to conserve disk space and CPU cycles, unnecessary recycled data should be purged from the accounting system.

Any of the following situations can cause CSA erroneously to recycle terminated jobs:

  • Kernel or daemon accounting is turned off.

    The kernel or csackpacct(1m) command can turn off accounting when there is not enough space on the file system containing /var/adm/acct/day.

  • Accounting files are corrupt. Accounting data can be lost or corrupted during a system or disk crash.

  • Recycled data is erroneously deleted in a previous accounting period.

How to Remove Recycled Data

Before choosing to delete recycled data, you should understand the repercussions, as described in “Adverse Effects of Removing Recycled Data”. Data removal can affect billing and can alter the contents of the consolidated data file, which is used by csaperiod.

You can remove recycled data from CSA in the following ways:

  • Interactively execute the csarecy-A command. Administrators can select the active jobs that are to be recycled by running csarecy with the -A option. Users are not billed for the resources used in the jobs terminated in this manner. Deleted data is also not included in the consolidated data file.

    The following example is one way to execute csarecy-A (which generates two accounting reports and two consolidated files):

    1. Run csarun at the regularly scheduled time.

    2. Edit a copy of /usr/lib/acct/csarun. Change the -r option on the csarecy invocation line to -A. Also, do not redirect standard output to ${SUM_DIR}/recyrpt. The result should be similar to the following:

      csarecy -A -s ${SPACCT} -P ${WTIME_DIR}/Rpacct \ 2> ${NITE_DIR}/Erec.${DTIME}

      Since both the -A and -r options write output to stdout, the -r option is not invoked and stdout is not redirected to a file. As a result, the recycled job report is not generated.

    3. Execute the jstat command, as follows, to display a list of currently active jobs:

      jstat -a > jstat.out

    4. Execute the qstat command to display a list of NQS requests. The qstat command is used for seeing whether there are requests that are not currently running. This includes requests that are checkpointed, held, queued, or waiting.

      To list all NQS requests, execute the qstat command, as follows, using a login that has either NQS manager or NQS operator privilege:

      qstat -a > qstat.out

    5. Interactively run the modified version of csarun. If you execute the modified csarun soon after the first step is complete, little data is lost because not very much data exists.

      For each active job, csarecy asks you if you want to preserve the job. Preserve the active and nonrunning NQS jobs found in the third and fourth steps. All other jobs are candidates for removal.

  • Execute csabuild with the -ondays option, which terminates all active jobs older than the specified number of days. Resource usage for these terminated jobs is reported by csarun, and users are billed for the jobs. The consolidated data file also includes this resource usage.

    To execute csabuild with the -o option, edit a copy of /usr/lib/acct/csarun. Add the -ondays option to the csabuild invocation line. Specify for ndays an appropriate value for your site.

    Recycled data for currently active jobs will be removed if you specify an inappropriate value for ndays.

  • Execute csarun with the -A option. It reports resource usage for both active and terminated jobs, so users are billed for recycled sessions. This data is also included in the consolidated data file.

    None of the data for the active jobs, including the currently active jobs, is recycled. No recycled data file is generated in the /var/adm/acct/day directory.

  • Remove the recycled data file from the /var/adm/acct/day directory. You can delete data for all of the recycled jobs, both terminated and active, by executing the following command:

    rm /var/adm/acct/day/pacct0

    The next time csarun is executed, it will not find data for any recycled jobs. Thus, users are not billed for the resources used in the recycled jobs, and this data is not included in the consolidated data file. csarun recycles the data for currently active jobs.

Adverse Effects of Removing Recycled Data

CSA assumes that all necessary accounting information is available to it, which means that CSA expects kernel and daemon accounting to be enabled and recycled data not to have been mistakenly removed. If some data is unavailable, CSA may provide erroneous billing information. Sites should be aware of the following facts before removing data:

  • Users may or may not be billed for terminated recycled jobs. Administrators must understand which of the previously described methods cause the user to be billed for the terminated recycled jobs. It is up to the site to decide whether or not it is valid for the user to be billed for these jobs.

    For those methods that cause the user to be billed, both csarun and csaperiod report the resource usage.

  • It may be impossible to reconstruct a terminated recycled job. If a recycled job is terminated by the administrator, but the job actually terminates in a later accounting period, information about the job is lost. If a user questions the resource billing, it may be extremely difficult or impossible for the administrator to correctly reassemble all accounting information for the job in question.

  • Manually terminated recycled jobs may be improperly billed in a future billing period. If the accounting data for the first portion of a job has been deleted, CSA may be unable to correctly identify the remaining portion of the job. Errors may occur, such as NQS or workload management requests being flagged as interactive jobs, or NQS or workload management requests being billed at the wrong queue rate. This is explained in detail in “NQS or Workload Management Requests and Recycled Data”.

  • CSA programs may detect data inconsistencies. When accounting data is missing, CSA programs may detect errors and abort.

The following table summarizes the effects of using the methods described in “How to Remove Recycled Data”.

Table 5-1. Possible Effects of Removing Recycled Data

Method

Underbilling?

Incorrect billing?

Consolidated data file

csarecy -A

Yes. Users are not billed for the portion of the job that was terminated by csarecy-A.

Possible. Manually terminated recycled jobs may be billed improperly in a future billing period.

Does not include data for jobs terminated by csarecy-A.

csabuild -o

No. Users are billed for the portion of the job that was terminated by csabuild-o.

Possible. Manually terminated recycled jobs may be billed improperly in a future billing period.

Includes data for jobs terminated by csabuild-o.

csarun -A

No. All active and recycled jobs are billed.

Possible. All active and recycled jobs that eventually terminate may be billed improperly in a future billing period, because no data is recycled.

Includes data for all active and recycled jobs.

rm

Yes. All users are not billed for the portion of the job that was recycled.

Possible. All recycled jobs that eventually terminate may be billed improperly in a future billing period.

Does not include data for any recycled job.

By default, the consolidated data file contains data only for terminated jobs. Manual termination of recycled data may cause some of the recycled data to be included in the consolidated file.

NQS or Workload Management Requests and Recycled Data

For CSA to identify all NQS or workload management requests, data must be properly recycled. When an administrator manually purges recycled data for an NQS or workload management request, errors such as the following can occur:

  • CSA fails to flag the job as an NQS or workload management job. This causes the request to be billed at standard rates instead of an NQS or workload management queue rate (see “NQS SBUs” or “Workload Management SBUs”).

  • The request is billed at the wrong queue rate.

  • The wrong queue wait time is associated with the request.

These errors occur because valuable NQS or workload management accounting information was purged by the administrator. Only a few NQS or workload management accounting records are written by the NQS or workload management daemon, and all of the records are needed for CSA to properly bill NQS or workload management requests.

NQS or workload management accounting records are only written under the following circumstances:

  • The NQS or workload management daemon receives a request.

  • A request is routed to a queue. (NQS only)

  • A request executes. This includes executing a request for the first time, restarting, and rerunning a request.

  • A request terminates. An NQS request can terminate because it is completed, requeued, preempted, held, or rerun. A workload management request can terminate because it is completed, requeued, held, rerun, or migrated.

  • Output is delivered.

Thus, for long running requests that span days, there can be days when no NQS or workload management data is written. Consequently, it is extremely important that accounting data be recycled. If the site administrator manually terminates recycled jobs, care must be taken to be sure that only nonexistent NQS or workload management requests are terminated.

Tailoring CSA

This section describes the following actions in CSA:

  • Setting up SBUs

  • Setting up daemon accounting

  • Setting up user exits

  • Writing a user exit

  • Modifying the charging of NQS or workload management jobs based on NQS or workload management termination status

  • Tailoring CSA shell scripts

  • Using at(1) instead of cron(1m) to periodically execute csarun

  • Allowing users without superuser permissions to run CSA

  • Using an alternate configuration file

System Billing Units (SBUs)

A system billing unit (SBU) is a unit of measure that reflects use of machine resources. You can alter the weighting factors associated with each field in each accounting record to obtain an SBU value suitable for your site. SBUs are defined in the accounting configuration file, /etc/csa.conf. By default, all SBUs are set to 0.0.

Accounting allows different periods of time to be designated either prime or nonprime time (the time periods are specified in /usr/lib/acct/holidays).

Following is an example of how the prime/nonprime algorithm works:

Assume a user uses 10 seconds of CPU time, and executes for 100 seconds of prime wall-clock time, and pauses for 100 seconds of nonprime wall-clock time. Therefore, elapsed time is 200 seconds (100+100). If

prime = prime time / elapsed time
nonprime = nonprime time / elapsed time
cputime[PRIME] = prime * CPU time
cputime[NONPRIME] = nonprime * CPU time

then

cputime[PRIME] == 5 seconds
cputime[NONPRIME] == 5 seconds

Under CSA, an SBU value is associated with each record in the sorted pacct file when that file is assembled by csabuild. Final summation of the SBU values is done by csacon during the creation of the cacct record file.

The following examples show how a site can bill different NQS or workload management queues at differing rates.

Total SBU = (NQS queue SBU value) * (sum of all process record SBUs
     + sum of all tape record SBUs)

or

Total SBU = (Workload management queue SBU value) * (sum of all process record SBUs
     + sum of all tape record SBUs)

Process SBUs

The SBUs for process data are separated into prime and nonprime values. Prime and nonprime use is calculated by a ratio of elapsed time. If you do not want to make a distinction between prime and nonprime time, set the nonprime time SBUs and the prime time SBUs to the same value. Prime time is defined in /usr/lib/acct/holidays. By default, Saturday and Sunday are considered nonprime time.

The following is a list of prime time process SBU weights. Descriptions and factor units for the nonprime time SBU weights are similar to those listed here. SBU weights are defined in /etc/csa.conf.

Value 

Description

P_BASIC 

Prime-time weight factor. P_BASIC is multiplied by the sum of prime time SBU values to get the final SBU factor for the process record.

P_TIME 

General-time weight factor. P_TIME is multiplied by the time SBUs (made up of P_STIME, P_UTIME, P_QTIME, P_BWTIME, and P_RWTIME) to get the time contribution to the process record SBU value.

P_STIME 

System CPU-time weight factor. The unit used for this weight is billing units per second. P_STIME is multiplied by the system CPU time.

P_UTIME 

User CPU-time weight factor. The unit used for this weight is billing units per second. P_UTIME is multiplied by the user CPU time.

P_QTIME 

Run queue wait time weight factor. The unit used for this weight is billing units per second. P_QTIME is multiplied by the run queue wait time.

P_BWTIME 

Block I/O wait time weight factor. The unit used for this weight is billing units per second. P_BWTIME is multiplied by the block I/O wait time.

P_RWTIME 

Raw I/O wait time weight factor. The unit used for this weight is billing units per second. P_RWTIME is multiplied by the raw I/O wait time.

P_MEM 

General-memory-integral weight factor. P_MEM is multiplied by the memory SBUs (made up of P_XMEM and P_VMEM) to get the memory contribution to the process record SBU value.

P_XMEM 

CPU-time-core-physical memory-integral weight factor. The unit used for this weight is billing units per Mbyte-minute P_XMEM is multiplied by the core-memory integral.

P_VMEM 

CPU-time-virtual-memory-integral weight factor. The unit used for this weight is billing units per Mbyte-minute. P_VMEM is multiplied by the virtual memory integral.

P_IO 

General-I/O weight factor. P_IO is multiplied by the I/O SBUs (made up of P_BIO, P_CIO, and P_LIO) to get the I/O contribution to the process record SBU value.

P_BIO 

Blocks-transferred weight factor. The unit used for this weight is billing units per block transferred. P_BIO is multiplied by the number of I/O blocks transferred.

P_CIO 

Characters-transferred weight factor. The unit used for this weight is billing units per character transferred. P_CIO is multiplied by the number of I/O characters transferred.

P_LIO 

Logical-I/O-request weight factor. The unit used for this weight is billing units per logical I/O request. P_LIO is multiplied by the number of logical I/O requests made. The number of logical I/O requests is total number of read and write system calls.

The formula for calculating the whole process record SBU is as follows:

PSBU = (P_TIME * (P_STIME * stime + P_UTIME * utime + P_QTIME * qwtime +
P_BWTIME * bwtime + P_RWTIME * rwtime)) + (P_MEM * (P_XMEM * coremem + P_VMEM
* virtmem)) + (P_IO * (P_BIO * bio + P_CIO * cio + P_LIO * lio));

NSBU = (NP_TIME * (NP_STIME * stime + NP_UTIME * utime + NP_QTIME * qwtime +
NP_BWTIME * bwtime + NP_RWTIME * rwtime)) + (NP_MEM * (NP_XMEM * coremem +
NP_VMEM * virtmem)) + (NP_IO * (NP_BIO * bio + NP_CIO * cio + NP_LIO * lio));

SBU = P_BASIC * PSBU + NP_BASIC * NSBU;

The variables in this formula are described as follows:

Variable 

Description

stime 

System CPU time in seconds

utime 

User CPU time in seconds

bwtime 

Block I/O wait time in seconds

rwtime 

Raw I/O wait time in seconds

coremem 

Core (physical) memory integral in Mbyte-minutes

virtmem 

Virtual memory integral in Mbyte-minutes

bio 

Number of blocks of data transferred

cio 

Number of characters of data transferred

lio  

Number of logical I/O requests

NQS SBUs

The /etc/csa.conf file contains the configurable parameters that pertain to NQS SBUs.

The NQS_NUM_QUEUES parameter sets the number of queues for which you want to set SBUs (the value must be set to at least 1). Each NQS_QUEUE x variable in the configuration file has a queue name and an SBU pair associated with it (the total number of queue/SBU pairs must equal NQS_NUM_QUEUES). The queue/SBU pairs define weights for the queues. If an SBU value is less than 1.0, there is an incentive to run jobs in the associated queue; if the value is 1.0, jobs are charged as though they are non-NQS jobs; and if the SBU is 0.0, there is no charge for jobs running in the associated queue. SBUs for queues not found in the configuration file are automatically set to 1.0.

The NQS_NUM_MACHINES parameter sets the number of originating machines for which you want to set SBUs (the value must be at least 1). Each NQS_MACHINE x variable in the configuration file has an originating machine and an SBU pair associated with it (the total number of machine/SBU pairs must equal NQS_NUM_MACHINES). SBUs for originating machines not specified in /etc/csa.conf are automatically set to 1.0.

The queue and machine SBUs are multiplied together to give an NQS multiplier. If the SBUs are set to less than 1.0, there is an incentive to run jobs in these queues or from these machines. SBUs of 1.0 indicate that jobs in the queues or from associated hosts are billed normally.

Workload Management SBUs

The /etc/csa.conf file contains the configurable parameters that pertain to workload management SBUs.

The WKMG_NUM_QUEUES parameter sets the number of queues for which you want to set SBUs (the value must be set to at least 1). Each WKMG_QUEUE x variable in the configuration file has a queue name and an SBU pair associated with it (the total number of queue/SBU pairs must equal WKMG_NUM_QUEUES). The queue/SBU pairs define weights for the queues. If an SBU value is less than 1.0, there is an incentive to run jobs in the associated queue; if the value is 1.0, jobs are charged as though they are non-workload management jobs; and if the SBU is 0.0, there is no charge for jobs running in the associated queue. SBUs for queues not found in the configuration file are automatically set to 1.0.

The WKMG_NUM_MACHINES parameter sets the number of originating machines for which you want to set SBUs (the value must be at least 1). Each WKMG_MACHINE x variable in the configuration file has an originating machine and an SBU pair associated with it (the total number of machine/SBU pairs must equal WKMG_NUM_MACHINES). SBUs for originating machines not specified in /etc/csa.conf are automatically set to 1.0.

Tape SBUs

There is a set of weighting factors for each group of tape devices. By default, there are only two groups, tape and cart. The TAPE_SBU i parameters in /etc/csa.conf define the weighting factors for each group. There are SBUs associated with the following:

  • Number of mounts

  • Device reservation time (seconds)

  • Number of bytes read

  • Number of bytes written

Example SBU Settings

The following shows how you could set up the SBU system. This example is restricted to the process records.

All time is considered prime time. Therefore, the nonprime time SBUs should be set to the same values as their prime time counterparts.

Users are charged $10 per hour of user CPU time. This is equal to $10 per 3600 seconds, which is $0.002777777777777 per second (P_UTIME).

Therefore, the charges are as follows (the nonprime time SBUs are set to the same values as their prime time counterparts):

Weight Factor 

Charge

P_BASIC 

1.0

P_TIME 

1.0

P_STIME 

0.0

P_UTIME 

0.002777777777777

P_QTIME 

0.0

P_BWTIME 

0.0

P_RWTIME 

0.0

P_MEM 

0.0

P_XMEM 

0.0

P_VMEM 

0.0

P_IO 

0.0

P_BIO 

0.0

P_CIO 

0.0

P_LIO 

0.0

 

Daemon Accounting

Accounting information is available from the NQS, workload management, and online tape daemons. Data is written to the pacct file in the /var/adm/acct/day directory.

In most cases, daemon accounting must be enabled by both the CSA subsystem and the daemon. “Setting Up CSA”, describes how to enable daemon accounting at system startup time. You can also enable daemon accounting after the system has booted.

You can enable accounting for a specified daemon by using the csaswitch command. For example, to start tape accounting, you should do the following:

/usr/lib/acct/csaswitch -c on -n tape

The NQS or workload management, and online tape daemon, also, must enable accounting. Use the qmgr set accounting on command to turn on NQS accounting. Tape daemon accounting is enabled when tmdaemon(1m) is executed with the -c option.See the appropriate workload management guide for information on how to enable workload management accounting.


Note: If you are running the Load Sharing Facility (LSF) system and want to enable workload management accounting, you must set two LSF configuration variables in the lsf.conf file as follows:
LSF_ENABLE_CSA=y
LSF_ULDB_DOMAIN = <ULDB_domain_name>



If LSF_ENABLE_CSA is defined in the lsf.conf file, LSF writes LSF batch job events to the pacct file for processing through CSA. For LSF job accounting, records are written to pacct at the start and end of each LSF job.

If a ULDB domain for LSF is defined in the lsf.conf file, LSF creates an IRIX job and applies the configured resource limits to it. LSF resource limits defined in lsb.queues or at job submission override IRIX job limits defined in the ULDB.

For more information on the Load Sharing Facility (LSF) system and workload management accounting, see the appropriate LSF documentation.


Daemon accounting is disabled at system shutdown (see “Setting Up CSA”). It can also be disabled at any time by the csaswitch command when used with the off operand. For example, to disable NQS accounting, execute the following command:

/usr/lib/acct/csaswitch -c off -n nqs

These dynamic changes using csaswitch are not saved across a system reboot.

Setting up User Exits

CSA accommodates the following user exits, which can be called from certain csarun states:

csarun state 

User exit

ARCHIVE1 

/usr/lib/acct/csa.archive1

ARCHIVE2 

/usr/lib/acct/csa.archive2

FEF 

/var/lib/acct/csa.fef

USEREXIT 

/usr/lib/acct/csa.user

CSA accommodates the following user exit, which can be called from certain csaperiod states:

csaperiod state 

User exit

USEREXIT 

/usr/lib/acct/csa.puser

These exits allow an administrator to tailor the csarun procedure (or csaperiod procedure) to the individual site's needs by creating scripts to perform additional site-specific processing during daily accounting. (Note that the following comments also apply to csaperiod).

While executing, csarun checks in the ARCHIVE1, ARCHIVE2, FEF and USEREXIT states for a shell script with the appropriate name.

If the script exists, it is executed via the shell . (dot) command. If the script does not exist, the user exit is ignored. The . (dot) command will not execute a compiled program, but the user exit script can. csarun variables are available, without being exported, to the user exit script. csarun checks the return status from the user exit and if it is nonzero, the execution of csarun is terminated.

If CSA is run by a user without superuser permissions, the user exits must be both readable and executable by this user (see “Allowing Non Superusers to Execute CSA”).

Writing a User Exit

This section provides information about writing a user exit. The first example shows a user exit that saves the sorted pacct file after a daily accounting run. The second example shows a user exit that consolidates information for a daily report by project rather than by user.

Example 5-1. Save a sorted pacct File During a Daily Accounting Run

The csarun(1M) and csaperiod(1M) scripts use shell variables that are available for use within a user exit script. For example, the sorted pacct file is deleted after a successful daily accounting run. However, if you want to save that file, you could use any of the user exits that are executed after the sorted pacct file is created (see the csarun(1M) man page). Here is a simple user exit script to do just that:

#! /bin/sh
echo "Copying spacct file to /tmp/spacct"
cp ${SPACCT} /tmp/spacct

Example 5-2. Consolidated Information Report by Project Rather than by User

The default output for consolidated information from a daily report is as follows:

CONSOLIDATED INFORMATION REPORT BETWEEN 08/09 04:00   AND   08/09 14:48

PROJECT    USER    LOGIN    CPU-TIM  KCORE *  KVIRT *   IOWAIT [SECS]
 NAME       ID     NAME     [SECS]   CPU-MIN  CPU-MIN   BLOCK     RAW
======== ======== ======== ======== ======== ======== ======== ========
sysadm   0        root           30      536     1177       48        0
root     4        sys             0        5       11        0        0
csa      5        adm             5       24      194        1        0
root     1461     security        1        2       16        0        0
nqe      10320    user12          2        5       68        1        0

To show consolidated information for a daily report by project rather than by user, use the csacon(1M) and csacrep(1M) commands with the project option as follows:

/usr/lib/acct/csacon -Ap -s /tmp/spacct > /tmp/cacct_p
/usr/lib/acct/csacrep -hpcw < /tmp/cacct_p > /tmp/csacrep.out.p

The output is as follows:

PROJECT    USER    LOGIN    CPU-TIM  KCORE *  KVIRT *   IOWAIT [SECS]
 NAME       ID     NAME     [SECS]   CPU-MIN  CPU-MIN   BLOCK     RAW
======== ======== ======== ======== ======== ======== ======== ========
root     Unknown  Unknown         1        8       28        0        0
sysadm   Unknown  Unknown        31      537     1187       49        0
csa      Unknown  Unknown         5       24      194        1        0
nqe      Unknown  Unknown         2        7       83        1        0

The example /usr/lib/acct/csa.user script below performs the same operation as the csacon(1M) and csacrep(1M) commands example above to include a consolidated information by project report within the daily report:

#!/sbin/sh
#
csacon ${ALLJOBS} -p -s ${SPACCT} > ${SUM_DIR}/cacct_p.${DTIME} \
        2> ${NITE_DIR}/Econ.${DTIME}
if [ ${?} -ne 0 ]
then
        CSAERRMSG="REPORT - csacon errors \
                \n\tSee ${NITE_DIR}/Econ.${DTIME} and/or ${NITE_DIR}/fd2log"
        ERROR_EXIT
fi
chgrp ${CHGRP} ${SUM_DIR}/cacct_p.${DTIME}
#
csacrep -hpcw < ${SUM_DIR}/cacct_p.${DTIME} \
> ${SUM_DIR}/conrpt_p.${DTIME} 2> ${NITE_DIR}/Ecrpt_p.${DTIME}
if [ ${?} -ne 0 ]
then
        CSAERRMSG="REPORT - csacrep errors \
                \n\tSee ${NITE_DIR}/Ecrep_p.${DTIME} and/or ${NITE_DIR}/fd2log"
        ERROR_EXIT
fi
#
cd ${SUM_DIR}
echo "${RPTHDR}\n" > tmprprt
echo "Put some header message here\n"  >> tmprprt
cat conrpt_p.${DTIME} >> tmprprt
pr -h "${DAYHDR} ${SYSNAME} ${RELMSG}" tmprprt >> rprt.${DTIME}
#

If you want the new binary data files (cacct_p in the user exit example, above) to be used with the periodic report, you need to create a user exit for /usr/lib/acct/csaperiod.


Charging for NQS Jobs

By default, SBUs are calculated for all NQS jobs regardless of the job's NQS termination code. If you do not want to bill portions of an NQS request, set the appropriate NQS_TERM_xxxx variable (termination code) in the /etc/csa.conf file to 0, which sets the SBU for this portion to 0.0. By default, all portions of a request are billed.

The following table describes the termination codes:

Code 

Description

NQS_TERM_EXIT 

Generated when the request finishes running and is no longer in a queued state. At NQS shutdown time, requests that specified both the -nc (no checkpoint) and -nr (no rerun) options for qsub also have NQS_TERM_EXIT records written. In addition, this record is written for requests that specified the -nr option for qsub and were running at the time of a system crash.

NQS_TERM_REQUEUE 

Written for running requests that are checkpointed and then requeued when NQS shuts down.

NQS_TERM_PREEMPT 

Written when a request is preempted with the qmgr preempt request command.

NQS_TERM_HOLD 

Written for a request that is checkpointed with the qmgr hold request command. The hold request command differs from the checkpoint done at daemon shutdown time because a "hold" keeps the job from being scheduled until a qmgr release command is executed.

NQS_TERM_OPRERUN 

Written when a request is rerun with the qmgr rerun request command.

At NQS shutdown time, jobs that cannot be checkpointed and do not have the -nr (no rerun) option for qsub specified have this type of termination record written. The requests are requeued with this status.

NQS_TERM_RERUN 

Written when a request is a non-operator rerun request.

Charging for Workload Management Jobs

By default, SBUs are calculated for all workload management jobs regardless of the workload management termination code of the job. If you do not want to bill portions of a workload management request, set the appropriate WKMG_TERM_xxxx variable (termination code) in the /etc/csa.conf file to 0, which sets the SBU for this portion to 0.0. By default, all portions of a request are billed.

The following table describes the termination codes:

Code 

Description

WKMG_TERM_EXIT 

Generated when the request finishes running and is no longer in a queued state.

WKMG_TERM_REQUEUE 

Written for a request that is requeued.

WKMG_TERM_HOLD 

Written for a request that is checkpointed and held.

WKMG_TERM_RERUN 

Written when a request is rerun.

WKMG_TERM_MIGRATE 

Written when a request is migrated.


Note: The above descriptions of the termination codes are very generic. Different workload managers will tailor the meaning of these codes to suit their products. LSF currently only uses the WKMG_TERM_EXIT termination code.


Tailoring CSA Shell Scripts and Commands

Modify the following variables in /etc/csa.conf if necessary:

Variable 

Description

MAIL_LIST 

List of users to whom mail is sent if fatal errors are detected in the accounting shell scripts. The default is root and adm.

WMAIL_LIST 

List of users to whom mail is sent if warning errors are detected by the accounting scripts at cleanup time. The default is root and adm.

MIN_BLKS 

Minimum number of free blocks needed on the file system on which the var/adm/acct directory resides to run csarun or csaperiod. The default is 2000 free blocks. Block size is 1024 bytes.

Using at to Execute csarun

You can use the at command instead of cron to execute csarun periodically. If your system is down when csarun is scheduled to run via cron, csarun will not be executed until the next scheduled time. On the other hand, at jobs execute when the machine reboots if their scheduled execution time was during a down period.

You can execute csarun by using at in several ways. For example, a separate script can be written to execute csarun and then resubmit the job at a specified time. Also, an at invocation of csarun could be placed in a user exit script, /usr/lib/acct/csa.user, that is executed from the USEREXIT section of csarun. For more information, see “Setting up User Exits”.

Allowing Non Superusers to Execute CSA

Your site may want to allow users without superuser permissions to run CSA accounting. CSA can be run by users who are in the group adm and have the CAP_ACCT_MGT capability. See the capability(4) and capabilities(4) man pages for more information on the capability mechanism that provides fine grained control over the privileges of a process.

The following steps describe the process of setting up CSA so it is executed automatically on a daily and periodic basis by a user without superuser permissions. In this example, the user without superuser permissions is adm:

  1. Ensure that user adm is a member of group adm and has the CAP_ACCT_MGT capability.

  2. Ensure that the following user exits (if they exist) are both readable and executable by user adm:

    • /usr/lib/acct/csa.archive1

    • /usr/lib/acct/csa.archive2

    • /usr/lib/acct/csa.fef

    • /usr/lib/acct/csa.user

    • /usr/lib/acct/csa.puser

  3. Follow steps 1 through 5 of “Setting Up CSA”, to set up system billing units, record system boot times, and turn off accounting before system shutdown.

  4. Include an entry similar to the one shown below in /var/spool/cron/crontabs/root so that cron automatically runs dodisk(1m):

    0 2 * * 4 if /etc/chkconfig csaacct; then /usr/lib/acct/dodisk -c 2> /var/adm/acct/nite/csa/dk2log; fi

    The dodisk command must be executed by root, because no other user has the correct permissions to read /dev/dsk/*. For more information on the dodisk(1M) command, see the dodisk(1M) man page.

  5. Include entries similar to the ones shown below in /var/spool/cron/crontabs/adm so that user adm automatically runs daily accounting by using cron:

    0 4 * * 1-6 su adm -C CAP_ACCT_MGT+pi -c "if /etc/chkconfig csaacct;
    then /usr/lib/acct/csarun 2> /var/adm/acct/nite/csa/fd2log; fi"
    5 * * * 1-6 su adm -C CAP_ACCT_MGT+pi -c "if /etc/chkconfig csaacct;
    then /usr/lib/acct/csackpacct; fi"

    The csarun command should be executed at a time that allows dodisk to complete. If dodisk does not complete before csarun executes, disk accounting information may be missing or incomplete.

  6. To run monthly accounting, place an entry similar to the one below in /var/spool/cron/crontabs/adm (this command generates a monthly report on all consolidated data files found in /var/adm/acct/sum/csa and then deletes those data files):

    Change the crontab entry for #6 to the following:
      0 5 1 * * if /etc/chkconfig csaacct;
    then /usr/lib/acct/csaperiod -r 2> /var/adm/acct/nite/csa/pd2log; fi
    

  7. Update the holidays file as described in “Setting Up CSA”.


Note: The cron entries listed above only work when the login shell of user adm is sh or ksh.


Using an Alternate Configuration File

By default, the /etc/csa.conf configuration file is used when any of the CSA commands are executed. You can specify a different file by setting the shell variable CSACONFIG to another configuration file, and then executing the CSA commands.

For example, you would execute the following commands to use the configuration file /tmp/myconfig while executing csarun:

CSACONFIG=/tmp/myconfig
/usr/lib/acct/csarun 2> /var/adm/acct/nite/fd2log

CSA Reports

You can use CSA to create accounting reports. The reports can be used to help track system usage, monitor performance, and charge users for their time on the system.

The CSA daily reports are located in the /var/adm/acct/sum/csa directory; periodic reports are located in the /var/adm/acct/fiscal/csa directory. To view the reports, go to the ASCII file rprt.MMDDhhmm in the report directories.

The CSA reports contain more detailed data than the other accounting reports. For CSA accounting, daily reports are generated by the csarun command. The daily report includes the following:

  • disk usage statistics

  • unfinished job information

  • command summary data

  • consolidated accounting report

  • last login information

  • daemon usage report

Periodic reports are generated by the csaperiod command. You can also create a disk usage report using the diskusg command.

CSA Daily Report

This section describes the following reports:

Consolidated Information Report

The Consolidated Information Report is sorted by user ID and then project ID. The following usage values are the total amount of resources used by all processes for the specified user and project during the reporting period.

Heading 

Description

PROJECT NAME 

Project associated with this resource usage information

USER ID 

User identifier

LOGIN NAME 

Login name for the user identifier

CPU_TIME 

Total accumulated CPU time in seconds

KCORE * CPU-MIN 

Total accumulated amount of Kbytes of core (physical) memory used per minute of CPU time

KVIRT * CPU-MIN 

Total accumulated amount of Kbytes of virtual memory used per minute of CPU time

IOWAIT BLOCK 

Total accumulated block I/O wait time in seconds

IOWAIT RAW 

Total accumulated raw I/O wait time in seconds

Unfinished Job Information Report

The Unfinished Job Information Report describes jobs which have not terminated and are recycled into the next accounting period.

Heading 

Description

JOB ID 

Job identifier

USERS 

Login name of the owner of this job

PROJECT ID 

Project identifier associated with this job

STARTED 

Beginning time of this job

Disk Usage Report

The Disk Usage Report describes the amount of disk resource consumption by login name.

There are no column headings for this report. The first column gives the user identifier. The second column gives the login name associated with the user identifier. The third column gives the number of disk blocks used by this user.

Command Summary Report

The Command Summary Report summarizes command usage during this reporting period. The usage values are the total amount of resources used by all invocations of the specified command. Commands which were run only once are combined together in the "***other" entry. Only the first 44 command entries are displayed in the daily report. The periodic report displays all command entries.

Heading 

Description

COMMAND NAME 

Name of the command (program)

NUMBER OF COMMANDS 

Number of times this command was executed

TOTAL KCORE-MINUTES 

Total amount of Kbytes of core (physical) memory used per minute of CPU time

TOTAL KVIRT-MINUTES 

Total amount of Kbytes of virtual memory used per minute of CPU time

TOTAL CPU 

Total amount of CPU time used in minutes

TOTAL REAL 

Total amount of real (wall clock) time used in minutes

MEAN SIZE KCORE 

Average amount of core (physical) memory used in Kbytes

MEAN SIZE KVIRT 

Average amount of virtual memory used in Kbytes

MEAN CPU 

Average amount of CPU time used in minutes

HOG FACTOR 

Total CPU time used divided by the total real time (elapsed time)

K-CHARS READ 

Total number of characters read in Kbytes

K-CHARS WRITTEN 

Total number of characters written in Kbytes

BLOCKS READ 

Total number of blocks read

BLOCKS WRITTEN 

Total number of blocks written

Last Login Report

The Last Login Report shows the last login date for each login account listed.

There are no column headings for this report. The first column is the last login date. The second column is the login account name.

Daemon Usage Report

Daemon Usage Report shows reports usage of the NQS or workload management, and tape daemons. This report has several individual reports depending upon if there was NQS, workload management, or tape daemon activity within this reporting period.

The Job Type Report gives the NQS and interactive job usage count.

Heading 

Description

Job Type 

Type of job (interactive or NQS or workload management)

Total Job Count 

Number and percentage of jobs per job type

Tape Jobs 

Number and percentage of tape jobs associated with these interactive and NQS or workload management jobs

The CPU Usage Report gives the NQS or workload management and interactive job usage related to CPU usage.

Heading 

Description

Job Type 

Type of job (interactive or NQS or workload management)

Total CPU Time 

Total amount of CPU time used in seconds and percentage of CPU time

System CPU Time 

Amount of system CPU time used of the total and the percentage of the total time which was system CPU time usage

User CPU Time 

Amount of user CPU time used of the total and the percentage of the total time which was user CPU time usage

The Tape Usage Report gives the NQS or workload management and interactive job usage related to tape activity for these jobs.

Heading 

Description

Job Type 

Type of job (interactive or NQS or workload management)

Device Group 

Tape device group name

Rsv Time 

Tape reservation time in seconds

Mounts 

Number of tape mounts

KBytes Read 

Tape amount read in Kbytes

KBytes Written 

Tape amount written in Kbytes

User CPU 

Amount of user CPU time used in seconds

Sys CPU 

Amount of system CPU time used in seconds

The Batch Queue Report gives the following information for each NQS or workload management queue.

Queue Name 

Name of the NQS or workload management queue

Number of Jobs 

Number of jobs initiated from this queue

CPU Time 

Amount of system and user CPU times used by jobs from this queue and percentage of CPU time used

Used Tapes 

How many jobs from this queue used tapes

Ave Queue Wait 

Average queue wait time before initiation in seconds

Periodic Report

This section describes two periodic reports as follows:

Consolidated accounting report

The following usage values for the Consolidated accounting report are the total amount of resources used by all processes for the specified user and project during the reporting period.

Heading 

Description

PROJECT NAME 

Project associated with this resource usage information

USER ID 

User identifier

LOGIN NAME 

Login name for the user identifier

CPU_TIME 

Total accumulated CPU time in seconds

KCORE * CPU-MIN 

Total accumulated amount of Kbytes of core (physical) memory used per minute of CPU time of processes

KVIRT * CPU-MIN 

Total accumulated amount of Kbytes of virtual memory used per minute of CPU time

IOWAIT BLOCK 

Total accumulated block I/O wait time in seconds

IOWAIT RAW 

Total accumulated raw I/O wait time in seconds

DISK BLOCKS 

Total number of disk blocks used

DISK SAMPLES 

Number of times disk accounting was run to obtain the disk blocks used value

FEE 

Total fees charged to this user from csachargefee(1M)

SBUs 

System billing units charged to this user and project

Command summary report

The following information summarizes command usage during the defined reporting period. The usage values are the total amount of resources used by all invocations of the specified command. Unlike the daily command summary report, the periodic command summary report displays all command entries. Commands executed only once are not combined together into an "***other" entry but are listed individually in the periodic command summary report.

Heading 

Description

COMMAND NAME 

Name of the command (program)

NUMBER OF COMMANDS 

Number of times this command was executed

TOTAL KCORE-MINUTES  

Total amount of Kbytes of core (physical) memory used per minute of CPU time

TOTAL KVIRT-MINUTES 

Total amount of Kbytes of virtual memory used per minute of CPU time

TOTAL CPU  

Total amount of CPU time used in minutes

TOTAL REAL 

Total amount of real (wall clock) time used in minutes

MEAN SIZE KCORE 

Average amount of core (physical) memory used in Kbytes

MEAN SIZE KVIRT 

Average amount of virtual memory used in Kbytes

MEAN CPU 

Average amount of CPU time used in minutes

HOG FACTOR 

Total CPU time used divided by the total real time (elapsed time)

K-CHARS READ 

Total number of characters read in Kbytes

K-CHARS WRITTEN 

Total number of characters written in Kbytes

BLOCKS READ 

Total number of blocks read

BLOCKS WRITTEN 

Total number of blocks written

CSA and Existing IRIX Software

This section describes some changes and additions to existing documentation for the IRIX operating system.

acct(1M) Man Page

The acctdisk command contains a -c option that reads standard input and converts records to cacct format, which it writes to standard output.

acctsh(1M) Man Page

The lastlogin(1M) command contains a -c option with an infile argument that specifies that lastlogin should process infile, which is a consolidated accounting file in cacct format.

The dodisk command information is now contained in a new dodisk(1M) man page.

dodisk(1M) Man Page

The IRIX 6.5.8 release introduced a new dodisk(1M) man page. The dodisk command information was previously in the acctsh(1M) man page.

explain(1) Man Page

CSA uses the message catalog system. There are two files that CSA uses for the message catalog:

  • /usr/lib/locale/C/LC_MESSAGES/acct.cat

  • /usr/lib/locale/C/LC_MESSAGES/acct.exp

The group code acct for the CSA Software Product has been added to the explain(1) page in the 6.5.8f release of the IRIX operating system.

capabilities(4) Man Page

Basic accounting and CSA require the same capability. CAP_ACCT_MGT is the privilege required to use accounting setup system calls, acct(2). The same privilege is required to use the new acctctl(3c) call. acctctl(3c) has been added to the capabilities(4) man page in the 6.5.8f release of the IRIX operating system.

Migrating Accounting Data

No changes have been made to basic accounting or extended accounting records. There is no migration of accounting data between these two IRIX accounting methods and CSA. That is, basic accounting commands should continue to be used with basic accounting, and third party packages should continue to be used with extended accounting data.

CSA accounting commands can only be used with CSA accounting data. CSA commands cannot process basic accounting or extended accounting records. Basic accounting commands cannot process CSA generated accounting data.

CSA Man Pages

The man command provides online help on all resource management commands. To view a man page online, type man commandname.

User-Level Man Pages

The following user-level man pages are provided with CSA software:

User-level man page 

Description

csacom(1) 

Searches and prints the CSA process accounting files.

ja(1) 

Starts and stops user job accounting information.

Administrator Man Pages

The following administrator man pages are provided with CSA software:

Administrator man page

Description

csaaddc(1m)

Combines cacct records.

csabuild(1m)

Organizes accounting records into job records.

csachargefee(1m)

Charges a fee to a user.

csackpacct(1m)

Checks the size of the CSA process accounting file.

csacms(1m)

Summarizes command usage from per-process accounting records.

csacon(1m)

Condenses records from the sorted pacct file.

csacrep(1m)

Reports on consolidated accounting data.

csadrep(1m)

Reports daemon usage.

csaedit(1m)

Displays and edits the accounting information.

csagetconfig(1m)

Searches the accounting configuration file for the specified argument.

csajrep(1m)

Prints a job report from the sorted pacct file.

csarecy(1m)

Recycles unfinished jobs into the next accounting run.

csaswitch(1m)

Checks the status of, enables or disables the different types of CSA, and switches accounting files for maintainability.

csaverify(1m)

Verifies that the accounting records are valid.