Chapter 1. Introduction to System Configuration and Operation

One of the first jobs of a system administrator is to bring a system online with an existing network (or standing alone), and to configure the system to meet the needs for which the system was installed. This configuration usually involves installing any necessary software and hardware, setting the name and network address of the system, creating accounts for the expected users, and generally taking a system from out-of-the-box uniformity and customizing it to meet your preferences and your user's needs.

The tasks of installing necessary hardware are described in the documentation for the hardware. Software installation is described in the IRIX Admin: Software Installation and Licensing volume. This guide describes the tasks you perform once the system has been powered-up, to bring a system from its initial distributed state to the state in which you or your users will use it.

This guide assists you by describing the procedure you—the system administrator—use to configure systems and by explaining the reasons why these procedures exist and why they work the way they do. Some of these tasks are typically performed only at times of major change—when a system is commissioned, when ownership changes, or when there has been a significant hardware upgrade. Others are ongoing tasks or tasks that may come up during standard usage of an installed system.

As system administrator, you should familiarize yourself with the graphical interface tools available through the System Manager. You can conveniently perform many common administrative tasks with this tool. This document does not describe the System Manager, but instead discusses how to use the command line and file interface to perform administrative functions.

This chapter provides information on the general nature of IRIX system administration. There are many good books on system administration listed in Appendix F of this guide, and these are available through computer bookstores. SGI systems are similar to those described in many of these books, and they are different in significant areas as well. The principles of good system administration, though, are constant.

Principles of Good System Administration

The following sections outline basic principles of good system administration. Each administrator must make individual decisions about the best practices for a site. The principles discussed here are generally considered to be wise and safe practices.

Account Passwords

To make your site as secure as possible, each user should have an account, with a unique user ID number, and each account should have a password. Users should never give out their passwords to anyone else under any circumstances. For more information on passwords and system security, see the IRIX Admin: Backup, Security, and Accounting volume. For additional accounting information, see IRIX Admin: Resource Administration.

Superuser (root) Account Access Restriction

Most system administration is performed while the system administrator is logged in as root (the superuser). This account is different from an ordinary user account because root has access to all system files and is not constrained by the usual system of permissions that control access to files, directories, and programs. The root account exists so that the administrator can perform all necessary tasks on the system while maintaining the privacy of user files and the integrity of system files. Other operating systems that do not differentiate between users have little or no means of providing for the privacy of users' files or for keeping system files uncorrupted. UNIX-based systems place the power to override system permissions and to change system files only with the root account.

All administrators at your site should have regular user accounts for their ordinary user tasks. The root account should be used only for necessary system administration tasks.

To obtain the best security on a multiuser system, restrict access to the root account. On workstations, the primary user of the workstation can generally use the root account safely, though most users should not have access to the root account on other users' workstations.

Make it a policy to give root passwords to as few people as is practical. Some sites maintain locked file cabinets of root passwords so that the passwords are not widely distributed but are available in an emergency.

User Privacy

On a multiuser system, users may have access to personal files that belong to others. Such access can be controlled by setting file permissions with the chmod(1) command. Default permissions are controlled by the umask shell parameter. (See “Default File Permissions (umask)” in Chapter 5 for information on setting umask.)

By default, it is easy for users to exchange data because permission to read files is granted to everyone. Users can change this default for their own files. However, many users do not set their umask, and they forget to change the access permissions of personal files. Make sure users are aware of file permissions and of your policy on examining other users' personal files. You can make this policy as lenient or stringent as you deem necessary.

Password File Check

At least once a week, run the pwck(1M) and grpck(1M) programs to check your /etc/passwd and /etc/group files for errors. You can automate this process using the cron(1) command, and you can direct cron to mail the results of the checks to your user account. For more information on using cron to automate your routine tasks, see “Task Scheduling with the at, batch, and cron Commands” in Chapter 2.

The pwck and grpck commands read the password and group files and report any incorrect or inconsistent entries. Any inconsistency with normal IRIX operation is reported. For example, if you have /etc/passwd entries for two user names with the same user identification (UID) number, pwck reports this as an error. grpck performs a similar function on the /etc/group file. The standard passwd file shipped with the system can generate several errors.

Hardware Change Check

Be aware that changing hardware configurations can affect the system, even if the change you make seems simple. Make sure you are available to help users with problems after the system is changed in any way.

Software Upgrade Check

Changing the software also affects the system, even if the change you make is as trivial as a small upgrade to a new version of an application. Some software installations can overwrite customized configuration files. Users may have scripts that assume that a utility or program is in a certain directory, and a software upgrade may move the utility. Or the new version of the software simply may not work in the same way as the old version.

Whenever you change the software configuration of your systems, let your users know and be ready to perform some detective work if seemingly unrelated software suddenly stops working as a result. Make sure you are available to help users with problems after the system is changed in any way.

Before you upgrade a system to new software, check your user community to see which parts of the old software they use, and if they might be inconvenienced by the upgrade. Often users need extra time to switch from one release of an application to a newer version.

If possible, do not strand your users by completely removing the old software. Try to keep both versions on the system until everyone switches to the new version.

System Unavailability Notification

In general, try to provide the user community as much notice as possible about events affecting the use of the system. When the system must be taken out of service, also tell the users when to expect the system to be available. Use the message-of-the-day file /etc/motd  to keep users informed about changes in hardware, software, policies, and procedures.

Many administrative tasks require the system to be shut down to a run level other than the multiuser state. This means that conventional users cannot access the system. Just before the system is taken out of the multiuser state, users on the system are requested to log off. You should do these types of tasks when they interfere the least with the activities of the user community.

Sometimes situations arise that require the system to be taken down with little or no notice provided to the users. This is often unavoidable, but try to give at least 5 to 15 minutes notice, if possible.

At your discretion, the following actions should be prerequisites for any task that requires the system to leave the multiuser state:

  • When possible, perform service tasks during periods of low system use. For scheduled actions, use /etc/motd to inform users of future actions.

  • Check to see who is logged in before taking any actions that would affect a logged-in user. You can use the /etc/whodo, /bin/who, or /usr/bsd/w command  to see who is on the system. You may also wish to check for large background tasks, such as background compilations, by executing ps -ef.

  • If the system is in use, provide the users advance warning about changes in system states or pending maintenance actions. For immediate actions, use the /etc/wall command to send a broadcast message announcing that the system will be taken down at a given time. Give the users a reasonable amount of time (5 to 15 minutes) to terminate their activities and log off before taking the system down.

Malicious Activity Policy

Set a policy regarding malicious activities that covers:

  • Deliberately crashing the system

  • Breaking into other accounts; for example, using password-guessing and password-stealing programs

  • Forging e-mail from other users

  • Creating and unleashing malicious programs, such as worm and virus processes

Make sure that all users at the site are aware that these sorts of activities are potentially very harmful to the community of users on the system. Penalties for malicious behavior should be severe and the enforcement should be consistent.

The most important thing you can do to prevent malicious damage to the system is to restrict access to the root password.

System Log Book Maintenance

It is important to keep a complete set of records about each system you administer. A system log book is a useful tool when troubleshooting transient problems or when trying to establish system operating characteristics over a period of time. Keeping a hard copy book is important, since you cannot refer to an online log if you have trouble starting the system.

Some of the information to consider entering into the log book for each system you administer is:

  • Maintenance records (dates and actions)

  • Printouts of error messages and diagnostic phases

  • Equipment and system configuration changes (dates and actions), including serial numbers of various parts (if applicable)

  • Copies of important configuration files

  • Output of prtvtoc(1M) for each disk on the system

  • /etc/passwd file

  • /etc/group file

  • /etc/fstab file

  • /etc/exports file

The format of the system log and the types of items noted in the log should follow a logical structure. Think of the log as a diary that you update periodically. To a large measure, how you use your system dictates the form and importance of maintaining a system log.

In addition to the system log, you may find it helpful to keep a user trouble log. The problems that users encounter fall into patterns. If you keep a record of how problems are resolved, you do not have to start from scratch when a problem recurs. Also, a user trouble log can be very useful for training new administrators in the specifics of your local system, and for helping them learn what to expect.

User Request Service

Provide a convenient way for your users to report problems. For example, set up a trouble mail alias, so that users with problems can simply send mail to trouble for assistance. Refer to IRIX Admin: Networking and Mail for more information on mail aliases.

System Administrator Task List

A system administrator has many varied responsibilities. Some of the most common responsibilities addressed in this guide are:

Operations 

Ensuring that systems stay up and running, scheduling preventive maintenance downtime, adding new users, installing new software, and updating the /etc/motd and /etc/issue files. See Chapter 2, “Making the Most of the IRIX System”, Chapter 3, “System Startup, Shutdown, and Run Levels”, and Chapter 4, “Configuring the IRIX Operating System”. Also see Chapter 5, “System Administration in a Multiuser Environment” and Chapter 9, “Using the Command (PROM) Monitor”.

Failure analysis 

Troubleshooting by reading system logs and drawing on past experience. See Chapter 1, “Introduction to System Configuration and Operation”.

Capacity planning 

Knowing the general level of system use and planning for additional resources when necessary. See Chapter 6, “Configuring Disk and Swap Space”Chapter 7, “Managing User Processes”, and Chapter 10, “System Performance Tuning”.

System tuning 

Tuning the kernel and user process priorities for optimum performance. See Chapter 10, “System Performance Tuning” and Appendix A, “IRIX Kernel Tunable Parameters”.

Application tuning 

Tuning your applications to more closely follow your system's resource limits. See Appendix C, “Application Tuning”.

Resource management 

Planning process and disk accounting and other resource sharing. See the IRIX Admin: Backup, Security, and Accounting guide and IRIX Admin: Resource Administration.

Networking 

Interconnecting systems, modems, and printers. See the IRIX Admin: Networking and Mail guide.

Security 

Maintaining sufficient security against break-ins as well as maintaining internal privacy and system integrity. See the IRIX Admin: Backup, Security, and Accounting guide.

User migration 

Helping users work on all workstations at a site. See the IRIX Admin: Networking and Mail guide.

User education 

Helping users develop good habits and instructing them in the use of the system. See Chapter 5, “System Administration in a Multiuser Environment” and Chapter 8, “Using the File Alteration Monitor”.

Backups 

Creating and maintaining system backups. See the IRIX Admin: Backup, Security, and Accounting guide.

If you are using the Array Services product, you will need to perform additional configuration. See Getting Started With Array Systems .

Administration Tools Overview

Depending on the exact configuration of your system, you may have the following tools available for performing system administration:

System Manager  

This tool, available on graphics workstations, provides easy access to system administration functions. It features a quick and easy method of performing most system administration tasks. The System Manager is available only on those systems that have graphics capability.

Command line tools 

The IRIX system provides a rich set of system administration tools that have command line interfaces. These are especially useful for automatically configuring systems with shell scripts and for repairing the system in unusual circumstances, such as when you must log in remotely from another system.

For example, using command line tools, a site administrator can alter the system automatically at designated times in the future (for instance, to distribute configuration files at regular intervals). These commands are available on all IRIX systems.

The suite of IRIX Admin guides are primarily concerned with the command-line interface and direct system file manipulation. Refer to the Personal System Administration Guide for a GUI approach to system administration tasks.