Chapter 8. Troubleshooting

This chapter provides information about diagnosing a variety of hardware problems using the blinking and colored LEDs on the front of the system. It also provides you with information on how to access and run diagnostics, how to use the non-maskable interrupt (NMI) button, how to recover from a system crash, and what to do when you have lost or forgotten the system password.

The following topics are covered in this chapter:

Diagnosing the Problem

If you suspect there is a problem with your hardware, use these diagnostics to help isolate and solve the problem:

  • Lightbar LEDs - Front of Octane, Diagnostic Chart, Figure 8-1

  • Lightbar LEDs - Front of Octane, Diagnostic Chart, Figure 8-2

  • Lightbar LEDs - Front of Octane, Diagnostic Chart, Figure 8-3

See also “Using the NMI Button”.

Using the LEDs on the Lightbar

Figure 8-1. Lightbar LEDs - Front of Octane, Diagnostic Chart #1

Figure 8-1 Lightbar LEDs - Front of Octane, Diagnostic Chart #1

Figure 8-2. Lightbar LEDs - Front of Octane, Diagnostic Chart #2

Figure 8-2 Lightbar LEDs - Front of Octane, Diagnostic Chart #2

Figure 8-3. Lightbar LEDs - Front of Octane, Diagnostic Chart #3

Figure 8-3 Lightbar LEDs - Front of Octane, Diagnostic Chart #3

Using the NMI Button

The Octane workstation has the ability to force a non-maskable interrupt (NMI) to the system. This can be used when the system is experiencing problems that do not report any error information. Using the NMI button causes the system to record to a file the activity of the system when the button is pushed and the system powers down.


Caution: Use of the NMI button results in the loss of any work in progress at the moment the button is pushed.

Figure 8-4. Locating the NMI Button

Figure 8-4 Locating the NMI Button

See Chapter 7 for instructions on removing the bezel. The recessed NMI button is located the bezel and beneath the Power and Reset buttons. Use a straightened paper clip to press the recessed NMI button.

The files generated by the NMI button are used by Customer Support to help diagnose the specific problem that caused the system to malfunction. The files that are created are placed in the var/adm/crash directory. Have these files accessible when you contact Customer Support.

Accessing the System Maintenance Menu

You may wish to access to System Maintenance Menu for some specialized tasks, including installing system software, running diagnostics, recovering the sytem, entering the command monitor, or selecting your keyboard layout.

You can access the system maintenance menu by shutting down the system and powering off using the Toolchest, or by pressing the power button.

Accessing the System Maintenance Menu From the Toolchest

  1. To access the System Maintenance menu from the Toolchest > System choose System Shutdown.

  2. After a few seconds you see the notifier shown in Figure 8-5.

    Figure 8-5. System Shutdown Notifier

    Figure 8-5 System Shutdown Notifier

  3. After a few seconds you see the notifier shown in Figure 8-6. At this point you can power off the system by pressing the power button.

    Figure 8-6. Okay to Power Off Notifier

    Figure 8-6 Okay to Power Off Notifier

  4. After the system is powered off, press the power button to power on. You see the following notifier:

    Figure 8-7. Stop for Maintenance Notifier

    Figure 8-7 Stop for Maintenance Notifier

  5. Click the Stop for Maintenance button.

  6. Choose the option you wish to complete your task.

Accessing the System Maintenance Menu Using the Power Button

To access the System Maintenance menu by using the power button, follow these steps.

  1. Press the power button on the front of the Octane workstation and power off.

  2. After a few seconds, press the power button and power on.

  3. You see the following notifier:

    Figure 8-8. Starting Up the System Notifier

    Figure 8-8 Starting Up the System Notifier

  4. Click the Stop for Maintenance button.

Choose the option you wish to complete your task.

Running Diagnostic Tests

Three types of software diagnostics tests are provided on the Octane workstation. Each is described below:

Power-On Tests

These tests run automatically on the major hardware components of the workstation each time it is turned on. If the tests find a faulty part, the LED in the lightbar on the front of the workstation is red and there will probably be an error message. See also “Diagnosing the Problem.”

Confidence Tests

There are confidence tests for the mouse, keyboard, monitor, audio subsystem, external SCSI devices (excluding hard disks), Presenter, and ISDN connection.

To run the Confidence Tests, from the Toolchest, choose System > Confidence Tests. Double-click the icon for the part you believe is faulty, and follow the online instructions.

IDE Tests

The Interactive Diagnostic Environment (IDE) tests are more comprehensive than the Confidence Tests, and take longer (as long as 30-45 minutes) to run. Follow these steps:

  1. Turn off the workstation, wait a few seconds, and then turn it on again.

  2. When you see the System Startup notifier, click Stop for Maintenance, or press Esc.

    Figure 8-9. Starting Up the System Notifier

    Figure 8-9 Starting Up the System Notifier

  3. When you see the System Maintenance menu, choose Run Diagnostics, or type 3 on the keyboard.


    Tip: You can stop the tests at any time by pressing Esc.



    Note: If you cannot reach the System Maintenance menu, your system is faulty. If you cannot run the diagnostics, you may have a faulty disk drive or other problem. Contact your service provider.


  4. At the end of the tests you see a message with the results of the tests. Press Enter and then Esc to return to the System Maintenance menu, from where you can restart the system.

Recovering From a System Crash

In most cases, your system recovers from a system crash automatically if you reboot it. If, however, you have lost data on your system disk, and you cannot communicate with your workstation using the mouse or keyboard, or over the network, follow these instructions. The instructions assume you have a backup tape of your system that has been made using the System Manager backup tool, or with the /usr/sbin/Backup script. You also need a CD with your current IRIX operating system level. If you are recovering data from a tape on a remote tape device, you need to know the hostname, tape device name, and IP address of the remote system.


Note: If you need instructions on creating a system disk from the PROM monitor, see Appendix E, “Regulatory Information.”. Also see the Installation Instructions booklet that came with your CDs for information on installing the operating system or other software.


  1. Use a pen tip or an unwound paper clip to press the Reset button located on the front panel.

    Figure 8-10. Pressing the Reset Button

    Figure 8-10 Pressing the Reset Button

  2. When you see the System Startup notifier (Figure 8-11), click Stop for Maintenance or press Esc.

    Figure 8-11. System Startup Notifier

    Figure 8-11 System Startup Notifier

  3. From the System Maintenance menu, choose Recover System, or type 4 on the keyboard.

    The System Recovery Menu appears (Figure 8-12).

    Figure 8-12. System Recovery Menu

    Figure 8-12 System Recovery Menu

  4. If you have a CD-ROM drive connected to your system and the IRIX CD, click Local CD-ROM. Then click Accept to start. Insert the CD when prompted. The system takes five minutes or more to copy the information.

    If you don't have a CD-ROM drive, use a drive that is connected to another system on the network. Click Remote Directory.

  5. When a notifier appears asking you for the remote hostname, type the system's name, a colon (:), and the full pathname of the CD-ROM drive, followed by /dist. For example, to access a CD-ROM drive on the system mars, you type

    mars:/CDROM/dist

    After everything is copied from the CD to the system disk, you can restore your data from a recent full backup tape. The backup must be one that has been made using the System Manager backup tool, or with the /usr/sbin/Backup script.


    Tip: If you need to check something on your system during the restore process, you can get a shell prompt by typing sh at most question prompts.


  6. If you have a local tape device, you see this message:

    Restore will be from <tapename> OK? ([Y]es, [N]o): [Y]
    

    tapename is the name of the local tape device.

  7. If you have a remote (network) tape device, when no tape device is found, or when you answered “No” to the question in the previous step, you see this message:

    Remote or local restore ([r]emote, [l]ocal):
    

    • If you answer “remote,” you have chosen to restore from the network, and you must know the hostname, tape device name, and IP address of the remote system. You also need to know the IP address of your system. The IP address, such as 192.0.2.1, always has four components separated by periods.

    • If you answer “local,” you have chosen a tape device that is connected to your system, and you are prompted to enter the name of the tape device.

  8. When you see the following message, remove the CD-ROM, insert your most recent full backup tape, then press Enter.

    Insert the first backup tape in the drive, then press <Enter>, [q]uit (from recovery), [r]estart:
    

    There is a pause while the program retrieves several files from the tape describing the system state at the time the backup was made. Then you see this message:

    Erase /x filesystem and make new one (y,n)? [n]
    

    It prompts you for every filesystem that was known at the time of the backup.

    Read the following to decide whether to answer y or n.

    • If you answer n for no, the system tries to salvage as many files as possible. Then it uses your backup tape to replace the files it could not salvage. Usually you should answer no, especially if your backup tape is not very recent. If the file systems were badly damaged, or the backup was from a different level operating system, you may need to answer yes.

    • If you answer y for yes, the system erases the filesystem and copies everything from your backup tape to the disk. The system loses any information on that filesystem that you created between now and when you made your backup tape.

  9. You see this message:

    Starting recovery from tape.
    

    After two or three minutes, the names of the files that the system is copying to the disk start scrolling.

    After the first tape is complete, if it is the first in a set, you are prompted for the second tape, and so on.

    After the first tape set (full backup) is completed, you are prompted for any incremental or additional tapes to be restored.

    When all tapes have been restored, you are asked whether you are ready to restart the system.

    When the recovery is complete, you see this message:

    Recovery complete, restarting system.
    


Note: If your backup tapes were old, or you were changing your operating system level, you should reinstall the operating system from the IRIX CD that came with your system after system recovery is complete. When you see the Startup System notifier, press Esc or click Stop for Maintenance. Then click Install System Software. For more information on installing the operating system, see “Installing Software” in the Personal System Administration Guide or the Installation Instructions booket that came with your CDs.


Disabling the System Maintenance Password

If you are in the System Maintenance menu, and you choose Install System Software, Run Diagnostics, Recover System, or Enter Command Monitor, you may be prompted for a password.

If you do not know the password, you can disable it by installing a jumper (a small cap that connects two pins) on the system board inside the workstation. The system board is located in the system module. To install the jumper, you must first remove the system module.

Go to Chapter 2, “Preparing the Workstation to Remove and Install a CPU and Memory” and follow the instructions through “Removing the System Module.” After the system module is removed, return here for instructions on disabling the password.


Note: The jumper is a small plastic cap that resides near the CPU. See Figure 8-13. The jumper sits on the right two of three pins.

Figure 8-13. Removing the Jumper for Disabling the Password

Figure 8-13 Removing the Jumper for Disabling the Password

Follow these steps:

  1. Locate the white rectangle printed on the system board at the base of the jumper. The rectangle has “Disable passwd” beside it. The white rectangle surrounds the jumper and one pin to the left of the jumper. The jumper is on the middle and right pins.

  2. Pull up on the jumper and remove it.

  3. Place the jumper on the left and middle pins within the rectangle. The password is now disabled.

  4. Return to Chapter 2, “Replacing the System Module” and follow the instructions through powering on the Octane workstation.

Enabling the System Maintenance Password

After you have powered on the Octane workstation, you can choose another system maintenance password.

  1. Power on the Octane workstation, if you have not already done so.

  2. Go to the Toolchest > Help > InfoSearch > Online Books > SGI End User > Personal System Administration Guide. Go to the section, “Creating, Changing and Deleting Passwords” in Chapter 5. Follow the instructions for creating or changing a password. Then return here for instructions on moving the jumper.

  3. Go to Chapter 2, “Preparing the Workstation to Remove and Install a CPU and Memory”, and follow the instructions through “Removing the System Module.” After the system module is removed, return here for instructions on enabling the password.


    Note: The jumper is a small plastic cap that resides near the CPU. The jumper is on the left side of the 3-pin configuration because the password was disabled by moving the jumper to the left. To enable the password, you must move the jumper to the right. Go to the next step for instructions on moving the jumper to the right.

    Figure 8-14. Removing the Jumper to Enable the Password

    Figure 8-14 Removing the Jumper to Enable the Password

  4. Locate the white rectangle printed on the system board at the base of the jumper. The rectangle has “Disable passwd” beside it. The white rectangle surrounds the jumper and one pin to the left of the jumper. The jumper is now disabled, and sits on the on the left and middle pins.

  5. Pull up on the jumper and remove it.

  6. Place the jumper on the middle and right pins (within the rectangle). The password is now enabled.

    You have finished enabling the password and are ready to replace the system module.

  7. Return to Chapter 2, “Replacing the System Module” and follow the instructions through powering on the Octane workstation.

System Does Not Power Off

If the system does not power off, either it never came up all the way or the operating system is hung. If you do not see any activity for several minutes, follow the steps below.

Figure 8-15. Pressing the Power Button

Figure 8-15 Pressing the Power Button

  1. Press the power button again.


    Note: If you press the power button a second time, the system powers off immediately, but it is not a clean shutdown. Avoid using this method unless the system does not respond for several minutes after your press the power button the first time.

    Figure 8-16. Pressing the Reset Button

    Figure 8-16 Pressing the Reset Button

  2. If pressing the power button a second time does not work, use a pencil or pen to press the reset button.

  3. If the system still fails to power off, unplug the power cord from the back of the workstation and contact your service provider.

Returning Parts

To return any part, use the packaging materials and box that came with your replacement part.

For product support information, see the Introduction of this guide.