Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
HP-UX System Administrator's Guide: Configuration Management: HP-UX 11i Version 3 > Chapter 9 Configuring Peripherals

Configuring PCI Error Recovery

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Index

The PCI Error Recovery feature provides the ability to detect, isolate, and automatically recover from a PCI error, avoiding a system crash. It is included with the HP-UX 11i v3 operating system and is enabled by default.

To enable and disable PCI Error Recovery, see “Controlling PCI Error Recovery”.

What is PCI Error Recovery?

If PCI Error Recovery is enabled and an error occurs on a PCI bus containing an I/O card that supports PCI Error Recovery, the following steps are taken:

  1. The PCI bus is quarantined to isolate the system from further I/O and prevent the error from damaging the system.

  2. The PCI Error Recovery feature attempts to recover from the error and re-initialize the bus so I/O can resume.

If an error occurs during the automated error recovery process, the bus and I/O card will remain quiesced.

If the bus contains a card that supports online addition, replacement, or deletion (OL*) and the card is in a hot pluggable slot, you can use the olrad command (or the attention button) to manually recover from the error by replacing the card.

For information on OL* operations, see the Interface Card OL* Support Guide. To determine if OL* is supported, see the documentation or support matrix for the specific I/O card.

If the PCI Error Recovery feature is disabled and an error occurs on a PCI bus, a Machine Check Abort (MCA) or a High Priority Machine Check (HPMC) will occur and the system will crash.

CAUTION: If you use HP Serviceguard, HP recommends that you enable the PCI Error Recovery feature only if your storage devices are configured with multiple paths and you have not disabled HP-UX native multipathing. If PCI Error Recovery is enabled, but your storage devices are configured with only a single path, HP Serviceguard may not detect when connectivity is lost. HP Serviceguard will not cause a failover unless it detects a loss of connectivity.

Controlling PCI Error Recovery

PCI Error Recovery is controlled by two tunables that you can configure, using HP SMH, kcweb, or kctune. See “Managing Kernel Tunable Parameters with kctune” and “Managing Kernel Tunable Parameters with HP SMH”.

  • pci_eh_enable

    This tunable enables or disables the PCI Error Recovery feature. It is enabled by default. Since pci_eh_enable is not a dynamic tunable, a reboot is required for changes to take effect.

  • pci_error_tolerance_time

    This tunable determines whether an automatic PCI error recovery will occur on an I/O slot, based on the time interval between two PCI errors. If two PCI errors occur on a PCI slot within the time interval specified by pci_error_tolerance_time, the card in the I/O slot will be suspended and you will need to attempt a manual recovery operation to restore the card.

PCI Error Recovery Documentation

PCI Error Recovery is supported by the following documentation, available on the HP Technical Documentation web site at http://docs.hp.com:

In the High Availability section:

  • PCI Error Recovery Product Note

  • PCI Error Recovery Support Matrix

  • Interface Card OL* Support Guide

In the HP-UX Reference:

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 2008 Hewlett-Packard Development Company, L.P.