| [Home] | [Comments] | [Ordering info] | [Help]

  6318P004.HTM
  OSSG Documentation
  26-NOV-1996 11:20:15.37

Guidelines for OpenVMS Cluster Configurations

Guidelines for OpenVMS Cluster Configurations

Previous | Contents

You can use local adapters to connect each disk to two access paths (dual ports). Dual porting allows automatic failover of disks between nodes.

5.8.1 Internal Buses

Locally connected storage devices attach to a system's internal bus.

Alpha systems use the following internal buses:

PCI
EISA
XMI
SCSI
TURBOchannel
Futurebus+

VAX systems use the following internal buses:

VAXBI
XMI
Q-bus
SCSI

5.8.2 Local Adapters

Following is a list of local adapters and their bus types:

KZPSM (PCI)
KZPDA (PCI)
KZPSC (PCI)
KZPAC (PCI)
KZESC (EISA)
KZMSA (XMI)
PB2HA (EISA)
PMAZB (TURBOchannel)
PMAZC (TURBOchannel)
KDM70 (XMI)
KDB50 (VAXBI)
KDA50 (Q-bus)

Chapter 6
Configuring OpenVMS Clusters for Availability

Availability is the percentage of time that a computing system provides application service. By taking advantage of OpenVMS Cluster features, you can configure your OpenVMS Cluster system for various levels of availability, including disaster tolerance.

This chapter provides strategies and sample optimal configurations for building a highly available OpenVMS Cluster system. You can use these strategies and examples to help you make choices and tradeoffs that enable you to meet your availability requirements.

6.1 Availability Requirements

You can configure OpenVMS Cluster systems for different levels of availability, depending on your requirements. Most organizations fall into one of the broad (and sometimes overlapping) categories shown in Table 6-1.

Table 6-1 Availability Requirements
Availability Requirements Description

Conventional For business functions that can wait with little or no effect while a system or application is unavailable.

24 x 365 For business functions that require uninterrupted computing services, either during essential time periods or during most hours of the day throughout the year. Minimal down time is acceptable.

Disaster tolerant For business functions with extremely stringent availability requirements. These businesses need to be immune to disasters like earthquakes, floods, and power failures.

**Table 6-1 Availability Requirements**
Availability Requirements	Description
Conventional	For business functions that can wait with little or no effect while a system or application is unavailable.
24 x 365	For business functions that require uninterrupted computing services, either during essential time periods or during most hours of the day throughout the year. Minimal down time is acceptable.
Disaster tolerant	For business functions with extremely stringent availability requirements. These businesses need to be immune to disasters like earthquakes, floods, and power failures.

Reference: See Building Dependable Systems: The OpenVMS Approach for more information about configuring highly available OpenVMS Clusters.

6.2 How OpenVMS Clusters Provide Availability

OpenVMS Cluster systems offer the following features that provide increased availability:

A highly integrated environment that allows multiple systems to share access to resources
Redundancy of major hardware components
Software support for failover between hardware components
Software products to support high availability

6.2.1 Shared Access to Storage

In an OpenVMS Cluster environment, users and applications on multiple systems can transparently share storage devices and files. When you shut down one system, users can continue to access shared files and devices. You can share storage devices in two ways:

Direct access
Connect disk and tape storage subsystems to CI and DSSI interconnects rather than to a node. This gives all nodes attached to the interconnect shared access to the storage system. The shutdown or failure of a system has no effect on the ability of other systems to access storage.
Served access
Storage devices attached to a node can be served to other nodes in the OpenVMS Cluster. MSCP and TMSCP server software enable you to make local devices available to all OpenVMS Cluster members. However, the shutdown or failure of the serving node affects the ability of other nodes to access storage.

6.2.2 Component Redundancy

OpenVMS Cluster systems allow for redundancy of many components, including:

Systems
Interconnects
Adapters
Storage devices and data

With redundant components, if one component fails, another is available to users and applications.

6.2.3 Failover Mechanisms

OpenVMS Cluster systems provide failover mechanisms that enable recovery from a failure in part of the OpenVMS Cluster. Table 6-2 lists these mechanisms and the levels of recovery that they provide.

Table 6-2 Failover Mechanisms
Mechanism What Happens if a Failure Occurs Type of Recovery

DECnet--Plus cluster alias If a node fails, OpenVMS Cluster software automatically distributes new incoming connections among other participating nodes. Manual. Users who were logged in to the failed node can reconnect to a remaining node.
Automatic for appropriately coded applications. Such applications can reinstate a connection to the cluster alias node name, and the connection is directed to one of the remaining nodes.

I/O paths With redundant paths to storage devices, if one path fails, OpenVMS Cluster software fails over to a working path, if one exists. Transparent, provided another working path is available.

Interconnect With redundant or mixed interconnects, OpenVMS Cluster software uses the fastest working path to connect to other OpenVMS Cluster members. If an interconnect path fails, OpenVMS Cluster software fails over to a working path, if one exists. Transparent.

Boot and disk servers If you configure at least two nodes as boot and disk servers, satellites can continue to boot and use disks if one of the servers shuts down or fails.
Failure of a boot server does not affect nodes that have already booted, providing they have an alternate path to access MSCP served disks.
Automatic.

Terminal servers and LAT software Attach terminals and printers to terminal servers. If a node fails, the LAT software automatically connects to one of the remaining nodes. In addition, if a user process is disconnected from a LAT terminal session, when the user attempts to reconnect to a LAT session, LAT software can automatically reconnect the user to the disconnected session.

Manual. Terminal users who were logged in to the failed node must log in to a remaining node and restart the application.

Generic batch and print queues You can set up generic queues to feed jobs to execution queues (where processing occurs) on more than one node. If one node fails, the generic queue can continue to submit jobs to execution queues on remaining nodes. In addition, batch jobs submitted using the /RESTART qualifier are automatically restarted on one of the remaining nodes.

Transparent for jobs waiting to be dispatched.
Automatic or manual for jobs executing on the failed node.

Autostart batch and print queues For maximum availability, you can set up execution queues as autostart queues with a failover list. When a node fails, an autostart execution queue and its jobs automatically fail over to the next logical node in the failover list and continue processing on another node. Autostart queues are especially useful for print queues directed to printers that are attached to terminal servers. Transparent.

**Table 6-2 Failover Mechanisms**
Mechanism	What Happens if a Failure Occurs	Type of Recovery
DECnet--Plus cluster alias	If a node fails, OpenVMS Cluster software automatically distributes new incoming connections among other participating nodes.	Manual. Users who were logged in to the failed node can reconnect to a remaining node. Automatic for appropriately coded applications. Such applications can reinstate a connection to the cluster alias node name, and the connection is directed to one of the remaining nodes.
I/O paths	With redundant paths to storage devices, if one path fails, OpenVMS Cluster software fails over to a working path, if one exists.	Transparent, provided another working path is available.
Interconnect	With redundant or mixed interconnects, OpenVMS Cluster software uses the fastest working path to connect to other OpenVMS Cluster members. If an interconnect path fails, OpenVMS Cluster software fails over to a working path, if one exists.	Transparent.
Boot and disk servers	If you configure at least two nodes as boot and disk servers, satellites can continue to boot and use disks if one of the servers shuts down or fails. Failure of a boot server does not affect nodes that have already booted, providing they have an alternate path to access MSCP served disks.	Automatic.
Terminal servers and LAT software	Attach terminals and printers to terminal servers. If a node fails, the LAT software automatically connects to one of the remaining nodes. In addition, if a user process is disconnected from a LAT terminal session, when the user attempts to reconnect to a LAT session, LAT software can automatically reconnect the user to the disconnected session.	Manual. Terminal users who were logged in to the failed node must log in to a remaining node and restart the application.
Generic batch and print queues	You can set up generic queues to feed jobs to execution queues (where processing occurs) on more than one node. If one node fails, the generic queue can continue to submit jobs to execution queues on remaining nodes. In addition, batch jobs submitted using the /RESTART qualifier are automatically restarted on one of the remaining nodes.	Transparent for jobs waiting to be dispatched. Automatic or manual for jobs executing on the failed node.
Autostart batch and print queues	For maximum availability, you can set up execution queues as autostart queues with a failover list. When a node fails, an autostart execution queue and its jobs automatically fail over to the next logical node in the failover list and continue processing on another node. Autostart queues are especially useful for print queues directed to printers that are attached to terminal servers.	Transparent.

Reference: For more information about cluster alias, generic queues, and autostart queues, see OpenVMS Cluster Systems.

6.2.4 Related Software Products

Table 6-3 shows a variety of related OpenVMS Cluster software products that Digital offers to increase availability.

Table 6-3 Products That Increase Availability
Product Description

DECamds Collects and analyzes data from multiple nodes simultaneously and directs all output to a centralized DECwindows display. The analysis detects availability problems and suggests corrective actions.

Volume Shadowing for OpenVMS Makes any disk in an OpenVMS Cluster system a redundant twin of any other same-model disk in the OpenVMS Cluster.

DECevent Simplifies disk monitoring. DECevent notifies you when it detects that a disk may fail. If the OpenVMS Cluster system is properly configured, DECevent can add a new disk and start a shadow copy operation.

POLYCENTER Console Manager (PCM) Helps monitor OpenVMS Cluster operations. PCM provides a central location for coordinating and managing up to 24 console lines connected to OpenVMS nodes or HSJ/HSC console ports.

**Table 6-3 Products That Increase Availability**
Product	Description
DECamds	Collects and analyzes data from multiple nodes simultaneously and directs all output to a centralized DECwindows display. The analysis detects availability problems and suggests corrective actions.
Volume Shadowing for OpenVMS	Makes any disk in an OpenVMS Cluster system a redundant twin of any other same-model disk in the OpenVMS Cluster.
DECevent	Simplifies disk monitoring. DECevent notifies you when it detects that a disk may fail. If the OpenVMS Cluster system is properly configured, DECevent can add a new disk and start a shadow copy operation.
POLYCENTER Console Manager (PCM)	Helps monitor OpenVMS Cluster operations. PCM provides a central location for coordinating and managing up to 24 console lines connected to OpenVMS nodes or HSJ/HSC console ports.

6.3 Strategies for Configuring Highly Available OpenVMS Clusters

The hardware you choose and the way you configure it has a significant impact on the availability of your OpenVMS Cluster system. This section presents strategies for designing an OpenVMS Cluster configuration that promotes availability.

6.3.1 Availability Strategies

Table 6-4 lists strategies for configuring a highly available OpenVMS Cluster. These strategies are listed in order of importance, and many of them are illustrated in the sample optimal configurations shown in this chapter.

Table 6-4 Availability Strategies
Strategy Description

Eliminate single points of failure Make components redundant so that if one component fails, the other is available to take over.

Shadow system disks The system disk is vital for node operation. Use Volume Shadowing for OpenVMS to make system disks redundant.

Shadow essential data disks Use Volume Shadowing for OpenVMS to improve data availability by making data disks redundant.

Provide shared, direct access to storage Where possible, give all nodes shared direct access to storage. This reduces dependency on MSCP server nodes for access to storage.

Minimize environmental risks Take the following steps to minimize the risk of environmental problems:

Provide a generator or uninterruptible power system (UPS) to replace utility power for use during temporary outages.
Configure extra air-conditioning equipment so that failure of a single unit does not prevent use of the system equipment.

Configure at least three nodes OpenVMS Cluster nodes require a quorum to continue operating. An optimal configuration uses a minimum of three nodes so that if one node becomes unavailable, the two remaining nodes maintain quorum and continue processing.
Reference: For detailed information on quorum strategies, see Section 8.5 and OpenVMS Cluster Systems.

Configure extra capacity For each component, configure at least one unit more than is necessary to handle capacity. Try to keep component use at 80% of capacity or less. For crucial components, keep resource use sufficiently less than 80% capacity so that if one component fails, the work load can be spread across remaining components without overloading them.

Keep a spare component on standby For each component, keep one or two spares available and ready to use if a component fails. Be sure to test spare components regularly to make sure they work. More than one or two spare components increases complexity as well as the chance that the spare will not operate correctly when needed.

Use homogeneous nodes Configure nodes of similar size and performance to avoid capacity overloads in case of failover. If a large node fails, a smaller node may not be able to handle the transferred work load. The resulting bottleneck may decrease OpenVMS Cluster performance.

Use reliable hardware Consider the probability of a hardware device failing. Check product descriptions for MTBF (mean time between failures). In general, newer technologies are more reliable.

**Table 6-4 Availability Strategies**
Strategy	Description
Eliminate single points of failure	Make components redundant so that if one component fails, the other is available to take over.
Shadow system disks	The system disk is vital for node operation. Use Volume Shadowing for OpenVMS to make system disks redundant.
Shadow essential data disks	Use Volume Shadowing for OpenVMS to improve data availability by making data disks redundant.
Provide shared, direct access to storage	Where possible, give all nodes shared direct access to storage. This reduces dependency on MSCP server nodes for access to storage.
Minimize environmental risks	Take the following steps to minimize the risk of environmental problems: Provide a generator or uninterruptible power system (UPS) to replace utility power for use during temporary outages. Configure extra air-conditioning equipment so that failure of a single unit does not prevent use of the system equipment.
Configure at least three nodes	OpenVMS Cluster nodes require a quorum to continue operating. An optimal configuration uses a minimum of three nodes so that if one node becomes unavailable, the two remaining nodes maintain quorum and continue processing. Reference: For detailed information on quorum strategies, see Section 8.5 and OpenVMS Cluster Systems.
Configure extra capacity	For each component, configure at least one unit more than is necessary to handle capacity. Try to keep component use at 80% of capacity or less. For crucial components, keep resource use sufficiently less than 80% capacity so that if one component fails, the work load can be spread across remaining components without overloading them.
Keep a spare component on standby	For each component, keep one or two spares available and ready to use if a component fails. Be sure to test spare components regularly to make sure they work. More than one or two spare components increases complexity as well as the chance that the spare will not operate correctly when needed.
Use homogeneous nodes	Configure nodes of similar size and performance to avoid capacity overloads in case of failover. If a large node fails, a smaller node may not be able to handle the transferred work load. The resulting bottleneck may decrease OpenVMS Cluster performance.
Use reliable hardware	Consider the probability of a hardware device failing. Check product descriptions for MTBF (mean time between failures). In general, newer technologies are more reliable.

6.4 Strategies for Maintaining Highly Available OpenVMS Clusters

Achieving high availability is an ongoing process. How you manage your OpenVMS Cluster system is just as important as how you configure it. This section presents strategies for maintaining availability in your OpenVMS Cluster configuration.

6.4.1 Strategies for Maintaining Availability

After you have set up your initial configuration, follow the strategies listed in Table 6-5 to maintain availability in OpenVMS Cluster system.

Table 6-5 Strategies for Maintaining Availability
Strategy Description

Plan a failover strategy OpenVMS Cluster systems provide software support for failover between hardware components. Be aware of what failover capabilities are available and which can be customized for your needs. Determine which components must recover from failure, and make sure that components are able to handle the additional work load that may result from a failover.
Reference: Table 6-2 lists OpenVMS Cluster failover mechanisms and the levels of recovery that they provide.

Code distributed applications Code applications to run simultaneously on multiple nodes in an OpenVMS Cluster system. If a node fails, the remaining members of the OpenVMS Cluster system are still available and continue to access the disks, tapes, printers, and other peripheral devices that they need.

Minimize change Assess carefully the need for any hardware or software change before implementing it on a running node. If you must make a change, test it in a noncritical environment before applying it to your production environment.

Reduce size and complexity After you have achieved redundancy, reduce the number of components and the complexity of the configuration. A simple configuration minimizes the potential for user and operator errors as well as hardware and software errors.

Set polling timers identically on all nodes Certain system parameters control the polling timers used to maintain an OpenVMS Cluster system. Make sure these system parameter values are set identically on all OpenVMS Cluster member nodes.
Reference: For information about these system parameters, see OpenVMS Cluster Systems.

Manage proactively The more experience your system managers have, the better. Allow privileges for only those users or operators who need them. Design strict policies for managing and securing the OpenVMS Cluster system.

Use AUTOGEN proactively With regular AUTOGEN feedback, you can analyze resource usage that may affect system parameter settings.

Reduce dependencies on a single server or disk Distributing data across several systems and disks prevents one system or disk from being a single point of failure.

Implement a backup strategy Performing frequent backup procedures on a regular basis guarantees the ability to recover data after failures. None of the strategies listed in this table can take the place of a solid backup strategy.

**Table 6-5 Strategies for Maintaining Availability**
Strategy	Description
Plan a failover strategy	OpenVMS Cluster systems provide software support for failover between hardware components. Be aware of what failover capabilities are available and which can be customized for your needs. Determine which components must recover from failure, and make sure that components are able to handle the additional work load that may result from a failover. Reference: Table 6-2 lists OpenVMS Cluster failover mechanisms and the levels of recovery that they provide.
Code distributed applications	Code applications to run simultaneously on multiple nodes in an OpenVMS Cluster system. If a node fails, the remaining members of the OpenVMS Cluster system are still available and continue to access the disks, tapes, printers, and other peripheral devices that they need.
Minimize change	Assess carefully the need for any hardware or software change before implementing it on a running node. If you must make a change, test it in a noncritical environment before applying it to your production environment.
Reduce size and complexity	After you have achieved redundancy, reduce the number of components and the complexity of the configuration. A simple configuration minimizes the potential for user and operator errors as well as hardware and software errors.
Set polling timers identically on all nodes	Certain system parameters control the polling timers used to maintain an OpenVMS Cluster system. Make sure these system parameter values are set identically on all OpenVMS Cluster member nodes. Reference: For information about these system parameters, see OpenVMS Cluster Systems.
Manage proactively	The more experience your system managers have, the better. Allow privileges for only those users or operators who need them. Design strict policies for managing and securing the OpenVMS Cluster system.
Use AUTOGEN proactively	With regular AUTOGEN feedback, you can analyze resource usage that may affect system parameter settings.
Reduce dependencies on a single server or disk	Distributing data across several systems and disks prevents one system or disk from being a single point of failure.
Implement a backup strategy	Performing frequent backup procedures on a regular basis guarantees the ability to recover data after failures. None of the strategies listed in this table can take the place of a solid backup strategy.

6.5 Availability in a LAN OpenVMS Cluster

Figure 6-1 shows an optimal configuration for a small-capacity, highly available LAN OpenVMS Cluster system. Figure 6-1 is followed by an analysis of the configuration that includes:

Analysis of its components
Advantages and disadvantages
Key availability strategies implemented

Figure 6-1 LAN OpenVMS Cluster System

6.5.1 Components

The LAN OpenVMS Cluster configuration in Figure 6-1 has the following components:

Part	Description
1	Two Ethernet interconnects. For higher network capacity, use FDDI interconnects instead of Ethernet. Rationale: For redundancy, use at least two LAN interconnects and attach all nodes to all LAN interconnects. A single interconnect would introduce a single point of failure.
2	Three to eight Ethernet-capable OpenVMS nodes. Each node has its own system disk so that it is not dependent on another node. Rationale: Use at least three nodes to maintain quorum. Use fewer than eight nodes to avoid the complexity of managing eight system disks. Alternative 1: If you require satellite nodes, configure one or two nodes as boot servers. Note, however, that the availability of the satellite nodes is dependent on the availability of the server nodes. Alternative 2: For more than eight nodes, use a LAN OpenVMS Cluster configuration as described in Section 6.10.
3	System disks. System disks generally are not shadowed in LAN OpenVMS Clusters because of boot-order dependencies. Alternative 1: Shadow the system disk across two local controllers. Alternative 2: Shadow the system disk across two nodes. The second node mounts the disk as a nonsystem disk. Reference: See Section 8.2.4 for an explanation of boot-order and satellite dependencies.
4	Essential data disks. Use volume shadowing to create multiple copies of all essential data disks. Place shadow set members on at least two nodes to eliminate a single point of failure.

6.5.2 Advantages

This configuration offers the following advantages:

Lowest cost of all the sample configurations shown in this chapter.
Some potential for growth in size and performance.
The LAN interconnect supports the widest choice of nodes.

6.5.3 Disadvantages

This configuration has the following disadvantages:

No shared direct access to storage. The nodes are dependent on an MSCP server for access to shared storage.
Shadowing disks across the LAN nodes causes shadow copies when the nodes boot.
Shadowing the system disks is not practical because of boot-order dependencies.

6.5.4 Key Availability Strategies

The configuration in Figure 6-1 incorporates the following strategies, which are critical to its success:

This configuration has no single point of failure.
Volume shadowing provides multiple copies of essential data disks across separate nodes.
At least three nodes are used for quorum, so the OpenVMS Cluster continues if any one node fails.
Each node has its own system disk; there are no satellite dependencies.

6.6 Configuring Multiple LANs

Follow these guidelines to configure a highly available multiple LAN cluster:

Bridge LAN segments together to form a single extended LAN.
Provide redundant LAN segment bridges for failover support.
Configure LAN bridges to pass the LAN and MOP multicast messages.
Reference: Refer to the documentation for your LAN bridge and to the documentation for RBMS, DECelms, or POLYCENTER Framework for more information about configuring LAN bridges to pass these multicast messages.
Use the Local Area OpenVMS Cluster Network Failure Analysis Program to monitor and maintain network availability. (See OpenVMS Cluster Systems for more information.)
Use the troubleshooting suggestions in OpenVMS Cluster Systems to diagnose performance problems with the SCS layer and the NISCA transport protocol.
Keep LAN average utilization below 50%.

Reference: See Section 7.7.7 for information about extended LANs (ELANs).

6.6.1 Selecting MOP Servers

When using multiple LAN adapters with multiple LAN segments, distribute the connections to LAN segments that provide MOP service. The distribution allows MOP servers to downline load satellites even when network component failures occur.

It is important to ensure sufficient MOP servers for both VAX and Alpha nodes to provide downline load support for booting satellites. By careful selection of the LAN connection for each MOP server (Alpha or VAX, as appropriate) on the network, you can maintain MOP service in the face of network failures.

6.6.2 Configuring Two LAN Segments

Figure 6-2 shows a sample configuration for an OpenVMS Cluster system connected to two different LAN segments. The configuration includes Alpha and VAX nodes, satellites, and two bridges.

Figure 6-2 Two-LAN Segment OpenVMS Cluster Configuration

The figure illustrates the following points:

Connecting critical nodes to multiple LAN segments provides increased availability in the event of segment or adapter failure. Disk and tape servers can use some of the network bandwidth provided by the additional network connection. Critical satellites can be booted using the other LAN adapter if one LAN adapter fails.
Connecting noncritical satellites to only one LAN segment helps to balance the network load by distributing systems equally among the LAN segments. These systems communicate with satellites on the other LAN segment through one of the bridges.
Only one LAN adapter per node can be used for DECnet and MOP service to prevent duplication of LAN addresses.
LAN adapters providing MOP service (Alpha or VAX, as appropriate) should be distributed among the LAN segments to ensure that LAN failures do not prevent satellite booting.
Using redundant LAN bridges prevents the bridge from being a single point of failure.

6.6.3 Configuring Three LAN Segments

Figure 6-3 shows a sample configuration for an OpenVMS Cluster system connected to three different LAN segments. The configuration also includes both Alpha and VAX nodes and satellites and multiple bridges.

Figure 6-3 Three-LAN Segment OpenVMS Cluster Configuration

The figure illustrates the following points:

Connecting disk and tape servers to two or three LAN segments can help provide higher availability and better I/O throughput.
Connecting critical satellites to two or more LAN segments can also increase availability. If any of the network components fails, these satellites can use the other LAN adapters to boot and still have access to the critical disk servers.
Distributing noncritical satellites equally among the LAN segments can help balance the network load.
A MOP server (Alpha or VAX, as appropriate) is provided for each LAN segment.

Reference: See Section 8.2.4 for more information about boot order and satellite dependencies in a LAN. See OpenVMS Cluster Systems for information about LAN bridge failover.

6.7 Availability in a DSSI OpenVMS Cluster

Figure 6-4 shows an optimal configuration for a medium-capacity, highly available DSSI OpenVMS Cluster system. Figure 6-4 is followed by an analysis of the configuration that includes:

Analysis of its components
Advantages and disadvantages
Key availability strategies implemented

Figure 6-4 DSSI OpenVMS Cluster System

6.7.1 Components

The DSSI OpenVMS Cluster configuration in Figure 6-4 has the following components:

Part	Description
1	Two DSSI interconnects with two DSSI adapters per node. Rationale: For redundancy, use at least two interconnects and attach all nodes to all DSSI interconnects.
2	Two to four DSSI-capable OpenVMS nodes. Rationale: Three nodes are recommended to maintain quorum. A DSSI interconnect can support a maximum of four OpenVMS nodes. Alternative 1: Two-node configurations require a quorum disk to maintain quorum if a node fails. Alternative 2: For more than four nodes, configure two DSSI sets of nodes connected by two LAN interconnects.
3	Two Ethernet interconnects. Rationale: The LAN interconnect is required for DECnet--Plus communication. Use two interconnects for redundancy. For higher network capacity, use FDDI instead of Ethernet.
4	System disk. Shadow the system disk across DSSI interconnects. Rationale: Shadow the system disk across interconnects so that the disk and the interconnect do not become single points of failure.
5	Data disks. Shadow essential data disks across DSSI interconnects. Rationale: Shadow the data disk across interconnects so that the disk and the interconnect do not become single points of failure.

6.7.2 Advantages

The configuration in Figure 6-4 offers the following advantages:

The DSSI interconnect gives all nodes shared, direct access to all storage.
Moderate potential for growth in size and performance.
There is only one system disk to manage.

6.7.3 Disadvantages

This configuration has the following disadvantages:

Applications must be shut down in order to swap DSSI cables. This is referred to as "warm swap." The DSSI cable is warm swappable for the adapter, the cable, and the node.
A node's location on the DSSI affects the recoverability of the node. If the adapter fails on a node located at the end of the DSSI interconnect, the OpenVMS Cluster may become unavailable.

6.7.4 Key Availability Strategies

The configuration in Figure 6-4 incorporates the following strategies, which are critical to its success:

This configuration has no single point of failure.
Volume shadowing provides multiple copies of system and essential data disks across separate DSSI interconnects.
All nodes have shared, direct access to all storage.
At least three nodes are used for quorum, so the OpenVMS Cluster continues if any one node fails.
There are no satellite dependencies.

6.8 Availability in a CI OpenVMS Cluster

Figure 6-5 shows an optimal configuration for a large-capacity, highly available CI OpenVMS Cluster system. Figure 6-5 is followed by an analysis of the configuration that includes:

Analysis of its components
Advantages and disadvantages
Key availability strategies implemented

Figure 6-5 CI OpenVMS Cluster System

6.8.1 Components

The CI OpenVMS Cluster configuration in Figure 6-5 has the following components: Previous | Next | Contents

Guidelines for OpenVMS Cluster Configurations

5.8.2 Local Adapters

Chapter 6Configuring OpenVMS Clusters for Availability

6.2.1 Shared Access to Storage

6.2.2 Component Redundancy

6.5.2 Advantages

6.5.3 Disadvantages

6.5.4 Key Availability Strategies

6.6 Configuring Multiple LANs

6.6.3 Configuring Three LAN Segments

6.7.2 Advantages

6.7.3 Disadvantages

6.7.4 Key Availability Strategies

6.8 Availability in a CI OpenVMS Cluster

6.8.1 Components

Chapter 6
Configuring OpenVMS Clusters for Availability