Guidelines for OpenVMS Cluster Configurations

Part	Description
1	Two LAN interconnects. \| [Home] \| [Comments] \| [Ordering info] \| [Help] 6318P005.HTM OSSG Documentation 26-NOV-1996 11:20:17.25 Copyright © Digital Equipment Corporation 1996. All Rights Reserved. Legal Rationale: The additional use of LAN interconnects is required for DECnet--Plus communication. Having two LAN interconnects---Ethernet or FDDI---increases redundancy. For higher network capacity, use FDDI instead of Ethernet.
2	Two to 16 CI capable OpenVMS nodes. Rationale: Three nodes are recommended to maintain quorum. A CI interconnect can support a maximum of 16 OpenVMS nodes. With a CI-to-PCI (CIPCA) adapter, Alpha nodes that have a combination of PCI and EISA buses can be connected to the CI. Alpha nodes with XMI buses can use the CIXCD adapter to connect to the CI. Reference: For more extensive information about the CIPCA, see Appendix C. Alternative: Two-node configurations require a quorum disk to maintain quorum if a node fails.
3	Two CI interconnects with two star couplers. Rationale: Use two star couplers to allow for redundant connections to each node.
4	Critical disks are dual-ported between HSJ or HSC controllers. Rationale: Connect each disk to two controllers for redundancy. Shadow and dual port system disks between HSJ or HSC controllers. Periodically alternate the primary path of dual-ported disks to test hardware.
5	Data disks. Rationale: Single port nonessential data disks, for which the redundancy provided by dual porting is unnecessary.
6	Essential data disks are shadowed across controllers. Rationale: Shadow essential disks and place shadow set members on different HSCs to eliminate a single point of failure.

6.8.2 Advantages

This configuration offers the following advantages:

All nodes have direct access to all storage.
This configuration has a high growth capacity for processing and storage.
The CI is inherently dual pathed, unlike other interconnects.

6.8.3 Disadvantages

This configuration has the following disadvantage:

Higher cost than the other configurations.

6.8.4 Key Availability Strategies

The configuration in Figure 6-5 incorporates the following strategies, which are critical to its success:

This configuration has no single point of failure.
Dual porting and volume shadowing provides multiple copies of essential disks across separate HSC or HSJ controllers.
All nodes have shared, direct access to all storage.
At least three nodes are used for quorum, so the OpenVMS Cluster continues if any one node fails.
There are no satellite dependencies.
The uninterruptible power supply (UPS) ensures availability in case of a power failure.

6.9 Availability in a MEMORY CHANNEL OpenVMS Cluster

Figure 6-6 shows a highly available MEMORY CHANNEL (MC) cluster configuration. Figure 6-6 is followed by an analysis of the configuration that includes:

Analysis of its components
Advantages and disadvantages
Key availability strategies implemented

Figure 6-6 MEMORY CHANNEL Cluster

6.9.1 Components

The MEMORY CHANNEL configuration shown in Figure 6-6 has the following components:

Part Description

1 Two MEMORY CHANNEL hubs.
Rationale: Having two hubs and multiple connections to the nodes prevents having a single point of failure.

2 Three to four MEMORY CHANNEL nodes.
Rationale: Three nodes are recommended to maintain quorum. A MEMORY CHANNEL interconnect can support a maximum of four OpenVMS Alpha nodes.
Alternative: Two-node configurations require a quorum disk to maintain quorum if a node fails.

3 Fast-wide differential (FWD) SCSI bus.
Rationale: Use a FWD SCSI bus to enhance data transfer rates (20 million transfers per second) and because it supports up to two HSZ controllers.

4 Two HSZ controllers.
Rationale: Two HSZ controllers ensure redundancy in case one of the controllers fails. With two controllers, you can connect two single-ended SCSI buses and more storage.

5 Essential system disks and data disks.
Rationale: Shadow essential disks and place shadow set members on different SCSI buses to eliminate a single point of failure.

Part	Description
1	Two MEMORY CHANNEL hubs. Rationale: Having two hubs and multiple connections to the nodes prevents having a single point of failure.
2	Three to four MEMORY CHANNEL nodes. Rationale: Three nodes are recommended to maintain quorum. A MEMORY CHANNEL interconnect can support a maximum of four OpenVMS Alpha nodes. Alternative: Two-node configurations require a quorum disk to maintain quorum if a node fails.
3	Fast-wide differential (FWD) SCSI bus. Rationale: Use a FWD SCSI bus to enhance data transfer rates (20 million transfers per second) and because it supports up to two HSZ controllers.
4	Two HSZ controllers. Rationale: Two HSZ controllers ensure redundancy in case one of the controllers fails. With two controllers, you can connect two single-ended SCSI buses and more storage.
5	Essential system disks and data disks. Rationale: Shadow essential disks and place shadow set members on different SCSI buses to eliminate a single point of failure.

6.9.2 Advantages

This configuration offers the following advantages:

All nodes have direct access to all storage.
SCSI storage provides low-cost, commodity hardware with good performance.
The MEMORY CHANNEL interconnect provides high-performance, node-to-node communication at a low price. The SCSI interconnect complements MEMORY CHANNEL by providing low-cost, commodity storage communication.

6.9.3 Disadvantages

This configuration has the following disadvantage:

The fast wide differential SCSI bus is a single point of failure. One solution is to add a LAN (Ethernet or FDDI) to each node so that if the SCSI bus fails, the nodes can fail over to the LAN.

6.9.4 Key Availability Strategies

The configuration in Figure 6-6 incorporates the following strategies, which are critical to its success:

Redundant MEMORY CHANNEL hubs and HSZ controllers prevent a single point of hub or controller failure.
Volume shadowing provides multiple copies of essential disks across separate HSZ controllers.
All nodes have shared, direct access to all storage.
At least three nodes are used for quorum, so the OpenVMS Cluster continues if any one node fails.

6.10 Availability in an OpenVMS Cluster with Satellites

Satellites are systems that do not have direct access to a system disk and other OpenVMS Cluster storage. Satellites are usually workstations, but they can be any OpenVMS Cluster node that is served storage by other nodes in the cluster.

Because satellite nodes are highly dependent on server nodes for availability, the sample configurations presented earlier in this chapter do not include satellite nodes. However, because satellite/server configurations provide important advantages, you may decide to trade off some availability to include satellite nodes in your configuration.

Figure 6-7 shows an optimal configuration for a OpenVMS Cluster system with satellites. Figure 6-7 is followed by an analysis of the configuration that includes:

Analysis of its components
Advantages and disadvantages
Key availability strategies implemented

The base configurations in Figure 6-4 and Figure 6-5 could replace the base configuration shown in Figure 6-7. In other words, the FDDI and satallite segments shown in Figure 6-7 could just as easily be attached to the configurations shown in Figure 6-4 and Figure 6-5.

Figure 6-7 OpenVMS Cluster with Satellites

6.10.1 Components

This satellite/server configuration in Figure 6-7 has the following components:

Part Description

1 Base configuration.
The base configuration performs server functions for satellites.

2 Three to 16 OpenVMS server nodes.
Rationale: At least three nodes are recommended to maintain quorum. More than 16 nodes introduces excessive complexity.

3 FDDI ring between base server nodes and satellites.
Rationale: The FDDI ring has increased network capacity over Ethernet, which is slower.
Alternative: Use two Ethernet segments instead of the FDDI ring.

4 Two Ethernet segments from the FDDI ring to attach each critical satellite with two Ethernet adapters. Each of these critical satellites has its own system disk.
Rationale: Having their own boot disks increases the availability of the critical satellites.

5 For noncritical satellites, place a boot server on the Ethernet segment.
Rationale: Noncritical satellites do not need their own boot disks.

6 Limit the satellites to 15 per segment.
Rationale: More than 15 satellites on a segment may cause I/O congestion.

Part	Description
1	Base configuration. The base configuration performs server functions for satellites.
2	Three to 16 OpenVMS server nodes. Rationale: At least three nodes are recommended to maintain quorum. More than 16 nodes introduces excessive complexity.
3	FDDI ring between base server nodes and satellites. Rationale: The FDDI ring has increased network capacity over Ethernet, which is slower. Alternative: Use two Ethernet segments instead of the FDDI ring.
4	Two Ethernet segments from the FDDI ring to attach each critical satellite with two Ethernet adapters. Each of these critical satellites has its own system disk. Rationale: Having their own boot disks increases the availability of the critical satellites.
5	For noncritical satellites, place a boot server on the Ethernet segment. Rationale: Noncritical satellites do not need their own boot disks.
6	Limit the satellites to 15 per segment. Rationale: More than 15 satellites on a segment may cause I/O congestion.

6.10.2 Advantages

This configuration provides the following advantages:

A large number of nodes can be served in one OpenVMS Cluster.
You can spread a large number of nodes over a greater distance.

6.10.3 Disadvantages

This configuration has the following disadvantages:

Satellites with single LAN adapters have a single point of failure that causes cluster transitions if the adapter fails.
High cost of LAN connectivity for highly available satellites.

6.10.4 Key Availability Strategies

The configuration in Figure 6-7 incorporates the following strategies, which are critical to its success:

This configuration has no single point of failure.
The FDDI interconnect has sufficient bandwidth to serve satellite nodes from the base server configuration.
All shared storage is MSCP served from the base configuration, which is appropriately configured to serve a large number of nodes.

6.11 Multiple-Site OpenVMS Cluster System

Multiple-site OpenVMS Cluster configurations contain nodes that are located at geographically separated sites. Depending on the technology used, the distances between sites can be as great as 150 miles. FDDI, asynchronous transfer mode (ATM), and DS3 are used to connect these separated sites to form one large cluster. Available from most common telephone service carriers, DS3 and ATM services provide long-distance, point-to-point communications for multiple-site clusters.

Figure 6-8 shows a typical configuration for a multiple-site OpenVMS Cluster system. Figure 6-8 is followed by an analysis of the configuration that includes:

Analysis of components
Advantages

Figure 6-8 Multiple-Site OpenVMS Cluster Configuration Connected by DS3

6.11.1 Components

Although Figure 6-8 does not show all possible configuration combinations, a multiple-site OpenVMS Cluster can include:

Two data centers with an intersite link (FDDI, ATM, or DS3) connected to a DECconcentrator or GIGAswitch crossbar switch.
Intersite link performance that is compatible with the applications that are shared by the two sites.
Up to 96 Alpha and VAX (combined total) nodes. In general, the rules that apply to OpenVMS LAN and extended LAN (ELAN) clusters also apply to multiple-site clusters.
Reference: For LAN configuration guidelines, see Section 4.11.6. For ELAN configuration guidelines, see Section 7.7.7.

6.11.2 Advantages

The benefits of a multiple-site OpenVMS Cluster system include the following:

A few systems can be remotely located at a secondary site and can benefit from centralized system management and other resources at the primary site. For example, a main office data center could be linked to a warehouse or a small manufacturing site that could have a few local nodes with directly attached, site-specific devices. Alternatively, some engineering workstations could be installed in an office park across the city from the primary business site.
Multiple sites can readily share devices such as high-capacity computers, tape libraries, disk archives, or phototypesetters.
Backups can be made to archival media at any site in the cluster. A common example would be to use disk or tape at a single site to back up the data for all sites in the multiple-site OpenVMS Cluster. Backups of data from remote sites can be made transparently (that is, without any intervention required at the remote site).
In general, a multiple-site OpenVMS Cluster provides all of the availability advantages of a LAN OpenVMS Cluster. Additionally, by connecting multiple, geographically separate sites, multiple-site OpenVMS Cluster configurations can increase the availability of a system or elements of a system in a variety of ways:
- Logical volume/data availability---Volume shadowing or redundant arrays of independent disks (RAID) can be used to create logical volumes with members at both sites. If one of the sites becomes unavailable, data can remain available at the other site.
- Site failover---By adjusting the VOTES system parameter, you can select a preferred site to continue automatically if the other site fails or if communications with the other site are lost.
- Disaster tolerance---When combined with the software, services, and management procedures provided by the Business Recovery Server and Volume Shadowing for OpenVMS products, you can achieve a high level of disaster tolerance. Section 6.12 provides an example of a BRS configuration with volume shadowing.

Reference: For additional information about multiple-site clusters, see OpenVMS Cluster Systems.

6.12 Disaster-Tolerant BRS System

The Business Recovery Server (BRS) is a disaster-tolerant, system integration product for clusters with mission critical applications. BRS enables systems at two different geographic sites to be combined into a single, manageable OpenVMS Cluster system. Like the multiple-site cluster discussed in the previous section, these physically separate data centers are connected by FDDI or by a combination of FDDI and ATM or T3. When configured with FDDI, a BRS cluster can span up to 25 miles; with FDDI and ATM or T3, it can span up to 500 miles.

BRS clusters have all of the power of regular clusters (such as multiple-site clusters) as well as the following additional features:

BRS software makes it possible to monitor and manage physically separate datacenters across great distances, through the Operations Management Station (OMS) nodes.
BRS provides consulting support from Digital Services for installing and configuring BRS.
BRS clusters can span up to 500 miles; regular, multiple-site clusters can span up to 150 miles.

BRS cluster configurations can be large and highly complex. Within each data center, the interconnects among systems can be CI, DSSI, Ethernet, FDDI/ATM, or a combination of these. RS232 can also be used to connect communication devices to terminals or host consoles.

Each of the two sites in a BRS cluster can have one or more OMS nodes, which provide site-independent management and control. The OMS software is unique to BRS and plays a central role in cluster management and disaster recovery.

Figure 6-9 shows a typical configuration for a disaster-tolerant OpenVMS Cluster system. Figure 6-9 is followed by an analysis of the configuration that includes:

Analysis of components
Advantages

Figure 6-9 Disaster-Tolerant BRS System

6.12.1 Components

A Business Recovery Server can include:

Two data centers with or without a quorum VAX at a third location.
A maximum of 16 processors attached directly to an FDDI ring through a DECconcentrator unit or GIGAswitch.
One or two FDDI rings between locations, with load balancing and failover between rings.
Systems with multiple FDDI interfaces per processor.
Ninety-six nodes in the total OpenVMS Cluster system.

Reference: For additional configuration information, see Planning and Configuring the Business Recovery Server.

6.12.2 Advantages

Most disaster recovery techniques are application dependent and can be expensive to implement and maintain. The disaster-tolerant OpenVMS Cluster system is not application dependent. By making the recovery process intrinsic to the system's implementation, the disaster recovery plan is in continuous operation.

To achieve a short recovery period, physical data center independence is crucial. The Business Recovery Server package provides the only reliable way to achieve this independence. Through the combination of software and service, Business Recovery Server offers data center independence in four areas:

System management
System function
Data correctness
Fault isolation and correction

Site-independent system management is provided by the deployment of two OMS nodes, each serving as backup for the other.

Chapter 7
Configuring an OpenVMS Cluster for Scalability

This chapter explains how to maximize scalability in many different kinds of OpenVMS Clusters.

7.1 What Is Scalability?

Scalability is the ability to expand an OpenVMS Cluster system in any system, storage, and interconnect dimension and at the same time fully use the initial configuration equipment. Your OpenVMS Cluster system can grow in many dimensions, as shown in Figure 7-1. Each dimension also enables your applications to expand.

Figure 7-1 OpenVMS Cluster Growth Dimensions

7.1.1 Scalable Dimensions

Table 7-1 describes the growth dimensions for systems, storage, and interconnects in OpenVMS Clusters.

Table 7-1 Scalable Dimensions in OpenVMS Clusters
This Dimension Grows by...

Systems

CPU Implementing SMP within a system.
Adding systems to a cluster.
Accommodating various processor sizes in a cluster.
Adding a bigger system to a cluster.
Migrating from VAX to Alpha systems.

Memory Adding memory to a system.

I/O Adding interconnects and adapters to a system.
Adding MEMORY CHANNEL to a cluster to offload the I/O interconnect.

OpenVMS Tuning system parameters.
Moving to OpenVMS Alpha.

Adapter Adding storage adapters to a system.
Adding CI and DSSI adapters to a system.
Adding LAN adapters to a system.

Storage

Media Adding disks to a cluster.
Adding tapes and CD-ROMs to a cluster.

Volume shadowing Increasing availability by shadowing disks.
Shadowing disks across controllers.
Shadowing disks across systems.

I/O Adding solid-state or DECram disks to a cluster.
Adding disks and controllers with caches to a cluster.
Adding RAID disks to a cluster.

Controller and array Moving disks and tapes from systems to controllers.
Combining disks and tapes in arrays.
Adding more controllers and arrays to a cluster.

Interconnect

LAN Adding Ethernet and FDDI segments.
Upgrading from Ethernet to FDDI.
Adding redundant segments and bridging segments.

CI, DSSI, SCSI, and MEMORY CHANNEL Adding CI, DSSI, SCSI, and MEMORY CHANNEL interconnects to a cluster or adding redundant interconnects to a cluster.

I/O Adding faster interconnects for capacity.
Adding redundant interconnects for capacity and availability.

Distance Expanding a cluster inside a room or a building.
Expanding a cluster across a town or several buildings.
Expanding a cluster between two sites (spanning 40 km).

**Table 7-1 Scalable Dimensions in OpenVMS Clusters**
This Dimension	Grows by...
Systems
CPU	Implementing SMP within a system. Adding systems to a cluster. Accommodating various processor sizes in a cluster. Adding a bigger system to a cluster. Migrating from VAX to Alpha systems.
Memory	Adding memory to a system.
I/O	Adding interconnects and adapters to a system. Adding MEMORY CHANNEL to a cluster to offload the I/O interconnect.
OpenVMS	Tuning system parameters. Moving to OpenVMS Alpha.
Adapter	Adding storage adapters to a system. Adding CI and DSSI adapters to a system. Adding LAN adapters to a system.
Storage
Media	Adding disks to a cluster. Adding tapes and CD-ROMs to a cluster.
Volume shadowing	Increasing availability by shadowing disks. Shadowing disks across controllers. Shadowing disks across systems.
I/O	Adding solid-state or DECram disks to a cluster. Adding disks and controllers with caches to a cluster. Adding RAID disks to a cluster.
Controller and array	Moving disks and tapes from systems to controllers. Combining disks and tapes in arrays. Adding more controllers and arrays to a cluster.
Interconnect
LAN	Adding Ethernet and FDDI segments. Upgrading from Ethernet to FDDI. Adding redundant segments and bridging segments.
CI, DSSI, SCSI, and MEMORY CHANNEL	Adding CI, DSSI, SCSI, and MEMORY CHANNEL interconnects to a cluster or adding redundant interconnects to a cluster.
I/O	Adding faster interconnects for capacity. Adding redundant interconnects for capacity and availability.
Distance	Expanding a cluster inside a room or a building. Expanding a cluster across a town or several buildings. Expanding a cluster between two sites (spanning 40 km).

The ability to add to the components listed in Table 7-1 in any way that you choose is an important feature that OpenVMS Clusters provide. You can add hardware and software in a wide variety of combinations by carefully following the suggestions and guidelines offered in this chapter and in the products' documentation and Digital Systems and Options Catalog. When you choose to expand your OpenVMS Cluster in a specific dimension, be aware of the advantages and tradeoffs with regard to the other dimensions. Table 7-2 describes strategies that promote OpenVMS Cluster scalability. Understanding these scalability strategies can help you maintain a higher level of performance and availability as your OpenVMS Cluster grows.

7.2 Strategies for Configuring a Highly Scalable OpenVMS Cluster

The hardware that you choose and the way that you configure it has a significant impact on the scalability of your OpenVMS Cluster. This section presents strategies for designing an OpenVMS Cluster configuration that promotes scalability.

7.2.1 Scalability Strategies

Table 7-2 lists strategies in order of importance that ensure scalability. This chapter contains many figures that show how these strategies are implemented.

Table 7-2 Scalability Strategies
Strategy Description

Capacity planning Running a system above 80% capacity (near performance saturation) limits the amount of future growth possible.
Understand whether your business and applications will grow. Try to anticipate future requirements for processor, memory, and I/O.

Shared, direct access to all storage The ability to scale compute and I/O performance is heavily dependent on whether all of the systems have shared, direct access to all storage.
The CI and DSSI OpenVMS Cluster illustrations that follow show many examples of shared, direct access to storage, with no MSCP overhead.
Reference: For more information about MSCP overhead, see Section 7.8.1.

Limit node count to between 3 and 16 Smaller OpenVMS Clusters are simpler to manage and tune for performance and require less OpenVMS Cluster communication overhead than do large OpenVMS Clusters. You can limit node count by upgrading to a more powerful processor and by taking advantage of OpenVMS SMP capability.
If your server is becoming a compute bottleneck because it is overloaded, consider whether your application can be split across nodes. If so, add a node; if not, add a processor (SMP).

Remove system bottlenecks To maximize the capacity of any OpenVMS Cluster function, consider the hardware and software components required to complete the function. Any component that is a bottleneck may prevent other components from achieving their full potential. Identifying bottlenecks and reducing their effects increases the capacity of an OpenVMS Cluster.

Enable the MSCP server The MSCP server enables you to add satellites to your OpenVMS Cluster so that all nodes can share access to all storage. In addition, the MSCP server provides failover for access to shared storage when an interconnect fails.

Reduce interdependencies and simplify configurations an OpenVMS Cluster system with one system disk is completely dependent on that disk for the OpenVMS Cluster to continue. If the disk, the node serving the disk, or the interconnects between nodes fail, the entire OpenVMS Cluster system may fail.

Ensure sufficient serving resources If a small disk server has to serve a large number disks to many satellites, the capacity of the entire OpenVMS Cluster is limited. Do not overload a server because it will become a bottleneck and will be unable to handle failover recovery effectively.

Configure resources and consumers close to each other Place servers (resources) and satellites (consumers) close to each other. If you need to increase the number of nodes in your OpenVMS Cluster, consider dividing it. See Section 8.2.4 for more information.

Set adequate system parameters If your OpenVMS Cluster is growing rapidly, important system parameters may be out of date. Run AUTOGEN, which automatically calculates significant system parameters and resizes page, swap, and dump files.

**Table 7-2 Scalability Strategies**
Strategy	Description
Capacity planning	Running a system above 80% capacity (near performance saturation) limits the amount of future growth possible. Understand whether your business and applications will grow. Try to anticipate future requirements for processor, memory, and I/O.
Shared, direct access to all storage	The ability to scale compute and I/O performance is heavily dependent on whether all of the systems have shared, direct access to all storage. The CI and DSSI OpenVMS Cluster illustrations that follow show many examples of shared, direct access to storage, with no MSCP overhead. Reference: For more information about MSCP overhead, see Section 7.8.1.
Limit node count to between 3 and 16	Smaller OpenVMS Clusters are simpler to manage and tune for performance and require less OpenVMS Cluster communication overhead than do large OpenVMS Clusters. You can limit node count by upgrading to a more powerful processor and by taking advantage of OpenVMS SMP capability. If your server is becoming a compute bottleneck because it is overloaded, consider whether your application can be split across nodes. If so, add a node; if not, add a processor (SMP).
Remove system bottlenecks	To maximize the capacity of any OpenVMS Cluster function, consider the hardware and software components required to complete the function. Any component that is a bottleneck may prevent other components from achieving their full potential. Identifying bottlenecks and reducing their effects increases the capacity of an OpenVMS Cluster.
Enable the MSCP server	The MSCP server enables you to add satellites to your OpenVMS Cluster so that all nodes can share access to all storage. In addition, the MSCP server provides failover for access to shared storage when an interconnect fails.
Reduce interdependencies and simplify configurations	an OpenVMS Cluster system with one system disk is completely dependent on that disk for the OpenVMS Cluster to continue. If the disk, the node serving the disk, or the interconnects between nodes fail, the entire OpenVMS Cluster system may fail.
Ensure sufficient serving resources	If a small disk server has to serve a large number disks to many satellites, the capacity of the entire OpenVMS Cluster is limited. Do not overload a server because it will become a bottleneck and will be unable to handle failover recovery effectively.
Configure resources and consumers close to each other	Place servers (resources) and satellites (consumers) close to each other. If you need to increase the number of nodes in your OpenVMS Cluster, consider dividing it. See Section 8.2.4 for more information.
Set adequate system parameters	If your OpenVMS Cluster is growing rapidly, important system parameters may be out of date. Run AUTOGEN, which automatically calculates significant system parameters and resizes page, swap, and dump files.

7.3 Scalability in CI OpenVMS Clusters

Each CI star coupler can have up to 32 nodes attached; 16 can be systems and the rest can be storage controllers and storage. Figure 7-2, Figure 7-3, and Figure 7-4 show a progression from a two-node CI OpenVMS Cluster to a seven-node CI OpenVMS Cluster.

7.3.1 Two-Node CI OpenVMS Cluster

In Figure 7-2, two nodes have shared, direct access to storage that includes a quorum disk. The VAX and Alpha systems each have their own system disks.

Figure 7-2 Two-Node CI OpenVMS Cluster

Previous | Next | Contents