Volume Shadowing for OpenVMS

When you dismount a shadow set on a single node in an OpenVMS Cluster system, and other nodes in the OpenVMS Cluster still have the shadow set mounted, none of the shadow set members contained in the shadow set are spun down, even if you have not specified the DMT$M_NOUNLOAD flag. After this call completes, the shadow set is unavailable on the node from which the call was made. The shadow set is still available to other nodes in the cluster that have the shadow set mounted.

If the node on which the shadow set is being dismounted is the only node that has the shadow set mounted, the shadow set dissolves. The shadow set member devices are spun down unless you specify the DMT$M_NOUNLOAD flag.

The MACRO-32 code in

Examples

5-5 demonstrates how to use the $DISMOU system service to dismount the shadow set represented by the virtual unit DSA23.

Examples

5-5 Dismounting and Dissolving a Shadow Set Locally

$DMTDEF 
FLAGS:    .LONG 0 
DSA23:  .ASCID /DSA23:/ 
 . 
 . 
 . 
 
$DISMOU_S - 
 devnam = DSA23, - 
 flags = FLAGS 
 . 
 . 
 . 
.END

When a shadow set is dissolved, there is no longer any meaningful relationship between the virtual unit and the shadow set members, or among the shadow set members.

Each of the former shadow set members can be mounted as a single disk for other purposes.
Each volume, however, continues to be marked as having been part of a shadow set. After you dissolve a shadow set, each volume retains the volume shadowing generation number that identifies it as being a former shadow set member (unless you remount the volume outside of the shadow set). Volumes marked as having been part of a shadow set are automatically software write-locked to prevent accidental deletion of data. You cannot mount these volumes for writing outside of a shadow set unless you use the MNT$M_OVR_SHAMEM option with the system service MNT$_FLAGS item code.
The virtual unit changes to an offline state.

The MACRO-32 code in

Examples

5-6 demonstrates a call to the $DISMOU system service to perform a dismount across the cluster. When the shadow set is dismounted from the last node, the shadow set is dissolved.

Examples

5-6 Dismounting and Dissolving a Shadow Set Across the Cluster

$DMTDEF 
FLAGS:   .LONG  DMT$M_CLUSTER 
DSA23:  .ASCID /DSA23:/ 
 . 
 . 
 . 
 
$DISMOU_S - 
 devnam = DSA23, - 
 flags = FLAGS 
 . 
 . 
 . 
.END

You must specify the DMT$M_CLUSTER option with the flags argument if you want the shadow set dismounted from every node in the cluster. When each node in the cluster has dismounted the shadow set (the number of hosts having the shadow set mounted reaches zero), the volume shadowing software dissolves the shadow set.

5.4.3 Setting $DISMOU Flags for Shadow Set Operations

Table 5-2 lists the options for the $DISMOU flags argument and describes the shadow set operations that use these options. For a full description of each of these flag options, see the description of the $DISMOU service in the OpenVMS System Services Reference Manual.

Table 5-2 $DISMOU Flag Options
Option Description

DMT$M_UNLOAD Valid for all shadowing-related requests

DMT$M_CLUSTER Valid for all shadowing-related requests

DMT$M_ABORT Honored for virtual units, ignored for member units

DMT$M_UNIT Ignored for virtual units and their members

**Table 5-2 $DISMOU Flag Options**
Option	Description
DMT$M_UNLOAD	Valid for all shadowing-related requests
DMT$M_CLUSTER	Valid for all shadowing-related requests
DMT$M_ABORT	Honored for virtual units, ignored for member units
DMT$M_UNIT	Ignored for virtual units and their members

5.5 Evaluating Condition Values Returned by $DISMOU and $MOUNT

This section discusses the condition values returned by the $DISMOU and $MOUNT system services that pertain to mounting and using shadow sets. For a complete list of the condition values returned by these services, see the OpenVMS System Services Reference Manual.

If $MOUNT returns the condition value SS$_BADPARAM, your item list probably contains one of the following errors:

The virtual unit specified in one of your MNT$_SHANAM item descriptors contains a name other than DSAn:.
A MNT$_SHAMEM item descriptor appears in the item list before any MNT$_SHANAM item descriptor.
Your item list contains a MNT$_SHANAM item descriptor, but it is not followed by the item descriptor MNT$_SHAMEM.
A MNT$_DEVNAM item descriptor appears in the item list in the middle of a series of item descriptors that specify a single shadow set. You can construct a volume set that contains one or more nonshadowed disks, as well as one or more shadow sets. However, when you use the MNT$_DEVNAM item descriptor to specify the nonshadowed disk, it must not appear between the MNT$_SHANAM item descriptor that specifies a virtual unit and the item descriptors that specify the members of the shadow set that the virtual unit represents.
The following list contains possible status messages that $MOUNT can return when mounting and using shadow sets:
- SS$_VOLINV (label mismatched)
- SS$_SHACHASTA (shadow state change occurred during a mount operation)
- SS$_MEDOFL (physical unit not accessible)
- SS$_INCSHAMEM (physical disk incompatible for shadow set)

See also Appendix B for shadowing-related status messages.

5.6 Using $GETDVI to Obtain Information About Shadow Sets

The $GETDVI system service is useful for obtaining information about the shadow set devices on your system. Through the use of the shadow set item codes, you can determine the following types of information:

Whether a device is a shadow set virtual unit or a shadow set member
Whether a device is the target of a copy or merge operation
The name of the virtual unit that represents the shadow set of which the particular device is a member
The entire membership of a shadow set, including the virtual unit and all of the members
Whether or not a member has been removed from the shadow set

The call to $GETDVI has the following format:

SYS$GETDVI [efn],[chan],[devnam],itmlst,[iosb],[astadr],[astprm],[nullarg]

For a complete description of the $GETDVI and $GETDVIW services and their arguments, see the OpenVMS System Services Reference Manual.

Note
If you use the file-system-related item codes with the $GETDVI system service to obtain meaningful system information (such as FREEBLOCK information) for a shadow set, you should specify the virtual unit name with the $GETDVI service. If you specify the device name of one of the shadow set members, the $GETDVI service returns a value of 0.

5.6.1 $GETDVI Shadow Set Item Codes

Table 5-3 lists the information returned by the $GETDVI shadow set item codes.

Table 5-3 SYS$GETDVI Item Codes
Item Code Function

DVI$_SHDW_CATCHUP_COPYING Returns a Boolean longword. The value 1 indicates that the device is the target of a copy operation.

DVI$_SHDW_MASTER Returns a Boolean longword. The value 1 indicates that the device is a virtual unit.

DVI$_SHDW_MASTER_NAME¹ When the specified device is a shadow set member, $GETDVI returns the name of the virtual unit that represents the shadow set of which the specified device is a member.
If you specify a virtual unit or a device that is not a shadow set member, $GETDVI returns a null string.

DVI$_SHDW_MEMBER Returns a Boolean longword. The value 1 indicates that the device is a shadow set member.

DVI$_SHDW_MERGE_COPYING Returns a Boolean longword. The value 1 indicates that the device is a merge member of the shadow set.

DVI$_SHDW_NEXT_MBR_NAME¹ Returns the device name of the next member in the shadow set. If you specify a virtual unit, $GETDVI returns the member device names in the shadow set. If you specify the name of a device that is neither a virtual unit nor a shadow set member, $GETDVI returns a null string.

**Table 5-3 SYS$GETDVI Item Codes**
Item Code	Function
DVI$_SHDW_CATCHUP_COPYING	Returns a Boolean longword. The value 1 indicates that the device is the target of a copy operation.
DVI$_SHDW_MASTER	Returns a Boolean longword. The value 1 indicates that the device is a virtual unit.
DVI$_SHDW_MASTER_NAME¹	When the specified device is a shadow set member, $GETDVI returns the name of the virtual unit that represents the shadow set of which the specified device is a member. If you specify a virtual unit or a device that is not a shadow set member, $GETDVI returns a null string.
DVI$_SHDW_MEMBER	Returns a Boolean longword. The value 1 indicates that the device is a shadow set member.
DVI$_SHDW_MERGE_COPYING	Returns a Boolean longword. The value 1 indicates that the device is a merge member of the shadow set.
DVI$_SHDW_NEXT_MBR_NAME¹	Returns the device name of the next member in the shadow set. If you specify a virtual unit, $GETDVI returns the member device names in the shadow set. If you specify the name of a device that is neither a virtual unit nor a shadow set member, $GETDVI returns a null string.

¹Because shadow set device names can include up to 64 characters, the buffer length field of this item descriptor should specify 64 (bytes).

5.6.2 Obtaining the Device Names of Shadow Set Members

To obtain the device names of all members of a shadow set, you must make a series of calls to $GETDVI. In your first call to $GETDVI, you can specify either the virtual unit that represents the shadow set or the device name of a member of the shadow set.

5.6.2.1 Virtual Unit Names

If your first call specifies the name of the virtual unit, the item list should contain a DVI$_SHDW_NEXT_MBR_NAME item descriptor into which $GETDVI returns the name of the lowest-numbered member of the shadow set. The devnam argument of the next call to $GETDVI should specify the device name returned in the previous call's DVI$_SHDW_NEXT_MBR_NAME item descriptor. This second call's item list should contain a DVI$_SHDW_NEXT_MBR_NAME item descriptor to receive the name of the next-highest-numbered unit in the shadow set. You should repeat these calls to $GETDVI until $GETDVI returns a null string, which means that there are no more members in the shadow set.

5.6.2.2 Member Unit Names

If your first call specifies the device name of a shadow set member, you must determine the name of the virtual unit that represents the shadow set before you can obtain the device names of all members contained in the shadow set. Therefore, if your first call specifies a member, it should also specify an item list that contains a DVI$_SHDW_MASTER_NAME item descriptor. $GETDVI returns the name of the virtual unit that represents the shadow set into this descriptor. You can now make the series of calls to $GETDVI described in Section 5.6.2.1. The devnam argument of each call specifies the name of the device returned in the previous call's DVI$_SHDW_NEXT_MBR_NAME item descriptor. You repeat these calls until $GETDVI returns a null string, indicating that there are no more members in the shadow set.

Chapter 6
Ensuring Shadow Set Consistency

Volume shadowing performs four basic functions. The two most important, as with any disk I/O subsystem, are to satisfy read and write requests. The other two functions, copy and merge, are required for shadow set maintenance.

Copy and merge operations are the cornerstone of achieving data availability. Under certain circumstances, Volume Shadowing for OpenVMS must perform a copy or a merge operation to ensure that corresponding LBNs on all shadow set members contain the same information. Although volume shadowing automatically performs these operations, this chapter provides an overview of their operation.

Copy and merge operations occur at the same time that applications and user processes read and write to active shadow set members, thereby having a minimal effect on current application processing. Refer also to Chapter 8 for information about copy and merge operation performance.

6.1 Shadow Set Consistency

During the life of a shadow set, the state of any shadow set member relative to the rest of the members of the shadow set can vary. The shadow set is considered to be in a steady state when all of its members contain identical data. Changes in the composition of the shadow set are inevitable because:

Disk drives occasionally need corrective maintenance.
New disks are added to replace other disks.
System failures occur, requiring merge operations to take place within the shadow set.
Controllers fail, requiring maintenance.
System management functions, such as backup, are required.

For example, suppose an operator dismounts a member of a shadow set and then remounts the member back into the shadow set. During the member's absence, the remaining members of the shadow set may have experienced write operations. Thus, the information on the member being remounted into the shadow set will differ from the information on the rest of the shadow set. Therefore a copy operation is required.

As another example, consider a situation where a shadow set is mounted by several systems in an OpenVMS Cluster configuration. If one of those systems fails, the data about the members of the shadow set may differ because of outstanding or incomplete write operations issued by the failed system. The shadowing software resolves this situation by performing a merge operation.

In any event, copy and merge operations allow volume shadowing to preserve the consistency of the data written to the shadow set. A shadow set is considered to be in a transient state when one or more of its members are undergoing a copy or a merge operation. Additionally, volume shadowing maintains shadow set consistency by:

Maintaining consistent data on shadow set members by automatically detecting and replacing bad blocks on one shadow set member and rewriting those bad blocks with good data from another shadow set member.
Notifying all nodes when a member is added or removed from a shadow set, and ensuring the shadow set membership is consistent clusterwide.

Volume shadowing uses two internal mechanisms to coordinate shadow set consistency:

Shadow set generation number
Volume shadowing uses a shadow set generation number as a primary method of determining shadow set member validity and status. A shadow set generation number is an incrementing value that is stored on every member of a shadow set. Each time a membership change occurs to the shadow set (members are mounted, dismounted, or fail), the generation number on the remaining members is incremented. Thus, if a shadow set's generation number is 100 and a member is dismounted from the set, the generation numbers on the remaining members are incremented to 101. The removed member's generation number remains at 100. When mounting shadow sets, the shadowing software uses the generation numbers on the physical units to determine the need for and direction of copy operations.
Storage control block (SCB)
Volume shadowing uses a storage control block (SCB) as a primary method for controlling shadow set membership. Each physical disk contains an SCB in which the shadowing software records the names of all the current members of the shadow set. Each time the composition of the shadow set changes, the SCB on all members is updated. This feature simplifies clusterwide membership coordination and is also used by the MOUNT qualifier /INCLUDE to reconstruct a shadow set.

Table 6-1 lists the information contained in the SCB.

Table 6-1 Information in the Storage Control Block (SCB)
SCB Information Function

Volume label Identifies a unique name for the volume. Every member of a shadow set must use the same volume label.

BACKUP revision number A BACKUP/IMAGE restoration rearranges the location of data on a volume and sets a revision number to record this change. The Mount utility (MOUNT) checks the revision number of the proposed shadow set member against the numbers on current or other proposed shadow set members. If the revision number differs, the shadowing software determines whether a copy or merge operation is required to bring the data on the less current members up to date.

Volume shadowing generation number When a volume joins a shadow set, it is marked with a volume shadowing generation number. You can erase the generation number by using the /OVERRIDE=SHADOW_MEMBERSHIP qualifier with the MOUNT command.

Mount and dismount status The SCB mount status field is used as a flag that is set when a volume is mounted and cleared when it is dismounted. The MOUNT command checks this field when a disk is mounted. If the flag is set, this indicates that the disk volume was incorrectly dismounted. This will occur in the event of system failure. When mounting shadow sets that were incorrectly dismounted, the shadowing software automatically initiates merge operations.

**Table 6-1 Information in the Storage Control Block (SCB)**
SCB Information	Function
Volume label	Identifies a unique name for the volume. Every member of a shadow set must use the same volume label.
BACKUP revision number	A BACKUP/IMAGE restoration rearranges the location of data on a volume and sets a revision number to record this change. The Mount utility (MOUNT) checks the revision number of the proposed shadow set member against the numbers on current or other proposed shadow set members. If the revision number differs, the shadowing software determines whether a copy or merge operation is required to bring the data on the less current members up to date.
Volume shadowing generation number	When a volume joins a shadow set, it is marked with a volume shadowing generation number. You can erase the generation number by using the /OVERRIDE=SHADOW_MEMBERSHIP qualifier with the MOUNT command.
Mount and dismount status	The SCB mount status field is used as a flag that is set when a volume is mounted and cleared when it is dismounted. The MOUNT command checks this field when a disk is mounted. If the flag is set, this indicates that the disk volume was incorrectly dismounted. This will occur in the event of system failure. When mounting shadow sets that were incorrectly dismounted, the shadowing software automatically initiates merge operations.

Upon receiving a command to mount a shadow set, volume shadowing immediately determines whether a copy or a merge operation is required; if so, the volume shadowing software automatically performs the operation to reconcile data differences. If you are not sure which disks might be targets of copy operations, you can specify the /CONFIRM or /NOCOPY qualifiers when you use the MOUNT command. To disable the copy operations, use the /NOCOPY qualifier. If you mount a shadow set interactively, use the /CONFIRM qualifier to instruct MOUNT to display the targets of copy operations and request permission before the operations are performed.

When you dismount an individual shadow set member, you produce a situation similar to a hardware disk failure. Because files remain open on the virtual unit, the removed physical unit is marked as not being properly dismounted.

After one of the devices is removed from a shadow set, the remaining shadow set members have their generation number incremented, identifying them as being more current than the former shadow set member. This generation number aids in determining the correct copy operation if you remount the volume in a shadow set.

6.2 Copy Operations

The purpose of a copy operation is to duplicate data on a source disk to a target disk. At the end of a copy operation, both disks contain identical information, and the target disk becomes a complete member of the shadow set. Read and write access to the shadow set continues while a disk or disks are undergoing a copy operation.

The DCL command MOUNT initiates a copy operation when a disk is added to a shadow set. A copy operation is simple in nature: A source disk is read and the data is written to the target disk. This is usually done in multiblock increments referred to as LBN ranges. In an OpenVMS Cluster environment, all systems that have the shadow set mounted know about the target disk and include it as part of the shadow set. However, only one of the OpenVMS systems actually manages the copy operation.

Two complexities characterize the copy operation:

Handling user I/O requests while the copy operation is in progress
Dealing with writes to the area that is currently being copied without losing the new write data

Volume Shadowing for OpenVMS handles these situations differently depending on the operating system version number and the hardware configuration. For systems running software prior to VMS Version 5.5--2, the copy operation is performed by a VAX node and is known as an unassisted copy operation (see Section 6.2.1).

With Version 5.5--2 and later, the copy operation includes enhancements for shadow sets that are configured on controllers. These enhancements enable the controllers to perform the copy operation and are referred to as assisted copies (see Section 6.2.2).

Volume Shadowing for OpenVMS supports both assisted and unassisted shadow sets in the same cluster. Whenever you create a shadow set, add members to an existing shadow set, or boot a system, the shadowing software reevaluates each device in the changed configuration to determine whether it is capable of supporting the copy assist.

6.2.1 Unassisted Copy Operations

Unassisted copy operations are performed by a node. The actual transfer of data from the source member to the target is done through host node memory. Although unassisted copy operations are not CPU intensive, they are I/O intensive and consume a small amount of CPU bandwidth on the node that is managing the copy. An unassisted copy operation also consumes interconnect bandwidth.

On the CPU that manages the copy operation, user and copy I/Os compete evenly for the available I/O bandwidth. For other nodes in the cluster, user I/Os proceed normally and contend for resources in the controller with all the other nodes. Note that the copy operation may take longer as the user I/O load increases.

The volume shadowing software performs an unassisted copy operation when it is not possible to use the assisted copy feature (see Section 6.2.2). The most common cause of an unassisted copy operation is when the source and target disk or disks are not on line to the same controller subsystem. For unassisted copy operations, two disks can be active targets of the copy simultaneously. Disks participating in an unassisted copy operation may be on line to any controller anywhere in a cluster.

During an unassisted copy operation, the concept of a copy fence is created---the fence moves across the disk, logically separating the copied and uncopied LBN areas. The node that is managing the copy operation knows the precise location of the fence and periodically notifies the other nodes in the cluster of the fence location. Thus, if the node performing the copy operation shuts down, another node can continue the operation without restarting at the beginning.

Read I/O requests to the copied side of the fence can be serviced by any shadow set member. Read requests to the uncopied side of the fence can be serviced only from a source member.

For write I/O requests, volume shadowing propagates the requests to all members of the shadow set.

The time and amount of I/O required to complete an unassisted copy operation depends heavily on the similarities of the data on the source and target disks. It can take at least two and a half times longer to copy dissimilar data than it does to complete a copy operation on disks containing similar data.

6.2.2 Assisted Copy Operations

Unlike an unassisted copy, an assisted copy does not transfer data through the host node memory. The actual transfer of data is performed within the controller, by direct disk-to-disk data transfers, without data passing through the system. Thus, the assisted copy decreases the impact on the system, the I/O bandwidth consumption, and the time required for copy operations.

Shadow set members must be accessed from the same controller in order to take advantage of the assisted copy. The shadowing software controls the copy operation by using special MSCP copy commands, called disk copy data (DCD) commands, to instruct the controller to copy specific ranges of LBNs. For an assisted copy, only one disk can be an active target for a copy at a time.

For OpenVMS Cluster configurations, the node that is managing the copy operation issues a DCD MSCP command to the controller for each LBN range. The controller then performs the disk-to-disk copy, thus avoiding consumption of interconnect bandwidth.

By default, the Volume Shadowing for OpenVMS software (beginning with VMS Version 5.5--2) and the controller automatically enable the copy assist if the source and target disks are accessed through the same HSC controller.

Shadowing automatically disables the copy assist if:

The shadow set is mounted on a controller that does not support the copy assist, or on a controller with the copy assist disabled.
A copy operation is initiated from a node running software earlier than VMS Version 5.5--2.
The source and target disks are not accessed using the same controller.
In the case of dual-ported disks, you can use the $QIO SET PREFERRED PATH feature to force both disks to be accessed via the same controller. See the PREFER program in SYS$EXAMPLES and refer to the OpenVMS I/O User's Reference Manual for more information about setting a preferred path.
The number of assisted copies specified by the DCD connection limit has been reached, at which point additional copies will be performed unassisted.
See Determining the DCD Connection Limit for more information about setting the DCD connection limit.

See also Section 6.4 for information about disabling and reenabling the assisted copy capability.

6.3 Merge Operations

The purpose of a merge operation is to compare data on shadow set members and to ensure that inconsistencies are resolved. A merge operation is initiated when a system failure results in the possibility of incomplete writes. For example, if a write request is made to a shadow set but the system fails before a completion status is returned from all the shadow set members, it is possible that:

All members might contain the new data.
All members might contain the old data.
Some members might contain new data and others might contain old data.

The exact timing of the failure during the original write request defines which of these three scenarios results. When the system recovers, however, it is essential that corresponding LBNs on each shadow set member contain the same data (old or new). Thus, the issue here is not one of data availability, but rather of reconciling potential differences among shadow set members. Once the data on all disks is made identical, old data can be reconciled, if necessary, either by the user reentering the data or by database and application journaling techniques.

During a merge operation, the members of a shadow set are physically compared to each other to ensure that they contain the same data. This is done by performing a block-by-block comparison of the disks. As the merge proceeds, any blocks that are different are made the same--either both old or new---by means of a copy operation. Because the shadowing software does not know which member contains the new data, any member can be the source member of the merge operation.

However, the shadowing software selects one member as a logical master for the merge operation. Any difference is resolved by propagating the information from the merge master to the other members.

Because one system is responsible for doing the merge operation on a given shadow set, the merge fence for this shadow set is updated after a range of LBNs is reconciled. This fence "proceeds" across the disk and separates the merged and unmerged portions of the shadow set. Read requests to the merged side of the fence can be satisfied by any member of the shadow set. Read requests to the unmerged side of the fence are also satisfied by any member of the shadow set; however, any potential data differences---discovered by doing a data compare operation---are corrected on all members of the shadow set before returning the data to the user or application that requested it.

  5423P005.HTM
  OSSG Documentation
  22-NOV-1996 13:03:43.36

Legal

Volume Shadowing for OpenVMS

Examples

Examples

5.6.1 $GETDVI Shadow Set Item Codes

5.6.2 Obtaining the Device Names of Shadow Set Members

Chapter 6Ensuring Shadow Set Consistency

Chapter 6
Ensuring Shadow Set Consistency