**************************** ECO SUMMARY INFORMATION **************************** Release Date: 20-DEC-2005 ======================================================================= Hewlett-Packard OpenVMS ECO Cover Letter ======================================================================= 1 KIT NAME: VAXSHAD02_073 2 KIT DESCRIPTION: 2.1 Installation Rating: INSTALL_2: To be installed by all customers using the following feature(s): - None This installation rating, based upon current CLD information, is provided to serve as a guide to which customers should apply this remedial kit. (Reference attached Disclaimer of Warranty and Limitation of Liability Statement) 2.2 Reboot Requirement: Reboot Required. HP strongly recommends that a reboot is performed immediately after kit installation to avoid system instability. If you have other nodes in your OpenVMS cluster, they must also be rebooted in order to make use of the new image(s). If it is not possible or convenient to reboot the entire cluster at this time, a rolling re-boot may be performed. 2.3 Version(s) of OpenVMS to which this kit may be applied: OpenVMS VAX V7.3 2.4 New functionality or new hardware support provided: No. 3 KITS SUPERSEDED BY THIS KIT: - VAXSHAD01_073 Page 2 4 KIT DEPENDENCIES: 4.1 The following remedial kit(s), or later, must be installed BEFORE installation of this, or any required kit: - None 4.2 In order to receive all the corrections listed in this kit, the following remedial kits, or later, should also be installed: - None. 5 NEW FUNCTIONALITY AND/OR PROBLEMS ADDRESSED IN THE VAXSHAD02_073 KIT 5.1 New functionality addressed in this kit Not Applicable 5.2 Problems addressed in this kit 5.2.1 Cluster Members Hang When Accessing Shadowset 5.2.1.1 Problem Description: In a multi-node cluster, some cluster members may hang when accessing a shadowset if: o The shadowset being accessed has multiple members. o All the shadowset members are local to one of the cluster nodes. o All the shadowset members are being MSCP-served by the local node to the other cluster members. o The local node goes down and remains down for at least MVTIMEOUT seconds. Images Affected: - [SYS$LDR]SHDRIVER.EXE Page 3 5.2.1.2 CLDs, and QARs reporting this problem: 5.2.1.2.1 CLD(s) 70-3-4743 5.2.1.2.2 QAR(s) None. 5.2.1.3 Problem Analysis: Members are not removed from the shadowset after MVTIMEOUT seconds. 5.2.1.4 Release Version of OpenVMS that will contain this change: Not applicable. 5.2.1.5 Work-arounds: None. 5.2.2 OPCOM Displays Unit Without Any Other Message. 5.2.2.1 Problem Description: OPCOM will frequently display a virtual unit device, or the member device string, without any other message. This can be seen during startup if you are using a shadowed system disk, or even during the dismount of the virtual unit or the dismount of members. This has been corrected so that you now will receive a complete VMS message, such as: %SHADOW-I-VOLPROC, DSA719: shadow master has changed. Dump file WILL be written if system crashes. rather than just displaying: DSA719: Images Affected: - [SYS$LDR]SHDRIVER.EXE Page 4 5.2.2.2 CLDs, and QARs reporting this problem: 5.2.2.2.1 CLD(s) None. 5.2.2.2.2 QAR(s) 75-2-411 5.2.2.3 Problem Analysis: See problem description. 5.2.2.4 Release Version of OpenVMS that will contain this change: Not Applicable 5.2.2.5 Work-arounds: None. 5.2.3 System Hang During The Mounting Of A Shadowset. 5.2.3.1 Problem Description: During the mounting of a shadowset, a series of protocols can collide such that a thread is left waiting to be resumed and there is no thread to resume it. This can result in a system hang. Images Affected: - [SYS$LDR]SHDRIVER.EXE 5.2.3.2 CLDs, and QARs reporting this problem: 5.2.3.2.1 CLD(s) CFS.90085 5.2.3.2.2 QAR(s) 75-13-829 Page 5 5.2.3.3 Problem Analysis: The START_MBR_CHANGE_VP macro used in the START_PROTOCOL_END macro will exit with an error when it detects PASSIVE_MV. This causes either a loop in the START_PROTOCOL_END macro or two NL enques in a row which causes the STALL mechanism in the GRANT_LOCK code to fail to resume a stalled thread. The check for PASSIVE was added to prevent incorrect member removal during volume processing. 5.2.3.4 Release Version of OpenVMS that will contain this change: Next release of OpenVMS VAX 5.2.3.5 Work-arounds: None. 5.2.4 Multi-site Cluster Shadowset Member Is Returned To The Shadowset Incorrectly 5.2.4.1 Problem Description: In a multi-site cluster with all timeouts set to the maximum, a shadowset member is returned to the shadowset incorrectly. Manual removal of a member from one site, followed by manual aborting of the virtual unit at a second site, allowed a third site to return the member to the shadowset without either a copy or a merge. Images Affected: - [SYS$LDR]SHDRIVER.EXE 5.2.4.2 CLDs, and QARs reporting this problem: 5.2.4.2.1 CLD(s) 70-3-5465,CFS.88963 5.2.4.2.2 QAR(s) None. Page 6 5.2.4.3 Problem Analysis: When the virtual unit is aborted, if there is an outstanding write a merge is triggered. Although the member has been removed from the first site, the third site still thinks it can do a merge. As soon as it can access the removed member it starts the merge. The update of the merge being started causes the member to be added back into the set on the first site. The fix is to not allow NODE_FAILURE to proceed until PASSIVE_MV has completed. 5.2.4.4 Release Version of OpenVMS that will contain this change: Next release of OpenVMS VAX 5.2.4.5 Work-arounds: None. 5.2.5 Ensure Shadow Copies Handle Bad Blocks Correctly 5.2.5.1 Problem Description: If a bad block is detected on the source disk during a full copy operation, the copy will abort with the following OPCOM message : %%%%%%%%%%% OPCOM 10-MAY-2002 09:41:23.94 %%%%%%%%%%% (from node UKVMS3 at 10-MAY-2002 09:41:22.46) Message from user SYSTEM on UKVMS3 %SHADOW_SERVER-E-SSRVTRMSTS, reason for termination of operation on device _DSA1: IVADDR, invalid media address The virtual unit will look like this afterwards: Device Device Error Volume Free Trans Mnt Name Status Count Label Blocks Count Cnt DSA1: Mounted 0 ALPHAE731_CD 15330438 1 1 $7$DKA100:(UKVMS3) ShadowSetMember 2 (member of DSA1:) $7$DKA1000:(UKVMS3) ShadowCopying 0 (copy trgt DSA1:11% copied) Note that the virtual unit will still be accessible in this state. Images Affected: - [SYS$LDR]SHDRIVER.EXE Page 7 5.2.5.2 CLDs, and QARs reporting this problem: 5.2.5.2.1 CLD(s) None. 5.2.5.2.2 QAR(s) 75-66-1156 5.2.5.3 Problem Analysis: When the SHADOW_SERVER is asked to do a shadow copy, it "steps" through the disk processing one 127-block chunk after another (the SCB is handled differently). It does this by sending IO$_COPYSHAD $QIOs to the shadowing driver. There are three pieces of information that the driver sends back to SHADOW_SERVER in response to a IO$_COPYSHAD, which are: 1. Status code 2. Byte transfer count 3. LBN copy fence (i.e. the last LBN successfully copied) If the LBN copy fence does not agree with what SHADOW_SERVER thinks it should be, then SHADOW_SERVER adjusts its value before moving onto the next LBN range. The problem occurs when a bad block is detected on the source volume. In this case, a zero is returned erroneously as the "LBN copy fence". The SHADOW_SERVER then attempts to start copying at LBN 1 again. It encounters a consistency check in SHDRIVER which aborts the COPYSHAD with an SS$_IVADDR error status. 5.2.5.4 Release Version of OpenVMS that will contain this change: Next release of OpenVMS VAX Page 8 5.2.5.5 Work-arounds: None. 5.2.6 Shadowset Aborts After Node Is Shutdown 5.2.6.1 Problem Description: Shadowsets on the remaining node of a multi-site cluster abort after the serving node of one member is shutdown. Images Affected: - [SYS$LDR]SHDRIVER.EXE 5.2.6.2 CLDs, and QARs reporting this problem: 5.2.6.2.1 CLD(s) CFS.90313,CFS.91498,CFS.93820 5.2.6.2.2 QAR(s) None. 5.2.6.3 Problem Analysis: An attempt was made to keep a shadowset together when a cluster interconnect is intermittent. This allowed the behaviour of hanging the set until MVTIMOUT then aborting it. 5.2.6.4 Release Version of OpenVMS that will contain this change: Next release of OpenVMS VAX 5.2.6.5 Work-arounds: None. 5.2.7 INVEXCPTN Crash During Shadow Copy Page 9 5.2.7.1 Problem Description: When doing a controller assisted copy, available with HSC and HSJ controllers, if the source member gets an error, an incorrect index is set up that results in a crash. Crashdump Summary Information: ------------------------------ Bugcheck Type: INVEXCEPTN, Exception while above ASTDEL Current Process: NULL Current Image: <not available> Failing PC: FFFFFFFF 80268D60 Failing PS: 30000000 00000804 Module: SYS$SHDRIVER Offset: 0004AD60 : Exception Frame: R2 = FFFFFFFF 837A6380 R3 = 00000000 00000000 R4 = FFFFFFFF 839CB3C0 R5 = FFFFFFFF 83689D80 R6 = FFFFFFFF 839CB680 R7 = 00000000 00000000 PC = FFFFFFFF 80268D60 PS = 30000000 00000804 Images Affected: - [SYS$LDR]SHDRIVER.EXE 5.2.7.2 CLDs, and QARs reporting this problem: 5.2.7.2.1 CLD(s) CFS.99512 5.2.7.2.2 QAR(s) None. 5.2.7.3 Problem Analysis: Ensure all users of shad$ca_target_index use a longword to move in and out of this field. 5.2.7.4 Release Version of OpenVMS that will contain this change: Next release of OpenVMS VAX Page 10 5.2.7.5 Work-arounds: None. 5.2.8 System Crashes With SHADDETINCON Bugcheck During Boot 5.2.8.1 Problem Description: During boot from a shadowed system disk, the system can crash with a SHADDETINCON at SYS$SHDRIVER+72858 very early in the boot process. Images Affected: - [SYS$LDR]SHDRIVER.EXE 5.2.8.2 CLDs, and QARs reporting this problem: 5.2.8.2.1 CLD(s) CFS.99960,CFS.10599,CFS.10587 5.2.8.2.2 QAR(s) None. 5.2.8.3 Problem Analysis: The WATCHER lock can remain in use after an attempt to create the system disk shadowset. When this happens, the next attempt finds it in use and crashes the system. 5.2.8.4 Release Version of OpenVMS that will contain this change: Next release of OpenVMS VAX 5.2.8.5 Work-arounds: None. Page 11 5.2.9 Write Not Completed To All Members Of The Shadow Set 5.2.9.1 Problem Description: On shadow sets, where write logging was being used for minimerge recovery operations, a condition existed where a write might not be completed to all members of the shadow set. The write, making it to at least one member, was logged but the shadowing driver did not recognize that it was logged. As a result, this difference in the data on the shadow set members was not corrected during a controller-based minimerge. The set would stay in the condition where some members had the written data while others would not. The analyze disk shadow utility would detect this discrepancy. This problem only occurred when using controller-based write logging. Images Affected: - [SYS$LDR]SHDRIVER.EXE 5.2.9.2 CLDs, and QARs reporting this problem: 5.2.9.2.1 CLD(s) None. 5.2.9.2.2 QAR(s) 75-80-298 5.2.9.3 Problem Analysis: The CNID field, used to determine whether or not the failing node had an outstanding write to a set was being cleared. If the node failure occurred on a system with a CNID other than zero, the write logging merge would not work since it would appear that the system had no writes. 5.2.9.4 Release Version of OpenVMS that will contain this change: Next release of OpenVMS VAX Page 12 5.2.9.5 Work-arounds: None. 6 FILES PATCHED OR REPLACED: o [SYSEXE]SHADOW_SERVER.EXE (new image) Image Identification Information image name: "SHADOW_SERVER" image file identification: "X-13" link date/time: 16-MAR-2005 20:01:00.93 linker identification: "V11-38" o [SYS$LDR]SHDRIVER.EXE (new image) Image Identification Information image name: "SHDRIVER" image file identification: "SHADOW03" link date/time: 16-MAR-2005 20:01:42.04 linker identification: "V11-38" 7 INSTALLATION INSTRUCTIONS 7.1 Test/Debug Image Loss In the course of debugging problems reported to OpenVMS Engineering, customers may be given debug or point-fix images to install. Typically, these images do not have the same image generation flags contained in images released via the OpenVMS remedial patch process. Because of this, any debug or point-fix image that is in the SYS$COMMON area, will be replaced by any image of the same name installed by this kit. If this occurs, you will lose any functionality that is provided by the replaced image. If you wish to retain these debug or point-fix images, you can take the following steps: o Prior to installing this kit, move the test/debug image(s) to be saved to the SYS$SPECIFIC area. o During kit installation, you will be asked if you wish to delete the image(s) in SYS$SPECIFIC. You should answer "No" for each image that you want to keep. o After installation completes, but before rebooting the system (if required), move the image(s) from SYS$SPECIFIC back to SYS$COMMON. Page 13 7.2 Compressed File This kit is provided as a DCX compressed kit. To expand this file to the installable .PCSI file, run the file with a RUN file_name command. When the file is run you will see the following output: $ RUN VAXSHAD02_073.A-DCX_VAXEXE FTSV DCX auto-extractible compressed file for OpenVMS (VAX) FTSV V3.0 -- FTSV$VAX_AXP_AUTO_EXTRACT Copyright (c) Digital Equipment Corp. 1993 Options: [output_file_specification] [input_file_specification] The decompressor needs to know the filename to use for the decompressed file. If you don't specify any, it will use the original name of the file before it was compressed, and create it in the current directory. If you specify a directory name, the file will be created in that directory. Decompress into (file specification): If you want the file to be expanded into a different directory, enter the directory specification. DO NOT enter a new file name. The expanded file must retain the original name. If you want to expand the file via batch, the command file must contain an answer to the Decompress into "(file specification)" question, either a <CR> or an alternate directory specification 7.3 Installation Command Install this kit with the VMSINSTAL utility by logging into the SYSTEM account, and typing the following at the DCL prompt: @SYS$UPDATE:VMSINSTAL VAXSHAD02_073 [location of the saveset] The saveset location may be a tape drive, CD, or a disk directory that contains the kit saveset. 8 COPYRIGHT AND DISCLAIMER: (C) Copyright 2005 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP and/or its subsidiaries required for possession, use, or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. Page 14 Neither HP nor any of its subsidiaries shall be liable for technical or editorial errors or omissions contained herein. The information in this document is provided "as is" without warranty of any kind and is subject to change without notice. The warranties for HP products are set forth in the express limited warranty statements accompanying such products. Nothing herein should be construed as constituting an additional warranty. DISCLAIMER OF WARRANTY AND LIMITATION OF LIABILITY THIS PATCH IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND. ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE HEREBY EXCLUDED TO THE EXTENT PERMITTED BY APPLICABLE LAW. IN NO EVENT WILL HP BE LIABLE FOR ANY LOST REVENUE OR PROFIT, OR FOR SPECIAL, INDIRECT, CONSEQUENTIAL, INCIDENTAL OR PUNITIVE DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, WITH RESPECT TO ANY PATCH MADE AVAILABLE HERE OR TO THE USE OF SUCH PATCH.