[Digital logo]
[HR]

OpenVMS VAX System Dump Analyzer Utility Manual


Previous | Contents

You can use the SDA COPY command or the DCL COPY command in your site-specific startup procedure. Digital recommends using the SDA COPY command because it marks the dump file as copied. This is particularly important if the dump was written into the paging file, SYS$SYSTEM:PAGEFILE.SYS, because the SDA COPY command releases to the pager the pages that were occupied by the dump.

Using /IGNORE=NOBACKUP

Because system dump files are set to NOBACKUP, the Backup utility (BACKUP) does not copy dump files to tape unless you use the qualifier /IGNORE=NOBACKUP when invoking BACKUP. When you use the SDA COPY command to copy the system dump file to another file, the new file is not set to NOBACKUP.

As included in the distribution kit, SYS$SYSTEM:SYSDUMP.DMP is protected against world access. Because a dump file can contain privileged information, Digital recommends that you continue to protect dump files from universal read access.

2.3 Invoking SDA in the Site-Specific Startup Command Procedure

Because a listing of the SDA output is an important source of information in determining the cause of a system failure, it is a good idea to have SDA produce such a listing after every failure. The system manager can ensure the creation of a listing by modifying the site-specific startup command procedure SYS$MANAGER:SYSTARTUP_VMS.COM so that it invokes SDA when the system is booted.

When invoked in the site-specific startup procedure, SDA executes the specified commands only if the system is booting immediately after a system failure. SDA examines a flag in the dump file's header that indicates whether it has already processed the file. If the flag is set, SDA merely exits. If the flag is clear, SDA executes the specified commands and sets the flag. This flag is clear when the operating system initially writes a crash dump, except for those resulting from an operator-requested shutdown (for instance, SYS$SYSTEM:SHUTDOWN.COM).

Using SYSDUMP.DMP

The following example shows typical commands that you might add to your site-specific startup command procedure to produce an SDA listing after each failure.

$ ! 
$ !      Print dump listing if system just failed 
$ ! 
$ ANALYZE/CRASH_DUMP SYS$SYSTEM:SYSDUMP.DMP 
   COPY SYS$SYSTEM:SAVEDUMP.DMP       ! Save dump file 
   SET OUTPUT DISK1:SYSDUMP.LIS       ! Create listing file 
   READ/EXEC         ! Read symbols into the SDA symbol table 
   SHOW CRASH        ! Display crash information 
   SHOW STACK        ! Show current stack 
   SHOW SUMMARY      ! List all active processes 
   SHOW PROCESS/PCB/PHD/REG           ! Display current process 
   SHOW SYMBOL/ALL   ! Print system symbol table 
   EXIT 
$ PRINT DISK1:SYSDUMP.LIS 

The COPY command in the preceding example saves the contents of the file SYS$SYSTEM:SYSDUMP.DMP. If your system's startup command file does not save a copy of the contents of this file, this crash dump information is lost in the next system failure, when the system saves the information about the new failure, overwriting the contents of SYS$SYSTEM:SYSDUMP.DMP.

Using PAGEFILE.SYS

If you are using the SYS$SYSTEM:PAGEFILE.SYS as the crash dump file, you must include SDA commands in SYS$MANAGER:SYSTARTUP_VMS.COM that free the space occupied by the dump so that the pager can use it. For instance:

$ ANALYZE/CRASH_DUMP SYS$SYSTEM:PAGEFILE.SYS 
   .
   .
   .
   COPY dump_filespec 
   EXIT 

3 Analyzing a System Dump

SDA performs certain tasks prior to bringing a dump into memory, presenting its initial displays, and accepting command input. This section describes those tasks, which include the following:

For detailed information about the investigation of a system failure, see Section 9.

Requirements

To be able to analyze a dump file, your process must have the following:

3.1 Invoking SDA

If your process satisfies these conditions, you can issue the DCL command ANALYZE/CRASH_DUMP to invoke SDA. If you do not specify the name of a dump file in the command, SDA prompts you for the name of the file, as follows:

$ ANALYZE/CRASH_DUMP
_Dump File:

The default file specification is as follows:

disk:[default-dir]SYSDUMP.DMP 

disk and [default-dir] represent the disk and directory specified in your last SET DEFAULT command.

3.2 Mapping the Contents of the Dump File

SDA first attempts to map the contents of physical memory as stored in the specified dump file. To do this, it must first locate the system page table (SPT) among its contents. The SPT contains one entry for each page of system virtual address space.

The SPT appears at the largest physical addresses in a typical configuration. As a result, if a dump file is too small, the SPT cannot be written to it in the event of system failure.

If SDA cannot find the SPT in the dump file, it displays either of the following messages:

%SDA-E-SPTNOTFND, system page table not found in dump file 
%SDA-E-SHORTDUMP, the dump only contains m out of n pages of physical memory 

If SDA displays either of these error messages, you cannot analyze the crash dump, but must take steps to ensure that any subsequent dump can be preserved. To do this, you must increase the size of the dump file, as indicated in Section 2.1, or adjust the system DUMPSTYLE parameter, as discussed in Section 2.1.2.

Under certain conditions, the system might not save some memory locations in the system dump file. For instance, during halt/restart bugchecks, the system does not preserve the contents of general registers. If such a bugcheck occurs, SDA indicates in the SHOW CRASH display that the contents of the registers were destroyed. Additionally, if a bugcheck occurs during system initialization, the contents of the register display might be unreliable. The symptom of such a bugcheck is a SHOW SUMMARY display that shows no processes or only the swapper process.

Also, if you use an SDA command to access a virtual address that has no corresponding physical address, SDA displays the following error message:

%SDA-E-NOTINPHYS, 'location' not in physical memory 

When you analyze a subset dump file, if you use an SDA command to access a virtual address that has a corresponding physical address but was not saved in the dump file, SDA displays the following error message:

%SDA-E-MEMNOTSVD, memory not saved in the dump file 

3.3 Building the SDA Symbol Table

After locating and reading the system dump file, SDA attempts to read the system symbol table file into the SDA symbol table. This file, named SYS$SYSTEM:SYS.STB by default, contains most of the global symbols used by the operating system. SDA also reads into its symbol table a subset of SYS$SYSTEM:SYSDEF.STB, called SYS$SYSTEM:REQSYSDEF.STB, that it requires to identify locations in memory.

If SDA cannot find the system symbol table file, or if it is given a file that is not a system symbol table in the /SYMBOL qualifier to the ANALYZE command, it halts with a fatal error.

When SDA finishes building its symbol table, it displays a message identifying itself and the immediate cause of the crash. In the following example, the cause of the crash was an illegal exception occurring at an IPL above IPL$_ASTDEL or while using the interrupt stack.

Dump taken on 28-Jan-1993  18:10:09.79 
INVEXCEPTN, Exception while above ASTDEL or on interrupt stack 

3.4 Executing the SDA Initialization File (SDA$INIT)

After displaying the crash summary, SDA executes the commands in the SDA initialization file, if you have established one. SDA refers to its initialization file by using the logical name SDA$INIT. If SDA cannot find the file defined as SDA$INIT, it searches for the file SYS$LOGIN:SDA.INIT.

The initialization file can contain SDA commands that read symbols into SDA's symbol table, define keys, establish a log of SDA commands and output, or perform other tasks. For instance, you might want to use an SDA initialization file to augment SDA's symbol table with definitions helpful in locating system code.

If you issue the following command, SDA includes those symbols that define many of the system's data structures, including those in the I/O database:

READ SYS$SYSTEM:SYSDEF.STB 

You might also find it very helpful to define those symbols that identify the modules in the images that make up the executive. You can do this by issuing the following command:

READ/EXECUTIVE SYS$LOADABLE_IMAGES 

After SDA executes the commands in the initialization file, it displays its prompt, as follows:

SDA> 

The SDA> prompt indicates that you can use SDA interactively and enter SDA commands.

An SDA initialization file can invoke a command procedure with the @ command. However, such command procedures cannot themselves invoke a command procedure (that is, you cannot have nested command procedures).

4 Analyzing a Running System

Occasionally, an internal problem hinders system performance but does not cause a system failure. By allowing you to examine the running system, SDA provides the means to search for the solution to the problem without disturbing the operating system. For example, you can use SDA to examine the stack and memory of a process that is stalled in a scheduler state, such as a miscellaneous wait (MWAIT) or a suspended (SUSP) state (see the Guide to OpenVMS Performance Management).

If your process has change-mode-to-kernel (CMKRNL) privilege, you can invoke SDA to examine the system. Use the following DCL command:

$ ANALYZE/SYSTEM

SDA then does the following:

  1. Attempts to load the system symbol table (SYS$SYSTEM:SYS.STB) and symbol table SYS$SYSTEM:REQSYSDEF.STB.
  2. Executes the contents of any existing SDA initialization file, as it does when invoked to analyze a crash dump (see Sections 3.3 and 3.4, respectively).
  3. Displays its identification message and prompt, as follows:
    OpenVMS System analyzer 
     
    SDA> 
    

The SDA> prompt indicates that you can use SDA interactively and enter SDA commands. When analyzing a running system, SDA sets its process context to that of the process running SDA.

If you are undertaking an analysis of a running system, take the following considerations into account:

5 SDA Context

When invoked to analyze either a crash dump or a running system, SDA establishes a default context from which it interprets certain commands.

When the subject of analysis is a uniprocessor system, SDA's context is solely process context. That is, SDA can interpret its process-specific commands in the context of either the process current on the uniprocessor or some other process in some other scheduling state.

When you initially invoke SDA to analyze a crash dump, its process context defaults to that of the process that was current at the time of the crash. When you invoke SDA to analyze a running system, its process context defaults to that of the current process; that is, the one executing SDA.

You can change SDA's process context by issuing any of the following commands:

6 CPU Context

In a uniprocessor system only one CPU exists, and the concept of SDA CPU context is not an issue. However, for a multiprocessor system with more than one active CPU, SDA must maintain an idea of CPU context to provide a way of displaying information bound to a specific CPU, such as the reason for the bugcheck exception, the currently executing process, the current IPL, the contents of CPU registers, and any owned spin locks. When you first invoke SDA to analyze a crash dump, the SDA current CPU is the CPU that induced the system failure.

Changing the CPU Context

You can use several SDA commands to change the CPU context. When you change the CPU context, the "SDA current process" is changed to the current process on the "SDA current CPU" to synchronize CPU context and process context. If no current process is on the "SDA current CPU," the "SDA current process" is undefined; no process context information will be available until you set SDA process context to a specific process.

Type HELP PROCESS_CONTEXT for specific information about the "SDA current process."

The following SDA commands change the "SDA current CPU":
Command Description
SET CPU cpu_id Changes the "SDA current CPU" to CPU cpu_id
SHOW CPU cpu_id Changes the "SDA current CPU" to CPU cpu_id
SHOW CRASH Changes the "SDA current CPU" to the CPU that induced the system failure

If you select a process that is the current process on a CPU, the following commands change the "SDA current CPU" to that CPU:

No other SDA commands affect the "SDA current CPU."


Note

When you analyze the running system, you cannot use the SET CPU and SHOW CPU commands because SDA does not have access to all the CPU-specific information about the running system.

7 Process Context

In a uniprocessor system, process context might be the process that is current on the CPU or the process in whose context process-specific SDA commands are interpreted. For a multiprocessor system with more than one active CPU, however, the meaning of SDA process context changes so that it includes a way to display information relevant to a specific process both when the process is current on a processor and when the process is not.

You can use several SDA commands to change SDA process context. Following is a list of the results of some of these changes:

Type HELP CPU_CONTEXT for specific information about the "SDA current CPU."

The following SDA commands change the "SDA current process":
Command Description
SET PROCESS name Changes the "SDA current process" to the named process
SET PROCESS /INDEX=n Changes the "SDA current process" to the process with index n
SHOW PROCESS name Changes the "SDA current process" to the named process
SHOW PROCESS /INDEX=n Changes the "SDA current process" to the process with index n

The following commands change the SDA process context if the "SDA current process" is not the current process on the selected CPU:
Command Description
SET CPU cpu_id Changes the "SDA current process" to the current process on CPU cpu_id
SHOW CPU cpu_id Changes the "SDA current process" to the current process on CPU cpu_id
SHOW CRASH Changes the "SDA current process" to the current process on the CPU that induced the system failure

No other SDA commands affect the "SDA current process."


Note

When you analyze the running system, CPU context is not used because all the CPU-specific information might not be available.

Changing the SDA CPU Context

When you invoke SDA to analyze a crash dump from a multiprocessing system with more than one active CPU, SDA maintains a second dimension of context---its CPU context---that allows it to display certain processor-specific information, such as the reason for the bugcheck exception, the currently executing process, the current IPL, the contents of processor-specific registers, the interrupt stack pointer (ISP), and the spin locks owned by the processor. When you invoke SDA to analyze a multiprocessor's crash dump, its CPU context defaults to that of the processor that induced the system failure.³

You can change the SDA CPU context by using any of the following commands:

Changing CPU context involves an implicit change in process context in either of the following ways:

Likewise, changing process context can involve a switch of CPU context as well. For instance, if you issue a SET PROCESS command for a process that is current on another CPU, SDA automatically changes its CPU context to that of the CPU on which that process is current. The following commands can have this effect if the name or index number (nn) refers to a current process:


Note

³ When you are analyzing a running system, CPU context is not accessible to SDA. Therefore, the SET CPU and SHOW CPU commands are not permitted.


8 SDA Command Format

The following sections describe the format of SDA commands and the expressions you can use with SDA commands.

8.1 General Command Format

SDA uses a command format similar to that used by the DCL interpreter. You issue commands in this general format:

command-name[/qualifier...] [parameter][/qualifier...] [!comment] 

where:
command-name Is an SDA command. Each command tells the utility to perform a function. Commands can consist of one or more words, and can be abbreviated to the number of characters that make the command unique. For example, SH stands for SHOW and SE stands for SET.
/qualifier Modifies the action of an SDA command. A qualifier is always preceded by a slash (/). Several qualifiers can follow a single parameter or command name, but a slash must precede each. You can abbreviate qualifiers to the shortest string of characters that uniquely identifies the qualifier.
parameter Is the target of the command. For example, SHOW PROCESS RUSKIN tells SDA to display the context of the process RUSKIN. The command EXAMINE 80104CD0;40 displays the contents of 40 bytes of memory, beginning with location 80104CD0.

When you supply part of a file specification as a parameter, SDA assumes default values for the omitted portions of the specification. The default device SYS$DISK and default directory are those specified in your most recent SET DEFAULT command. See the OpenVMS DCL Dictionary for a description of the DCL command SET DEFAULT.

!comment Consists of text that describes the command, but this text is not actually part of the command. Comments are useful for documenting SDA command procedures. When executing a command, SDA ignores the exclamation point (!) and all characters that follow it on the same line.

8.2 Expressions

You can use expressions as parameters for some SDA commands, such as SEARCH and EXAMINE. To create expressions, you can use any of the following elements:

The following sections describe elements other than numerals.

8.2.1 Radix Operators

Radix operators determine which numeric base SDA uses to evaluate expressions. You can use one of three radix operators to specify the radix of the numeric expression that follows the operator:

The default radix is hexadecimal. SDA displays hexadecimal numbers with leading zeros and decimal numbers with leading spaces.

8.2.2 Arithmetic and Logical Operators

There are two types of arithmetic and logical operators, both of which are listed in Table SDA-8.

In evaluating expressions containing binary operators, SDA performs logical AND, OR, and XOR operations, and multiplication, division, and arithmetic shifting before addition and subtraction. Note that the SDA arithmetic operators perform integer arithmetic on 32-bit operands.

Table SDA-8 SDA Operators
Operator Action
Unary Operators
# Performs a logical NOT of the expression
+ Makes the value of the expression positive
-- Makes the value of the expression negative
@ Evaluates the following expression as a virtual address, then uses the contents of that address as value
G Adds 80000000 16 to the value of the expression¹
H Adds 7FFE0000 16 to the value of the expression²
Binary Operators
+ Addition
-- Subtraction
* Multiplication
& Logical AND
| Logical OR
\ Logical XOR
/ Division³
@ Arithmetic shifting


¹The unary operator G corresponds to the first virtual address in system space. For example, the expression GD40 can be used to represent the address 80000D4016.
²The unary operator H corresponds to a convenient base address in the control region of a process (7FFE000016). You can therefore refer to an address such as 7FFE2A6416 as H2A64.
³In division, SDA truncates the quotient to an integer, if necessary, and does not retain a remainder.

8.2.3 Precedence Operators

SDA uses parentheses as precedence operators. Expressions enclosed in parentheses are evaluated first. SDA evaluates nested parenthetical expressions from the innermost to the outermost pairs of parentheses.

8.2.4 Symbols

Names of symbols can contain from 1 to 31 alphanumeric characters and can include the dollar sign ($) and underscore (_) characters. Symbols can take values from --7FFFFFFF16 to 7FFFFFFF16.

By default, SDA copies symbols into its symbol table from the files SYS$SYSTEM:SYS.STB and SYS$SYSTEM:REQSYSDEF.STB. To add more symbols to the symbol table, you can use the following SDA commands:

In addition, SDA provides the symbols described in Table SDA-9.

Table SDA-9 SDA Symbols
Symbol Meaning
. (period) Current location
2P_CDDB Address of alternate CDDB for MSCP-served device¹
2P_UCB Address of alternate UCB for dual-pathed device¹
AMB Associated mailbox UCB pointer¹
AP Argument pointer²
CDDB Address of class driver descriptor block for MSCP-served device¹
CLUSTRLOA Base address of loadable VAXcluster code
CRB Address of channel request block¹
DDB Address of device data block¹
DDT Address of driver dispatch table¹
nnDRIVER Base address of a driver prologue table (DPT); such a symbol exists for each loaded device driver in the system³
ESP Executive stack pointer²
FP Frame pointer²
FPEMUL Base address of the code that emulates floating-point instructions
G 80000000 16, the base address of system space
H 7FFE0000 16
IRP Address of I/O request packet¹
JIB Job information block
KSP Kernel stack pointer²
LNM Address of logical name block for mailbox¹
MCHK Address within loadable CPU-specific routines
MSCP Address of loadable MSCP server code
ORB Address of object rights block¹
P0BR Base register for the program region (P0)²
P0LR Length register for the program region (P0)²
P1BR Base register for the control region (P1)²
P1LR Length register for the control region (P1)²
PC Program counter²
PCB Process control block
PDT Address of port descriptor table¹
PHD Process header
PSL Processor status longword²
R0 through R11 General registers²
RMS Base address of the RMS image
RWAITCNT Resource wait count for MSCP-served device¹
SB Address of system block¹
SCSLOA Base address of loadable common SCS services
SP Current stack pointer of a process²
SSP Supervisor stack pointer²
SYSLOA Base address of loadable processor-specific system code
TMSCP Address of loadable TMSCP server code
UCB Address of unit control block¹
USP User stack pointer²
VCB Address of volume control block for mounted device¹


¹The SHOW DEVICE command defines this symbol, if appropriate, to represent information pertinent to the last displayed device unit. See the description of the SHOW DEVICE command for additional information.
²The value of those symbols representing the current SDA process context changes whenever you issue a command that changes the context (see Section 5). These symbols include the general-purpose registers (R0 through R11, AP, FP, PC, and SP); the per-process stack pointers (USP, SSP, KSP); the page table base and length registers (P0BR, P0LR, P1BR, and P1LR); and the processor status longword (PSL).
³The notation nn within the symbol nnDRIVER represents a 2-letter, generic device/controller name (for example, LPDRIVER).

When SDA displays an address, it displays that address both in hexadecimal and as a symbol, if possible. If the address is within FFF16 of the value of a symbol, SDA displays the symbol plus the offset from the value of that symbol to the address. If more than one symbol's value is within FFF16 of the address, SDA displays the symbol whose value is the closest. If no symbols have values within FFF16 of the address, SDA displays no symbol. (For an example, see the description of the SHOW STACK command.)

9 Investigating System Failures

This section discusses how the operating system handles internal errors and suggests procedures that can aid you in determining the causes of these errors. To conclude, it illustrates, through detailed analysis of a sample system failure, how SDA helps you find the causes of operating system problems.

For a complete description of the commands discussed in the sections that follow, refer to the SDA Commands section.

9.1 General Procedure for Analyzing System Failures

When the operating system detects an internal error so severe that normal operation cannot continue, it signals a condition known as a fatal bugcheck and shuts itself down. A specific bugcheck code describes each such error.

To resolve the problem, you must find the reason for the bugcheck. Most failures are caused by errors in user-written device drivers or other privileged code not supplied by Digital. To identify and correct these errors, you need a listing of the code in question.


Previous | Next | Contents | [Home] | [Comments] | [Ordering info] | [Help]

[HR]

  4556P001.HTM
  OSSG Documentation
  22-NOV-1996 14:13:03.33

Copyright © Digital Equipment Corporation 1996. All Rights Reserved.

Legal