6017P037.HTM OSSG Documentation 22-NOV-1996 14:22:10.21
Copyright © Digital Equipment Corporation 1996. All Rights Reserved.
OpenVMS Cluster transitions do not change the state of the queue manager. Newly available nodes do not attempt to start the queue manager (unless you enter START/QUEUE/MANAGER).
Use the /CLUSTER qualifier to stop a clusterwide queue manager. If you enter the obsolete command STOP/QUEUE/MANAGER (without the /CLUSTER qualifier), the command performs the same function as STOP/QUEUES/ON_NODE. (Use STOP/QUEUES/ON_NODE to stop all queues on a single node without stopping the queue manager.)
The queue manager restarts automatically whenever you reboot the system. However, you might need to enter START/QUEUE/MANAGER for one of the following reasons:
How to Perform This Task
To restart the queue manager, use a command in the following format:
START/QUEUE/MANAGER[/ON=(node,...)] [dirspec]
Specify the /ON=(node,...) qualifier and dirspec parameter only if you want to change the value you are currently using for the qualifier or parameter. The command you enter to start the queue manager is stored in the queue database, with any qualifier or parameter you specify. If you do not specify a qualifier or parameter, the queue manager is started using the node list and location (if any) stored in the queue database.
If the queue manager does not start, see Section 12.11.1 for a troubleshooting checklist.
You can use multiple queue managers to distribute the batch and print work load among nodes and disk volumes. You need to understand what multiple queue managers are and how to create additional queue managers.
Explanations of items related to the operation of multiple queue managers follow.
Restrictions on Using Multiple Queue Managers
Multiple queue managers have the following restrictions:
Names of Multiple Queue Managers
The process name for a queue manager is the first twelve characters of the queue manager name. The default queue manager name is SYS$QUEUE_MANAGER; the default queue manager process name is QUEUE_MANAGE. If you create an additional queue manager named PRINT_MANAGER, the process name is PRINT_MANAGE.
Know the process names of all your queue managers so that you can troubleshoot queue manager problems, as explained in Section 12.11.
Multiple Queue Managers' Use of Queue Database Files
Multiple queue managers share a single master file. However, a queue database with multiple queue managers contains a queue file and a journal file for each queue manager, as explained in Section 12.2.
Commands for Managing Multiple Queue Managers
By default, the following commands affect the default queue manager SYS$QUEUE_MANAGER or the queues running on the default queue manager:
The /NAME_OF_MANAGER qualifier allows you to specify a different queue manager for these commands.
START/QUEUE/MANAGER/ADD/NAME_OF_MANAGER=name[/ON=(node,...)] [dirspec]
/ADD | Creates an additional queue manager in the existing master file and creates new queue and journal files |
/NAME_OF_MANAGER= name | Creates a non-default queue manager with a name up to 31 characters long. You can create a maximum of five queue managers. |
/ON= (node,...) | Allows you to customize failover of the queue manager. For more information, see Section 12.6. |
dirspec | Specifies the location of the queue and journal files, as explained in Section 12.3.2. Use this parameter if you are creating the queue and journal files in a location other than the default. |
Caution
Do not specify the /NEW_VERSION qualifier when you create an additional queue manager: multiple queue managers share a single master file. An additional queue file and journal file are created automatically for each additional queue manager.
Example
The command in the following example creates and starts a new queue manager named BATCH_MANAGER.
$ START/QUEUE/MANAGER/ADD/NAME_OF_MANAGER=BATCH_MANAGER/ON=(A,B,*) DUA2:[QUEUES]
When you create a queue with the INITIALIZE/QUEUE command, specify the name of the queue manager on which it is to run by including the /NAME_OF_MANAGER qualifier. If you do not specify the /NAME_OF_MANAGER qualifier, the queue is created to run on the default queue manager, SYS$QUEUE_MANAGER.
To move an existing queue from its original queue manager to a different queue manager, delete the queue with the DELETE/QUEUE command and re-create the queue with the INITIALIZE/QUEUE command.
When entering DCL commands to maintain the queue manager, be sure to specify the /NAME_OF_MANAGER qualifier to specify the queue manager to which the command is to apply. If you do not specify the /NAME_OF_MANAGER qualifier, the command is executed on the default queue manager, SYS$QUEUE_MANAGER.
Example
In the following example:
$ START/QUEUE/MANAGER/NEW_VERSION/NAME_OF_MANAGER=PRINT_MANAGER - _$ /ON=(JADE,RUBY,*) $ START/QUEUE/MANAGER/ADD/NAME_OF_MANAGER=BATCH_MANAGER - _$ /ON=(OPAL,PEARL,*) $ SHOW QUEUE/MANAGERS/FULL Master file: SYS$COMMON:[SYSEXE]QMAN$MASTER.DAT; Queue manager PRINT_MANAGER, running, on JADE:: /ON=(JADE,RUBY,*) Database location: SYS$COMMON:[SYSEXE] Queue manager BATCH_MANAGER, running, on OPAL:: /ON=(OPAL,PEARL,*) Database location: SYS$COMMON:[SYSEXE]
Each time you want to preserve changes to your queue configuration, save a copy of your queue database files. In this way, if your queue database files are not accessible, you can restore the queue database you have saved; you thus avoid having to redefine forms and characteristics and reinitialize each queue.
To save a record-by-record copy of your queue database files while the queuing system is functioning, perform the following steps. This procedure saves definitions of queues, forms, and characteristics. No job information is preserved. (Digital recommends not saving the journal file because timed and pending jobs might be reexecuted after the journal file is restored.)
CONVERT/SHARE QMAN$MASTER.DAT master-filename
CONVERT/SHARE SYS$QUEUE_MANAGER.QMAN$QUEUES queue-filename
BACKUP/LOG masterfile-name, queue-filename device:saveset-name/LABEL=label
Example
The following example is a simple procedure showing how to save the queue database.
$ SET DEFAULT SYS$COMMON:[SYSEXE] $ CONVERT/SHARE QMAN$MASTER.DAT MASTERFILE_9SEP.KEEP; $ CONVERT/SHARE SYS$QUEUE_MANAGER.QMAN$QUEUES QFILE_9SEP.KEEP; $ INITIALIZE MUA0: QDB $ MOUNT/FOREIGN MUA0: %MOUNT-I-MOUNTED, QDB mounted on _LILITH$MUA0: $ BACKUP/LOG MASTERFILE_9SEP.KEEP,QFILE_9SEP.KEEP MUA0:QDB_9SEP.SAV/LABEL=QDB %BACKUP-S-COPIED, copied SYS$COMMON:[SYSEXE]MASTERFILE_9SEP.KEEP; %BACKUP-S-COPIED, copied SYS$COMMON:[SYSEXE]QFILE_9SEP.KEEP; $ DISMOUNT MUA0:
When you restore queue database files, all queue, form, characteristic, and queue manager information is restored. However, information about jobs in the queues is not restored.
How to Perform This Task
Note
When you restore your queue database, you must always restore both the master and queue files, even if you lost only one of those files.
Example
The following example is a simple procedure showing how to restore the queue database from tape.
$ STOP/QUEUE/MANAGER/CLUSTER $ SET DEFAULT SYS$COMMON:[SYSEXE] $ DELETE SYS$QUEUE_MANAGER.QMAN$JOURNAL;,SYS$QUEUE_MANAGER.QMAN$QUEUES;, - _$ QMAN$MASTER.DAT; $ MOUNT/FOREIGN MUA0: %MOUNT-I-MOUNTED, QDB mounted on _LILITH$MUA0: $ BACKUP/LOG MUA0:QDB_9SEP.SAV/SELECT=[SYSEXE]MASTERFILE_9SEP.KEEP; - _$ QMAN$MASTER.DAT; %BACKUP-S-CREATED, created SYS$COMMON:[SYSEXE]QMAN$MASTER.DAT;1 $ SET MAGTAPE/REWIND MUA0: $ BACKUP/LOG MUA0:QDB_9SEP.SAV/SELECT=[SYSEXE]QFILE_9SEP.KEEP; - _$ SYS$QUEUE_MANAGER.QMAN$QUEUES %BACKUP-S-CREATED, created SYS$COMMON:[SYSEXE]SYS$QUEUE_MANAGER.QMAN$QUEUES;1 $ DISMOUNT MUA0: $ START/QUEUE/MANAGER
The following resources have the most effect on queuing system performance:
Use the following methods to maximize your queuing system's performance:
Use the following sections to help solve queue manager problems:
Topic | For More Information |
---|---|
Avoiding common problems: a troubleshooting checklist | Section 12.11.1 |
If the queue manager does not start | Section 12.11.2 |
If the queuing system stops or the queue manager does not run on specific nodes | Section 12.11.3 |
If the queue manager becomes unavailable | Section 12.11.4 |
If the queuing system does not work on a specific OpenVMS Cluster node | Section 12.11.5 |
If you see inconsistent queuing behavior on different OpenVMS Cluster nodes | Section 12.11.6 |
Reporting a queuing system problem to Digital support representatives | Section 12.12 |
To avoid the most common queuing system problems, make sure you have met the following requirements:
Requirement | For More Information |
---|---|
QMAN$MASTER is identically defined on all nodes in the cluster. | Section 12.3 |
The queue database is in the specified location. | Section 12.3 |
The queue database disk is mounted and available. | Section 12.3 |
The node list specified with the /ON qualifier contains a sufficient number of nodes. If you specify a node list, Digital recommends you include an asterisk (*) at the end of the node list. | Section 12.11.4 |
The system address parameters SCSNODE and SCSSYSTEMID match the DECnet for OpenVMS node name and node ID. | Section 12.11.5 |
If the queue manager does not start when you enter the START/QUEUE/MANAGER command, the system displays the following message:
%JBC-E-QMANNOTSTARTED, queue manager could not be started
Search the operator log file SYS$MANAGER:OPERATOR.LOG (or look on the operator console) for messages from the queue manager and job controller for information about the problem, as follows:
$ SEARCH SYS$MANAGER:OPERATOR.LOG/WINDOW=5 QUEUE_MANAGE, JOB_CONTROL,BATCH_MANAGE
Use the information provided with these messages to further investigate the problem, making sure you have met the requirements listed in Section 12.11.1.
The cause of the problem is the system's inability to find the queue master file. Often the logical is not defined correctly, or the disk is not available. For example, the following message indicates that the master queue file does not exist in the expected location:
%%%%%%%%%%% OPCOM 13-MAR-1996 15:53:52.84 %%%%%%%%%%% Message from user JOB_CONTROL on ABDCEF %JBC-E-OPENERR, error opening SYS$COMMON:[SYSEXE]QMAN$MASTER.DAT %%%%%%%%%%% OPCOM 13-MAR-1996 15:53:53.04 %%%%%%%%%%% Message from user JOB_CONTROL on ABDCEF -SYSTEM-W-NOSUCHFILE, no such file
On systems with multiple queue managers, search for messages displayed by additional queue managers by including their process names in the search string. To display information about queue managers running on your system, use the SHOW QUEUE/MANAGERS command as explained in Section 12.4. Correct any problem indicated in the displayed information.
Example
$ START/QUEUE/MANAGER DUA55:[SYSQUE] (1) %JBC-E-QMANNOTSTARTED, queue manager could not be started (2) $ SEARCH SYS$MANAGER:OPERATOR.LOG /WINDOW=5 QUEUE_MANAGE,JOB_CONTROL (3) %%%%%%%%%%% OPCOM 14-APR-1996 18:55:18.23 %%%%%%%%%%% Message from user QUEUE_MANAGE on CATNIP %QMAN-E-OPENERR, error opening DUA55:[SYSQUE]SYS$QUEUE_MANAGER.QMAN$QUEUES; %%%%%%%%%%% OPCOM 14-APR-1996 18:55:18.29 %%%%%%%%%%% Message from user QUEUE_MANAGE on CATNIP -RMS-F-DEV, error in device name or inappropriate device type for operation %%%%%%%%%%% OPCOM 14-APR-1996 18:55:18.31 %%%%%%%%%%% Message from user QUEUE_MANAGE on CATNIP -SYSTEM-W-NOSUCHDEV, no such device available (4) $ START/QUEUE/MANAGER DUA5:[SYSQUE] (5)
For more information about multiple queue managers and their process names, see Section 12.8.1.
Use this section if the queue manager does not run on a specific node in the cluster, or if the queuing system stops, especially after one of the following:
Check the operator log that was current at the time the queue manager started up or failed over. Search the log for operator messages from the queue manager.
On systems with multiple queue managers, also search for messages displayed by additional queue managers by including their process names in the search string. To display information about queue managers running on your system, use the SHOW QUEUE/MANAGERS command, as explained in Section 12.4.
For more information about multiple queue managers and their process names, see Section 12.8.1.
The following messages indicate that the queue database is not in the specified location:
%%%%%%%%%%% OPCOM 4-FEB-1996 15:06:25.21 %%%%%%%%%%% Message from user QUEUE_MANAGE on MANGLR %QMAN-E-OPENERR, error opening CLU$COMMON:[SYSEXE]SYS$QUEUE_MANAGER.QMAN$QUEUES; %%%%%%%%%%% OPCOM 4-FEB-1996 15:06:27.29 %%%%%%%%%%% Message from user QUEUE_MANAGE on MANGLR -RMS-E-FNF, file not found %%%%%%%%%%% OPCOM 4-FEB-1996 15:06:27.45 %%%%%%%%%%% Message from user QUEUE_MANAGE on MANGLR -SYSTEM-W-NOSUCHFILE, no such file
The following messages indicate that the queue database disk is not mounted:
%%%%%%%%%%% OPCOM 4-FEB-1996 15:36:49.15 %%%%%%%%%%% Message from user QUEUE_MANAGE on MANGLR %QMAN-E-OPENERR, error opening DISK888:[QUEUE_DATABASE]SYS$QUEUE_MANAGER.QMAN$QUEUES; %%%%%%%%%%% OPCOM 4-FEB-1996 15:36:51.69 %%%%%%%%%%% Message from user QUEUE_MANAGE on MANGLR -RMS-F-DEV, error in device name or inappropriate device type for operation %%%%%%%%%%% OPCOM 4-FEB-1996 15:36:52.20 %%%%%%%%%%% Message from user QUEUE_MANAGE on MANGLR -SYSTEM-W-NOSUCHDEV, no such device available
The queuing system does not work correctly under the following circumstances:
In general, the queuing system will be shut off completely if the queue manager encounters a serious error and forces a crash or failover twice in two minutes consecutively on the same node. Therefore, the queuing system may have stopped, or it may continue to run if the queue manager moves to yet another node on which it can access the database after the original failed startup.
Perform the following steps:
The queue manager becomes unavailable if it does not start or has stopped running.
To investigate the problem, enter SHOW CLUSTER to see if the nodes on the list are available.
An insufficient failover node list might have been specified for the queue manager, so that none of the nodes in the failover list is available to run the queue manager.
Make sure the queue manager list contains a sufficient number of nodes by entering START/QUEUE/MANAGER with the /ON qualifier to specify a node list appropriate for your configuration.
If you are in doubt about what nodes to specify, Digital recommends that you specify an asterisk (*) wildcard character as the last node in the list; the asterisk indicates that any remaining node in the cluster can run the queue manager. Specifying the asterisk prevents your queue manager from becoming unavailable because of an insufficient node list.
Use this section if the queuing system does not work on a specific node when it starts up.
Perform the following steps:
%%%%%%%%%%% OPCOM 4-FEB-1996 15:36:49.15 %%%%%%%%%%% Message from user QUEUE_MANAGE on ZNFNDL %QMAN-E-COMMERROR, unexpected error #5 in communicating with node CSID 000000 %%%%%%%%%%% OPCOM 4-FEB-1996 15:36:49.15 %%%%%%%%%%% Message from user QUEUE_MANAGE on ZNFNDL -SYSTEM-F-WRONGACP, wrong ACP for device_
$ RUN SYS$SYSTEM:SYSMAN SYSMAN> PARAMETERS SHOW SCSSYSTEMID Parameter Name Current Default Min. Max. Unit Dynamic -------------- ------- ------- ------- ------- ---- ------- SCSSYSTEMID 19941 0 -1 -1 Pure-numbe SYSMAN> PARAMETERS SHOW SCSNODE Parameter Name Current Default Min. Max. Unit Dynamic -------------- ------- ------- ------- ------- ---- ------- SCSNODE "RANDY " " " " " "ZZZZ" Ascii SYSMAN> EXIT $ RUN SYS$SYSTEM:NCP NCP> SHOW EXECUTOR SUMMARY Node Volatile Summary as of 5-FEB-1996 15:50:36 Executor node = 19.45 (DREAMR) State = on Identification = DECnet for OpenVMS V7.1 NCP> EXIT $ WRITE SYS$OUTPUT 19*1024+45 19501
If the DECnet node name and node ID do not match the SCSNODE and SCSSYSTEMID system address parameters, IPC (interprocess communication, an operating system internal mechanism) cannot work properly and the affected node will not be able to participate in the queuing system.
Perform the following steps:
Use this section if you see the following symptoms:
Perform the following steps:
%%%%%%%%%%% OPCOM 4-FEB-1996 14:41:20.88 %%%%%%%%%%% Message from user JOB_CONTROL on MANGLR %JBC-E-OPENERR, error opening BOGUS:[QUEUE_DIR]QMAN$MASTER.DAT; %%%%%%%%%%% OPCOM 4-FEB-1996 14:41:21.12 %%%%%%%%%%% Message from user JOB_CONTROL on MANGLR -RMS-E-FNF, file not found
This problem may be caused by different definitions for the logical name QMAN$MASTER on different nodes in the cluster, causing multiple queuing environments. You typically find this problem in OpenVMS Cluster environments when you have just added a system disk or moved the queuing database.
Perform the following steps:
STOP/QUEUE/MANAGER/CLUSTER/NAME_OF_MANAGER=name
If you encounter problems with the queuing system that you need to report to Digital support representative, provide the information in the following table. This information will help Digital representatives diagnose your problem. Please provide as much of the information as possible. Previous | Next | Contents