Step | Task |
---|---|
1 |
Specify SYS$INPUT as the input file on the SORT or MERGE command line.
| [Home] | [Comments] | [Ordering info] | [Help]
6489P018.HTM OSSG Documentation 22-NOV-1996 13:17:02.42 Copyright © Digital Equipment Corporation 1996. All Rights Reserved. Use the input file qualifier /FORMAT to specify the size of the longest record, in bytes, and the approximate size of the input file, in blocks. |
2 |
Enter the input records on successive lines.
End each record by pressing Return. |
3 | Press Ctrl/Z to end the file. |
Example
The following example demonstrates a Sort operation in which the input records to be sorted are entered directly from the terminal:
$ SORT/KEY=(POSITION:8,SIZE:15) - _$ SYS$INPUT/FORMAT=(RECORD_SIZE:24,FILE_SIZE:10) BYNAME.LST BST 7828 MCMAHON JANE [Return] ADM 7933 ROSENBERG HARRY[Return] COM 8102 KNIGHT MARTHA[Return] ANS 8042 BENTLEY PETER[Return] BIO 7951 LOWELL FRANK[Return] [Ctrl/Z]
This sequence of commands creates the output file BYNAME.LST, which contains the sorted records.
Sort/Merge allows you to maintain sort definitions and to specify more complex sort criteria in specification files. (The high-performance Sort/Merge utility does not support specification files. Implementation of this feature is deferred to a future OpenVMS Alpha release.) You can use any standard editor, or the DCL CREATE command to create a specification file.
A Sort/Merge specification file allows you to:
After you complete the specification file, specify the file name using the /SPECIFICATION qualifier. The default file type for a specification file is .SRT.
Each command in the specification file should start with a slash (/) and continuation characters are not required if a command spans more than one line.
Note
Many of the qualifiers used in the specification file are similar to the DCL qualifiers used in the Sort/Merge command line. Note, however, that the syntax of these qualifiers can be different. For example, the /KEY qualifier at DCL level has different syntax than the /KEY qualifier in the specification file. See Section 11.10.4 for a summary of the specification file qualifiers.
Any DCL command qualifiers that you specify on the command line override corresponding entries in the specification file. For example, if you specify the /KEY qualifier in the DCL command line, Sort/Merge ignores the /KEY clause in the specification file.
Generally, there is no required order in which you must specify the qualifiers in a specification file. However, the order becomes significant in the following cases:
When you specify the FOLD, MODIFICATION, and IGNORE keywords with the /COLLATING_SEQUENCE qualifier, you should specify all MODIFICATION and IGNORE clauses before any FOLD clauses. See Section 11.10.4 for more information about the /COLLATING_SEQUENCE qualifier.
You can include comments in your specification file by beginning each comment line with an exclamation point (!). Unlike DCL command lines, specification files do not need hyphens (-) to continue the line.
! Specification file for sorting negative and positive data ! in ascending order ! /FIELD=(NAME=SIGN,POS:1,SIZ:1) /FIELD=(NAME=AMT,POS:2,SIZ:4) /CONDITION=(NAME=CHECK1, TEST=(SIGN EQ "-")) /CONDITION=(NAME=CHECK2, TEST=(SIGN EQ " ")) /INCLUDE=(CONDITION=CHECK1, KEY=(AMT,DESCENDING), DATA=SIGN, DATA=AMT) /INCLUDE=(CONDITION=CHECK2, KEY=(AMT,ASCENDING), DATA=SIGN, DATA=AMT)
Figure 11-8 Output from Using a Specification File
/FIELD=(NAME=RECORD_TYPE,POS:1,SIZ:1) ! Record type, 1-byte /FIELD=(NAME=PRICE,POS:2,SIZ:8) ! Price, both files /FIELD=(NAME=TAXES,POS:10,SIZ:5) ! Taxes, both files /FIELD=(NAME=STYLE_A,POS:15,SIZ:10) ! Style, format A file /FIELD=(NAME=STYLE_B,POS:20,SIZ:10) ! Style, format B file /FIELD=(NAME=ZIP_A,POS:25,SIZ:5) ! Zip code, format A file /FIELD=(NAME=ZIP_B,POS:15,SIZ:5) ! Zip code, format B file /CONDITION=(NAME=FORMAT_A, ! Condition test, format A TEST=(RECORD_TYPE EQ "A")) /CONDITION=(NAME=FORMAT_B, ! Condition test, format B TEST=(RECORD_TYPE EQ "B")) /INCLUDE=(CONDITION=FORMAT_A, ! Output format, type A KEY=ZIP_A, DATA=PRICE, DATA=TAXES, DATA=STYLE_A, DATA=ZIP_A) /INCLUDE=(CONDITION=FORMAT_B, ! Output format, type B KEY=ZIP_B, DATA=PRICE, DATA=TAXES, DATA=STYLE_B, DATA=ZIP_B)
|A|PRICE|TAXES|STYLE|ZIP| 1 2 10 15 25
|B|PRICE|TAXES|ZIP|STYLE| 1 2 10 15 20
/COLLATING_SEQUENCE=(SEQUENCE= ("AN","EB","AR","PR","AY","UN","UL", "UG","EP","CT","OV","EC","0"-"9"), MODIFICATION=("'"="19"), FOLD)
16 NOV 1983 Communication Skills 05 APR 1984 Coping with Alcoholism 11 Jan '84 How to Be Assertive 12 OCT 1983 Improving Productivity 15 MAR 1984 Living with Your Teenager 08 FEB 1984 Single Parenting 07 Dec '83 Stress --- Causes and Cures 14 SEP 1983 Time Management
14 SEP 1983 Time Management 12 OCT 1983 Improving Productivity 16 NOV 1983 Communication Skills 07 Dec '83 Stress --- Causes and Cures 11 Jan '84 How to Be Assertive 08 FEB 1984 Single Parenting 15 MAR 1984 Living with Your Teenager 05 APR 1984 Coping with Alcoholism
/FIELD=(NAME=AGENT,POSITION:20,SIZE:15) /CONDITION=(NAME=AGENCY, TEST=(AGENT EQ "Real-T Trust" OR AGENT EQ "Realty Trust")) /DATA=(IF AGENCY THEN "Realty Trust" ELSE AGENT)
/FIELD=(NAME=ZIP,POSITION:60,SIZE:6) /CONDITION=(NAME=LOCATION, TEST=(ZIP EQ "01863")) /KEY=(IF LOCATION THEN 1 ELSE 2)
/FIELD=(NAME=ZIP,POSITION:60,SIZE:6) /CONDITION=(NAME=LOCATION, TEST=(ZIP EQ "01863")) /DATA=(IF LOCATION THEN "NORTH CHELMSFORD" ELSE "Outside district")
/FIELD=(NAME=FFLOAT,POS:1,SIZ:0,F_FLOATING) /CONDITION=(NAME=CFFLOAT,TEST=(FFLOAT GE 100)) /OMIT=(CONDITION=CFFLOAT)
/FIELD=(NAME=AGENT,POSITION:1,SIZE:5) /FIELD=(NAME=ZIP,POSITION:6,SIZE:3) /FIELD=(NAME=STYLE,POSITION:10,SIZE:5) /FIELD=(NAME=CONDITION,POSITION:16,SIZE:9) /FIELD=(NAME=PRICE,POSITION:26,SIZE:5) /FIELD=(NAME=TAXES,POSITION:32,SIZE:5) /DATA=PRICE /DATA=" " /DATA=TAXES /DATA=" " /DATA=STYLE /DATA=" " /DATA=ZIP /DATA=" " /DATA=AGENT
AGENT ZIP STYLE CONDITION PRICE TAXES
PRICE TAXES STYLE ZIP AGENT
There are several ways in which you can improve the efficiency of a Sort or Merge operation, based on your sorting environment. Use the /STATISTICS qualifier with the SORT or MERGE command to get information about the variables in your sorting environment. (The high-performance Sort/Merge utility does not support this qualifier. Implementation of this feature is deferred to a future OpenVMS Alpha release.)
After you examine the statistics display, consider any of the optimization options presented in the following sections.
When you enter the SORT or MERGE command with the /STATISTICS qualifier, you see output similar to the following:
$ SORT/STATISTICS PAGEANT.LIS DOCUMENT.LIS OpenVMS Sort/Merge Statistics Records read: 3 (1) Input record length: 26 Records sorted: 3 Internal length: 28 Records output: 3 Output record length: 26 Working set extent: 16384 (2) Sort tree size: 42 Virtual memory: 392 Number of initial runs: 0 Direct I/O: 10 Maximum merge order: 0 Buffered I/O: 11 Number of merge passes: 0 Page faults: 158 (3) Work file allocation: 0 (4) Elapsed time: 00:00:00.54 Elapsed CPU: 00:00:00.03 (5)
As you examine the fields, note the following:
Sort defines four processes for sorting data internally: record, tag, address and indexed. (The high-performance Sort/Merge utility supports only the record process. Implementation of tag, address, and index processes is deferred to a future OpenVMS Alpha release.) RECORD is the default process. The type of process you choose affects the performance of the Sort operation as well as storage requirements. See the Section 11.3.8 for information about the different sort processes.
Before you select a sorting process, consider the following:
From a specification file, you can improve Sort efficiency by using the /CONDITION, /INCLUDE, and /OMIT qualifiers to process only those records needed in the output file. (The high-performance Sort/Merge utility does not support specification files. Implementation of this feature is deferred to a future OpenVMS Alpha release.) You can also use specification file qualifiers to reformat records, omitting unnecessary fields from the output file. These qualifiers are not available as command line qualifiers.
During a Sort operation, records from the input file are read into memory. If the allocated memory cannot hold all the records, Sort transfers the sorted data to one or more temporary work files. Merge does not use work files.
You can increase sort efficiency by changing the number of work files and by assigning them to specific devices:
ASSIGN device: SORTWORKn
$ ASSIGN WORK$2: SORTWORK1 $ ASSIGN WORK$3: SORTWORK2
Consider the following when you assign work files to devices:
If Sort requires work files (for example, if you are sorting a large file), a larger working set can increase sort efficiency. However, if your system is used heavily, it might be unable to allocate all the pages in the working set extent to your process. This can result in paging, which occurs when the operating system transfers parts of a process between physical memory and memory on a paging device; only the active part of the process remains in the physical memory. To avoid excessive paging, you can decrease the working set extent for your process. (Use the SET WORKING_SET command to decrease the working set extent.)
The following sections summarize information about Sort/Merge qualifiers.
This list describes command qualifiers used with the SORT and MERGE commands. To use a command qualifier, include the qualifier immediately after the SORT or MERGE command.
/[NO]CHECK_SEQUENCE
$ MERGE/KEY=(SIZE:4,POSITION:3)/NOCHECK_SEQUENCE - _$ PRICE1.DAT,PRICE2.DAT PRICE.LIS
/COLLATING_SEQUENCE=sequence
$ SORT/COLLATING_SEQUENCE=MULTINATIONAL - _$ NAMES.DAT,NOM.DAT LIST.LIS
/[NO]DUPLICATES
$ SORT/KEY=(POSITION:3,SIZE:5,DECIMAL)/NODUPLICATES - _$ ACCT1,ACCT2 ACCT.LIS
/KEY=(POSITION:n,SIZE:n[,field,...])
/PROCESS=type
$ SORT/KEY=(POS:40,SIZ:2,DESC)/PROCESS=TAG YRENDAVG.DAT - _$ DESCYRAVG.LIS
/SPECIFICATION=filespec
(The high-performance Sort/Merge utility does not support this qualifier. Implementation of this feature is deferred to a future OpenVMS Alpha release.)
/[NO]STABLE
$ SORT/KEY=(POS:1,SIZ:5,DECIMAL)/STABLE PRICESA.DAT, - _$ PRICESB.DAT,PRICESC.DAT SUMMARY.LIS
/[NO]STATISTICS
(The high-performance Sort/Merge utility does not support this qualifier. Implementation of this feature is deferred to a future OpenVMS Alpha release.)
$ DEFINE/USER SYS$ERROR output-file
Statistic | Description |
---|---|
Records read | The number of records read by Sort or Merge. |
Records sorted | The number of records that have been processed using Sort. This number could be less than the number of records read if a specification file is used to select only certain records for the Sort or Merge operation. |
Records output | The number of records written to the output file. This number could be less than the number of records sorted if /NODUPLICATES was selected or if I/O errors occurred when the output records were being written. |
Working set extent | The number of pages in the process working set extent. This value is used as an upper limit on the size of the sort data structure. Adjusting this value is one way to improve the efficiency of a Sort operation. |
Virtual memory | The number of pages of virtual memory added to the Sort image to hold the data. |
Direct I/O + buffered I/O | This total is the number of I/O movements needed to read and write data. The lower this total value is, the more efficient the ordering operation. |
Page faults | Indicates how well the data fits into memory: the higher the number of page faults, the less efficient the ordering operation. |
Elapsed time | The total wall clock time used by the Sort or Merge operation in hours, minutes, seconds, and hundredths of seconds. |
Input record length | This value is obtained from the Record Management Services (OpenVMS RMS) unless the user supplies it. |
Internal length | The size in bytes of an internal format node. This includes any keys, data, a word to store the length, record file addresses (RFAs), and converted keys. |
Output record length | The length of the output record. The length is computed from the input record length, the sort process, and the record reformatting requested. |
Sort tree size | The number of records that fit in the Sort internal data structure. |
Number of initial runs | One indication of how well the data fits into memory. |
Maximum merge order | The maximum number of sorted strings that are merged at one time. |
Number of merge passes | The number of times the Sort utility merges strings until one sorted output string is produced. The number of initial runs and the number of merge passes indicate how well the data fits into memory. The higher these numbers, the further the working set size is from containing the data and the longer the sorting takes. |
Work file allocation | The number of blocks used for the work files. When more than one merge pass is needed, this size is approximately twice the size of the input file allocation. |
Elapsed CPU | The CPU time used by the ordering operation; it does not include time spent waiting for I/O operations to complete or time spent waiting while another process executes. |