[Digital logo]
[HR]

OpenVMS User's Manual


Previous | Contents
Step Task
1 Specify SYS$INPUT as the input file on the SORT or MERGE command line.


| [Home] | [Comments] | [Ordering info] | [Help]

[HR]

  6489P018.HTM
  OSSG Documentation
  22-NOV-1996 13:17:02.42

Copyright © Digital Equipment Corporation 1996. All Rights Reserved.

Legal

Use the input file qualifier /FORMAT to specify the size of the longest record, in bytes, and the approximate size of the input file, in blocks.

2 Enter the input records on successive lines.

End each record by pressing Return.

3 Press Ctrl/Z to end the file.

Example

The following example demonstrates a Sort operation in which the input records to be sorted are entered directly from the terminal:

$ SORT/KEY=(POSITION:8,SIZE:15) -
_$ SYS$INPUT/FORMAT=(RECORD_SIZE:24,FILE_SIZE:10) BYNAME.LST
BST 7828 MCMAHON JANE [Return]
ADM 7933 ROSENBERG HARRY[Return]
COM 8102 KNIGHT MARTHA[Return]
ANS 8042 BENTLEY PETER[Return]
BIO 7951 LOWELL FRANK[Return]
[Ctrl/Z]

This sequence of commands creates the output file BYNAME.LST, which contains the sorted records.

11.8 Using a Sort/Merge Specification File

Sort/Merge allows you to maintain sort definitions and to specify more complex sort criteria in specification files. (The high-performance Sort/Merge utility does not support specification files. Implementation of this feature is deferred to a future OpenVMS Alpha release.) You can use any standard editor, or the DCL CREATE command to create a specification file.

A Sort/Merge specification file allows you to:

After you complete the specification file, specify the file name using the /SPECIFICATION qualifier. The default file type for a specification file is .SRT.

11.8.1 Format

Each command in the specification file should start with a slash (/) and continuation characters are not required if a command spans more than one line.


Note

Many of the qualifiers used in the specification file are similar to the DCL qualifiers used in the Sort/Merge command line. Note, however, that the syntax of these qualifiers can be different. For example, the /KEY qualifier at DCL level has different syntax than the /KEY qualifier in the specification file. See Section 11.10.4 for a summary of the specification file qualifiers.

11.8.2 Overriding a Qualifier

Any DCL command qualifiers that you specify on the command line override corresponding entries in the specification file. For example, if you specify the /KEY qualifier in the DCL command line, Sort/Merge ignores the /KEY clause in the specification file.

11.8.3 Order of Qualifiers

Generally, there is no required order in which you must specify the qualifiers in a specification file. However, the order becomes significant in the following cases:

When you specify the FOLD, MODIFICATION, and IGNORE keywords with the /COLLATING_SEQUENCE qualifier, you should specify all MODIFICATION and IGNORE clauses before any FOLD clauses. See Section 11.10.4 for more information about the /COLLATING_SEQUENCE qualifier.

11.8.4 Including Comments

You can include comments in your specification file by beginning each comment line with an exclamation point (!). Unlike DCL command lines, specification files do not need hyphens (-) to continue the line.

11.8.5 Examples: Specification File

  1. This is an example of a specification file that can be used to sort negative and positive data in ascending order:
    ! Specification file for sorting negative and positive data 
    ! in ascending order 
    ! 
    /FIELD=(NAME=SIGN,POS:1,SIZ:1)  
    /FIELD=(NAME=AMT,POS:2,SIZ:4)   
    /CONDITION=(NAME=CHECK1,        
              TEST=(SIGN EQ "-")) 
    /CONDITION=(NAME=CHECK2,        
              TEST=(SIGN EQ " ")) 
    /INCLUDE=(CONDITION=CHECK1,     
              KEY=(AMT,DESCENDING), 
              DATA=SIGN, 
              DATA=AMT) 
    /INCLUDE=(CONDITION=CHECK2,     
              KEY=(AMT,ASCENDING), 
              DATA=SIGN, 
              DATA=AMT) 
    

    As you examine the specification file, note the following:
    1. This command line defines a field that begins in byte 1 of the record and is 1 byte long. It assigns the field the name SIGN.
    2. This command line defines a field that begins in byte 2 of the record and is 4 bytes long. It assigns the field the name AMT.
    3. This is a condition statement. If there is a negative sign ( - ) in the SIGN byte, the CHECK1 condition is met.
    4. This is a condition statement. If the SIGN byte is blank, the CHECK2 condition is met.
    5. If the condition CHECK1 is met, then the record is sorted in descending order.
    6. If the condition CHECK2 is met, then the record is sorted in ascending order.

    Figure 11-8 shows the result of using the specification file on an input file named BALANCES.LIS.

    Figure 11-8 Output from Using a Specification File



  2. /FIELD=(NAME=RECORD_TYPE,POS:1,SIZ:1)   ! Record type, 1-byte 
    /FIELD=(NAME=PRICE,POS:2,SIZ:8)         ! Price, both files 
    /FIELD=(NAME=TAXES,POS:10,SIZ:5)        ! Taxes, both files 
    /FIELD=(NAME=STYLE_A,POS:15,SIZ:10)     ! Style, format A file 
    /FIELD=(NAME=STYLE_B,POS:20,SIZ:10)     ! Style, format B file 
    /FIELD=(NAME=ZIP_A,POS:25,SIZ:5)        ! Zip code, format A file 
    /FIELD=(NAME=ZIP_B,POS:15,SIZ:5)        ! Zip code, format B file 
    /CONDITION=(NAME=FORMAT_A,              ! Condition test, format A 
                TEST=(RECORD_TYPE EQ "A")) 
    /CONDITION=(NAME=FORMAT_B,              ! Condition test, format B 
                TEST=(RECORD_TYPE EQ "B")) 
    /INCLUDE=(CONDITION=FORMAT_A,           ! Output format, type A 
                KEY=ZIP_A, 
                DATA=PRICE, 
                DATA=TAXES, 
                DATA=STYLE_A, 
                DATA=ZIP_A) 
    /INCLUDE=(CONDITION=FORMAT_B,           ! Output format, type B 
                KEY=ZIP_B, 
                DATA=PRICE, 
                DATA=TAXES, 
                DATA=STYLE_B, 
                DATA=ZIP_B) 
    

    In this example, two input files from two different branches of a real estate agency are sorted according to the instructions specified in a specification file. The records in the first file that begin with an A in the first position have this format:
           |A|PRICE|TAXES|STYLE|ZIP| 
            1 2     10    15    25 
    

    The records in the second file that begin with a B in the first position and have the style and zip code fields reversed, as follows:
           |B|PRICE|TAXES|ZIP|STYLE| 
            1 2     10    15  20 
    

    To sort these two files on the zip code field in the format of record A, first define the fields in both records with the /FIELD qualifiers. Then, specify a test to distinguish between the two types of records with the /CONDITION qualifiers. Finally, the /INCLUDE qualifiers change the record format of type B to record format of type A on output.
    Note that, if you specify either key or data fields in an /INCLUDE qualifier, you must explicitly specify all the key and data fields for the Sort operation in the /INCLUDE qualifier.
    Also note that records that are not type A or type B are omitted from the sort.
  3. /COLLATING_SEQUENCE=(SEQUENCE= 
    ("AN","EB","AR","PR","AY","UN","UL", 
    "UG","EP","CT","OV","EC","0"-"9"), 
    MODIFICATION=("'"="19"), 
    FOLD) 
    

    This /COLLATING_SEQUENCE qualifier specifies a user-defined sequence that gives each month a unique value in chronological order. For example, if you want to order a file called SEMINAR.DAT according to the date, the file SEMINAR.DAT would be set up as follows:
           16 NOV 1983   Communication Skills 
           05 APR 1984   Coping with Alcoholism 
           11 Jan '84    How to Be Assertive 
           12 OCT 1983   Improving Productivity 
           15 MAR 1984   Living with Your Teenager 
           08 FEB 1984   Single Parenting 
           07 Dec '83    Stress --- Causes and Cures 
           14 SEP 1983   Time Management 
    

    The primary key is the year field; the secondary key is the month field. Because the month field is not numeric and you want the months ordered chronologically, you must define your own collating sequence. You can do this by sorting on the second two letters of each month--in their chronological sequence--giving each month a unique key value.
    The MODIFICATION option specifies that the apostrophe (') be equated to 19, thereby allowing a comparison of '83 and 1984. The FOLD option specifies that uppercase and lowercase letters are treated as equal.
    The output from this Sort operation appears as follows:
           14 SEP 1983   Time Management 
           12 OCT 1983   Improving Productivity 
           16 NOV 1983   Communication Skills 
           07 Dec '83    Stress --- Causes and Cures 
           11 Jan '84    How to Be Assertive 
           08 FEB 1984   Single Parenting 
           15 MAR 1984   Living with Your Teenager 
           05 APR 1984   Coping with Alcoholism 
    

    See Section 11.4 for other examples of creating user-defined collating sequences.
  4. /FIELD=(NAME=AGENT,POSITION:20,SIZE:15) 
    /CONDITION=(NAME=AGENCY, 
                TEST=(AGENT EQ "Real-T Trust" 
                OR 
                AGENT EQ "Realty Trust")) 
    /DATA=(IF AGENCY THEN "Realty Trust" ELSE AGENT) 
    

    In this example, two real estate files are being sorted. One file refers to an agency as Real-T Trust; the other refers to the same agency as Realty Trust. The /CONDITION and /DATA qualifiers instruct Sort to list the AGENT field in the sorted output file as Realty Trust.
  5. /FIELD=(NAME=ZIP,POSITION:60,SIZE:6) 
    /CONDITION=(NAME=LOCATION, 
                TEST=(ZIP EQ "01863")) 
    /KEY=(IF LOCATION THEN 1 
          ELSE 2) 
    

    In this example, all the records with a zip code of 01863 will appear at the beginning of the sorted output file. The conditional test is on the ZIP field, defined with the /FIELD qualifier; the condition is named LOCATION. The values 1 and 2 in this /KEY qualifier signify a relative order for those records that satisfy the condition and those that do not.
  6. /FIELD=(NAME=ZIP,POSITION:60,SIZE:6) 
    /CONDITION=(NAME=LOCATION, 
                TEST=(ZIP EQ "01863")) 
    /DATA=(IF LOCATION THEN "NORTH CHELMSFORD" 
           ELSE "Outside district") 
    

    In this example, the /CONDITION qualifier tests for the 01863 zip code. The /DATA qualifier specifies that the name of town field will be added to the output record, depending on the test results.
  7. /FIELD=(NAME=FFLOAT,POS:1,SIZ:0,F_FLOATING) 
    /CONDITION=(NAME=CFFLOAT,TEST=(FFLOAT GE 100)) 
    /OMIT=(CONDITION=CFFLOAT) 
    

    In this example, the number 100 is considered to be an F_FLOATING data type because field FFLOAT is defined as F_FLOATING in the /FIELD qualifier.
  8. /FIELD=(NAME=AGENT,POSITION:1,SIZE:5) 
    /FIELD=(NAME=ZIP,POSITION:6,SIZE:3) 
    /FIELD=(NAME=STYLE,POSITION:10,SIZE:5) 
    /FIELD=(NAME=CONDITION,POSITION:16,SIZE:9) 
    /FIELD=(NAME=PRICE,POSITION:26,SIZE:5) 
    /FIELD=(NAME=TAXES,POSITION:32,SIZE:5) 
    /DATA=PRICE 
    /DATA="  " 
    /DATA=TAXES 
    /DATA="  " 
    /DATA=STYLE 
    /DATA="  " 
    /DATA=ZIP 
    /DATA="  " 
    /DATA=AGENT 
    

    The /FIELD qualifiers define the fields in the records from an input file that has the following format:
    AGENT ZIP STYLE CONDITION PRICE TAXES 
    

    The /DATA qualifiers, which use the field-names defined in the /FIELD qualifiers, reformat the records to create output records of the following format:
    PRICE TAXES STYLE ZIP AGENT 
    

11.9 Optimizing a Sort or Merge Operation

There are several ways in which you can improve the efficiency of a Sort or Merge operation, based on your sorting environment. Use the /STATISTICS qualifier with the SORT or MERGE command to get information about the variables in your sorting environment. (The high-performance Sort/Merge utility does not support this qualifier. Implementation of this feature is deferred to a future OpenVMS Alpha release.)

After you examine the statistics display, consider any of the optimization options presented in the following sections.

Example

When you enter the SORT or MERGE command with the /STATISTICS qualifier, you see output similar to the following:

$ SORT/STATISTICS PAGEANT.LIS DOCUMENT.LIS
                  OpenVMS Sort/Merge Statistics 
 
Records read:           3 (1)     Input record length:       26 
Records sorted:         3        Internal length:           28 
Records output:         3        Output record length:      26 
Working set extent: 16384 (2)     Sort tree size:            42 
Virtual memory:       392        Number of initial runs:     0 
Direct I/O:            10        Maximum merge order:        0 
Buffered I/O:          11        Number of merge passes:     0 
Page faults:          158 (3)     Work file allocation:       0 (4)
Elapsed time: 00:00:00.54        Elapsed CPU:      00:00:00.03 (5)

As you examine the fields, note the following:

  1. Records read
    Lists the number of records that were read during a Sort operation. See Section 11.9.2 for information on selectively omitting records from a Sort operation.
  2. Working set extent
    Shows how many blocks are reserved to perform the sort operation. See Section 11.9.4 for information on making your working set larger.
  3. Page faults
    Shows how many times the operating system has transferred parts of your process from physical memory to your paging device. See Section 11.9.4 for more information on preventing paging.
  4. Work file allocation
    Shows how much disk space is reserved for your work file. See Section 11.9.3 for more information on work files.
  5. Elapsed CPU
    Shows how much CPU time the operating system took to process the sort operation. See Section 11.9.1 for information on saving time by choosing different methods of sorting.

11.9.1 Sorting Process

Sort defines four processes for sorting data internally: record, tag, address and indexed. (The high-performance Sort/Merge utility supports only the record process. Implementation of tag, address, and index processes is deferred to a future OpenVMS Alpha release.) RECORD is the default process. The type of process you choose affects the performance of the Sort operation as well as storage requirements. See the Section 11.3.8 for information about the different sort processes.

Before you select a sorting process, consider the following:

11.9.2 Omitting Records and Fields

From a specification file, you can improve Sort efficiency by using the /CONDITION, /INCLUDE, and /OMIT qualifiers to process only those records needed in the output file. (The high-performance Sort/Merge utility does not support specification files. Implementation of this feature is deferred to a future OpenVMS Alpha release.) You can also use specification file qualifiers to reformat records, omitting unnecessary fields from the output file. These qualifiers are not available as command line qualifiers.

11.9.3 Work Files

During a Sort operation, records from the input file are read into memory. If the allocated memory cannot hold all the records, Sort transfers the sorted data to one or more temporary work files. Merge does not use work files.

You can increase sort efficiency by changing the number of work files and by assigning them to specific devices:

Consider the following when you assign work files to devices:

11.9.4 Working Set Extent

If Sort requires work files (for example, if you are sorting a large file), a larger working set can increase sort efficiency. However, if your system is used heavily, it might be unable to allocate all the pages in the working set extent to your process. This can result in paging, which occurs when the operating system transfers parts of a process between physical memory and memory on a paging device; only the active part of the process remains in the physical memory. To avoid excessive paging, you can decrease the working set extent for your process. (Use the SET WORKING_SET command to decrease the working set extent.)

11.10 Summary of Sort/Merge Qualifiers

The following sections summarize information about Sort/Merge qualifiers.

11.10.1 Command Qualifiers

This list describes command qualifiers used with the SORT and MERGE commands. To use a command qualifier, include the qualifier immediately after the SORT or MERGE command.

/[NO]CHECK_SEQUENCE

/COLLATING_SEQUENCE=sequence

/[NO]DUPLICATES

/KEY=(POSITION:n,SIZE:n[,field,...])

/PROCESS=type

/SPECIFICATION=filespec

(The high-performance Sort/Merge utility does not support this qualifier. Implementation of this feature is deferred to a future OpenVMS Alpha release.)

/[NO]STABLE

/[NO]STATISTICS

(The high-performance Sort/Merge utility does not support this qualifier. Implementation of this feature is deferred to a future OpenVMS Alpha release.)