Chapter 12. Foreign File Conversion

This chapter contains information about data conversion, a discussion about moving data between machines, and information about the working of implicit and explicit data conversion. It also explains the support provided for reading and writing files in foreign formats, including the record blocking and numeric and character conversion.

These routines convert data (primarily floating-point data, but also integer and character, as well as Fortran complex and logical data) from your system's native representation to a foreign representation, and vice versa. Most of the routines discussed in this chapter are not yet available on Cray MPP systems or on IRIX systems. For complete implementation details, see the individual man pages or the INTRO_CONVERSION(3f) man page.

Conversion Overview

Data can be transferred between IRIX/UNICOS systems and other computer systems in several ways. These methods include the use of stations supplied by Cray Research and online tapes and utilities built on TCP/IP (such as ftp).

Cray Research supports foreign data conversion to and from IBM, VAX/VMS, CDC NOS/VE, CYBER 205, and the Institute of Electrical and Electronics Engineers (IEEE) format. For each foreign file type, several supported file and record formats exist or explicit or implicit data conversion can also be used.

When processing foreign data on IRIX or UNICOS systems, you must consider the interactions between the data formats and the chosen method of data transfer. This section describes, in broad terms, the techniques available to do these data conversions.

Explicit data conversion is the process by which the user performs calls to subroutines that convert the native data to and from the foreign data formats. These routines are provided for many data formats. This is discussed in more detail in “Explicit Data Item Conversion”.

Implicit data conversion is the process by which users declare that a particular file contains foreign data and/or record blocking and then request that the run-time library perform appropriate transformations on the data to make it useful to the program at I/O time. This method of record and/or data format conversion requires changes in command scripts. This is discussed in more detail in “Implicit Data Item Conversion”.

Transferring Data

This section describes several ways to transfer data, including using the fdcp tool, magnetic tape, and station conversion facilities.

Using fdcp to Transfer Files (Not Available on IRIX systems)

The fdcp(1) command can handle data that is not a simple disk-resident byte stream. The fdcp command assumes that both the data and any record, including EOF records, can be copied from one file to another. Record structures can be preserved or removed. EOF records can be preserved either as EOF records in the output file or used to separate the delimited data in the input file into separate files.

The fdcp command does not perform data conversion; the only transformations done are on the record and file structures (fdcp transforms block, record, and file control words from one format to another).

If no assign(1) information is available for a file, the system layer is used. This means that if the file being accessed is on disk and if no assign -F attribute is used, the syscall layer is used; if it is on a tape, the bmx layer is used. Therefore, each tape block is considered a record; user tape marks are mapped to EOF.

The following four examples show some uses of fdcp:

Example 12-1. Copy VAX/VMS tape file to disk

Copy a VAX/VMS tape file from labeled tape to disk, converting the logical records to native text records. The resulting file is a native text file that contains the VAX data.

assign -F vms.v.tape tapefile
assign -F text diskfile
fdcp tapefile diskfile

vi diskfile         # process the data


Example 12-2. Copy unknown tape type to disk

Copy a tape of unknown type to disk and preserve all tape blocks and tape marks in a COS blocked file (used by the UNICOS operating system). For this example, no assign command is necessary for the tape. If an assign command was needed, it would be the following:

assign -F bmx tapefile

The following assign and fdcp commands are required for disk files:

assign -F cos diskfile
fdcp tapefile diskfile

After examining the tape label, you discover that it is an IBM tape with fixed-length 240-byte records in 4800-byte tape blocks. Read this disk file by reassigning the file, using the following command:

assign -F ibm.fb:240:4800,cos -N ibm diskfile

Then execute the program using the following command:

./a.out diskfile

If this information was available when the tape was mounted, the tape could have been read directly by using the following command:

assign -F ibm.fb:240:4800,bmx -N ibm tapefile

Then the program could be executed with the following command:

./a.out tapefile


Example 12-3. Creating files for other systems

Use fdcp to create files for use on other systems. To create a COS blocked blank-compressed file to send to a COS site on tape, do the following:

assign -F blankx,cos tapefile
assign -F text diskfile
fdcp diskfile tapefile

If this sample tape will contain more than one file, the assign -F text diskfile syntax can be replaced by any of the following syntaxes:

assign -F text file1
assign -F text file2
assign -F text file3
assign -F text file4

fdcp file1,file2,file3,file4 tapefile 

This creates a COS transparent tape that contains four files.


Example 12-4. Copying to UNICOS text files

The same tape created in Example 12-3 (if created on the COS site) can be read on UNICOS in the same way. The following command copies each of the files on the tape to a separate UNICOS text file.

assign -F blankx,cos tapefile
assign -F text file1
assign -F text file2
assign -F text file3
assign -F text file4

fdcp tapefile file1,file2,file3,file4


Moving Data between Systems

This section describes the following ways to move data between the UNICOS machine and other machines:

  • Station conversion facilities (not available on CRAY T3E systems or IRIX systems)

  • Magnetic tapes (not available on IRIX systems)

  • TCP/IP, ftp, and other networks

Station Conversion Facilities

When a station converts a front-end file to a UNICOS file or a UNICOS file to a front-end file, two basic conversion options are available:

  • Conversion of the file's internal block and record control structures

  • Conversion of character data such as IBM EBCDIC and CDC display code to and from ASCII; conversion of data types other than character is not available with station facilities.

Data conversion is done only with the knowledge of the front-end computer's data type formats. The station software performs character and control structure conversions.

When a front-end file is processed by using fetch, acquire, or dispose commands, the -fparameter option to those commands defines if a file is processed in COS blocked format and if character conversion should be performed. See the fetch(1), acquire(1), and dispose(1) man pages for more details.

The following are some of the options for the format parameter:

Option

Description

bb

Binary blocked. The station converts between COS file and front-end file record blocking structures. The data in the resulting file remains unchanged.

tr

Transparent. No blocking, deblocking, or character conversion are performed. The resulting unblocked file is a bit copy of the original file. The exact format of this data depends on the front-end operating system.

cb

Character blocked. The station converts between COS file and front-end file record-blocking structures. The file is assumed to consist of character data which is converted between ASCII and the front-end character set. Cray Research blank compression and expansion also occur.

ud

UNIX operating system data. This is often used to transfer text files, and it is the default between UNIX text file format and front-end record blocking structures.

All of the stations (including IBM MVS and VM, CDC, and VMS) support all of the preceding options. The bb file format conversion differs for each station.

Magnetic Tape

The simplest way to move data between machines is to carry a magnetic tape from one computer room to another. The following commands are most relevant to this process:

Command

Description

rsv

Reserves tape devices

tpmnt

Mounts the tape

assign

Declares Fortran FFIO processing options

rls

Releases the tape

Many options are available with these commands; only a subset is discussed in this manual. For complete details, see the Cray Research publication, the Tape Subsystem User's Guide. When mounting a tape, it is assumed that the rsv command was used to reserve a tape drive and the rls command will be used to release the tape and the drive when you finish.

The tpmnt -T option that allows user tape marks and the tpmnt -l option that specifies a label type are relevant to data conversion. For VAX/VMS data conversion, remember that tapes are written in entirely different formats depending on the tape labeling.

When writing a tape, this factor must be considered: if a tape will be read on a foreign system, the tape label should contain information about the record type in each file. The request to set up this information is made by specifying option(s) on the tpmnt command. These options do not affect the actual record format of the data in the file; they simply request the tape subsystem to place the information in the appropriate labels.

The following options are available on the tpmnt command:

  • -Frecord-format specifies the value of the record format field in the HDR2, EOV2, and EOF2 labels of the ANSI and standard IBM labels. The record format value is used when creating labels. If you do not specify this option when creating a new file, the default record format is U. The record-format option can have the following values:

    F

    Fixed length (for both ANSI label and IBM standard label). This corresponds to IBM F and FB formats and VMS F format.

    D

    Variable length with zoned decimal length indicator (for ANSI label). This corresponds to VMS D and S formats and NOS/VE D and S formats.

    U

    Undefined length (for both ANSI label and IBM standard label). This corresponds to IBM and NOS/VE U formats.

    V

    Variable length (for standard IBM label). This is appropriate for all IBM V, VB, and VBS files.

    An optional second character can also be specified. This indicates the attributes of the data and can be one of the following (you must determine if the target system requires these values):

    B

    Blocked records

    S

    Spanned or standard records

    R

    Blocked and spanned or standard records

  • -Lrecord-length specifies the maximum number of bytes in a record length. It is used differently on various systems and usually corresponds to the logical record length. If this option is not present and a new file is being created, an installation default value is used.

  • -bblock-size is the same as the mbs parameter in many of the FFIO specifications for foreign record types. It is placed in the label of the tape when a file is created and is often unnecessary (but can be specified if a particular value is needed). It is checked at processing time. If it is small, the tape writing fails. An installation default is used if this option is not present and a new file is being created.

When creating a tape to use on another system, you can place this information in the tape labels. This can be important when reading these tapes on systems other than the UNICOS operating system, where it is unused.

TCP/IP and Other Networks

Several network utilities allow users to move data between computer systems. In this manual, the rcp and ftp utilities are discussed. These utilities work very well transferring files between systems based on UNIX software.

When transferring a file to a foreign system, FFIO can create the file in the correct foreign format but ftp cannot establish the right attributes on the file so that the foreign operating system can handle it correctly. Therefore, ftp is not useful as a transfer agent on IBM and VMS systems for binary data. Its utility is limited to those systems that do not embed record attributes in the system file information.

Data Item Conversion

The IRIX and UNICOS operating systems provide both implicit and explicit conversion of data items. Explicit conversion means that the user's code must invoke the routines that convert between native systems and foreign representations.

Options to the assign(1) command controls implicit conversion. The data types in the Fortran I/O lists direct implicit conversion. Implicit conversion is usually transparent to users and is available only to Fortran programmers. The following sections describe these data conversion types and provide direction in choosing a conversion type.

Explicit Data Item Conversion

The Cray Research Fortran library contains a set of subroutines that convert between Cray Research data formats and the formats of various vendors. These routines are callable from any programming language supported by Cray Research. The explicit conversion routines convert between IBM, VAX/VMS, CDC, NOS/VE, CYBER 205, or IEEE binary data formats and Cray Research binary data formats. For complete details, see the individual man pages for each routine. These subroutines provide an efficient way to convert data that was read into system central memory. These are the recommended routines, and they replace the older explicit routines described in Appendix A.

Table 12-1 lists subroutines that convert Cray Research PVP types. Table 12-2 lists subroutines that convert Cray Research MPP types. Table 12-3 lists subroutines that convert Cray T90/IEEE types. Table 12-4 lists SGI (MIPS) conversion routines.

Table 12-1. Conversion routines for Cray PVP systems

Cray PVP systems (non-IEEE)

Name

Foreign -> Cray

Cray -> Foreign

IBM

IBM2CRAY

CRAY2IBM

VAX/VMS

VAX2CRAY

CRAY2VAX

CDC (NOS)

CDC2CRAY

CRAY2CDC

CDC (NOS/VE)

NVE2CRAY

CRAY2NVE

CDC CYBER 205

ETA2CRAY

CRAY2ETA

Generic IEEE (32-bit)

IEG2CRAY

CRAY2IEG

IEEE little-endian

IEU2CRAY

CRAY2IEU

Cray IEEE (64-bit)

CRI2CRAY

CRAY2CRI

SGI MIPS

MIPS2CRY

CRY2MIPS

User conversion

USR2CRAY

CRAY2USR

Site conversion

STE2CRAY

CRAY2STE


Table 12-2. Conversion routines for Cray MPP systems

Cray MPP systems

Name

Foreign -> Native

Native -> Foreign

Cray PVP (non-IEEE)

CRAY2CRIand CRY2CRI

CRI2CRAYand CRI2CRY

IBM

IBM2CRI

CRI2IBM

Generic IEEE (32-bit)

IEG2CRI

CRI2IEG

User conversion

USR2CRAY

CRAY2USR

Site conversion

STE2CRAY

CRAY2STE


Table 12-3. Conversion routines for CRAY T90 systems

Cray T90/IEEE

Name

Foreign -> Native

Native -> Foreign

Cray PVP (non-IEEE)

CRY2CRI

CRI2CRY

IBM

IBM2CRI

CRI2IBM

Generic IEEE (32-bit)

IEG2CRI

CRI2IEG

User conversion

USR2CRAY

CRAY2USR

Site conversion

STE2CRAY

CRAY2STE


Table 12-4. Conversion routines for SGI (MIPS) systems

SGI (MIPS)

Name

Foreign -> Native

Native -> Foreign

Cray PVP (non-IEEE)

CRY2MIPS

MIPS2CRY

User conversion

USR2MIPS

MIPS2USR

Site conversion

STE2MIPS

MIPS2STE

IEEE Fortran conversion

IEG2MIPS

MIPS2IEG

VAX Fortran conversion

VAX2MIPS

MIPS2VAX


See the individual man pages for details about the syntax and arguments for each routine.

Implicit Data Item Conversion

Implicit data conversion in Fortran requires no explicit action by the program to convert the data in the I/O stream other than using the assign command to instruct the libraries to perform conversion. For details, see the assign(1) man page.

The implicit data conversion process is performed in two steps:

  1. Record format conversion

  2. Data conversion

Record format conversion interprets or converts the internal record blocking structures in the data stream to gain record-level access to the data. The data contained in the records can then be converted.

Using implicit conversion, you can select record blocking or deblocking alone, or you can request that the data items be converted automatically. When enabled, record format conversion and data item conversion occur transparently and simultaneously. Changes are usually not required in your Fortran code.

To enable conversion of foreign record formats, specify the appropriate record type with the assign -F command. The -N (numeric conversion) and -C (character conversion) assign options control conversion of data contained in a record. If -F is specified, but -N and -C are not, the libraries interpret the record format, but they do not convert data. You can obtain information about the type of data that will be converted (and, therefore, the type of conversion that will be performed) from the Fortran I/O list.

If -N is used and -C is not, an appropriate character conversion type is selected by default, as shown in the following tables:

  • Table 12-5 lists conversion types s on Cray PVP systems (non-IEEE)

  • Table 12-6 lists conversion types on Cray MPP systems

  • Table 12-7 lists conversion types on CRAY T90 /IEEE systems

  • Table 12-8 lists conversion types on SGI MIPS systems

Table 12-5. Conversion types on Cray PVP systems

-N option

-C default

Meaning

none

none

No data conversion

default

default

No data conversion

cray

ASCII

No data conversion

ibm

EBCDIC

IBM data conversion

ibm_dp

EBCDIC

IBM data conversion; floating-point is 64-bits

cdc

CDC

CDC 60-bit conversion

nosve

ASCII

CDC NOS/VE data conversion

c205

ASCII

CDC CYBER 205 (ETA) data conversion

vms

ASCII

VAX/VMS data conversion

vms_dp

ASCII

VAX/VMS data conversion; floating-point is 64-bits

ieee

ASCII

Generic 32-bit IEEE data conversion

ieee_32

ASCII

alias for above

ieee_dp

ASCII

IEEE data conversion; floating-point is 64-bits

mips

ASCII

SGI MIPS IEEE data conversion (128-bit floating-point is "double double" format)

ieee_64

ASCII

Cray 64-bit IEEE data conversion

ieee_le

ASCII

Little endian 32-bit IEEE data conversion

ultrix

ASCII

Alias for above

ieee_le_dp

ASCII

Little-endian 32-bit IEEE data conversion; floating-point is 64-bits

ultrix_dp

ASCII

alias for above

t3e

ASCII

Cray 64-bit IEEE data conversion; denormalized numbers flushed to zero

t3d

ASCII

alias for above

user

ASCII

User defined data conversion

site

ASCII

Site defined data conversion


Table 12-6. Conversion types on Cray MPP systems

-N option

-C default

Meaning

none

none

No data conversion

default

default

No data conversion

cray

ASCII

CRAY PVP (non-IEEE) data conversion

ieee

ASCII

Generic 32-bit IEEE data conversion

ieee_32

ASCII

alias for above

t3e

ASCII

No data conversion

t3d

ASCII

No data conversion

user

ASCII

User defined data conversion

site

ASCII

Site defined data conversion


Table 12-7. Conversion types on CRAY T90/IEEE systems

-N option

-C default

Meaning

none

none

No data conversion

default

default

No data conversion

cray

ASCII

Cray PVP (non-IEEE) data conversion

ibm

EBCDIC

IBM data conversion

ibm_dp

EBCDIC

IBM data conversion; floating-point is 64-bits

ieee

ASCII

Generic 32-bit IEEE data conversion

ieee_32

ASCII

alias for above

ieee_64

ASCII

No data conversion

ieee_dp

ASCII

IEEE data conversion; floating-point is 64-bits

user

ASCII

User defined data conversion

site

ASCII

Site defined data conversion


Table 12-8. Conversion types on SGI IRIX (MIPS)

-N option

-C default

Meaning

none

none

No data conversion

default

default

No data conversion

cray

ASCII

CRAY PVP (non-IEEE) data conversion

mips

ASCII

No data conversion

user

ASCII

User defined data conversion

site

ASCII

Site defined data conversion

ieee

ASCII

Generic 32-bit IEEE data conversion

ieee_32

 

(alias for above)

ieee_64

ASCII

CRAY 64-bit IEEE data conversion

ieee_le

ASCII

Little-endian 32-bit IEEE data conversion

vax

ASCII

DEC VAX/VMS data conversion

vms

 

(alias for above)


Cray Research supports the following implicit data conversion:

  • Conversion of the supported tape and disk formats and data types through standard Fortran formatted, unformatted list-directed, and Namelist I/O and through BUFFER IN and BUFFER OUT statements.

  • Conversion of the supported tape/disk record formats only for AQIO or CALL READ/WRITE. No data item conversion is performed.

Generally, read, write, and rewind are supported for all record formats. Other capabilities, such as backspace and GETPOS/SETPOS are usually not available, but they can be made to work if a blocking type can be used to support it. See the sections on the specific layers for complete details.

If you select the -N option, the libraries perform data conversion for Fortran unformatted statements and BUFFER IN and BUFFER OUT I/O statements. Data is converted between its Cray Research representation and a foreign representation, according to its Fortran data type. Table 12-9 describes the conversion performed for each of the conversion types.

For numeric data conversions, most foreign data elements are defined with fewer bits than their corresponding Cray Research data elements. If the value in a Cray Research element is too large to fit in the foreign element, the foreign element is set to the largest or smallest possible value; no error is generated. When converting from a Cray Research element to a smaller foreign element, precision is also lost due to truncation of the floating-point mantissa.

If the assign -N user or assign -N site command is specified, the user or site must provide site numeric data conversion routines. They follow the same calling conventions as the other explicit routines.

Table 12-9. Supported foreign I/O formats and default data types

Vendor data type

Record formats

Foreign data types

Cray Research data types

IBM

U, F, FB, V, VB, VBS

INTEGER*2

INTEGER*4

DOUBLE PRECISION

COMPLEX*4

LOGICAL*4

CHARACTER (EBCDIC)

INTEGER(24/32)

INTEGER(64)

DOUBLE PRECISION

COMPLEX

LOGICAL

CHARACTER (ASCII)

VMS

F, V, S for tape; bb or disk and tr types

INTEGER*2

INTEGER*4

REAL*4

DOUBLE PRECISION

COMPLEX*4

LOGICAL*4

CHARACTER (ASCII)

INTEGER(24/32)

INTEGER(64)

REAL(64)

DOUBLE PRECISION

COMPLEX

LOGICAL

CHARACTER (ASCII)

CDC (60 bit)

Subtype: DISK, I, SI Block record: IW, CW, CZ, CS

INTEGER

REAL

DOUBLE PRECISION

COMPLEX

LOGICAL

CHARACTER (display code)

INTEGER

REAL

DOUBLE PRECISION

COMPLEX

LOGICAL

CHARACTER (ASCII)

CDC NOS/VE

F, S, V

INTEGER

REAL

DOUBLE PRECISION

COMPLEX

LOGICAL

CHARACTER

INTEGER

REAL

DOUBLE PRECISION

COMPLEX

LOGICAL

CHARACTER (ASCII)

CDC/ETA CYBER205

W type

INTEGER

REAL

REAL*4

DOUBLE PRECISION

COMPLEX

LOGICAL

CHARACTER (display code)

INTEGER

REAL

INTEGER(24/32) (See Note 1)

DOUBLE PRECISION

COMPLEX

LOGICAL

CHARACTER (ASCII)

IEEE

None defined (often f77)

INTEGER*2 (see Note 2)

INTEGER*4

REAL*4

DOUBLE PRECISION

COMPLEX*4

LOGICAL*4

CHARACTER (ASCII)

INTEGER(24/32)

INTEGER(64)

REAL(64)

DOUBLE PRECISION

COMPLEX

LOGICAL

CHARACTER (ASCII)

ULTRIX

f77.vax

INTEGER*2

INTEGER*4

REAL*4

DOUBLE PRECISION

COMPLEX*4

LOGICAL*4

CHARACTER (ASCII)

INTEGER(24/32)

INTEGER(64)

REAL(64) (see Note 3)

DOUBLE PRECISION

COMPLEX

LOGICAL

CHARACTER (ASCII)

Note 1: The CYBER 205 half-precision type maps to the Cray short integer (INTEGER*2) type

Note 2: On Cray MPP systems, the compiler will not implicitly correct INTEGER*2 data. Explicit conversion is supported.

Note 3: Special data conversion types ibm_dp , ieee_dp , ultrix_dp , and vms_dp are available. These types modify the conversion of real data if any of the following conditions apply to all real data items that are written or read to a unit with implicit data item conversion: the I/O list item is of type DOUBLE PRECISION and the -dp compiler option was specified when compiled on the Cray Research system; the I/O list item is of type REAL*8 and the other vendor supports REAL*8 as 8-byte real; or the I/O list item is of type REAL*8 or REAL and the program was compiled on the other (foreign) vendor system with an option which maps REAL*8 or REAL to 8-byte real.

For implicit conversion, specify format characteristics on an assign command.

Files can be converted to one of the following:

  • A magnetic tape

  • A disk file

  • A file transferred from a front end with the station

When a Fortran I/O operation is performed on the file, the appropriate file format and data conversions are performed during the I/O operation. Data conversion is performed on each data item, based on the type of the Fortran variable in the I/O list.

For example, if the first read of a foreign format file is the following, the library interprets any blocking structures in the file that precede the first data record:

READ (10) INT,FLOAT1,FLOAT2

These vary depending on the file type and record format. The first 32 bits of data (in IBM format, for example) are extracted, sign-extended, and stored in the INT Fortran variable. The next 32 bits are extracted, converted to native floating-point format, and stored in the FLOAT1 Fortran variable.

The next 32 bits are extracted, converted, and stored into the FLOAT2 Fortran variable. The library then skips to the end of the foreign logical record. When writing from a native system to a foreign format (for example, if in the previous example WRITE(10) was used), precision is lost when converting from a 64-bit representation to 32-bit representation.

Choosing a Conversion Method

As with any software process, the various options for data conversion have advantages and disadvantages, which are discussed in this section. As a set, various data conversion options provide choices in methods of file processing for front-end systems. No one option is best for all applications.

Station Conversion (Not Available on IRIX systems)

The following are some of the advantages of using the station software to convert data:

  • The system overhead associated with data conversion is placed on the front-end system rather than on the UNICOS system.

  • You do not have to change source code.

  • Your Cray Research job processes only with Cray Research format data.

Some disadvantages of using the front-end station for conversion include the following:

  • Binary data cannot be converted.

  • Front-end systems have a relatively slow processing speed.

Explicit Conversion

Explicit data conversion has some distinct advantages over using station software, including the following:

  • Direct control over data conversion is provided (including some options not available through implicit conversion).

  • Programmers can control the conversion, and they can do the conversion at a convenient and appropriate time.

  • Conversion is usually performed on large data areas as vector operations, increasing performance.

One disadvantage of using explicit conversion is that explicit routines require changes to the source code.

Implicit Conversion

An advantage when using implicit conversion is that you do not have to change the source code.

The following are disadvantages when using implicit conversion:

  • Job Control Language (JCL) or script changes are required on the assign(1) or asgcmd(1) command (asgcmd is not available on IRIX systems).

  • Conversion is less efficient on a record-by-record basis.

  • Conversion is done at I/O time according to the declared data types, allowing little flexibility for nonstandard requirements.

Disabling Conversion Types (Not Available on IRIX systems)

The subroutines required to handle data conversion must be loaded into absolute binary files. By default, the run-time libraries include references to routines required to support the forms of implicit conversion enabled in the foreign data conversion configuration file, usually named <fdcconfig.h>.

If an application requires the use of a conversion routine that is not loaded by default, it can use the loader directives files to activate the routines that support that type of conversion.

It is possible to activate these conversion types for an entire site by using the UNICOS installation tool. Use the following nested menu options:

Configure system==>
  SEGLDR loader configuration==>
    Define optional SEGLDR HARDREF directives==>

This adds the needed directives in the site-configurable SEGLDR directives file.

Foreign Conversion Techniques

This section contains some tips and techniques for the following conversion types:

Conversion type 

Convert data to/from

CDC 60-bit conversion 

CDC CYBER 60-bit machines

COS files 

COS systems

CYBER 205 conversion 

CDC CYBER 205 and ETA machines

CTSS text files 

CTSS format text files

IBM conversion 

IBM machines

IEEE conversion 

Various types of workstations and different vendors that support IEEE floating-point format

NOS/VE conversion 

CDC CYBER machines that run NOS/VE

VAX/VMS conversion 

DEC VAX machines that run MVS

CDC CYBER NOS (VE and NOS/BE 60-bit) Conversion

Tape formats are physical structures that the operating system superimposes over the user-declared CYBER record manager file structure. The FFIO system supports I-format (internal) and SI -format (system or SCOPE internal) tape formats and files transmitted to the UNICOS operating system from a CYBER disk.

I-format tapes have block sizes that range from 0 to 512 words in exact multiples of 60-bit words. Each block includes a 48-bit block terminator.

SI-format tapes also have block sizes ranging from 0 to 512 words in exact multiples of 60-bit words. Any block smaller than the maximum size (512 words) contains a 48-bit block terminator. This terminator has the same format as that used for I-format tapes.

All CDC sequential tape files are blocked; a block may contain partial records or one or more records. The block structure is intertwined with the physical tape format. Fortran programmers do not use block boundaries. The translation routines construct blocks from records supplied by users and supply users with records as required.

Two CYBER blocking types are supported:

  • I (internal): I-blocking contains internal control words (ICWs), which are similar to Cray Research block control words (BCWs). Each block contains an ICW followed by maximum block size in words. The maximum block size (mbs) parameter specifies the maximum number of characters. Records can also span block boundaries. Only W-type records can be used with I-type blocks.

  • C (character count): C-blocking implies no blocking. Each block has a fixed number of characters with no special control words internally. The mbs parameter specifies the maximum number of characters. You can span records (except CYBER type S) across block boundaries.

A record is the unit of information that is processed on each call for reading or writing. Therefore, the CDC translation routines on UNICOS systems must be aware of the following record types and their analogous structures. Partial record input and output can be achieved using standard buffer I/O requests.

  • W-type records are prefixed with a CDC-supplied record control word. The processing of this control word is invisible to users. FFIO routines determine the record length by looking at the control words. This record type is comparable to COS blocked records.

  • Z-type records are card-image data. Each record is terminated by a 12-bit byte of zeros in the low-order position of the last 60-bit CYBER word in the record.

  • S-type records are system-logical records that contain fixed-size blocks of data terminated by a short block to which is appended a 48-bit level number. The RS parameter specifies the maximum number of characters in the record. The record length should be a multiple of ten 6-bit characters. The assign(1) command determines conversion characteristics.

Several NOS/VE record formats are supported. One noteworthy restriction is that the nosve.v format is not supported on tape.

COS Conversions

The UNICOS operating system uses COS blocking primarily for Fortran unformatted sequential files.

The COS operating system uses COS blocking for all blocked files. Because the data formats (floating point, character, logical, and integer) are the same as on UNICOS systems with CRI floating point format, and because COS blocking is the default blocking format on the UNICOS operating system, no conversion is necessary when moving unformatted blocked sequential files between the UNICOS operating system on Cray PVP systems with CRI floating point and the COS operating system. Two common file types on COS require some conversion to make them useful on the UNICOS operating system.

The first of these file types is COS blocked text files. To handle these, a combination of the cos and the blankx layers is necessary. The blankx layer must process the blank compression that is usually done on COS files. The cos layer processes the COS block and record control words that are present in COS text files. To read or write such a file, use the following command:

assign -F blankx,cos cosfile

With this command, a Fortran program can perform list-directed, formatted, and namelist I/O on cosfile as though it were in UNICOS text format.

On UNICOS systems, you can also use the fdcp command to convert such a file to UNICOS text format.

assign -F blankx,cos cosfile
assign -F text textfile
fdcp cosfile textfile

To create a COS blocked, blank compressed text file, use the following:

assign -F blankx,cos cosfile
assign -F text textfile
fdcp textfile cosfile

If the COS file contains more than one EOF, specify the textfile with assign -F text.eof. This directs the text layer to use the ~e marker in the text file to signify an EOF.

The second of these file types is the COS blocked direct-access file. Direct-access files on the UNICOS operating system do not contain any blocking information; they are fixed-length record files and rely on the record length for the record boundaries.

On the COS operating system, direct-access files are COS blocked. Unformatted direct-access files are indistinguishable from sequential files. Formatted direct-access files are distinguishable from sequential files only because they are not blank compressed.

Because the FFIO system does not support random positioning on COS blocked files, it is not possible to directly read and write COS direct-access files. You can use fdcp to convert the files to a format that can be directly used on the UNICOS operating system. The simplest way to do this is to remove the blocking control words (BCW) from the file. This results in a file that contains all of the fixed-length records in a directly usable format. An example of this follows:

assign -F cos cosfile
fdcp cosfile dafile

The converse operation requires one more command. You must borrow one of the fixed-length record types because generic fixed-length record type do not exist. An example of this follows:

assign -F cosfile
assign -F vms.f.t:800 dafile
fdcp dafile cosfile

CDC CYBER 205 and ETA Conversion

The CYBER 205 layer is limited to the support of the W-type record. These records are not supported directly on tape; however, if tape files are copied to disk (preferably using fdcp), W-type records can be handled there.

A peculiar feature of the CYBER 205 conversion is the conversion of half-precision floating-point numbers, which are mapped to short integers in the numeric conversion routines. Therefore, if you do this conversion, a short integer array must be an equivalence of a real array to do the I/O.

Example:

    integer*2 ioarray(100)
    real fltnum(100)
    equivalence (fltnum,ioarray)
    read(1) ioarray
    do 10 i= 1,100
      call calc(fltnum(i))
C     deal with converted half-precision values
10  continue
    end

CTSS Conversion

The FFIO system includes two features that are embedded in the blankx and text layers to process CTSS files. CTSS uses its own text file format and uses blank compression on these files. To read and write most CTSS text files, use the following specification:

blankx.ctss,text.ctss

Because unformatted Fortran files (binary records) are in COS blocked format at CTSS sites running a UNICOS system, conversion is not necessary.

IBM Overview

To convert and transfer data between UNICOS systems and an IBM/MVS or VM system, you must understand the differences between the UNICOS file system and file formats, and those on the IBM system(s). On both VM and MVS, the file system is record oriented.

The most obvious form of data conversion is between the IBM EBCDIC character set and the ASCII character set used on UNICOS systems. Most of the utilities that transfer files to and from the IBM systems automatically convert both the record structures and character set to the UNICOS text format and to ASCII. For example, ftp performs these conversions and does not require any further conversion on UNICOS systems.

Binary data, however, is more complicated. You must first find a way to transfer the file and to preserve the record boundaries. If stations are available, this is simple (some examples are shown in the following sections). Few problems are caused by using tapes in transferring the file and preserving record boundaries.

Cray Research supports the following IBM record formats:

Format

Description

U

Undefined record format

F

Fixed-length records, one record per block

FB

Fixed-length, blocked records

V

Variable-length records

VB

Variable-length, blocked records

VBS

Variable-length, blocked, spanned records

For all IBM record formats, the data formats are the same whether you are processing a disk file or processing a tape file.

Using the MVS Station

To convert IBM foreign data that is transferred using the MVS station, you must perform two basic tasks:

  • Convert the data and record formats

  • Move the data between the MVS or VM system and UNICOS systems

An example that is common on IBM MVS systems follows. This example starts with a simple Fortran program that creates some data and moves the data to the MVS system. The following shell script shows the details:

    INTEGER IARR(10)
    REAL RARR(15)

    DO 10 I=1,10
      WRITE(1) IARR,RARR
10  CONTINUE
    STOP
    END

EOF
f90 test.f                  # compile and load program
segldr test.o               # load program
assign -R                   # reset assign
assign -F cos -N ibm fort.1 # select COS blocking (normal default)
                            # and IBM numeric data conversion
./a.out                     # execute
dispose fort.1 -f bb -t 'dsn=array.test,disp=shr'
                            # dispose file to MVS system

This program writes 10 records that contain 10 integer values and 15 real numbers. With the IBM numeric conversion enabled, there are ten 32-bit integers in IBM format and fifteen 32-bit IBM real numbers in these records.

During the dispose operation, the key parameter is -f bb. This directs the station to translate COS blocking to IBM blocking according to the file format defined in the file catalog on MVS. The part of this catalog that defines this file structure is called a DCB.

The -t argument specifies the operands of an IBM DD statement. This Job Control Language (JCL) information varies considerably from job to job and user to user.

A Fortran program can read the file that results on the MVS system without special handling.

An example of reading the file back to a UNICOS system follows. This option is usable for all IBM formats.

     INTEGER IARR(10)
     REAL RARR(15)
     DO 10 I=1,10
       READ(1) IARR,RARR
 10  CONTINUE
     STOP
     END

EOF
f90 test.f -o test.o        # compile and load program
assign -R                   # reset assign
assign -F cos -N ibm fort.1 # select cos blocking (normal default)
                            # and IBM numeric data conversion
fetch fort.1 -f BB -t 'dsname=array.test, disp-shr'
                            # fetch file from MVS system, and
                            # convert blocking from what is on
                            # MVS to COS blocked.
./a.out                     # run the program

The IBM record formats are not used with the assign command because the station can convert the record format for you. To specify a specific record format, it is more difficult to get the station to transfer the data correctly in both directions without more parameters.

In these examples, the station must interpret the blocking on the IBM/MVS side and translate it into a Cray Research COS blocking format. This is relatively slow and unnecessary because the UNICOS operating system can read the files directly.

The following is an example of a faster IBM MVS station transfer. Use the previous program and try to speed up the station file transfer by using the -f TR option on the fetch and dispose commands. This option causes the station to take the bytes from the IBM disk and to transfer them unchanged and untranslated to the UNICOS system. This produces faster wall-clock times for transferring the data. It costs slightly more in CPU time on a UNICOS system because a UNICOS system must perform the deblocking work previously performed by the station.

The following example assumes that the program that reads the data is compiled and loaded:

assign -R                      # reset assign
assign -F ibm.vbs -N ibm fort.1# select IBM VBS record format
                               # and IBM numeric data conversion

fetch fort.1 -f TR -t 'dsn=array.test, disp=shr'
                               # fetch file from MVS system in
                               # 'transparent' mode.
./a.out                        # run the program

The example does not try to create the file on a UNICOS system and to send it to MVS in transparent mode; this does not work. You cannot specify proper physical record boundaries in a transparent transfer. The file that results cannot be used on the MVS system. When fetching and reading files from MVS, however, this is often the best method.

A third option exists to transfer data between the UNICOS operating system and MVS. It requires more knowledge of both the FFIO layers and the IBM disk formats.

The six supported record formats are basic, low-level record types on the MVS operating system. The record format is stored in a part of the file called a DCB. When accessing data, the MVS system invokes appropriate processing to interpret the data in the file so that the user sees only the data, and not the control information that determines logical record boundaries.

Using MVS JCL, a user can specify to the MVS system that the DCB on a given file should be ignored. For example, if you have a VBS format file and you want MVS to read it as though it were a U format file, the system does not interpret the block and segment information embedded in the data. You will see all of the bits in the file on disk, including control words. It is similar to taking a Cray Research blocked file, assigning it as unblocked, and then reading it.

In the same way, when writing a file on the MVS system, you can declare it to be a U format file and the MVS system will not add any control information to your records. This is similar to creating a blocked file on the UNICOS system by inserting your own RCWs and BCWs in an unblocked file. When using the MVS station, the -t option on the fetch or dispose command provides a JCL to define the file and record access methods that MVS uses.

The following example assumes that the program that writes the data is compiled and loaded:

assign -R               # reset assign
assign -F ibm.vbs,cos -N ibm fort.1
                        # select IBM VBS record format with blocks
                        # delimited as COS blocked records. Also
                        # specify IBM numeric data conversion
./a.out                 # run the program
dispose fort.1 -f BB -t 'DCB=(RECFM=U),dsname=array.test,disp=shr'
                        # dispose file from MVS system

In this example, the assign -F spec command includes two layers. The dispose command requests that the COS blocked records on the Cray disk be converted on the MVS side. Using DCB=(RECFM=U) allows the station to write each block as a physical disk block on MVS. This allows the user on the UNICOS system to write the file in whatever format is desired. It also does not rely on the file catalog on the MVS system to determine the format.

An extra step is necessary to change the stored DCB on the MVS file so that programs on the MVS side can read the data correctly. Specifically, a VBS file was created on a UNICOS system and dispose -f bb was used to dispose of it. The station creates and writes the file as a U format file through the TEXT field on the dispose. The station avoids adding control words to the file that are already there. However, you must then correct the DCB for the file to match the real format. You can do an empty dispose that changes only the DCB and leaves the data unchanged. You can also do the empty dispose first, then do the dispose of the data that tells the station to ignore the DCB and to write the file as U format.

Other record formats perform like ibm.vbs. Any of these can be read or written with any IBM MVS station transfer method. If the faster method is used, you must add the requisite ,cos to the -F specification. A list of common record formats follows:

assign -F ibm.v:3280 fort.1    # V format, recsize 3280 bytes
assign -F ibm.vb::16000 fort.1 # VB format,
                               # max block size 16,000 bytes
assign -F ibm.f:1600 fort.1    # F format fixed-length records
                               # of 1600 bytes each.
assign -F ibm.fb:1600:32000 fort.1
                               # Fixed-length records 1600 bytes
                               # each, 20 records per block
assign -F ibm.vbs::1800 fort.1 # VBS format, block size 1800 bytes

One different format is the U format. This format, unlike all other IBM formats, does not contain any IBM control words to delimit records and lacks a fixed record size. To delimit logical records, the U format relies completely on the physical blocks on an IBM disk, or on tape blocks.

These physical records must be translated into some other form to be preserved so that the file will be interpreted correctly on the UNICOS system. This is usually done with a lower-level layer.

Data Transfer between UNICOS and VM

The primary difference between the VM station and the MVS station is that the record types described here are not known to the VM system, but only to VM applications. The stations are VM applications. Files are stored in a manner similar to that of MVS, in that variable-length blocks exist on the disk. Each of these blocks appears the same as it does on MVS, with block descriptor words and segment descriptor words. The VM system, however, is unaware of the control words stored in the blocks; therefore, you do not have to fool the system into thinking that the file is in a format that it is not. VM has the same limitations as the MVS station with the usage of TR transfers.

The VM station also handles dispose commands differently than the other stations. The dispose -f command must be set properly to avoid record truncation and unwanted binary data conversion. When disposing directly to CMS minidisk, you must have a cooperating process running on your virtual machine. See the IBM VM Station Command and Reference for COS, publication SI-0160.

Two basic methods exist to read and fetch a VB format file on VM/CMS disk from a VM/CMS front end. The examples shown here are abbreviated and do not include all parameters:

  • You can fetch the file from the front end by using fetch -f bb. Each disk block from the VM system is changed to a COS record. This preserves the control words embedded in the VB data, as in the following:

    fetch DATA -f BB -t 'dsn=file.name,disp=shr'

    assign -F ibm.vb,cos -N ibm DATA

  • You can fetch the file from VM by using the fetch -f tr (transparent) mode. Use the assign command to declare a file to be read in VB format, and set up the -F specification appropriately, as in the following:

    fetch TRDATA -f tr -t 'dsname=file.name,disp=shr'

    assign -F ibm.vb -N ibm TRDATA

Workstation and IEEE Conversion

IRIX systems use 32-bit IEEE standard floating point, as do many workstations and personal computers. These workstations often use a dialect of UNIX software as the operating system, with twos-complement arithmetic and the ASCII character set. The logical values in these implementations are usually the same for Fortran and C. They use zero for false and nonzero for true. It is also common to see the f77 record blocking used by the Fortran run-time library on unformatted sequential files.

No IEEE record format exists, but the IEEE implicit and explicit data conversion routine facilities are provided with the assumption that many of these things are true.

Most computer systems that use the IEEE data formats run operating systems based on UNIX software and use f77 record blocking. You can use the rcp or ftp commands to transfer files. In most cases, the following command should work for implicit conversion:

assign -F f77 -N ieee fort.1

When writing files in the f77 format, remember that you can gain a large performance boost by ensuring that the records being written fit in the working buffer of the f77 layer.

Silicon Graphics MIPS systems use IEEE floating-point representation, so IEEE conversion is usually unnecessary when reading or writing IEEE data on these systems.

The Cray T90/IEEE and Cray MPP systems both use IEEE floating-point representations. However, they differ from most workstations in that the default data size is 64-bits instead of 32-bits.

On Cray T90/IEEE systems there are no 32-bit native data types, so any and all 32-bit IEEE data types must be read or written with an IEEE data conversion layer (for example, assign -N ieee, or assign -N ieee_dp).

On Cray MPP systems, data types can be declared as 32-bits in size and can then be read or written directly. This is the most direct and efficient method to read or write data files for IEEE workstations. The user can either alter the declarations of the variables used in the Fortran I/O list to declare then as KIND=4 or as REAL*4 (or INTEGER*4), or all the variables in the program can be resized by compiling with the -s default32 compiler option.

For example, to read a file on a Cray MPP system which has 32-bit integers and 64-bit IEEE floating-point numbers, consider the following code fragments. Existing program:

REAL         RVAL      ! Default size (64-bits)
INTEGER      IVAL      ! Default size (64-bits)
...
READ (1) IVAL, RVAL

This program will expect both the integer and floating-point data to be the same size (64-bits). However, it can be modified to declare the variables to be the same size as the expected data. Modified program (#1):

REAL    (KIND=8) RVAL      ! Explicit 64-bits
INTEGER (KIND=4) IVAL      ! Explicit 32-bits
...
READ (1) IVAL, RVAL

This program will correctly read the the expected data. However, if this type of modification is too extensive, only the variables used in the I/O statement list need be modified. Modified program (#2):

REAL             RVAL      ! Default size (64-bits)
INTEGER          IVAL      ! Default size (64-bits)
REAL    (KIND=8) RTMP      ! Explicit 64-bits
INTEGER (KIND=4) ITMP      ! Explicit 32-bits
...
READ (1) ITMP, RTMP  !

Change explicitly sized data to default sized data:

RVAL = RTMP
IVAL = ITMP

On MIPS systems, data types can be declared as 64-bits in size and can then be read or written directly. This is the most direct and efficient method to read or write data files for Cray IEEE systems. The user can either alter the declarations of the variables used in the Fortran I/O list to declare them as KIND=8 or as REAL*8 (or INTEGER*8), or all the variables in the program can be resized by compiling with the -r8 (or -i8) compiler option.

The following are other IEEE data conversion variants; not all variants are available on all systems:

ieee or ieee_32 

The default workstation conversion specification. Data sizes are based on 32-bit words.

ieee_64 

The default IEEE specification on Cray T90/IEEE and Cray MPP systems. Data sizes are based on 64-bit words.

ieee_dp 

Data sizes are based on 32-bit words except for floating-point data which is based on 64-bit words.

ieee_le or ultrix 

Data sizes are based on 32-bit words and are little-endian.

ieee_le_dp or ultrix_dp 

Data sizes are based on 32-bit words except for floating-point data which is based on 64-bit words. All data is little-endian.

mips 

Data sizes are based on 32-bit words except for 128-bit floating-point data which uses a "double double" format.

VAX/VMS Conversion

Nine record types are supported for VAX/VMS record conversion. This includes a combination of three record types and the three types of storage medium, as defined in the following list:

Record type

Definition

f

Fixed-length records

v

Variable-length records

s

Segmented records

Media

Definition

tr

For transparent access to files

bb

For unlabeled tapes and bb station transfers

tape

For labeled tapes

Segmented records are mainly used by VAX/VMS Fortran. The following are examples of some combinations of segmented records in different types of storage media:

Example

Definition

vms.s.tr

Use as an FFIO specification to read or write a file containing segmented records with transparent access. In the fetch and dispose commands, specify the -f tr option for the file.

vms.s.tape

Use as an FFIO specification to read or write a file containing segmented records on a labeled tape.

vms.s.bb

Use as an FFIO specification to read or write a file containing segmented records on an unlabeled tape. In the fetch and dispose commands, specify the -f bb option for the file if it is not a tape.

The VAX/VMS system stores its data as a stream of bytes on various devices. UNICOS systems number their bytes from the most-significant bits to the least-significant bits, while the VAX system numbers the bytes from lowest-significance up. The station and a UNICOS system make this byte-ordering transparent when you use text files. When data conversion is used, byte swapping sometimes must be done.

Character conversion is not necessary for text files transferred using the station because the VAX/VMS system uses the same ASCII character set as UNICOS systems. The station software correctly handles byte-ordering machines for text files.

The process is similar for binary data. You can use the fetch and dispose commands to move data between the UNICOS system and the VAX. You can move data in several ways.

You can use the -f bb transfer format, which places the burden of blocking and unblocking the data to be transferred to the VAX. The VAX station must convert VMS blocking to COS blocking and the opposite is true.

You can also use the -f tr transfer option, which results in record formats on UNICOS systems that use the tr subfield in the vms layer specification.

When using -f bb transfers, you must know the format of the file on the VAX side and specify the proper record format when reading/writing the data.

Most VAX/VMS users are aware of only two basic record types, which are V and F (variable length and fixed length). F format on the VAX maps directly to the vms.f record types on UNICOS systems.

fetch testin -f bb -t 'CRAYMH"USER PASSWD"::SEG.DAT'
assign -a TESTIN -F vms.s.bb::10000,cos -N vms fort.1
assign -a TESTOUT -F vms.s.bb::10000,cos -N vms fort.22
f90 test.f test.o../a.out
dispose TESTOUT -f BB -t 'CRAYMH"USER PASSWD"::[]SEG2.DAT'

    PROGRAM FRED
    REAL
    INTEGER*2 SHORT(50)
C
C   NOTE THAT BECAUSE -N VMS IS IN USE, THE SHORT INTEGERS
C   BECOME 64 BITS ON THE CRAY RESEARCH SYSTEM
C
10  READ (1,END=99) SHORT, REAL
    CALL PROCESS (STAT,SHORT,REAL)C
C   ...AND ARE CONVERTED BACK ON OUTPUT
C
    WRITE(22) SHORT, REAL
    GOTO 10
99  STOP
    END
    SUBROUTINE PROCESS (STAT,SHORT,REAL)
    REAL REAL(100)
    INTEGER SHORT(50)
C
C   PROCESS THE DATA
C
    RETURN
    END

Because the default record type produced on UNICOS systems is v, special work is not required to dispose vms.s.bb or vms.v.bb records. It is also easy to fetch data from the station because the station reads the data properly without user intervention. When using fixed-length record format, you must add the following information to the TEXT field on the dispose command:

-t 'CRAYMH"USER PASSWD"::SEG.DAT/RFM=FIX/MRS=256/NORAT'

This dispose command works properly for a file created with the vms.f.tr:256 specification.

The RFM, MRS, and NORAT parameters specify a fixed-length record format file, a maximum record size of 256 bytes, and no record attributes.

Implicit Numeric Conversions (Cray PVP systems Only)

The following segldrHARDREF directives select optional implicit numeric conversions to include in the standard libraries compiled into user programs by default.

Directive 

FFIO option

HARDREF=CRAY2IBM HARDREF=IBM2CRAY 

Cray<->IBM implicit numeric conversion

HARDREF=CRAY2VAX HARDREF=VAX2CRAY 

Cray<->VAX/VMS implicit numeric conversion

HARDREF=CRAY2NVE HARDREF=NVE2CRAY 

Cray<->NOS/VE implicit numeric conversion

HARDREF=CRAY2IEG HARDREF=IEG2CRAY 

Cray<->IEEE implicit numeric conversion

HARDREF=CRAY2ETA HARDREF=ETA2CRAY 

Cray<->ETA implicit numeric conversion

HARDREF=CRAY2CDC HARDREF=CDC2CRAY 

Cray<->CDC 60-bit implicit numeric conversion

HARDREF=CRI2IBM HARDREF=IBM2CRI 

Cray IEEE<->IBM

HARDREF=CRI2IEG HARDREF=IEG2CRI 

Cray IEEE<->Generic IEEE