Chapter 7. File Structures

A file structure defines the way that records are delimited and how the end-of-file is represented.

Five distinct native file structures are used on UNICOS and UNICOS/mk systems: unblocked, pure, text, cos or blocked, and tape or bmx. On IRIX systems, the unblocked, pure, text, and F77 structures are used.

The I/O library provides four different forms of file processing to indicate an unblocked file structure by using the assign -s ft command: unblocked (unblocked), standard binary (sbin), binary (bin), and undefined (u). These alternative forms provide different types of I/O packages used to access the records of the file, different types of file truncation and data alignment, and different endfile record recognitions in a file.

The full set of options allowed with the assign -s ft command are the following:

For more information about valid arguments to the assign -F command, see “File Structure Selection” in Chapter 6. Table 7-1 summarizes the Fortran access methods and options.

Table 7-1. Fortran access methods and options

Access and form

assign -sft defaults

assign -sft options

Unformatted sequential BUFFER IN / BUFFER OUT

blocked/ cos*


bin
sbin
u
unblocked
bmx/tape

Unformatted direct

unblocked


bin
sbin
u
unblocked

Formatted sequential

text


blocked
cos
sbin/text
bmx/tape

Formatted direct on UNICOS systems

text


sbin/text

Formatted direct on IRIX systems

unblocked

u
unblocked

Any type of sequential, formatted, unformatted, or buffer I/O to tape

bmx/tape


bmx/tape

* UNICOS systems only 

On IRIX systems, you cannot specify the default for unformatted sequential access with assign -s. You must use assign -F f77.

Unblocked File Structure

A file with an unblocked file structure contains undelimited records. Because it does not contain any record control words, it does not have record boundaries. The unblocked file structure can be specified for a file that is opened with either unformatted sequential access or unformatted direct access. It is the default file structure for a file opened as an unformatted direct-access file.

If a file with unblocked file structure must be repositioned, a BACKSPACE statement should not be used. You cannot reposition the file to a previous record when record boundaries do not exist.

BUFFER IN and BUFFER OUT statements can specify a file that is an unbuffered and unblocked file structure. If the file is specified with assign -s u, BUFFER IN and BUFFER OUT statements can perform asynchronous unformatted I/O.

You can specify the unblocked data file structure by using the assign(1) command in several ways. All methods result in a similar file structure but with different library buffering styles, use of truncation on a file, alignment of data, and recognition of an endfile record in the file. The following unblocked data file structure specifications are available:

Specification

Structure

assign -s unblocked

Library-buffered

assign -F system

No library buffering

assign -s u

No library buffering

assign -s sbin

Standard-I/O-compatible buffering; for example, both library and system buffering

The type of file processing for an unblocked data file structure depends on the assign -s ft option declared or assumed for a Fortran file.

assign -s unblocked File Processing

An I/O request for a file specified using the assign -s unblocked command does not need to be a multiple of a specific number of bytes. Such a file is truncated after the last record is written to the file. Padding occurs for files specified with the assign -s bin command and the assign -s unblocked command. Padding usually occurs when noncharacter variables follow character variables in an unformatted direct-access file.

No padding is done in an unformatted sequential access file. An unformatted direct-access file created by a Fortran program on a UNICOS or UNICOS/mk system and with the MIPSpro 7 Fortran 90 compiler on IRIX systems contains records that are the same length. The endfile record is recognized in sequential-access files.

assign -s sbin File Processing (Not Recommended)

You can use an assign -s sbin specification for a Fortran file that is opened with either unformatted direct access or unformatted sequential access. The file does not contain record delimiters. The file created for assign -s sbin in this instance has an unblocked data file structure and uses unblocked file processing.

The assign -s sbin option can be specified for a Fortran file that is declared as formatted sequential access. Because the file contains records that are delimited with the new-line character, it is not an unblocked data file structure. It is the same as a text file structure.

The assign -s sbin option is compatible with the standard C I/O functions. See Chapter 5, “System and C I/O ”, for more details.


Note: Use of assign -s sbin is discouraged. Use assign -s text for formatted files, and assign -s unblocked for unformatted files.


assign -s bin File Processing (Not Recommended)

An I/O request for a file that is specified with assign -s bin does not need to be a multiple of a specific number of bytes. On UNICOS and UNICOS/mk systems, padding occurs when noncharacter variables follow character variables in an unformatted record.

The I/O library uses an internal buffer for the records. If opened for sequential access, a file is not truncated after each record is written to the file.

assign -s u File Processing

The assign -s u command specifies undefined or unknown file processing. An assign -s u specification can be specified for a Fortran file that is declared as unformatted sequential or direct access. Because the file does not contain record delimiters, it has an unblocked data file structure. Both synchronous and asynchronous BUFFER IN and BUFFER OUT processing can be used with u file processing.

For best performance, a Fortran I/O request on a file assigned with the assign -s u command should be a multiple of a sector. I/O requests are not library buffered. They cause an immediate system call.

Fortran sequential files declared by using assign -s u are not truncated after the last word written. The user must execute an explicit ENDFILE statement on the file to get truncation.

Text File Structure

The text file structure consists of a stream of 8-bit ASCII characters. Every record in a text file is terminated by a newline character (\n, ASCII 012). Some utilities may omit the newline character on the last record, but the Fortran library will treat such an occurrence as a malformed record. This file structure can be specified for a file that is declared as formatted sequential access or formatted direct access. It is the default file structure for formatted sequential access files. On UNICOS and UNICOS/mk systems, it is also the default file structure for formatted direct access files.

The assign -s text command specifies the library-buffered text file structure. Both library and system buffering are done for all text file structures (for more information about library buffering, see Chapter 8, “Buffering”).

An I/O request for a file using assign -s text does not need to be a multiple of a specific number of bytes.

You cannot use BUFFER IN and BUFFER OUT statements with this structure. Use a BACKSPACE statement to reposition a file with this structure.

COS or Blocked File Structure

The cos or blocked file structure uses control words to mark the beginning of each sector and to delimit each record. You can specify this file structure for a file that is declared as unformatted sequential access. Synchronous BUFFER IN and BUFFER OUT statements can create and access files with this file structure. This file structure is the default structure for files declared as unformatted sequential access on UNICOS and UNICOS/mk systems.

You can specify this file structure with one of the following assign(1) commands:

assign -s cos
assign -s blocked
assign -F cos
assign -F blocked        

These four assign commands result in the same file structure.

An I/O request on a blocked file is library buffered. For more information about library buffering, see Chapter 8, “Buffering”.

In a COS file structure, one or more ENDFILE records are allowed. BACKSPACE statements can be used to reposition a file with this structure.

A blocked file is a stream of words that contains control words called Block Control Word (BCW) and Record Control Words (RCW) to delimit records. Each record is terminated by an EOR (end-of-record) RCW. At the beginning of the stream, and every 512 words thereafter, (including any RCWs), a BCW is inserted. An end-of-file (EOF) control word marks a special record that is always empty. Fortran considers this empty record to be an endfile record. The end-of-data (EOD) control word is always the last control word in any blocked file. The EOD is always immediately preceded by an EOR, or an EOF and a BCW.

Each control word contains a count of the number of data words to be found between it and the next control word. In the case of the EOD, this count is 0. Because there is a BCW every 512 words, these counts never point forward more than 511 words.

A record always begins at a word boundary. If a record ends in the middle of a word, the rest of that word is zero filled; the ubc field of the closing RCW contains the number of unused bits in the last word.

The following is a representation of the structure of a BCW:

m

unused

bdf

unused

bn

fwi

(4)

(7)

(1)

(19)

(24)

(9)


Field

Bits

Description

m

0-3

Type of control word; 0 for BCW

bdf

11

Bad Data flag (1-bit).

bn

31-54

Block number (modulo 224).

fwi

55-63

Forward index; the number of words to next control word.

The following is a representation of the structure of an RCW:

m

ubc

tran

bdf

srs

unused

pfi

pri

fwi

(4)

(6)

(1)

(1)

(1)

(7)

(20)

(15)

(9)


Field

Bits

Description

m

0-3

Type of control word; 108 for EOR, 168 for EOF, and 178 for EOD.

ubc

4-9

Unused bit count; number of unused low-order bits in last word of previous record.

tran

10

Transparent record field (unused).

bdf

11

Bad data flag (unused).

srs

12

Skip remainder of sector (unused).

pfi

20-39

Previous file index; offset modulo 220 to the block where the current file starts (as defined by the last EOF).

pri

40-54

Previous record index; offset modulo 215 to the block where the current record starts.

fwi

55-63

Forward index; the number of words to next control word.

Tape/bmx File Structure (Not Available on IRIX systems)

The tape or bmx file structure is used for online tape access through the UNICOS tape subsystem. You can use any type of sequential, formatted, unformatted, or buffer I/O to read or write an online tape if this file structure was specified.

Each read or write request results in the processing of one tape block.

This file structure is the default option for doing any type of Fortran I/O to an online tape file. The file structure can be specified with one of the following commands:

assign -s bmx
assign -s tape
assign -F bmx
assign -F tape

These assign(1) commands result in the same file structure. Each read or write request results in the processing of one tape block. This structure can be used only with online IBM-compatible tape files or with ER90 volumes mounted in blocked mode. See the Cray document, Tape Subsystem User's Guide, for more information on library interfaces to ER90 volumes.

Library Buffers

When using Fortran I/O or FFIO for online tapes and the tape or bmx file structure, all of the user's data passes through a library buffer. The size and number of buffers can affect performance. Each of the library's buffers must be a multiple of the maximum block size (MBS) on the tape, as specified by the tpmnt -b command.

On IOS model D systems, one tape buffer is allocated by default. The buffer size is either MBS or (MBS ×n), whichever is larger (n is the largest integer such that MBS ×n≤ 65536).

On IOS model E systems, the default is to allocate 2 buffers of 4 × MBS each, with a minimum of 65,536 bytes, provided that the total buffer size does not exceed a threshold defined within the library. If the MBS is too large to accommodate this formula, the size of the buffers is adjusted downward, and the number is adjusted downward to remain under the threshold.

In all cases, at least one buffer of at least the MBS in bytes is allocated.

During a write request, the library copies the user's data to its buffer. Each of the user's records must be placed on a 4096-byte boundary within the library buffer. After a user's record is copied to the library buffer, the library checks the remaining free buffer space. If it is less than the maximum block size specified with the tpmnt -b command, the library issues an asynchronous write (writea(2)) system call. If the user requests that a tape mark be written, this also causes the library to issue a writea system call.

When using Fortran I/O or FFIO to read online tapes, the system determines how much data can be placed in the user's buffers. Reading a user's tape mark stops all outstanding asynchronous I/O to that file.