Chapter 8. XDR Programming Notes

XDR is the backbone of Sun's RPC package—the data for remote procedure calls is transmitted using the XDR standard. This chapter is based on Sun's technical notes about the implementation of the XDR standard. (For a complete specification of the XDR protocol, see Appendix B, “XDR Protocol Specification”.)

Most programmers (especially RPC programmers) will only need the information in three sections of this chapter: “Number Filters”,, “Floating-point Filters”,, and “Enumeration Filters”.. Topics in this chapter include:

Overview of XDR Programming

XDR's approach to standardizing data representations is canonical. That is, XDR defines a single byte order (big-endian), a single floating-point representation (IEEE), and so on. Any program running on any machine can use XDR to create portable data by translating its local data representations to the equivalent XDR standard representations; similarly, any program running on any machine can read portable data by translating the XDR standard representations to its local equivalents. The single standard completely decouples programs that create or send portable data from those that use or receive portable data.

The advent of a new machine or a new language has no effect on the community of existing portable data creators and users. A new machine joins this community by being taught how to convert the standard representations and its local representations; the local representations of other machines are irrelevant.

Conversely, the local representations of the new machine are also irrelevant to existing programs running on other machines; such programs can immediately read portable data produced by the new machine, because such data conforms to the canonical standard that it already understands.

There are strong precedents for XDR's canonical approach. For example, TCP/IP, UDP/IP, Ethernet, and indeed all protocols below layer five of the ISO model are canonical protocols. The advantage of any canonical approach is simplicity; in the case of XDR, a single set of conversion routines is written once and is never touched again. The canonical approach has a disadvantage, but this disadvantage is unimportant in real-world data transfer applications.

Suppose two little-endian machines are transferring integers according to the XDR standard. The sending machine converts the integers from little-endian byte order to XDR (big-endian) byte order; the receiving machine performs the reverse conversion. Because both machines observe the same byte order, their conversions are unnecessary. The point, however, is not necessity but cost, when compared to the alternative.

The time spent converting to and from a canonical representation is insignificant, especially in networking applications. Most of the time required to prepare a data structure for transfer is not spent in conversion but in traversing the elements of the data structure. To transmit an image of a tree, for example, each leaf must be visited and each element in a leaf record must be copied to a buffer and aligned there; storage for the leaf may have to be deallocated as well. Similarly, to receive a tree image, storage must be allocated for each leaf; data must be moved from the buffer to the leaf and properly aligned; and pointers must be constructed to link the leaves. Every machine pays the cost of traversing and copying data structures, regardless of whether conversion is required.

In networking applications, communication overhead—the time required to move the data down through the sender's protocol layers, across the network, and up through the receiver's protocol layers—dwarfs conversion overhead.

Consider the writer and reader programs.

The writer program looks like this:

#include <stdio.h>

main()        /* writer.c */
{
    long i;

    for (i = 0; i < 8; i++) {
        if (fwrite((char *)&i, sizeof(i), 1, stdout) != 1) {
            fprintf(stderr, "failed!\n");
            exit(1);
        }
    }
    exit(0);
}

The reader program looks like this:

#include <stdio.h> 

main()        /* reader.c */ 
{
    long i, j;

    for (j = 0; j < 8; j++) {
        if (fread((char *)&i, sizeof (i), 1, stdin) != 1) {
            fprintf(stderr, "failed!\n");
            exit(1);
        }
        printf("%ld ", i);
    }
    printf("\n");
    exit(0); 
}

The writer and reader programs appear to be portable, because they pass lint checking, and they exhibit the same behavior when executed on different hardware architectures—an IRIS-4D and a VAX.

Piping the output of the writer program to the reader program produces identical results on both machines:

IRIS% writer | reader 
0 1 2 3 4 5 6 7 
VAX% writer | reader 
0 1 2 3 4 5 6 7 

With the advent of local area networks and Berkeley's 4.2BSD UNIX came the concept of “network pipes”—a process produces data on one machine, and a second process consumes data on another machine. You can construct a network pipe with writer and reader. The next example shows the results if writer produces data on an IRIS, and reader consumes data on a VAX:

IRIS% writer | rsh vax reader 
0 16777216 33554432 50331648 67108864 83886080 100663296 117440512
IRIS%

You can obtain identical results by executing writer on the VAX and reader on the IRIS. These results occur because the byte ordering of long integers differs between the VAX and the IRIS, even though word size is the same.


Note: The 16777216 is 224—when the order of 4 bytes is reversed, the 1 that started in the zeroth bit winds up in the 24th bit.

Whenever data is shared by two or more machine types, there is a need for portable data. Programs can be made data-portable by replacing the read() and write() system calls with calls to the XDR routine xdr_long(), a filter that knows the standard representation of a long integer in its external form.

This is the revised version of the writer program:

#include <stdio.h>
#include <rpc/rpc.h>    /* xdr is a sub-library of rpc */ 

main()            /* writer.c */ 
{
    XDR xdrs;
    long i;

    xdrstdio_create(&xdrs, stdout, XDR_ENCODE);
    for (i = 0; i < 8; i++) {
        if (! xdr_long(&xdrs, &i)) {
            fprintf(stderr, "failed!\n");
            exit(1);
        }
    }
    exit(0);
}

This is a revised version of the reader program:

#include <stdio.h>
#include <rpc/rpc.h>    /* xdr is a sub-library of rpc */ 

main()            /* reader.c */
{
    XDR xdrs;
    long i, j;

    xdrstdio_create(&xdrs, stdin, XDR_DECODE);
    for (j = 0; j < 8; j++) {
        if (! xdr_long(&xdrs, &i)) {
            fprintf(stderr, "failed!\n");
            exit(1);
        }
        printf("%ld ", i);
    }
    printf("\n");
    exit(0);
}

When the revised programs are executed on an IRIS, on a VAX, and from an IRIS to a VAX, the results are:

IRIS% writer | reader 
0 1 2 3 4 5 6 7 

VAX% writer | reader 
0 1 2 3 4 5 6 7 

IRIS% writer | rsh vax reader 
0 1 2 3 4 5 6 7 

Dealing with integers is just the tip of the portable-data iceberg. Arbitrary data structures present portability problems, particularly with respect to alignment and pointers. Alignment on word boundaries may cause the size of a structure to vary from machine to machine. Pointers are convenient to use, but they have no meaning outside the machine where they are defined.


Note: On IRIX systems, C programs that want access to XDR routines should include the <rpc/rpc.h> header file, which contains all necessary interfaces to the XDR system. Since the default C DSO contains all the XDR routines, you don't need to indicate any special libraries on the compilation line in order to use XDR. See “Compiling BSD and RPC Programs” in Chapter 1 for additional compiling hints.


The XDR Library

The XDR library not only solves data portability problems, it also lets you write and read arbitrary C constructs in a consistent, specified, well–documented manner. Thus, it makes sense to use the XDR library, even when data is not shared among machines on a network.

The XDR library has filter routines for strings (null-terminated arrays of bytes), structures, unions, and arrays, to name a few. Using more primitive routines, you can write your own specific XDR routines to describe arbitrary data structures, including elements of arrays, arms of unions, or objects pointed at from other structures. The structures themselves may contain arrays of arbitrary elements or pointers to other structures.

Examine the reader and writer programs more closely. There is a family of XDR stream creation routines in which each member treats the stream of bits differently. In the example, data is manipulated using standard I/O routines; therefore, use xdrstdio_create(). The parameters to XDR stream creation routines vary according to their function. In our example, xdrstdio_create() takes a pointer to an XDR structure that it initializes, a pointer to a FILE that the input or output is performed on, and the operation. The operation may be XDR_ENCODE for serializing in the writer program or XDR_DECODE for deserializing in the reader program.


Note: RPC users never need to create XDR streams; the RPC system itself creates the streams, which are then passed to the users.

The xdr_long() primitive is characteristic of most XDR library primitives and all client XDR routines. First, the routine returns FALSE (that is, 0) if it fails and TRUE (1) if it succeeds. Second, for each data type xxx there is an associated XDR routine of the form shown in this example:

xdr_xxx(XDR *xdrs, xxx *xp)
{
}

In this case, xxx is long, and the corresponding XDR routine is a primitive, xdr_long(). The client could also define an arbitrary structure xxx, in which case the client would also supply the xdr_xxx() routine, describing each field by calling XDR routines of the appropriate type. In all cases, the first parameter, xdrs, can be treated as an opaque handle and passed to the primitive routines.

XDR routines are direction-independent; that is, the same routines are called to serialize or deserialize data. This feature is critical to the software engineering of portable data. The idea is to call the same routine for either operation—which almost guarantees that serialized data can also be deserialized. One routine is used by both producer and consumer of networked data. This direction independence is implemented by always passing the address of an object rather than the object itself—only in the case of deserialization is the object modified. This feature is not shown in our trivial example, but its value becomes obvious when nontrivial data structures are passed among machines. If needed, the user can obtain the direction of the XDR operation. (See “XDR Operation Directions” for details.)

For a slightly more complicated example, assume that a person's gross assets and liabilities are to be exchanged among processes, and assume that these values are important enough to warrant their own data type:

struct gnumbers {
    long g_assets;
    long g_liabilities;
};

A corresponding XDR routine describing this structure is:

bool_t             /* TRUE is success, FALSE is failure */ xdr_gnumbers(xdrs, gp)
XDR *xdrs;
struct gnumbers *gp;
{
    if (xdr_long(xdrs, &gp->g_assets) &&
        xdr_long(xdrs, &gp->g_liabilities))
        return(TRUE);
    return(FALSE);
}

Note that the parameter xdrs is never inspected or modified; it is only passed on to the subcomponent routines. It is imperative to inspect the return value of each XDR routine call and to give up immediately and return FALSE if the subroutine fails.

This example also shows that the type bool_t is declared as an integer whose only values are TRUE (1) and FALSE (0). This chapter uses the following definitions:

#define bool_t   int
#define TRUE     1
#define FALSE    0

Keeping these conventions in mind, xdr_gnumbers() can be rewritten like this:

xdr_gnumbers(XDR *xdrs, struct gnumbers *gp)
{
    return (xdr_long(xdrs, &gp->g_assets) &&
        xdr_long(xdrs, &gp->g_liabilities));
}

This chapter uses both coding styles.

XDR Library Primitives

This section gives a synopsis of each XDR primitive, including basic data types, constructed data types, and XDR utilities. The interface to these primitives and utilities is defined in the include file <rpc/xdr.h>, automatically included by <rpc/rpc.h>.

Number Filters

The XDR library provides primitives to translate between numbers and their corresponding external representations. Primitives cover the set of numbers in:

[signed, unsigned] [short, int, long]

The eight primitives are:

bool_t xdr_char(XDR *xdrs, char *cp);
bool_t xdr_u_char(XDR *xdrs, unsigned char *ucp);
bool_t xdr_int(XDR *xdrs, int *ip);
bool_t xdr_u_int(XDR *xdrs, unsigned *up);
bool_t xdr_long(XDR *xdrs, long *lip);
bool_t xdr_u_long(XDR *xdrs, u_long *lup);
bool_t xdr_short(XDR *xdrs, short *sip);
bool_t xdr_u_short(XDR *xdrs, u_short *sup);

The first parameter, xdrs, is an XDR stream handle. The second parameter is the address of the number that provides data to the stream or receives data from it. All routines return TRUE if they complete successfully, and FALSE otherwise.

Floating-point Filters

The XDR library also provides primitive routines for C's floating-point types:

bool_t xdr_float(XDR *xdrs, float *fp);
bool_t xdr_double(XDR *xdrs, double *dp);

The first parameter, xdrs, is an XDR stream handle. The second parameter is the address of the floating-point number that provides data to the stream or receives data from it. Both routines return TRUE if they complete successfully, and FALSE otherwise.


Note: Since the numbers are represented in IEEE floating point, routines may fail when decoding a valid IEEE representation into a machine-specific representation, or vice versa.


Enumeration Filters

The XDR library provides a primitive for generic enumerations. The primitive assumes that a C enumeration has the same representation inside the machine as a C integer. The boolean type is an important instance of the enum. The external representation of a boolean is always one (TRUE) or zero (FALSE).

#define bool_t   int
#define FALSE    0
#define TRUE     1
#define enum_t   int
bool_t xdr_enum(XDR *xdrs, enum_t *ep);
bool_t xdr_bool(XDR *xdrs, bool_t *bp);

The second parameters ep and bp are addresses of the associated type that provides data to, or receives data from, the stream xdrs. The routines return TRUE if they complete successfully, and FALSE otherwise.

No Data

Occasionally, an XDR routine must be supplied to the RPC system, even when no data is passed or required. The XDR library provides this routine:

bool_t xdr_void(XDR *xdrs, void *vp); /*always returns TRUE*/ 

Constructed Data Type Filters

Constructed or compound data type primitives require more parameters and perform more complicated functions than the primitives already discussed. This section includes primitives for strings, arrays, unions, and pointers to structures.

Constructed data type primitives may use memory management. In many cases, memory is allocated when deserializing data with XDR_DECODE. Therefore, the XDR package must provide means to deallocate memory. The XDR operation XDR_FREE is used for this purpose.

To review, the three XDR directional operations are:

  • XDR_ENCODE

  • XDR_DECODE

  • XDR_FREE

Strings

In C language, a string is defined as a sequence of bytes terminated by a null byte, which is not considered when calculating string length. However, when a string is passed or manipulated, a pointer to the string is employed. Therefore, the XDR library defines a string to be a char * and not a sequence of characters.

The external representation of a string is drastically different from its internal representation. Externally, strings are represented as sequences of ASCII characters, while internally they are represented with character pointers. Conversion between the two representations is accomplished with the xdr_string() routine:

bool_t xdr_string(XDR *xdrs, char **sp, u_int maxlength);

The first parameter, xdrs, is the XDR stream handle. The second parameter, sp, is a pointer to a string (type char *). The third parameter, maxlength, specifies the maximum number of bytes allowed during encoding or decoding. Its value is usually specified by a protocol. (For example, a protocol specification may say that a filename may be no longer than 255 characters.)

The routine returns FALSE if the number of characters exceeds maxlength, and TRUE if it doesn't.


Note: Keep maxlength small. If it is too big, you can overrun the heap, since xdr_string() will call malloc() for space.

The behavior of xdr_string() is similar to the behavior of other routines discussed in this chapter. The XDR_ENCODE operation is easiest to understand. The parameter sp points to a string of a certain length; if the string does not exceed maxlength, the bytes are serialized.

The effect of deserializing a string is subtle. First, the length of the incoming string is determined; it must not exceed maxlength. Next, sp is dereferenced; if the value is NULL, a string of the appropriate length is allocated, and *sp is set to this string. If the original value of *sp is non-NULL, the XDR package assumes that a target area has been allocated that can hold strings no longer than maxlength. In either case, the string is decoded into the target area. The routine then appends a null character to the string.

In the XDR_FREE operation, the string is obtained by dereferencing sp. If the string is not NULL, it is freed and *sp is set to NULL. In this operation, xdr_string() ignores the maxlength parameter.

Byte Arrays

Variable-length byte arrays are often preferable to strings. Byte arrays differ from strings in several ways:

  • The length of the array (the byte count) is explicitly located in an unsigned integer.

  • The byte sequence is not terminated by a null character.

  • The external representation of the bytes in the array is the same as their internal representation.

The xdr_bytes() primitive converts byte arrays between their internal and external representations:

bool_t xdr_bytes(XDR *xdrs, char **bpp, u_int *lp,
                 u_int maxlength);

The usage of the xdrs, bpp, and maxlength parameters is identical to their usage in xdr_string(). The length of the byte area is obtained by dereferencing lp when serializing; *lp is set to the byte length when deserializing.

Arrays

The XDR library provides a primitive for handling arrays of arbitrary elements. xdr_bytes() treats a subset of generic arrays, in which the size of array elements is known to be 1, and the external description of each element is built-in. The generic array primitive, xdr_array(), requires parameters identical to those of xdr_bytes() plus two more: the size of array elements and an XDR routine to handle each of the elements. This routine is called to encode or decode each element of the array:

bool_t xdr_array(XDR *xdrs, char **ap, u_int *lp,
                 u_int maxlength, u_int elementsize,
                 xdrproc_t *xdr_element);

The parameter ap is the address of the pointer to the array. If *ap is NULL when the array is being deserialized, XDR allocates an array of the appropriate size and sets *ap to that array. The element count of the array is obtained from *lp when the array is serialized; *lp is set to the array length when the array is deserialized. The parameter maxlength is the maximum number of elements that the array is allowed to have; elementsize is the byte size of each element of the array; the C function sizeof() can be used to obtain this value. The xdr_element() routine is called to serialize, deserialize, or free each element of the array.

Examples of Constructed Data Types

Before defining more constructed data types, consider the examples in this section.

Example A

A user on a networked machine can be identified by:

  • the machine name, such as krypton; see gethostname(2) 

  • the user's user ID; see geteuid(2)

  • the group numbers to which the user belongs; see getgroups(2)

A structure with this information and its associated XDR routine could be coded like this:

struct netuser {
    char    *nu_machinename;
    int     nu_uid;
    u_int   nu_glen;
    int     *nu_gids;
};

#define NLEN 255     /* machine names must < 256 chars */
#define NGRPS 20     /* user can't belong to > 20 groups */

bool_t
xdr_netuser(XDR *xdrs, struct netuser *nup)
{
    return (xdr_string(xdrs, &nup->nu_machinename, NLEN) &&
            xdr_int(xdrs, &nup->nu_uid) &&
            xdr_array(xdrs, &nup->nu_gids, &nup->nu_glen,
                      NGRPS, sizeof (int), xdr_int));
}

Example B

A party of network users could be implemented as an array of netuser structure. This is the declaration and its associated XDR routines:

struct party {
    u_int p_len;
    struct netuser *p_nusers;
};
#define PLEN 500     /* max number of users in a party */

bool_t
xdr_party(XDR *xdrs, struct party *pp)
{
    return (xdr_array(xdrs, &pp->p_nusers, &pp->p_len, PLEN,
                      sizeof (struct netuser), xdr_netuser));
}

Example C

The well-known parameters to main(), argc and argv, can be combined into a structure, and an array of instances of this structure can make up a history of commands. The declarations and XDR routines might look like this:

struct cmd {
    u_int c_argc;
    char **c_argv;
};

struct history {
    u_int h_len;
    struct cmd *h_cmds;
};
#define NCMDS  75   /* history is no more than 75 commands */
#define ALEN 1000   /* args cannot be > 1000 chars */
#define NARGC 100   /* command  cannot have > 100 args */

bool_t
xdr_wrap_string(XDR *xdrs, char **sp)
{
    return (xdr_string(xdrs, sp, ALEN));
}

bool_t
xdr_cmd(XDR *xdrs, struct cmd *cp)
{
    return (xdr_array(xdrs, &cp->c_argv, &cp->c_argc, NARGC,
                      sizeof (char *), xdr_wrap_string));
}

bool_t
xdr_history(XDR *xdrs, struct history *hp)
{
    return (xdr_array(xdrs, &hp->h_cmds, &hp->h_len, NCMDS,
                      sizeof (struct cmd), xdr_cmd));
}

The most confusing part of this example is that the xdr_wrap_string() routine is needed to package the xdr_string() routine, because the implementation of xdr_array() only passes two parameters to the array element description routine; xdr_wrap_string() supplies the third parameter to xdr_string().

Opaque Data

In some protocols, handles are passed from a server to a client; the client passes the handle back to the server at some later time. Handles are never inspected by clients; they are obtained and submitted. In other words, handles are opaque. The xdr_opaque() primitive is used to describe fixed-size opaque bytes.

bool_t xdr_opaque(XDR *xdrs, char *p, u_int len);

The parameter p is the location of the bytes; len is the number of bytes in the opaque object. By definition, the actual data contained in the opaque object is not machine portable.

Fixed-length Size Arrays

The XDR library does not provide a primitive for fixed-length arrays; the primitive xdr_array() is for variable-length arrays.

Example A could be rewritten to use fixed-size arrays, as shown in this code:

#define NLEN 255
/* machine names must be shorter than 256 chars */
#define NGRPS 20
/* user cannot be a member of more than 20 groups */ 
struct netuser {
    char *nu_machinename;
    int nu_uid;
    int nu_gids[NGRPS];
};

bool_t
xdr_netuser(XDR *xdrs, struct netuser *nup)
{
    int i;
    if (! xdr_string(xdrs, &nup->nu_machinename, NLEN))
        return (FALSE);
    if (! xdr_int(xdrs, &nup->nu_uid))
        return (FALSE);
    if (!xdr_vector(xdrs, nup->nu_gi , NGRPS, sizeof(int),
                    xdr_int)) {
        return (FALSE);
    }
    return (TRUE);
}

Discriminated Unions

The XDR library supports discriminated unions, C unions, and an enum_t value that selects an “arm” of the union:

struct xdr_discrim {
    enum_t value;
    bool_t (*proc)();
};

bool_t xdr_union(XDR *xdrs, enum_t *dscmp, char *unp,
                 struct xdr_discrim *arms,
                 xdrproc_t defaultarm);

First, the routine translates the discriminant of the union located at *dscmp. The discriminant is always an enum_t. Next, the union located at *unp is translated. The parameter arms is a pointer to an array of xdr_discrim structures. Each structure contains an ordered pair of [value, proc].

If the union's discriminant is equal to the associated value, the proc is called to translate the union. The end of the xdr_discrim structure array is denoted by a routine of value NULL (0). If the discriminant is not found in the arms array, the defaultarm procedure is called if it is non-NULL; otherwise, the routine returns FALSE.

Example D

Suppose the type of a union is integer, character pointer (a string), or a gnumbers structure. Also, assume the union and its current type are declared in a structure.

The declaration is:

enum utype { INTEGER=1, STRING=2, GNUMBERS=3 };

struct u_tag {
    enum utype utype;    /* the union's discriminant */
    union {
        int ival;
        char *pval;
        struct gnumbers gn;
    } uval;
};

The following constructs and XDR procedure (de)serialize the discriminated union:

struct xdr_discrim u_tag_arms[4] = {
    { INTEGER, xdr_int },
    { GNUMBERS, xdr_gnumbers }
    { STRING, xdr_wrap_string },
    { __dontcare__, NULL }
    /* always terminate arms with a NULL xdr_proc */ 
} 

bool_t
xdr_u_tag(XDR *xdrs, struct u_tag *utp)
{
    return (xdr_union(xdrs, &utp->utype, &utp->uval,
                      u_tag_arms, NULL)); 
}

The routine xdr_gnumbers() was described in “The XDR Library”. The routine xdr_wrap_string() was described in Example C. The defaultarm parameter to xdr_union() (the last parameter) is NULL in this example. Therefore, the value of the union's discriminant may legally take on only values listed in the u_tag_arms array. This example also demonstrates that the elements of the arm's array need not be sorted.

It is worth pointing out that the values of the discriminant may be sparse, although in this example they are not. It is always good practice to assign explicit integer values to each element of the discriminant's type. This practice both documents the external representation of the discriminant and guarantees that different C compilers emit identical discriminant values.

Pointers

In C language it is often convenient to put pointers to a structure within another structure. The xdr_reference() primitive makes it easy to serialize, deserialize, and free these referenced structures:

bool_t xdr_reference(XDR *xdrs, char **pp, u_int ssize,
                     xdrproc_t proc);

Parameter pp is the address of the pointer to the structure; parameter ssize is the size in bytes of the structure (use the C function sizeof() to obtain this value); and proc is the XDR routine that describes the structure. When you are decoding data, storage is allocated if *pp is NULL.

There is no need for a primitive xdr_struct() to describe structures within structures, because pointers are always sufficient.


Note: xdr_reference() and xdr_array() are not interchangeable external representations of data.

Example E

Suppose there's a structure containing a person's name and a pointer to a gnumbers structure containing the person's gross assets and liabilities. This example demonstrates this construct:

struct pgn {
    char *name;
    struct gnumbers *gnp;
};

The corresponding XDR routine for this structure is:

bool_t
xdr_pgn(XDR *xdrs, struct pgn *pp)
{
    if (xdr_string(xdrs, &pp->name, NLEN) &&
        xdr_reference(xdrs, &pp->gnp,
                      sizeof(struct gnumbers), xdr_gnumbers))
        return(TRUE);
    return(FALSE);
}

Pointer Semantics and XDR

In many applications, C programmers attach double meaning to the values of a pointer. Typically, the value NULL (or zero) means data is not needed, yet some application-specific interpretation applies. In essence, the C programmer is encoding a discriminated union efficiently by overloading the interpretation of the value of a pointer.

For instance, in Example E, a NULL pointer value for gnp could indicate that the person's assets and liabilities are unknown. That is, the pointer value encodes two things: whether or not the data is known; and if it is known, where it is located in memory. Linked lists are an extreme example of the use of application-specific pointer interpretation.

The primitive xdr_reference() cannot and does not attach any special meaning to a NULL-value pointer during serialization. That is, passing an address of a pointer whose value is NULL to xdr_reference() when you are serializing data will most likely cause a memory fault and, on the UNIX system, a core dump.

xdr_pointer() correctly handles NULL pointers. For more information about its use, see “Linked Lists”.

Non-filter Primitives

XDR streams can be manipulated with the primitives discussed in this section.

u_int xdr_getpos(XDR *xdrs);
bool_t xdr_setpos(XDR *xdrs, u_int pos);
xdr_destroy(XDR *xdrs);

The routine xdr_getpos() returns an unsigned integer that describes the current position in the data stream.


Note: In some XDR streams, the returned value of xdr_getpos() is meaningless; the routine returns –1 in this case (though –1 should be a legitimate value).

The xdr_setpos() routine sets a stream position to pos.


Note: In some XDR streams, setting a position is impossible; in such cases, xdr_setpos() will return FALSE. This routine will also fail if the requested position is out-of-bounds. The definition of bounds varies from stream to stream.

The xdr_destroy() primitive destroys the XDR stream. Usage of the stream after calling this routine is undefined.

XDR Operation Directions

You can optimize XDR routines by taking advantage of the direction of the operation (XDR_ENCODE, XDR_DECODE, or XDR_FREE). The value xdrs–>x_op always contains the direction of the XDR operation. Programmers are not encouraged to take advantage of this information. Therefore, no example is presented here. However, an example in “Linked Lists” demonstrates the usefulness of the xdrs->x_op field.

XDR Stream Access

An XDR stream is obtained by calling the appropriate creation routine. These creation routines take arguments that are tailored to the specific properties of the stream.

Streams currently exist for (de)serialization of data to or from standard I/O FILE streams, TCP/IP connections and UNIX files, and memory. “XDR Stream Implementation” documents the XDR object and how to make new XDR streams when they are required.

Standard I/O Streams

XDR streams can be interfaced to standard I/O using the xdrstdio_create() routine:

#include <stdio.h>
#include <rpc/rpc.h>    /* xdr streams part of rpc */

void
xdrstdio_create(XDR *xdrs, FILE *fp, enum xdr_op x_op);

The xdrstdio_create() routine initializes an XDR stream pointed to by xdrs. The XDR stream interfaces to the standard I/O library. Parameter fp is an open file, and x_op is an XDR direction.

Memory Streams

Memory streams allow the streaming of data into or out of a specified area of memory:

#include <rpc/rpc.h>

void
xdrmem_create(XDR *xdrs, char *addr, u_int len,
              enum xdr_op x_op);

The xdrmem_create() routine initializes an XDR stream in local memory. The memory is pointed to by parameter addr; len is the length in bytes of the memory. The parameters xdrs and x_op are identical to the corresponding parameters of xdrstdio_create(). Currently, the UDP/IP implementation of RPC uses xdrmem_create(). Complete call or result messages are built in memory before calling the sendto() system call.

Record (TCP/IP) Streams

A record stream is an XDR stream built on top of a record-marking standard that is built on top of the UNIX file or 4.3BSD connection interface.

#include <rpc/rpc.h>   /* xdr streams are part of the
                        * rpc library */

xdrrec_create(XDR *xdrs, u_int sendsize, u_int recvsize,
              void *iohandle,
              int (*readproc) (void *, void *, u_int),
              int (*writeproc) (void *, void *, u_int));

The routine xdrrec_create() provides an XDR stream interface that allows for bidirectional, arbitrarily long sequences of records. The contents of the records are meant to be data in XDR form. The stream's primary use is for interfacing RPC to TCP connections. However, it can be used to stream data into or out of normal UNIX files.

The parameter xdrs is similar to the corresponding parameter described above. The stream does its own data buffering similar to that of standard I/O. The parameters sendsize and recvsize determine the size in bytes of the output and input buffers, respectively; if their values are zero (0), predetermined defaults are used. When a buffer needs to be filled or flushed, the routine readproc() or writeproc() is called, respectively.

These routines are much like the read() and write() system calls. However, the first parameter to each routine is the opaque parameter iohandle. The other two parameters (buf and nbytes) and the results (byte count) are identical to the system routines.

If xxx is readproc() or writeproc(), it has this form:

/*
 * Returns the actual number of bytes transferred.
 * -1 is an error.
 */
int xxx(char *iohandle, char *buf, int len, int nbytes);

The XDR stream provides a means for delimiting records in the byte stream. The primitives specific to record streams are:

bool_t
xdrrec_endofrecord(XDR *xdrs, bool_t flushnow);

bool_t
xdrrec_skiprecord(XDR *xdrs);

bool_t
xdrrec_eof(XDR *xdrs);

(See “Advanced Topics” for the implementation details of delimiting records in a stream.)

The xdrrec_endofrecord() routine causes the current outgoing data to be marked as a record. If the parameter flushnow is TRUE, the stream's writeproc() will be called; otherwise, writeproc() will be called when the output buffer has been filled.

The xdrrec_skiprecord() routine causes an input stream's position to be moved past the current record boundary and onto the beginning of the next record in the stream.

If no data remains in the stream's input buffer, the xdrrec_eof() routine returns TRUE; that is, there is no more data in the underlying file descriptor.

XDR Stream Implementation

This section provides the abstract data types needed to implement new instances of XDR streams.

The XDR Object

This structure defines the interface to an XDR stream:

enum xdr_op { XDR_ENCODE = 0, XDR_DECODE = 1, XDR_FREE = 2 };

typedef struct {
    enum xdr_op x_op;       /* operation; fast added param */
    struct xdr_ops {
        bool_t  (*x_getlong)();  /* get long from stream */
        bool_t  (*x_putlong)();  /* put long to stream */
        bool_t  (*x_getbytes)();  /* get bytes from stream */
        bool_t  (*x_putbytes)(); /* put bytes to stream */
        u_int   (*x_getpostn)(); /* return stream offset */
        bool_t  (*x_setpostn)(); /* reposition offset */
        caddr_t (*x_inline)();   /* ptr to buffered data */
        VOID    (*x_destroy)();  /* free private area */
    } *x_ops;
    caddr_t x_public;        /* users' data */
    caddr_t x_private;       /* pointer to private data */
    caddr_t x_base;          /* private for position info */
    int     x_handy;         /* extra private word */
} XDR;

The x_op field is the current operation being performed on the stream. This field is important to the XDR primitives but should not affect a stream's implementation. That is, a stream's implementation should not depend on this value.

The fields x_private, x_base, and x_handy are private to the particular stream's implementation. The field x_public is for the XDR client and should never be used by the XDR stream implementations or the XDR primitives.

x_getpostn(), x_setpostn(), and x_destroy() are macros for accessing operations. The operation x_inline() takes two parameters: an XDR * and an unsigned integer, which is a byte count. The routine returns a pointer to a piece of the stream's internal buffer. The caller can then use the buffer segment for any purpose. From the stream's point of view, the bytes in the buffer segment have been consumed. The routine may return NULL if it cannot return a buffer segment of the requested size.


Note: The x_inline() routine is for cycle squeezers. Use of the resulting buffer is not data portable. Programmers should avoid using this feature.

The operations x_getbytes() and x_putbytes() blindly get and put sequences of bytes from or to the underlying stream; they return TRUE if they are successful, and FALSE otherwise. The routines have identical parameters (replace xxx):

bool_t xxxbytes(XDR *xdrs, char *buf, u_int bytecount);

The operations x_getlong() and x_putlong() receive and put long numbers from and to the data stream. It is the responsibility of these routines to translate the numbers between the machine representation and the (standard) external representation. The IRIX routines htonl() and ntohl() can be helpful in accomplishing this task. Appendix B, “XDR Protocol Specification”, defines the standard representation of numbers.

The higher-level XDR implementation assumes that signed and unsigned long integers contain the same number of bits and that nonnegative integers have the same bit representations as unsigned integers.

These routines return TRUE if they succeed, and FALSE otherwise. They have identical parameters:

bool_t xxxlong(XDR *xdrs, long *lp);

Implementors of new XDR streams must make an XDR structure (with new operation routines) available to clients, using some kind of creation routine.

Advanced Topics

This section describes additional techniques for passing data structures; for example, linked lists (of arbitrary lengths). Unlike the simpler examples already presented in this chapter, the examples in this section are written using both the XDR C library routines and the XDR data description language.

Linked Lists

Example E (see “Pointers”) presented a C data structure and its associated XDR routines for an individual's gross assets and liabilities. The example is duplicated here:

struct gnumbers {
    long g_assets;
    long g_liabilities; 
};

bool_t
xdr_gnumbers(XDR *xdrs, struct gnumbers *gp)
{
    if (xdr_long(xdrs, &(gp->g_assets)))
        return (xdr_long(xdrs, &(gp->g_liabilities)));
    return (FALSE);
}

Now assume that you want to implement a linked list of such information. A data structure could be constructed as follows:

struct gnumbers_node {
    struct gnumbers gn_numbers;
    struct gnumbers_node *gn_next;
};

typedef struct gnumbers_node *gnumbers_list;

The head of the linked list can be thought of as the data object; that is, the head is not merely a convenient shorthand for a structure. Similarly, the gn_next field is used to indicate whether or not the object has terminated. Unfortunately, if the object continues, the gn_next field is also the address of where it continues. The link addresses do not carry any useful information when the object is serialized.

The XDR data description of this linked list is described by the recursive type declaration of gnumbers_list:

typedef union switch (boolean) {
    case TRUE: struct {
        struct gnumbers current_element;
        gnumbers_list rest_of_list;
    };
    case FALSE: struct {};
} gnumbers_list;

In this description, the boolean indicates whether there is more data following it. If the boolean is FALSE, then it is the last data field of the structure. If TRUE, it is followed by a gnumbers structure and (recursively) by a gnumbers_list (the rest of the object). Note that the C declaration has no boolean explicitly declared (although the gn_next field implicitly carries the information), while the XDR data description has no pointer explicitly declared.

Hints for writing the XDR routines for a gnumbers_list follow easily from the XDR description above. Note how the primitive xdr_pointer() is used to implement the above XDR union:

bool_t
xdr_gnumbers_node(XDR *xdrs, gnumbers_node *gn)
{
    return(xdr_gnumbers(xdrs, &gn->gn_numbers) &&
           xdr_gnumbers_list(xdrs, &gp->gn_next));
}

bool_t
xdr_gnumbers_list(XDR *xdrs, gnumbers_list *gnp)
{
    return(xdr_pointer(xdrs, gnp,
                       sizeof(struct gnumbers_node),
                       xdr_gnumbers_node));
}

The unfortunate side effect of XDRing a list with these routines is that the C stack grows linearly with respect to the number of nodes in the list due to the recursion. The following routine collapses the above two mutually recursive routines into a single, nonrecursive routine:

bool_t
xdr_gnumbers_list(XDR *xdrs, gnumbers_list *gnp)
{
    bool_t more_data;
    gnumbers_list *nextp;
    for (;;) {
        more_data = (*gnp != NULL);
        if (!xdr_bool(xdrs, &more_data)) {
            return(FALSE);
        }
        if (! more_data) {
            break;
        }
        if (xdrs->x_op == XDR_FREE) {
            nextp = &(*gnp)->gn_next;
        }
        if (!xdr_reference(xdrs, gnp,
                           sizeof(struct gnumbers_node),
                           xdr_gnumbers)) {
            return(FALSE);
        }
        gnp = (xdrs->x_op == XDR_FREE) ?
               nextp : &(*gnp)->gn_next;
    }
    *gnp = NULL;
    return(TRUE);
}

The first task is to find out whether there is more data so that the boolean information can be serialized. Notice that this statement is unnecessary in the XDR_DECODE case, since the value of more_data is not known until you deserialize it in the next statement.

The next statement XDR's the more_data field of the XDR union. If there isn't any more data, set this last pointer to NULL to indicate the end of the list, and return TRUE, because you are done. Note that setting the pointer to NULL is only important in the XDR_DECODE case, since it is already NULL in the XDR_ENCODE and XDR_FREE cases.

Next, if the direction is XDR_FREE, the value of nextp is set to indicate the location of the next pointer in the list. You set this value now because you need to dereference gnp to find the location of the next item in the list, and after the next statement, the storage pointed to by gnp will be freed up and no longer valid. You can't free gnp in this way for all directions, though, because in the XDR_DECODE direction the value of gnp won't be set until the next statement.

Next, XDR the data in the node using the xdr_reference() primitive. xdr_reference() is like xdr_pointer() (used earlier), but it does not send over the boolean indicating whether there is more data. Use it instead of xdr_pointer(), because you have already XDR'd this information.

Notice that the XDR routine passed is not the same type as an element in the list. The routine passed is xdr_gnumbers(), for XDR'ing gnumbers, but each element in the list is actually of type gnumbers_node. You don't pass xdr_gnumbers_node(), because it is recursive, but instead use xdr_gnumbers(), which XDR's all of the nonrecursive part. Note that this trick will work only if the gn_numbers field is the first item in each element, so that their addresses are identical when passed to xdr_reference().

Finally, update gnp to point to the next item in the list. If the direction is XDR_FREE, set it to the previously saved value; otherwise you can dereference gnp to get the proper value. Though harder to understand than the recursive version, this nonrecursive routine is less likely to blow the C stack. It will also run more efficiently, since a lot of the procedure call overhead has been removed. Most lists are small, though (in the hundreds of items or less), and the recursive version should be sufficient for them.