Chapter 2. Using Multiple Page Sizes

This chapter describes how to use the multiple page size support provided by the IRIX kernel to improve the performance of an application. It covers the following topics:

Introduction

The IRIX operating system maps the virtual memory of a process into physical memory in chunks called pages. Whenever a process accesses its address space, the virtual memory address is translated to a physical memory address by the processor. The recently used translations are cached in a table inside the processor called the translation lookaside buffer (TLB).

Each TLB entry maps a page. The number of TLB entries for a processor is limited. If a translation is not found in the TLB, the processor raises a TLBMISS exception to the software. The number of TLBMISS exceptions a process can withstand depends upon its working set.

The working set is the range of address space the process needs to run. If the working set is large or if the process has a poor locality of reference, the process will incur more TLBMISS exceptions. Each TLBMISS exception has a small overhead and if a process has a lot of TLBMISS exceptions, the overhead can significantly affect the performance of the process.

Tools such as perfex(1) can be used to measure the number of TLB misses a process incurs during its run. The range of memory that can be mapped by a TLB depends on the page size. By increasing the page size, a larger range of memory can be mapped by the TLB. This results in a reduction in TLB misses and improves the performance of an application.

User Interface to Multiple Page Sizes

The policy module (PM) interface can be used to set a page size for an address range in the address space of a process. For more information on policy modules, see Chapter 1, “Using Memory Management Policy Modules”. The following example illustrates a how to set a page size for a piece of an address space of a process. The program sets a 64K page size to its text and it allocates a buffer in its BSS (that is, how much space the kernel should allocate for uninitialized data, historically called bss for “block started by symbol”). The program is as follows:


#define PAGE_SIZE       65536

#define BUFSIZE         6*PAGE_SIZE

char    buf[BUFSIZE];

policy_set_t policy = {
        PlacementDefault, (void *)1,
        FallbackLargepage, NULL,
        ReplicationDefault, NULL,
        MigrationDefault, NULL,
        PagingDefault, NULL,
        PAGE_SIZE
};


/*
 * Creates a PM with a particular page size and attaches it to a specific
 * address range.
 */

int
set_page_size(int size, char *vaddr, int len)
{
        pmo_handle_t    pm;

        /*
         * Set the page size.
         */
        policy.page_size = size;

        /*
         * Create a PM.
         */

        pm = pm_create( &policy);

        if ( pm < 0) {
                perror("pm_create");
                return -1;
        }

        /*
         * Attach the PM to the virtual address range.
         */
        if (pm_attach(pm, vaddr, len) < 0) {
                perror("pm_attach");
                return -1;
        }
        return 0;
}


main()
{
        extern  int     _ftext[];
        extern  int     etext[];
        int     len;
        char    *ftext;
        volatile char   *vaddr;

/*
         * Compute text start and length.
         */
        ftext = (char *)_ftext;
        ftext =  (char *)((long)ftext & (~(0x4000 -1)));
        len = ((char *)etext - ftext);

        /*
         * Set the page size as 64K for the process text and 
         * the buffer buf.
         */

        if (set_page_size(PAGE_SIZE, ftext, len) == -1) {
                exit(1);
        }

        if (set_page_size(PAGE_SIZE, buf, sizeof(buf)) == -1) {
                exit(1);
        }
}

Recommended Page Sizes

The page sizes supported depends on the base page size of the system. The base page size can be obtained by using the getpagesize(2) system call. Currently, IRIX supports two page sizes, 16 KB and 4 KB.

On systems with 16K page size, the following page sizes are supported: 16 KB, 64 KB, 256 KB, 1 MB, 4 MB, and 16 MB.

On systems with 4K page size, the following page sizes are supported: 4 KB, 16 KB, 256 KB, 1 MB, 4 MB, and 16 MB.

In general, for most applications, 4 KB, 16 KB, and 64 KB page sizes are sufficient to eliminate TLBMISS overhead.

Tunable Parameters

To adjust page sizes for your system adjust the following parameters:

Coalescing Parameters

The IRIX kernel attempts to keep a percentage of total free memory in the system at a certain page size. It periodically attempts to coalesce a chunk of adjacent pages to form a larger page. The following tunable parameters specify the upper limit for the number of free pages at a particular page size. If your system does not need large page sizes, you can set these tunable parameters to zero. The tunables parameters are as follows:

  • percent_totalmem_16k_pages

  • percent_totalmem_64k_pages

  • percent_totalmem_256k_pages

  • percent_totalmem_1m_pages

  • percent_totalmem_4m_pages

  • percent_totalmem_16m_pages

These parameters specify the percentage of total memory that can be used as an upper limit for the number of pages in a specific page size. For example, setting the percent_totalmem_64k_pages parameter to 20, implies that the coalescing mechanism will try to limit the number of free 64 KB pages to 20% of the total memory in the system. These tunable parameters can be tuned dynamically at run time. Note that very large pages, greater or equal to 1 MB, are harder to coalesce dynamically during run time on a busy system. It is recommended that these tunable parameter be set during boot time in such cases. Setting these tunable parameters to a high value can result in high coalescing activity. If the system runs low on memory, the large pages can be split into lower sized pages as needed. The default value for all these parameters is zero.

Reserving Large Pages

As said earlier, it is hard to coalesce very large pages, greater than 1 MB, at run time due to fragmentation of physical memory. Applications, which need such pages, can set tunable parameters to reserve large pages during boot time. They are specified as the number of pages. The tunables parameters are as follows:

  • nlpages_64k

  • nlpages_256k

  • nlpages_1m

  • nlpages_4m

  • nlpages_16m

For example, setting nlpages_4m to 4 will result in the system reserving four 4 Mybes pages to be reserved during boot time. If the system runs low on memory, the reserved pages can be split down to lower sized pages for use by other applications. You can use the osview(1) command to view the number of free pages available at a particular page size. The default value for all these parameters is zero.

Caveats

If the kernel fails to allocate a large page for the process, it uses a page of the lowest page size. The same is true if the virtual address range is smaller than the page size. For the best performance, the starting virtual address should be aligned at 2*page_size boundary and should be of a length that is a multiple of 2*page_size. This is mostly due to the R4000 and R10000 processor limitations.