Thursday Feb 07, 2008

Page size and memory layout

Support for large pages has been available since Solaris 9, I've previously talked about the various ways that an application can be coaxed into using large pages. However, I wanted to quickly write up how the large pages are laid out in memory. Take the following code that allocates a large chunk of memory, and then iterates over it for enough time to run pmap -xs on it:

#include <stdlib.h>

void main()
{
  int x,y;
  char \*c;
  c=(char\*)malloc(sizeof(char)\*300000000);
  for (y=0; y<; y++)
  for (x=0; x<300000000; x++) { c[x]=c[x]+y;}
}

Compiling this code to use 4MB pages and then running the resulting executable produces a pmap output like:

% cc -xpagesize=4M t.c
% a.out&
[1] 15501
% pmap -xs 15501
15501:  a.out
 Address  Kbytes     RSS    Anon  Locked Pgsz Mode   Mapped File
00010000       8       8       -       -   8K r-x--  a.out
00020000       8       8       8       -   8K rwx--  a.out
00022000    3960    3960    3960       -   8K rwx--    [ heap ]
00400000  290816  290816  290816       -   4M rwx--    [ heap ]
...

Notice that the heap starts on 8KB pages, and uses these up until the memory reaches a 4MB boundary and then starts using 4MB pages. In this case it means that nearly 4MB of the memory is not using 4MB pages - if this happens to be where the majority of the program's active data resides, then there will still be plenty of TLB misses.

Fortunately, it is possible to tell the linker where to start the heap. There are some mapfiles provided in /usr/lib/ld/ for various scenarios, the one that we need is map.bssalign. Recompiling with this produces the following memory layout:

% cc -M /usr/lib/ld/map.bssalign -xpagesize=4M t.c
% a.out&
[1] 19077
% pmap -xs 19077
19077:  a.out
 Address  Kbytes     RSS    Anon  Locked Pgsz Mode   Mapped File
00010000       8       8       -       -   8K r-x--  a.out
00020000       8       8       8       -   8K rwx--  a.out
00400000  294912  294912  294912       -   4M rwx--    [ heap ]

With this change the heap now starts on a 4MB boundary and is entirely mapped with 4MB pages.

Wednesday Apr 18, 2007

Using large DTLB page sizes

The TLB is a structure on the chip that handles the mapping of virtual memory addresses (used by the application) into physical memory addresses (used by the hardware). It is a list of such mappings, each mapping describes a range of memory (called the page size), the default on SPARC in 8KB page sizes, but it can be configured up to impressively large sizes (eg 256MB for UltraSPARC T1). The command to display what page sizes the hardware supports is pagesize:

pagesize -a

If the application requests a virtual to physical translation that is not mapped in the TLB, then there's a TLB miss. On UltraSPARC III/IV the process of fetching a TLB entry takes about a hundred cycles.

Using a larger page size will reduce the number of TLB misses. Of course a large page size requires a large chunk of contiguous physical memory, and it's not always possible to get this.

An application can request large pages in one of three ways:

  • Using the ppgsz command to set the preferred page sizes.
  • Using the compiler flag -xpagesize= to set the preferred page size at compile time.
  • Preloading the mpss.so.1 library and using the MPSSHEAP, MPSSSTACK environment variables to describe the page size.

When an application is running it is possible to inspect the page sizes of the allocated memory using the command:

pmap -xs <pid>
About

Darryl Gove is a senior engineer in the Solaris Studio team, working on optimising applications and benchmarks for current and future processors. He is also the author of the books:
Multicore Application Programming
Solaris Application Programming
The Developer's Edge

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
5
6
8
9
10
12
13
14
15
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Bookmarks
The Developer's Edge
Solaris Application Programming
Publications
Webcasts
Presentations
OpenSPARC Book
Multicore Application Programming
Docs