Friday Feb 01, 2008

The meaning of -xmemalign

I made some comments on a thread on the forums about memory alignment on SPARC and the -xmemalign flag. I've talked about memory alignment before, but this time the discussion was more about how the flag works. In brief:

  • The flag has two parts -xmemalign=[1|2|4|8][i|s]
  • The number specifies the alignment that the compiler should assume when compiling an object file. So if the compiler is not certain that the current variable is correctly aligned (say it's accessed through a pointer) then the compiler will assume the alignment given by the flag. Take a single precision floating point value that takes four bytes. Under -xmemalign=1[i|s] the compiler will assume that it is unaligned, so will issue four single byte loads to load the value. If the alignenment is specified as -xmemalign=2[i|s] the compiler will assume two byte alignment, so will issue two loads to get the four byte value.
  • The suffix [i|s] tells the compiler how to behave if there is a misaligned access. For 32-bit codes the default is i which fixes the misaligned access and continues. For 64-bit codes the default is s which causes the app to die with a SIGBUS error. This is the part of the flag that has to be specified at link time because it causes different code to be linked into the binary depending on the desired behaviour. The C documentation captures this correctly, but the C++ and Fortran docs will be updated.

Tuesday Jun 12, 2007

Identifying misaligned loads in 32-bit code using dtrace

A previous blog entry talks about handling and detecting misaligned memory accesses. For 64-bit code this is easy to achieve using the Performance Analyzer, for 32-bit code the analysis is a bit more tricky. Fortunately it is possible to do the 32-bit analysis with dtrace

Consider the following program which has a misaligned memory access. The default mode of the compiler (since Sun Studio 9) will compile the binary to trap to fix the misalignment and continue

% more align.c
void main()
{
  volatile char a[10];
  int i;
  for (i=0; i<100000000; i++) {(\*(int\*)(&a[1]))++;}
}

The following dtrace script will instrument the misaligned data access trap handler and report all the pids that trigger this

% more tr.d
fbt::do_unaligned:entry
{
  @p[pid]=count();
}

It can be run with

% sudo dtrace -s tr.d
dtrace: script 'tr.d' matched 1 probe
\^C


    14873           260932

The script returns the pid which is having misalignment issues. This information is useful, in that it is trivial to recompile the binary with a different setting for -xmemalign and avoid the behaviour. But it would be very useful to know where the traps are occuring in the binary - perhaps most of the traps only happen in one place, and that place can be fixed in the source.

% more tr.d
fbt::do_unaligned:entry
{
  @[ustack()]=count();
}

This script produces output that identifies the locations in the binary where the traps are being generated. For the simple test code there are two locations - the load and the store.

sudo dtrace -s tr.d
dtrace: script 'tr.d' matched 1 probe
\^C
              align`main+0x10
              align`_start+0x108
           130466

              align`main+0x18
              align`_start+0x108
           130466

The disassembly for the loop is as follows

main()
        10b80:  9d e3 bf 90  save       %sp, -112, %sp
...
        10b90:  d0 07 bf f7  ld         [%fp - 9], %o0  <<<<<< misaligned
        10b94:  90 02 20 01  inc        %o0
        10b98:  d0 27 bf f7  st         %o0, [%fp - 9]  <<<<<< misaligned
        10b9c:  ba 07 60 01  inc        %i5
        10ba0:  80 a7 40 09  cmp        %i5, %o1
        10ba4:  06 bf ff fb  bl         main+0x10       ! 0x10b90
        10ba8:  01 00 00 00  nop
About

Darryl Gove is a senior engineer in the Solaris Studio team, working on optimising applications and benchmarks for current and future processors. He is also the author of the books:
Multicore Application Programming
Solaris Application Programming
The Developer's Edge

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
5
6
8
9
10
12
13
14
15
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Bookmarks
The Developer's Edge
Solaris Application Programming
Publications
Webcasts
Presentations
OpenSPARC Book
Multicore Application Programming
Docs