Obtaining Function Arguments on AMD64

Now that you have experienced enough pain debugging on AMD64 platforms without arguments, you would be delighted to hear that there are options out there to help you!

The Studio 10 patch compilers (minimum patch number is 117846-03, use ube -V to verify) offers an option -Wu,-save_args on amd64 for saving INTEGER type function arguments passed via registers on the stack. When this option is specified, up to 6 arguments are saved on the stack on function entry, and will not be modified through out the life of the routine (the checkpoint effect we have all dreamed about). For example,
        void
        foo(int a1, int a2, int a3, int a4, int a5, int a6, int a7)
        {
        ...
        }
Disassembled code will look something like the following:
        pushq   %rbp
        movq    %rsp, %rbp
        subq    $0x30, %rsp                     \*\*
        movq    %rdi, -0x8(%rbp)
        movq    %rsi, -0x10(%rbp)
        movq    %rdx, -0x18(%rbp)
        movq    %rcx, -0x20(%rbp)
        movq    %r8, -0x28(%rbp)
        movq    %r9, -0x30(%rbp)
        ...
\*\*: The space being reserved is in addition to what the current function prolog already reserves.

return PC
%rbp
%rdi
%rsi
%rdx
%rcx
%r8
%r9

Nothing special is done for arguments beyond the first 6. If there are odd number of arguments to a function, additional space should be reserved on the stack to maintain 16-byte alignment. For example,
        argc == 0: no argument saving.
        argc == 3: save 3, but reserve space for 4 to maintain stack alignment.
        argc == 7: save 6.
The -save_args flag has no direct association with the optimization level. In other words, you can use various optimization level along with -save_args.

A new Dwarf attribute has been introduced to indicate if a function has been compiled with -save_args:
        DW_AT_SUN_amd64_parmdump        = 0x2224
The attribute has the value of 1 or 0. The attribute is only added when the value is 1. The attribute is attached to DW_TAG_subprogram tag.

You might wonder about the following:
  • How does the extra argument saving affect performance?

    With a 20-deep small function calls stack each with 6 arguments (to cause maximum argument saving), the impact of the extra saving is 18 nanoseconds around a 10% hit.
            #define FUNC(i, j) \\
                    static int      \\
                    func##i(int i1, int i2, int i3, int i4, int i5, int i6) \\
                    {                                                       \\
                            i3 = i1 + i2;                                   \\
                            i4 = i2 + i3;                                   \\
                            i5 = i3 + i4;                                   \\
                            i6 = func##j(i1, i2, i3, i4, i5, i6);           \\
                            return (i3 + i4 + i5 + i6);                     \\
                    }
        
    This is on hot cache where the first store to the stack won't suffer a page fault. Since in reality functions actually do something more complicated, the actual hit should be much smaller. If it turns out the -save_args option does affect performance of your particular application, you can always turn it off in production code.

  • Why was it implemented as callee-saved instead of caller-saved?

    • Smaller code size when functions are called by many callers.
    • Avoids useless argument saving when calling assembly functions.
    • Can be enabled only on the module that's being debugged.


  • So what does the output look like?

    Ha, I thought you would never ask!
    
    stack pointer for thread fffffe8123debe80: fffffe80006296c0
    [ fffffe80006296c0 unix`_resume_from_idle+0xde() ]
      fffffe8000629700 unix`swtch+0x241()
      fffffe8000629730 genunix`cv_wait+0x83(ffffffff82a44ed8, ffffffff82a44ed0)
      fffffe80006297a0 ufs`ufs_check_lockfs+0x14c(ffffffff82a44e00, ffffffff82a44eb0, 80000030)
      fffffe8000629800 ufs`ufs_lockfs_begin+0x14e(ffffffff82a44e00, fffffe8000629840, 80000030)
      fffffe8000629920 ufs`ufs_readlink+0x7e(ffffffff90377300, fffffe8000629980, ffffffff832e9428)
      fffffe8000629950 genunix`fop_readlink+0x24(ffffffff90377300, fffffe8000629980, ffffffff832e9428)
      fffffe80006299d0 genunix`pn_getsymlink+0x66(ffffffff90377300, fffffe8000629b20, ffffffff832e9428)
      fffffe8000629bc0 genunix`lookuppnvp+0x3f5(fffffe8000629ca0, 0, 1, 0, fffffe8000629e10, ffffffff8c907b80)
      fffffe8000629c60 genunix`lookuppnat+0x13e(fffffe8000629ca0, 0, 1, 0, fffffe8000629e10, 0)
      fffffe8000629d40 genunix`lookupnameat+0x88(805bd38, 0, 1, 0, fffffe8000629e10 , 0)
      fffffe8000629dd0 genunix`cstatat_getvp+0x17d(ffd19553, 805bd38, 1, 1, fffffe8000629e10, fffffe8000629e18)
      fffffe8000629e60 genunix`cstatat32+0x68(ffd19553, 805bd38, 1, fcfdbef8, 0, 10
      fffffe8000629e80 genunix`stat32+0x33(805bd38, fcfdbef8)
      fffffe8000629eb0 genunix`xstat32+0x26(2, 805bd38, fcfdbef8)
      fffffe8000629f00 unix`sys_syscall32+0x1ff()
    
        
Comments:

Post a Comment:
Comments are closed for this entry.
About

sherrym

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today