Monday Jan 24, 2011

8 gdb tricks you should know

Despite its age, gdb remains an amazingly versatile and flexible tool, and mastering it can save you huge amounts of time when trying to debug problems in your code. In this post, I'll share 10 tips and tricks for using GDB to debug most efficiently.

I'll be using the Linux kernel for examples throughout this post, not because these examples are necessarily realistic, but because it's a large C codebase that I know and that anyone can download and take a look at. Don't worry if you aren't familiar with Linux's source in particular -- the details of the examples won't matter too much.

  1. break WHERE if COND

    If you've ever used gdb, you almost certainly know about the "breakpoint" command, which lets you break at some specified point in the debugged program.

    But did you know that you can set conditional breakpoints? If you add if CONDITION to a breakpoint command, you can include an expression to be evaluated whenever the program reaches that point, and the program will only be stopped if the condition is fulfilled. Suppose I was debugging the Linux kernel and wanted to stop whenever init got scheduled. I could do:

    (gdb) break context_switch if next == init_task
    

    Note that the condition is evaluated by gdb, not by the debugged program, so you still pay the cost of the target stopping and switching to gdb every time the breakpoint is hit. As such, they still slow the target down in relation to to how often the target location is hit, not how often the condition is met.

  2. command

    In addition to conditional breakpoints, the command command lets you specify commands to be run every time you hit a breakpoint. This can be used for a number of things, but one of the most basic is to augment points in a program to include debug output, without having to recompile and restart the program. I could get a minimal log of every mmap() operation performed on a system using:

    (gdb) b do_mmap_pgoff 
    Breakpoint 1 at 0xffffffff8111a441: file mm/mmap.c, line 940.
    (gdb) command 1
    Type commands for when breakpoint 1 is hit, one per line.
    End with a line saying just "end".
    >print addr
    >print len
    >print prot
    >end
    (gdb)
    
  3. gdb --args

    This one is simple, but a huge timesaver if you didn't know it. If you just want to start a program under gdb, passing some arguments on the command line, you can just build your command-line like usual, and then put "gdb --args" in front to launch gdb with the target program and the argument list both set:

    [~]$ gdb --args pizzamaker --deep-dish --toppings=pepperoni
    ...
    (gdb) show args
    Argument list to give program being debugged when it is started is
      " --deep-dish --toppings=pepperoni".
    (gdb) b main
    Breakpoint 1 at 0x45467c: file oven.c, line 123.
    (gdb) run
    ...
    

    I find this especially useful if I want to debug a project that has some arcane wrapper script that assembles lots of environment variables and possibly arguments before launching the actual binary (I'm looking at you, libtool). Instead of trying to replicate all that state and then launch gdb, simply make a copy of the wrapper, find the final "exec" call or similar, and add "gdb --args" in front.

  4. Finding source files

    I run Ubuntu, so I can download debug symbols for most of the packages on my system from ddebs.ubuntu.com, and I can get source using apt-get source. But how do I tell gdb to put the two together? If the debug symbols include relative paths, I can use gdb's directory command to add the source directory to my source path:

    [~/src]$ apt-get source coreutils
    [~/src]$ sudo apt-get install coreutils-dbgsym
    [~/src]$ gdb /bin/ls
    GNU gdb (GDB) 7.1-ubuntu
    (gdb) list main
    1192    ls.c: No such file or directory.
        in ls.c
    (gdb) directory ~/src/coreutils-7.4/src/
    Source directories searched: /home/nelhage/src/coreutils-7.4:$cdir:$cwd
    (gdb) list main
    1192        }
    1193    }
    1194    
    1195    int
    1196    main (int argc, char **argv)
    1197    {
    1198      int i;
    1199      struct pending *thispend;
    1200      int n_files;
    1201
    

    Sometimes, however, debug symbols end up with absolute paths, such as the kernel's. In that case, I can use set substitute-path to tell gdb how to translate paths:

    [~/src]$ apt-get source linux-image-2.6.32-25-generic
    [~/src]$ sudo apt-get install linux-image-2.6.32-25-generic-dbgsym
    [~/src]$ gdb /usr/lib/debug/boot/vmlinux-2.6.32-25-generic 
    (gdb) list schedule
    5519    /build/buildd/linux-2.6.32/kernel/sched.c: No such file or directory.
        in /build/buildd/linux-2.6.32/kernel/sched.c
    (gdb) set substitute-path /build/buildd/linux-2.6.32 /home/nelhage/src/linux-2.6.32/
    (gdb) list schedule
    5519    
    5520    static void put_prev_task(struct rq *rq, struct task_struct *p)
    5521    {
    5522        u64 runtime = p->se.sum_exec_runtime - p->se.prev_sum_exec_runtime;
    5523    
    5524        update_avg(&p->se.avg_running, runtime);
    5525    
    5526        if (p->state == TASK_RUNNING) {
    5527            /*
    5528             * In order to avoid avg_overlap growing stale when we are
    
  5. Debugging macros

    One of the standard reasons almost everyone will tell you to prefer inline functions over macros is that debuggers tend to be better at dealing with inline functions. And in fact, by default, gdb doesn't know anything at all about macros, even when your project was built with debug symbols:

    (gdb) p GFP_ATOMIC
    No symbol "GFP_ATOMIC" in current context.
    (gdb) p task_is_stopped(&init_task)
    No symbol "task_is_stopped" in current context.
    

    However, if you're willing to tell GCC to generate debug symbols specifically optimized for gdb, using -ggdb3, it can preserve this information:

    $ make KCFLAGS=-ggdb3
    ...
    (gdb) break schedule
    (gdb) continue
    (gdb) p/x GFP_ATOMIC
    $1 = 0x20
    (gdb) p task_is_stopped_or_traced(init_task)
    $2 = 0
    

    You can also use the macro and info macro commands to work with macros from inside your gdb session:

    (gdb) macro expand task_is_stopped_or_traced(init_task)
    expands to: ((init_task->state & (4 | 8)) != 0)
    (gdb) info macro task_is_stopped_or_traced
    Defined at include/linux/sched.h:218
      included at include/linux/nmi.h:7
      included at kernel/sched.c:31
    #define task_is_stopped_or_traced(task) ((task->state & (__TASK_STOPPED | __TASK_TRACED)) != 0)
    

    Note that gdb actually knows which contexts macros are and aren't visible, so when you have the program stopped inside some function, you can only access macros visible at that point. (You can see that the "included at" lines above show you through exactly what path the macro is visible).

  6. gdb variables

    Whenever you print a variable in gdb, it prints this weird $NN = before it in the output:

    (gdb) p 5+5
    $1 = 10
    

    This is actually a gdb variable, that you can use to reference that same variable any time later in your session:

    (gdb) p $1
    $2 = 10
    

    You can also assign your own variables for convenience, using set:

    (gdb) set $foo = 4
    (gdb) p $foo
    $3 = 4
    

    This can be useful to grab a reference to some complex expression or similar that you'll be referencing many times, or, for example, for simplicity in writing a conditional breakpoint (see tip 1).

  7. Register variables

    In addition to the numeric variables, and any variables you define, gdb exposes your machine's registers as pseudo-variables, including some cross-architecture aliases for common ones, like $sp for the the stack pointer, or $pc for the program counter or instruction pointer.

    These are most useful when debugging assembly code or code without debugging symbols. Combined with a knowledge of your machine's calling convention, for example, you can use these to inspect function parameters:

    (gdb) break write if $rsi == 2
    

    will break on all writes to stderr on amd64, where the $rsi register is used to pass the first parameter.

  8. The x command

    Most people who've used gdb know about the print or p command, because of its obvious name, but I've been surprised how many don't know about the power of the x command.

    x (for "examine") is used to output regions of memory in various formats. It takes two arguments in a slightly unusual syntax:

    x/FMT ADDRESS
    

    ADDRESS, unsurprisingly, is the address to examine; It can be an arbitrary expression, like the argument to print.

    FMT controls how the memory should be dumped, and consists of (up to) three components:

    • A numeric COUNT of how many elements to dump
    • A single-character FORMAT, indicating how to interpret and display each element
    • A single-character SIZE, indicating the size of each element to display.

    x displays COUNT elements of length SIZE each, starting from ADDRESS, formatting them according to the FORMAT.

    There are many valid "format" arguments; help x in gdb will give you the full list, so here's my favorites:

    x/x displays elements in hex, x/d displays them as signed decimals, x/c displays characters, x/i disassembles memory as instructions, and x/s interprets memory as C strings.

    The SIZE argument can be one of: b, h, w, and g, for one-, two-, four-, and eight-byte blocks, respectively.

    If you have debug symbols so that GDB knows the types of everything you might want to inspect, p is usually a better choice, but if not, x is invaluable for taking a look at memory.

    [~]$ grep saved_command /proc/kallsyms
    ffffffff81946000 B saved_command_line
    
    
    (gdb) x/s 0xffffffff81946000
    ffffffff81946000 <>:     "root=/dev/sda1 quiet"
    

    x/i is invaluable as a quick way to disassemble memory:

    (gdb) x/5i schedule
       0xffffffff8154804a <schedule>:   push   %rbp
       0xffffffff8154804b <schedule+1>: mov    $0x11ac0,%rdx
       0xffffffff81548052 <schedule+8>: mov    %gs:0xb588,%rax
       0xffffffff8154805b <schedule+17>:    mov    %rsp,%rbp
       0xffffffff8154805e <schedule+20>:    push   %r15
    

    If I'm stopped at a segfault in unknown code, one of the first things I try is something like x/20i $ip-40, to get a look at what the code I'm stopped at looks like.

    A quick-and-dirty but surprisingly effective way to debug memory leaks is to let the leak grow until it consumes most of a program's memory, and then attach gdb and just x random pieces of memory. Since the leaked data is using up most of memory, you'll usually hit it pretty quickly, and can try to interpret what it must have come from.

~nelhage

Ksplice is hiring!

Do you love tinkering with, exploring, and debugging Linux systems? Does writing Python clones of your favorite childhood computer games sound like a fun weekend project? Have you ever told a joke whose punch line was a git command?

Join Ksplice and work on technology that most people will tell you is impossible: updating the Linux kernel while it is running.

Help us develop the software and infrastructure to bring rebootless kernel updates to Linux, as well as new operating system kernels and other parts of the software stack. We're hiring backend, frontend, and kernel engineers. Say hello at jobs@ksplice.com!

About

Tired of rebooting to update systems? So are we -- which is why we invented Ksplice, technology that lets you update the Linux kernel without rebooting. It's currently available as part of Oracle Linux Premier Support, Fedora, and Ubuntu desktop. This blog is our place to ramble about technical topics that we (and hopefully you) think are interesting.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today