DTracing vim latency

I've just decreased my vim start up time by 340x, which makes for an interesting example of using DTrace:

I noticed that the vim text editor was taking longer and longer to fire up, and it was now taking up to 14 seconds for vim to initialise. Something is definitely wrong - and there should be some big game for DTrace to hunt down.

Syscall analysis is a good starting point, and so I run procsystime from the DTraceToolkit to show syscall timing information and counts:

# /opt/DTT/procsystime -a -n vim
Hit Ctrl-C to stop sampling...
\^C

Elapsed Times for processes vim,

         SYSCALL          TIME (ns)
     sigaltstack               1866
      sigpending               2046
[...]
           close           96958757
         pollsys         1694151254
       nanosleep        12098745228
          TOTAL:        14219906941

CPU Times for processes vim,

         SYSCALL          TIME (ns)
     sigaltstack                382
      sigpending                420
[...]
         connect            2644571
           xstat            3809476
            open            9690091
          TOTAL:           26863724

Syscall Counts for processes vim,

         SYSCALL              COUNT
          access                  1
           chmod                  1
[...]
            mmap                 67
             brk                188
           xstat                213
          TOTAL:               1041

This output has been cut to fit. Reading from bottom up:

  • Syscall Counts show nothing obvious - 1041 syscalls isn't a lot; what is more interesting is their elapsed and on-CPU times -- for example, those 213 xstat()s may have blocked on a slow NFS share.
  • CPU Times in total are 26 ms - so they don't explain our long start up times.
  • Elapsed Times show a total of 14 seconds, 12 seconds of which are during nanosleep. This corresponds to the long delay experienced as vim loaded.

Next is to examine why vim is calling nanosleep. We can start by pulling out user stack traces when nanosleep is called, either from a DTrace one-liner or using dtruss from the DTraceToolkit:

# dtrace -n 'syscall::nanosleep:entry /execname == "vim"/ { @[ustack()] = count(); }'
dtrace: description 'syscall::nanosleep:entry ' matched 1 probe
\^C


              libc.so.1`__nanosleep+0x7
              libX11.so.4`_X11TransConnectDisplay+0x6b8
              libX11.so.4`XOpenDisplay+0xbd
              libXt.so.4`XtOpenDisplay+0xa0
              vim`setup_term_clip+0x233
              vim`main+0xb5f
              vim`_start+0x80
               12

# /opt/DTT/dtruss -st nanosleep -n vim
PID/LWP   SYSCALL(args)                  = return
849553/1:  nanosleep(0x8046C50, 0x8046C58, 0xD9213FA8)           = 0 0

              libc.so.1`__nanosleep+0x7
              libX11.so.4`_X11TransConnectDisplay+0x6b8
              libX11.so.4`XOpenDisplay+0xbd
              libXt.so.4`XtOpenDisplay+0xa0
              vim`setup_term_clip+0x233
              vim`main+0xb5f
              vim`_start+0x80
[...]

This shows how nanosleep was being called by vim. It was being called from libX11 in functions such as _X11TransConnectDisplay() -- huh? I'm using vim in text mode, why would it be talking to X11?

Well, if I have an X11 issue, let me start by checking the DISPLAY variable:

$ echo $DISPLAY
:11.0

Ahh - that doesn't seem right. Maybe I'm connected via ssh, and we are doing X11 port forwarding?:

$ ptree $$
369710 screen
  369715 /bin/bash
    851597 ptree 369715

Rather than showing sshd in our ancestory, this has shown screen. The screen software provides text based multiple session recovery. When I initialised screen, I did so via an ssh session; now I'm connecting via the console, and that ssh session is long gone - as is the DISPLAY :11.0.

Setting DISPLAY to :0.0 fixes the issue, and now procsystime shows the following syscall elapsed times:

# /opt/DTT/procsystime -Te -n vim
Hit Ctrl-C to stop sampling...
\^C

Elapsed Times for processes vim,

         SYSCALL          TIME (ns)
     sigaltstack               1980
      systeminfo               2221
[...]
          access            4190720
           xstat            7960249
            open           16273686
          TOTAL:           42198388

A total elapsed time of 42 ms is 340 times faster than it was previously!

... I didn't answer why vim was talking to X11 in the first place: in the previous stack trace, the vim function that calls X is setup_term_clip(). There are numerous things we can do with DTrace to learn more about this, but first I'll check the vim source to see if a comment explains what is going on. This is what I found:

   /\* Start using the X clipboard, unless the GUI was started. \*/
[...]
      setup_term_clip()

Ahh - vim was trying to use the X clipboard, which was failing as our DISPLAY variable was corrupt.

Comments:

Post a Comment:
Comments are closed for this entry.
About

Brendan Gregg, Fishworks engineer

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today