Sunday Aug 04, 2013

Some times it’s a book sprint, other times it’s a marathon

One of the areas the X.Org Foundation has been working on in recent years is trying to bring new developers on board. Programs like Google Summer of Code and The X.Org Endless Vacation of Code (EVoC) help with this somewhat, but we’ve also been trying to find ways to lower the barrier to entry, both so the students in those programs can learn more and so that other developers have an easier time joining us.

Some of our efforts here have been technological - one of the driving efforts of the conversions from Imake to automake and from CVS to git was to make use of tools developers would already be familiar and productive with from other projects. The Modularization project, which broke up X.Org from one giant tree into over 200 small ones, had the goal of making it possible to fix a bug in a single library or driver without having to download and build many megabytes of software & fonts that were not being changed.

Another area of effort has been around the X.Org documentation, as I’ve written about before. Several years ago, Bart Massey suggested we try organizing a Book Sprint to write a guide for new X.Org developers, similar to the sprint he’d participated in to write a Google Summer of Code Mentoring Guide and a matching GSoC Student Guide. The Board agreed to let him try organizing one, but finding a time to bring everyone together was challenging.

We finally ended up holding a first session as a virtual gathering over the internet one weekend in March 2012. Bart, Matt Dew, Keith Packard, Stéphane Marchesin, and I used Gobby and Mumble to agree on an outline and write several chapters, along with creating some accompanying diagrams using graphviz dot. Unfortunately, the combination of the work from home setting, in which people kept dropping in and out as their home life intervened, the small number of people, and lack of an experienced organizer made this not as productive as other book sprints have been for other projects.

After the first weekend we had drafts of about 7 chapters and a preface, and outlines of several more. We also had a large chunk of prewritten material on graphics drivers that Stéphane had been working on already to link with. Over the next few months, Bart edited and formatted the chapters we had, while we got a little more written, including recruiting additional authors such as Peter Hutterer.

We then gathered again, this time in person, hosted by Matthias Hopf at the Georg-Simon-Ohm-University in Nürnberg, Germany, over the two days prior to the 2012 X.Org Developer’s Conference. We had some additional people join us this time, including Martin Peres, Lucas Stach, Ben Skeggs, and probably more I’ve forgotten. At this session, Bart, Matt & I worked on polsihing the rough edges on the guide we’d been creating, while the others worked more on the driver guide that Stéphane had started. Bart figured out the tools needed to generate ebook formats of the guide, such as epub, and made a cover and pulled together an overall structure for the guide, and by the end of that sprint, we had something we could read on the Android ebook reader he'd brought with him.

And then, we waited. Bart tried to figure out how to make the setup maintainable and reproducible, and how to make it available to our users, but his day job as a University professor kept taking time away from that. Bart also gave a lightning talk on our book sprint experience at the Write the Docs conference in Portland earlier this year covering what worked and what we learned didn’t work in our virtual sprint, and asking for ideas or help on how to go forward.

Meanwhile, the burden of fighting the spammers on the X.Org & FreeDesktop.Org wikis had gotten overwhelming on the current wiki software, so the freedesktop sitewranglers evaluated different solutions, and after looking at how well ikiwiki had been working for the past few years on the XCB wiki decided to move the rest of the FreeDesktop hosted wikis to ikiwiki as well. One major advantage is that it let us use the existing freedesktop authentication infrastructure for wiki accounts instead of having to find different ways to let our users in while keeping the spammers out. Another benefit is that it uses Markdown and git to update the wiki, so we can easily push files in Markdown format to the wiki to publish them now.

In a shocking coincidence, the geeks who wrote the X.Org Developer's Guide also used Markdown & git to author it, so with just a very little conversion to make links work between sections, it was possible to publish the guide to the wiki, where you can find it now:

It’s not perfect, it’s not even fully finished, but it’s better than nothing, and since it’s now on a wiki, it’s even easier for others to fill in the missing pieces or improve the existing bits.

Stephane is still working on the companion guide to graphics drivers, covering the stack from DRM and KMS in the kernel up to Xorg & Mesa. He posted the PDF of the draft as it was at the end of the March book sprint, but not yet a version with the contributions added from the September followup.

When working on applications higher up the stack, not hacking on the graphics stack itself, we refer developers to the documentation for their toolkits, or to general guides to OpenGL programming, as those are beyond what we can document ourselves in X.Org.

And for other open source projects, if you’d like a chance to have a professionally organized, in person doc sprint for your project, applications are being taken until August 7 for 2-4 projects to hold sprints as part of the Google Summer of Code Doc Camp in October, including travel expenses for up to 5 participants from the sponsors.

Monday Dec 19, 2011

Flame on: Graphing hot paths through X server code

Last week, Brendan Gregg, who wrote the book on DTrace, published a new tool for visualizing hot sections of code by sampling the process stack every few milliseconds and then graphing which stacks were seen the most during the run: Flame Graphs. This looked neat, but my experience has been that I get the most understanding out of things like this by trying them out myself, so I did.

Fortunately, it's very easy to setup, as Brendan provided the tools as two standalone perl scripts you can download from the Flame Graph github repo. Then the next step is deciding what you want to run it on and capturing the data from a run of that.

It just so turns out that a few days before the Flame Graph release, another mail had crossed my inbox that gave me something to look at. Several years ago, I added some Userland Statically Defined Tracing probes to the Xserver code base, creating an Xserver provider for DTrace. These got integrated first to the Solaris Xservers (both Xsun & Xorg) and then merged upstream to the X.Org community release for Xserver 1.4, where they're available for all platforms with DTrace support.

Earlier this year, Adam Jackson enabled the dtrace hooks in the Fedora Xorg server package, using the SystemTap static probe support for DTrace probe compatibility. He then noticed a performance drop, which isn't supposed to happen, as DTrace probes are simply noops when not being actively traced, and submitted a fix for it upstream.

This was due to two of the probes I'd defined - when the Xserver processes a request from a client, there's a request-start probe just before the request is processed, and a request-done probe right after the request is processed. If you just want to see what requests a client is making you can trace either one, but if you want to measure the time taken to run a request, or determine if something is happening while a request is being processed, you need both of the probes. When they first got integrated, the code was simple:

#ifdef XSERVER_DTRACE
                XSERVER_REQUEST_START(GetRequestName(MAJOROP), MAJOROP,
                              ((xReq *)client->requestBuffer)->length,
                              client->index, client->requestBuffer);
#endif
                // skipping over input checking and auditing, to the main event,
                // the call to the request handling proc for the specified opcode:
                    result = (* client->requestVector[MAJOROP])(client);
#ifdef XSERVER_DTRACE
                XSERVER_REQUEST_DONE(GetRequestName(MAJOROP), MAJOROP,
                              client->sequence, client->index, result);
#endif

The compiler sees XSERVER_REQUEST_START and XSERVER_REQUEST_DONE as simple function calls, so it does whatever work is necessary to set up their arguments and then calls them. Later, during the linking process, the actual call instructions are replaced with noops and the addresses recorded so that when a dtrace user enables the probe the call can be activated at that time. In these cases, that's not so bad, just a bunch of register access and memory loads of things that are going to be needed nearby. The one outlier is GetRequestName(MAJOROP) which looks like a function call, but was really just a macro that used the opcode as the index an array of strings and returned the string name for the opcode so that DTrace probes could see the request names, especially for extensions which don't have static opcode mappings. For that the compiler would just load a register with the address of the base of the array and then add the offset of the entry specified by MAJOROP in that array.

All was well and good for a bit, until a later project came along during the Xorg 1.5 development cycle to unify all the different lists of protocol object names in the Xserver, as there were different ones in use by the DTrace probes, the security extensions, and the resource management system. That replaced the simple array lookup macro with a function call. While the function doesn't do a lot more work, it does enough to be noticed, and thus the performance hit was taken in the hot path of request dispatching. Adam's patch to fix this simply uses is-enabled probes to only make those function calls when the probes are actually enabled. x11perf testing showed the win on a Athlon 64 3200+ test system running Solaris 11:

Before: 250000000 trep @   0.0001 msec ( 8500000.0/sec): X protocol NoOperation
After:  300000000 trep @   0.0001 msec (10400000.0/sec): X protocol NoOperation 

But numbers are boring, and this gave an excuse to try out Flame Graphs. To capture the data, I took advantage of synchronization features built into xinit and dtrace -c:

# xinit /usr/sbin/dtrace -x ustackframes=100 \
  -n 'profile-997 /execname == "Xorg"/ { @[ustack()] = count(); }' \
  -o out.user_stacks.noop -c "x11perf -noop" \
  -- -config xorg.conf.dummy
To explain this command, I'll start at the end. For xinit, everything after the double dash is a set of arguments to the Xserver it starts, in this case, Xorg is told to look in the normal config paths for xorg.conf.dummy, which it would find is this simple config file in /etc/X11/xorg.conf.dummy setting the driver to the Xvfb-like “dummy” driver, which just uses RAM as a frame buffer to take graphics driver considerations out of this test:
Section "Device"
	Identifier  "Card0"
	Driver      "dummy"
EndSection
Since I'm using a modern Xorg version, that's all the configuration needed, all the unspecified sections are autoconfigured. xinit starts the Xserver, waits for the Xserver to signal that it's finished its start up, and then runs the first half of the command line as a client with the DISPLAY set to the new Xserver. In this case it runs the dtrace command, which sets up the probes based on the examples in the Flame Graphs README, and then runs the command specified as the -c argument, the x11perf benchmark tool. When x11perf exits, dtrace stops the probes, generates its report, and then exits itself, which in turn causes xinit to shut down the X server and exit.

The resulting Flame Graphs are, in their native SVG interactive form:

Before

After


If your browser can't handle those as inline SVG's, or they're scrunched too small to read, try Before (SVG) or Before (PNG), and After (SVG) or After (PNG).

You can see in the first one a bar showing a little over 10% of the time was in stacks involving LookupMajorName, which is completely gone in the second patch. Those who saw Adam's patch series come across the xorg-devel list last week may also notice the presence of XaceHook calls, which Adam optimized in another patch. Unfortunately, while I did build with that patch as well, we don't get the benefits of it since the XC-Security extension is on by default, and those fill in the hooks, so it can't just bypass them as it does when the hooks are empty.

I also took measurements of what Xorg did as gdm started up and a test user logged in, which produced the much larger flame graph you can see in SVG or PNG. As you can see the recursive calls in the font catalogue scanning functions make for some really tall flame graphs. You can also see that, to no one's surprise, xf86SlowBCopy is slow, and a large portion of the time is spent “blitting” bits from one place to another. Some potential areas for improvement stand out - like the 5.7% of time spent rescanning the font path because the Solaris gnome session startup scripts make xset fp calls to add the fonts for the current locale to the legacy font path for old clients that still use it, and another nearly 5% handling the ListFonts and ListFontsWithInfo calls, which dtracing with the request-start probes turned out to be the Java GUI for the Solaris Visual Panels gnome-panel applet.

Now because of the way the data for these is gathered, from looking at them alone you can't tell if a wide bar is one really long call to a function (as it is for the main() function bar in all these) or millions of individual calls (as it was for the ProcNoOperation calls in the x11perf -noop trace), but it does give a quick and easy way to pick out which functions the program is spending most of its time in, as a first pass for figuring out where to dig deeper for potential performance improvements.

Brendan has made these scripts easy to use to generate these graphs, so I encourage you to try them out as well on some sample runs to get familiar with them, so that when you really need them, you know what cases they're good for and how to capture the data and generate the graphs for yourself. Trying really is the best method of learning.

Sunday Jan 23, 2011

X11R7.6 Documentation Improvements

Late last month (about three months late in fact, sorry about that), X.Org announced the release of X11R7.6. While the announcement, release notes, and changelog give details on all the changes that went into this release since the X11R7.5 release a year earlier, the one I worked on the most (aside from the work of producing and posting the modules to be released) was the modernization of the X documentation.

When the X.Org Foundation inherited stewardship of the X Window System, the documentation was in a wide variety of formats. The programs and most libraries included man pages in the traditional nroff format. The source tree also contained a large number of specifications of the protocols and libraries, and these were in a variety of formats, including troff, TeX, FrameMaker, and LinuxDoc. These were mostly optimized for print output and generated Postscript documents which were included in the releases since few people had all the tools necessary to process all of those, especially the commercial FrameMaker software. This also was a problem for developers who needed to update the docs, and either didn't know all the different formats or who didn't have tools like FrameMaker.

One of the early decisions of X.Org was that we wanted all our docs to be in open formats with open source toolchains. The format we decided to standardize on for the long form documents such as the specifications was DocBook, a open format that was already adopted by projects such as GNOME. Later, during the split of the monolithic X Window System, we decided that once the documents were converted, they would be moved to the modules that they documented, so that the documents would be updated and released in sync with the code. Over the following years, we made a little progress, converting a few documents here and there, but not making a serious dent.

In 2010 though, two volunteers really turned this around - Matt Dew converted the bulk of our specifications to DocBook/XML, and Gaetan Nadon integrated those into the modules and set up the autoconf macros and Makefile build rules to convert them to the various output formats in a standardized fashion. The xmlto tool is used as a front-end processor to drive backend tools including xsltproc and Apache FOP to generate output documents in text, html, PostScript, and/or PDF formats.

You can see the effects of this if you compare documents from the X11R7.5 release with some from the X11R7.6 release. For instance, in the html versions of the Xlib spec, not only has the formatting greatly improved from the 7.5 version to the 7.6 version, the new one also now has a hyperlinked table of contents and index, and many cross-reference links between individual sections. In the pdf version, the 7.5 version is a simple encapsulation of the postscript output that is at least nicely formatted since the original troff was designed for print output. In the 7.6 version, fop generated the output directly for PDF, and was able to produce output that took advantage of the additional features of PDF, such as including the same extended set of hyperlinks as the HTML version, and a set of "bookmarks" for quick navigation to sections listed in the table of contents.

If you're building the sources, there's a standardized set of configure flags to control the documentation tools used during the build, such as --with-fop to enable the use of fop to generate Postscript and PDF output or --without-fop to disable it. If you're installing prebuilt packages, you may find the documents in various formats provided in the module specific subdirectory of the documentation directory for your system, such as /usr/share/doc/libX11 for the documents provided with libX11. Even if the packager didn't preformat the html, text, or pdf for you, the default Makefiles install the xml versions there so you can format them when needed, or read online with a Docbook-viewing tool such as GNOME's Yelp.

While this is a huge step forward, we're not done yet. Many of the API docs still need to be converted from old-style K&R C prototypes to the ANSI C89 style used in the code base, and other formatting cleanups are needed. Matt is working on ways to not just have hyperlinked cross-references inside a single document, but across the documentation set. There's plenty of work to go around for additional people to help, such as improving the style-sheets, proofreading the documents to find more areas to fix, adding cross-references where they could be useful, so more help is always welcome. And of course, the huge pile of docs we have are almost exclusively developer focused - end user documentation is mainly limited to man pages, which aren't always as helpful as they could be, but we need user feedback to let us know the areas that need more help, since the developers don't have to rely on the man pages to figure out how to use the software, so don't notice the gaps.

But now at least we've got good momentum going, and I'm hopeful each future release will continue to show improvement.

Friday Nov 05, 2010

Is Wayland going to replace X?

The interwebs are full this morning of reports on Mark Shuttleworth’s announcement that someday he’d like to use Wayland instead of Xorg as Ubuntu’s primary display driver.

My favorite so far is Ryan Paul’s Ars Technica article for this dead-on estimate of the amount of effort it would take to completely replace X11:

Although the Linux ecosystem would benefit greatly from a lighter and more easily extensible alternative, a concerted effort to displace X11 on the Linux desktop hasn’t really emerged yet because the task of bringing drivers, third-party software, and all of the other layers of the stack into alignment with such a move would be prohibitively cumbersome. Like, in the sense that using only your toes to build a full-scale replica of the Statue of Liberty out of toothpicks is prohibitively cumbersome.

And that’s the important point - despite what some of the sensationalist headlines are saying, Ubuntu is not dropping X. They’re not even sure when it would be possible to start shipping Wayland - the announcement from Mark Shuttleworth was that this is the direction they want to go in hopes that a show of support will help get more people working on Wayland (which has mostly been a single person effort) in order to make it a usable solution in a year or more.

Even once they start shipping Wayland, since most applications will still be written for X, they’ll have to include an X server which now has to share the graphics devices with Wayland. (The Wayland site has architecture diagrams showing how this works.) Sure, their Unity desktop (the Ubuntu replacement for GNOME 3.0’s gnome-shell) will be written to Wayland, but more widespread adoption depends on how many other platforms also adopt Wayland, since then the creators of other applications will have to decide if its worth the effort of porting just for those distros when they can do nothing and continue to work on all X11 platforms, including those running Wayland.

One of the challenges in widespread Wayland adoption, especially on non-Linux kernels, is that Wayland requires a kernel Direct Rendering Module (DRM) with Kernel Modesetting Support (KMS) for every graphics device.

That’s especially challenging in Solaris, where even in the latest OpenSolaris and Solaris Express releases, we only have DRM drivers for Intel and ATI Radeon graphics (and the Radeon one currently doesn’t work, but that’s fixable).

That leaves these non-EOL graphics devices with no DRM/KMS support in Solaris:

  • AST (service processor/remote KVM in Sun servers)
  • ATI Rage and Mach64
  • Cirrus
  • Matrox
  • Nvidia
  • VIA Unichrome
  • Trident
  • VESA (fallback for hardware without specific driver support)
  • VirtualBox
  • VMWare
  • Sun XVR-50/100/300 (OEM ATI Radeon series, but with separate driver)
  • Sun XVR-2500
  • Sun Ray

Of course, that’s the list of devices Oracle is supporting in Solaris - other distros/forks from the OpenSolaris code base, such as OpenIndiana or Belenix, may choose to support more or less, but I’ve not yet seen any sign of them porting DRM drivers on their own.

For some of those, support could be provided by porting the existing open source dual-BSD/GPL-licensed DRM code - others require creating drivers where there are none.

Nvidia appears on their list since we ship their driver, which has their own kernel/driver architecture instead of DRI. Getting DRI/KMS support for nvidia either requires them to rearchitect their driver or moving from their driver to the open source / reverse engineered Nouveau project.

From Owain Ainsworth’s talk at this year’s X Developer Summit, we know that work is in progress to port KMS support to the BSD DRI, but again, that’s not yet done, though the BSDs do have a lot more of the device-specific DRM modules already ported than Solaris did.

So Xorg is in little danger of disappearing overnight - our desktop architecture continues to evolve, as it has over many years, and Wayland may play an increasingly larger role in it, or may end up an interesting side note, like Xgl before it, but either way it will be an evolutionary process, not a big bang flag day sudden change.

Friday Oct 08, 2010

Porting X apps to XCB

I pushed the release of xwininfo 1.1 to the X.Org release archives last week. xwininfo is a command-line utility to print information about windows on an X server. The major new feature of this release is the rewrite to use libxcb instead of libX11 for the connection to the X server.

For those who haven't heard, XCB is a new library (well, new compared to Xlib) to communicate to the X server over the X11 protocol. While Xlib is designed to look like a traditional library API, hiding the fact that calls result in protocol requests to a server, XCB makes the client-server nature of the protocol explicit in its design. For instance, to lookup a window property, the Xlib code is a single function call:

XGetWindowProperty(dpy, win, atom, 0, 0, False, AnyPropertyType,
                   &type_ret, &format_ret, &num_ret,
                   &bytes_after, &prop_ret);

Xlib generates the request to the X server to retrieve the property and appends it to its buffer of requests. Since this is a request that requires a response (many requests, such as those to draw something, do not generate responses from the server unless an error occurs), Xlib then flushes the buffer, sending the contents to the X server, and waits until the server processes all the requests in turn and then sends the response to the client with the property requested. Xlib also provides utility functions to wrap that to retrieve specific properties and decode them, knowing the details of each property and how to request and decode them, such as XGetWMName, XGetWMHints, and so on.

XCB on the other hand, provides functions generated directly from the protocol descriptions, so that they map directly onto the protocol, with separate functions to put requests into the outgoing buffer, and to read results back from them asynchronously later. The xcb version of the above code is:

prop_cookie = xcb_get_property (dpy, False, win, atom,
                                XCB_GET_PROPERTY_TYPE_ANY, 0, 0);
prop_reply = xcb_get_property_reply (dpy, prop_cookie, NULL);

The power of xcb is in allowing those two steps to have as much code as you want between them, letting the programmer decide when to wait for data instead of forcing you to wait everytime you make a request that returns data. For instance, xwininfo knows in advance from the command line options most of the data it needs to request from the server for each window, so it can request it all at once, and then wait for the results to start coming in. When using the -tree option to walk the window tree, it can request the data for all the children of the current window at once, batching even further. On a local connection on a single CPU server, this means less context switches between X client and server. On a multi-core/CPU server, it can allow the X server to be processing requests on one core while the client is handling the responses as they become available, better utilizing the multiprocessing nature of the system. And on remote connections, the requests can be grouped into packets closer to the MTU size of the connection, instead of whatever requests are in the buffer when one is made that needs a response.

For xwininfo, when I ported to xcb I tested with a GNOME desktop session on OpenSolaris with a few clients open and ran it as xwininfo -root -all starting at the root of the window hierarchy and climbing down the tree, requesting all the information available for each window along the way. In my sample session it found 114 windows (in X, a window is simply a container for drawing output and receiving events, and often windows in terms of the protocol are subsets of, or borders around, the objects users think of as windows). When running locally on my Ultra 27 with it's 4-core Nehalem CPU, both versions ran so fast (0.05 seconds or less) that the difference in time was too small to really accurately measure. So I used ssh -X to tunnel an X11 connection from my office in California to a server in the Sun office in Beijing, China, and from there back to my desktop, introducing a huge amount of latency. With this, the difference was dramatic between the two:

Xlib: 0.03u 0.02s 8:19.12 0.0% 
 xcb: 0.00u 0.00s 0:45.26 0.0% 

Of course, xwininfo is an unusual X application in a few ways:

  • It runs through the requests as fast as it can, then exits, not waiting for user input (unless you use the mode where you click on a window to choose it, after which it runs through as normal). Most X applications spend most of their time waiting for user input, so the overall runtime won't decrease as much by reducing the time spent communicating with the X server.
  • It uses only the core protocol and shape extension, none of the extensions like Render or Xinput that more complex, modern applications use. Xinput, XKB, and GLX are especially problematic, as those have not yet been fully supported in an XCB release, though support has been worked on through some Google Summer of Code projects.
  • It's small enough that a complete reworking to use xcb in one shot was feasible. I couldn't imagine trying that for something as complex as Firefox, Gimp, or Nautilus.
  • It used only raw Xlib - no toolkits. Any application programmer who wants to stay sane uses toolkits like Gtk or Qt and helper libraries such as Pango and Cairo to deal with all the things that all applications need to do, and shouldn't have to reinvent - widgets, input methods, accessibility support, interacting with window managers, complex text layout for non-Latin character sets, and so on.
  • It doesn't use much of the Xlib helper functions - not a whole lot more than the protocol calls, so is more directly mappable to xcb. Applications that rely on Xlib's input method framework, Compose key handling, character set conversion, or other functions, would be harder to port without duplicating all that work (though the modern toolkits handle most of that in the toolkit layer now anyway). It did rely on the helper functions for converting the window name property from other character sets, and the xcb version right now has a regression in that it only works for UTF-8 and Latin-1 window names, but since most modern toolkits use UTF-8, you may not notice unless you run older applications with localized window names.

Fortunately, XCB also provides a method for incremental conversion from Xlib to XCB, where you can use libX11 to open the display, and pass the Display pointer it returns to all your existing code, toolkits, and libraries, and then when you want to call an XCB function, convert it to an xcb_connection_t pointer for the same connection, allowing mixing calls to Xlib & XCB API's.

This is done by building libX11 as a layer on top of libxcb, so they share the same socket to the X server and pass control of it back and forth. That option was introduced in libX11 1.2, and is now always present (no longer optional) in the libX11 1.4 release coming out this fall (Release Candidate 2 is out now).

So as another example, xdpyinfo also prints a lot of information about the X server, but it calls many extensions, and a lot of its calls aren't blocking on a response from the server. If you add the -queryExt option though, then for every extension it calls XQueryExtension to print which request, event, and error ids are assigned to that extension in the currently running server. Since those are dynamically assigned, and vary depending on the set of extensions enabled in a given server build/configuration, this is critical information to have when debugging X error reports that reference these ids. Using xdpyinfo -queryExt is thus especially needed when reporting an X error message that come from custom error handlers like the one in the gtk toolkit that omit the extension information found in the default Xlib error handler, or the person reading the bug will have no idea which extension you hit the error in.

XQueryExtension takes one extension name at a time, sends a request to the X server for the id codes for that extension, and waits for a response so it can return those ids to the caller. On the Xorg 1.7 server on my Solaris 11 test system, there are currently 30 active X extensions, so that's 30 tiny packets sent to the X server, 30 times the xdpyinfo client blocks in poll() waiting for a response, and 30 times the X server goes through the client handling & request scheduling code before going back to block again on its own select() loop.

A simple patch to xdpyinfo replaced just that loop of calls to XQueryExtension with two loops - the first calling xcb_query_extension for each extension, and then when the entire batch was ready to go to the server, a second loop called xcb_query_extension_reply to start collecting the batched replies. Gathering system call counts with truss -c showed the expected reduction in a number of system calls made by the xdpyinfo client itself:

System callXlibxcb
writev4011
poll8022
recv11729
total\*464296

\* total includes all system calls, including many not shown since their count did not change significantly. There was one additional set of open/mmap/close etc. for loading the added libX11-xcb library.

Over a tcp connection, this reduced both the number of packets, and due to tcp packet header overhead, the overall amount of data:

 Xlibxcb
TCP packets9335
TCP bytes115547726

This sort of change is far more feasible for most applications - finding the hotspots where your user is waiting for data from the server, especially at the startup of the application when you're gathering the information about the server and session to initialize your application with, and converting just those to more efficient sets of xcb calls. This is much like earlier work to reduce latency by converting repeated calls to XInternAtom with a single call to fetch multiple atoms at once via XInternAtoms.

Of course, if you still have to maintain support for older platforms, such as Solaris 10 or RHEL 5, you'll have to keep this code #ifdef'ed, since those platforms won't have xcb support, but availablity is growing on newer systems. (We didn't get it into any of the OpenSolaris distro releases before the end unfortunately, nor the first Solaris 11 Express release, but are working on it for a future Solaris release.)

Monday Apr 26, 2010

Grabbing information from the X server

Jay came into my office last week complaining his mouse was grabbed by a client and wouldn't let him click in any windows, but he couldn't tell which one. He remembered I said I had a script to find this out, but hadn't told him where it was. So I logged into his workstation, su'ed to root and ran a couple commands:

root@overthere:~# ps -ef | grep Xorg
jacotton   881   880   0   Apr 08 vt/2      274:24 /usr/bin/Xorg :0 -nolisten tcp -br -auth /var/run/gdm/auth-for-gdm-O_aWTb/datab
root@overthere:~# /usr/demo/Xserver/mdb/list_Xserver_devicegrab_client -p 881
Device "Virtual core pointer" id 2: 
  -- active grab 3a00000 by client 29

Device "Virtual core keyboard" id 3: 
  -- active grab 3a00000 by client 29

Device "Virtual core XTEST pointer" id 4: 
  -- no active grab on device

Device "Virtual core XTEST keyboard" id 5: 
  -- no active grab on device

Device "mouse" id 6: 
  -- no active grab on device

Device "input" id 7: 
  -- no active grab on device

Device "hotkey" id 8: 
  -- no active grab on device

Device "keyboard" id 9: 
  -- no active grab on device

root@overthere:~# /usr/demo/Xserver/mdb/list_Xserver_clients -p 881
currentMaxClients = 41
CLIENT SEQUENCE #  FD  PIDS
    0           0 ??? - NULL ClientPtr->osPrivate
    1          22  33 880 
    2          20  35 912 
    3           9  36 915 
[...]
   29     6926083  59 15839 
[...]
15822 /bin/bash /usr/bin/firefox
  15835 /bin/bash /usr/lib/firefox/run-mozilla.sh /usr/lib/firefox/firefox-bin
    15839 /usr/lib/firefox/firefox-bin
[...]

So we knew it was firefox holding the grab, and killing that process released it.

The scripts are actually simple wrappers around a loadable module for the Solaris mdb modular debugger. While I prefer source level debuggers like dbx & gdb for interactive use, the mdb module system works well for this style of use. The mdb modules are compiled C code, so can use the headers from the X server to get the type information, instead of requiring debugging information such as stabs or DWARF be built into the X server binary.

These were originally written for Solaris 9, in the dark days before DTrace, but are still useful, because they get information you don't know you need until after it's happened - for instance, with the Xserver DTrace probes you can get the client information from the client-connect and client-auth probes when the client connects, or see which clients send grab requests from the request probes when they make the calls - but when your server is already grabbed it's too late to trace unless you know how to reproduce the issue.

These have been kept in our source tree for a while, but weren't built as part of the regular builds so were often out of date when the server internals changed how things like the devPrivate fields were stored, and thus needed porting to the new X server versions when you needed them. And even when they were up-to-date, you needed to have a full X server source tree to build them, since they used headers not included in the server SDK package.

To solve this, in Nevada build 135 (sorry, that was just after the stable branch for the 2010.03 release forked off, so it won't be in there) I integrated them into the normal build process and delivered them in the X server packages. The mdb module is delivered in /usr/lib/mdb/proc/ with links to it named Xorg.so, Xvfb.so, Xephyr.so and Xvnc.so so that it should be automatically loaded when mdb attaches to a process or core of any of those programs. The scripts were delivered in /usr/share/demo/Xserver/mdb, along with a README file explaining how to use either the direct mdb module or the wrapper scripts.

Porting these to other debuggers architectures for scripting should be possible, though you'd need to deal with either having the X server debugging information available or converting the C language headers & traversal algorithms to Studio dbx's ksh or gdb's Python or whatever scripting solution your debugger uses. The grab and client information should be identical across the X servers from the xorg-server sources on all platforms (though as noted above, varying by xorg-server version as the internals change), except for the client process id information. Most platforms get the client pid at connection time, so it can be logged in the client audit log if you run with -audit, or made available to the client-auth dtrace probe, but then discard that information. On Solaris & OpenSolaris however, we keep it in a client devPrivate for the SolarisIA extension to use to adjust the priority boost given to the process with focus, so the mdb module can look it up there. Of course, there's no guarantee that the client hasn't forked a child or somehow else passed control of its X connection in a way the server wasn't informed about, but it's accurate far more often than not. Since the file descriptor information is available on all platforms, you might be able to determine the client via that somehow, such as through lsof - for instance, I've implemented an enhancement to pfiles on Solaris to show peer processes for local IPC that hopefully I'll find time to get reviewed and integrated soon.

There's of course a lot more information in the X server that might be useful to find - these were each written to solve specific problems on machines we couldn't login to directly to debug ourselves. What other problems could be solved if we had similar modules to expose other information? We can certainly enhance these as we find more - let us know if you find some.

Wednesday Feb 18, 2009

Xorg 1.5.3 known issues in OpenSolaris 2009.06 & Solaris Express build 107

As Phoronix recently noticed, we upgraded the Xorg server from 1.3 to 1.5.3 in Nevada build 107 (currently available via Solaris Express Community Edition (SXCE) Install ISO’s and OpenSolaris IPS package updates in the /dev repo).

When we integrated, I sent out a heads up message with information about the driver compatibility changes and issues we knew about at that time.

So far it’s gone mostly smooth, thanks to the help of the people who tested the pre-integration builds I posted both internally and on the X community on opensolaris.org, but there’s a few issues that have hit some people so far, most with workarounds you can use to get past them:

6801598 Xorg 1.5 ignores kernel keyboard layout setting, always uses "us"

Fixed in build 109. Xorg on Solaris has always shipped with a patch to query the kernel for the console keyboard layout (which in turn either gets set automatically by the keyboard if, like Sun keyboards, it provides the layout in the optional USB HID layout identifier, or gets set by querying the user on initial installation), but the code we apply this patch to moved from the Xorg binary to the kbd_drv.so binary in Xorg 1.5 and our patch didn’t get migrated correctly. We didn’t notice in testing since our testing was done with input devices provided by HAL, which was handling this for us, but when that didn’t integrate as expected, this bug was exposed.

Workaround: Create an /etc/X11/xorg.conf file with a keyboard section specifying your desired layout, such as this one for a British layout:

Section "InputDevice"
	Identifier		"Keyboard0"
	Driver			"kbd"
	Option "XkbRules"	"xorg"
	Option "XkbModel"	"pc105"
	Option "XkbLayout"	"gb"
EndSection

6791361 SUNWxorg-xkb should deliver "base" rules

If you do need to set a localized keyboard layout, you may notice the default XKB rules file upstream changed from "xorg" (which is currently shipped in our packages) to "base" (which isn’t yet, though we’ve asked the localization teams who build our XKB packages to add it).

Workaround: Include an XkbRules option line in your keyboard settings as shown in the example above, or make a symlink from /usr/X11/lib/X11/xkb/rules/base to the xorg file in that directory. The fix for the above bug in build 109 should also set the default used by our Xorg builds back to "xorg" for now.

6801386 Xorg core dumps on startup if hald not running in snv_107

Fixed in build 109. Yet another instance of our old friend 6724478 libc printf should not SEGV when passed NULL for %s format. This was a design decision made in SVR4 and Solaris 2.0 many years ago, to force programmers to catch NULL pointer misuse - but since Linux & the BSD’s allow it, today it’s mostly become a source of pain in porting code that while technically incorrect, works on those platforms, so the decision was made last year to change Solaris libc to match. Unfortunately, that change is still undergoing standards conformance testing, so hasn’t integrated yet, and we still have to check for NULL first.

Workaround: Make sure hald starts before Xorg does. This shouldn’t be a problem for gdm users (default in OpenSolaris), since the gdm SMF service lists the hal service as a dependency that needs to start first, but may cause problems for dtlogin, xinit or startx users.

6806763 Xorg 1.5 doesn’t start if xorg.conf contains RgbPath entry

Fixed in 109, and in upstream head. This entry in the Files section of xorg.conf was obsoleted upstream, but became a parse error instead of just being ignored, which breaks people who upgraded a system with a working xorg.conf that happened to have this line.

Workaround: Remove RgbPath lines from /etc/X11/xorg.conf.

6799573 Metacity not starting on TX Xorg 1.5

Fixed in 108. The XACE framework used by extensions such as Xtsol & Xselinux added new permission types to check in the upstream release, but Xtsol doesn’t handle all of them yet. In this case, Metacity’s constant checking of round-trip time by appending to an existing property failed because Xtsol was only handling the WriteAccess check and not the BlendAccess check.

I don’t know of a workaround for this other than not running the desktop in multi-level mode, so users of the labeled/multi-level security desktop in Trusted Extensions may want to wait to upgrade until then.

6798452 X Server will not start on x86 systems with multiple graphics devices

Not yet fixed, either in our builds or upstream, but has a simple workaround of creating an xorg.conf file listing the devices you want to use. This may affect people with a single card if your motherboard has on-board graphics as well.

6797940 Can’t do gui installs on x86 systems with Nevada 107/Nevada 108

We haven’t yet tracked down this issue, which only affects the SXCE installer on x86/x64 platforms. Workarounds include using the OpenSolaris LiveCD installer or the text only mode of the SXCE installer instead. After installation, the installed Xorg works fine (modulo the above issues).

6686 SUNWefb[w] should be part of slim_install

Fixed in 108. The SPARC drivers added in OpenSolaris build 107 for Xorg on XVR-50, XVR-100, and XVR-300 graphics cards aren’t included in the default install. You can add them after the install by doing pfexec pkg install SUNWefb SUNWefbw and then rebooting. (These drivers are only available in OpenSolaris installs, not SXCE ones, since they conflict with the Xsun drivers for these cards included in SXCE.)

Saturday Jun 14, 2008

June 11, 2008 X Server security advisories

On June 11, iDefense & the X.Org Foundation released security advisories for a set of issues in extension protocol parsing code in the open source X server common code base that iDefense discovered and X.Org fixed.

Their advisories/reports are at:

Sun has released a Security Sun Alert for the X server versions in Solaris 8, 9, 10 and OpenSolaris 2008.05 at:

Preliminary T-patches are available for Solaris 8, 9, and 10 from the locations shown in the Sun Alert - these are not fully tested yet (hence the "T" in T-patch).

The fix for these issues has integrated into the X gate for Nevada in Nevada build 92, so users of SXCE or SXDE will get the fixes by upgrading to SXCE build 92 when it becomes available (probably in 3-4 weeks, though the first week of July is traditionally a holiday week in Sun's US offices, so may affect availability).

Fixes for OpenSolaris 2008.05 users following the development build trains will be available when the Nevada 92 packages are pushed to the pkg.opensolaris.org repo (also probably in about 3 weeks from now).

Fixes are planned for OpenSolaris 2008.05 users staying on the stable branch (i.e. nv_86 equivalent), but I do not have information yet on how or when those will be available.

Fixes for users building X from the OpenSolaris sources are currently available in the Mercurial repository of the FOX project in open-src/xserver/xorg/6683567.patch.

For users of all OS versions, the best defenses against this class of attacks is to never, ever, ever run “xhost +”, and if possible, to run X with incoming TCP connections disabled, since if the attacker can't connect to your X server in the first place, they can't cause the X server to parse the protocol stream incorrectly. This is not a complete defense, as anyone who can connect to the Xserver can still exploit it, so if you're in a situation where the X users don't have root access it won't protect you from them, but it is a strong first line of defense against attacks from other machines on the network.

Releases based on the Solaris Nevada train (including OpenSolaris & Solaris Express), default to “Secure by Default” mode, which disables incoming TCP connections to the X server. Current Solaris 10 releases offer to set the Secure by Default mode at install time. On both Solaris 10 & Nevada, the netservices command may be used to change the Secure by Default settings for all services, or the svccfg command may be used to disable listening for TCP connections for just X by running:

svccfg -s svc:/application/x11/x11-server setprop options/tcp_listen=false

and then restarting the X server (logout of your desktop session and log back in).

On older releases, the “-nolisten tcp” flag may be appended to the X server command line in /etc/dt/config/Xservers (copied from /usr/dt/config/Xservers if it doesn't exist) or in whatever other method is being used to start the X server.

See the Sun Alert for other prevention methods, such as disabling the vulnerable extensions if your applications can run effectively without them.

Thursday Oct 11, 2007

Stupid nv_74 cursor tricks

For those who have already grabbed Nevada build 74 from the Solaris Express Community Edition downloads, you can try out the new cursor code available with the integration of libXcursor (which most XFree86 and Xorg platforms have had for a while).

If your graphics card supports the Render extension & 32-bit alpha cursors, (i.e. most x86 graphics, but not Sun Rays nor SPARC graphics yet - I've mainly used it on nvidia cards and my laptop with a ATI Radeon chipset), you can see them in action by doing:

% su -
# mkdir -p /etc/dt/config/C/
# cat > /etc/dt/config/C/styleModern
#include "/usr/dt/config/C/styleModern"
Xcursor.theme: whiteglass
\^D
# exit

% echo 'Xcursor.theme: whiteglass' >> ~/.Xresources

and then logout & log back in.

Change 'whiteglass' to 'redglass' in the above if you want something that stands out more.

Thursday Sep 27, 2007

Happy Belated Birthday X11!

Just noticed that according to Wikipedia's X Window System Release History, X11 was first released on September 15, 1987 - 20 years old and I didn't even remember to get a birthday card...

Wednesday Jun 13, 2007

Xorg 7.2 Solaris status update

It's been a while since I've given an update on our Xorg 7.2 state in Solaris and OpenSolaris, and I was reminded by a new patch release today. For a summary of the new features included, and some bugs in the initial Nevada builds, see the Heads Up: Xorg 7.2 in Nevada build 58 message I sent to the OpenSolaris xwin-discuss list in February.

Besides the original release in Solaris Express: Community Edition and the OpenSolaris sources, Xorg 7.2 is now also available in:

For those on Solaris 10 with an older patch release, or the Solaris 10 7/07 Beta, patch 125720-08 was released today, which fixes several issues reported by early users, including:

  • A number of problems with auto-configuration of screen resolution
  • Xephyr's Caps Lock not being unlockable (bug 6539225)
  • A memory leak every time an X client connection was closed (6533650)
  • Crashes or failures using TrueType fonts in 64-bit Xorg servers (6535518)
  • xorgconfig listing the wrong keyboard driver in generated xorg.conf files (6559079)
  • Xorg -configure not including the bitstream TrueType font module in generated xorg.conf files (6546692)

Also included in the patch are:

  • A new version 2.0.2 of the “nv” open source driver for nVidia cards which adds support for the first set of GeForce 8000 series cards, though you can also get the updated nvidia closed-source/accelerated driver to support them as well. (A slightly older version of the nVidia closed driver is included in Solaris 10 7/07 already.)
  • Security fixes for two reported vulnerabilities: X Render Trapezoid divide-by-zero and XC-MISC memory corruption.
  • An update to the Radeon driver to check if you've plugged in an external monitor on your laptop (at least on Ferrari 3400's) when you run xrandr, and if so, activate the external display port without having to restart X. (A small step towards the more general and complete solution coming with X Resize-and-Rotate 1.2.)

There's a few more fixes in progress still (the -09 rev of the Solaris 10 patch is in QA now, and there will of course be more revisions in the future), including:

  • Fixing glxgears and glxinfo linking so they run (6560568)
  • Allowing Xorg to start with an old xorg.conf listing the "Keyboard" driver (capitalized) (6560332 - if you have this problem now, just change the driver name to "kbd" in your xorg.conf)
  • Adding the VMWare mouse driver, vmmouse_drv.so (6559114)
  • Fixing Xorg -configure in the 64-bit Xorg (6556115)
  • Making Xv playback work again in the ATI Radeon driver (6564910)
  • Making xorgcfg work again ( 6563653)

but the current state should be fairly usable by most people as it is today.

For those using Solaris Express/Nevada releases, or OpenSolaris sources, you can see which build the above changes went in at the X ChangeLogs page. (Solaris Express Developer Edition 5/07 corresponds to build 64 in that list.)

Removing XInitThreads from Totem

Bastien, you're not alone. If the JDS team complained to even Sun's X11 team about bugs after they applied that Totem patch, we'd tell them they're on their own and we'd never support an application that called X from multiple threads without calling XInitThreads first.

The Solaris Xlib is thread safe (though probably with a few bugs in corner cases still) based on the same X11R6 multi-thread code as everyone else, though we're currently working to reconcile the fixes from our fork with the current X.Org code as part of our ongoing project to un-fork Solaris X11 and bring it back to the X.Org code base. We're also looking at XCB in the future for a cleaner thread-safety model as well.

Update: As noted by trisk in the comments, this seems to have been caused by known Solaris libXi bug 6551484 - the engineer who had removed XInitThreads() from totem tried the test binary produced by the engineer working on the libXi bug and found it solved the hang problem she was seeing.

Monday Jun 11, 2007

The Irregular X11 DTrace Companion: Issue 2

In response to a Solaris 10 7/07 beta tester question last week, I updated the Xserver provider for DTrace web page to point out that besides the original method of patching the Xorg sources and rebuilding, pre-built binaries are now included directly in Solaris...

Getting an X server with DTrace probes built in

The Xserver dtrace providers are now included in the regular X server binaries (including Xorg, Xsun, Xvfb, Xephyr, Xnest, & Xprt) in recent Solaris releases, including:

For those who still want to build from source, the provider is integrated into the X.Org git master repository for the Xserver 1.4 release, planned to be released in August 2007 with X11R7.3, and into the OpenSolaris X code base for Nevada builds 53 and later.

Thursday May 17, 2007

The Irregular X11 DTrace Companion: Issue 1

All the cool kids are doing unscheduled update newsletters now, and I had a collection of X11 Dtrace things to post about piling up, so I figured I'd give it a whirl. (BTW, Nouveau guys - I'd read yours more if you had an RSS feed - having to go check the wiki on an irregular basis doesn't work so well.)

Doing newsletters on a regular basis is hard, as Glynn has learned, so don't expect updates too often...

Sample script for tracing events

Jeremy asked recently for an example DTrace script using the event probes in the Xserver DTrace provider. I had lost the ones I wrote when developing/testing that probe, but dug one out of my e-mail that was written by DTrace Master Guru Bryan Cantrill and posted it as ButtonPress-events.d.

He wrote this script when he'd upgraded to Nevada build 58 and found shift-click was no longer working. Using the script he was able to show that the Xserver was sending ButtonPress events without the ShiftMask set. I wrote a corresponding script to decode the VUID events the X server was getting from the kernel mouse and keyboard drivers to determine the keyboard wasn't giving us the shift key pressed events until another key was pressed, which wasn't happening for shift-clicks with the mouse. That script can be seen in the bug report I filed with the kernel keyboard team as Solaris bug 6526932.

Using DTrace to win a race

A bug was filed with X.Org recently stating “iceauth can dump a core in auth_initialize() if a signal is caught before iceauth_filename has been malloced.” As I was fixing bugs in iceauth anyway, I figured I'd look at it to see if I could roll in a fix to the new iceauth release I'd be making for those. Fortunately, iceauth is a small program, and I could see where I thought the race condition lied between the main code and the signal handler. Testing any race condition has always been a pain in the past, since if you don't get the timing just right you don't know if you really fixed it, or just changed the timing enough to not hit it.

So I pulled up the DTrace manual and DTrace wiki and created a one-liner to force the race condition by sending the signal when I knew it was in the right spot:

 # dtrace -w -c ./iceauth -n 'pid$target::IceLockAuthFile:return { raise(1); }'

For those who haven't memorized the DTrace manual, this can be broken down as:

  • -w - allows “destructive” actions - i.e. those that change the running of the program, and don't just monitor it - in this case, allows sending a signal to the process
  • -c ./iceauth - specifies the command to run with this script, and puts the process id of the command in $target in the script
  • pid$target::IceLockAuthFile:return - sets the probe we're looking for to be the return from any call to the function IceLockAuthFile in the process matching the pid $target
  • { raise(1); } - sets the action to be run when that probe is hit to send signal 1 (aka SIGHUP) to the target process

Ran that and a few seconds later had a nice core file to verify where the race was. Turned out I was right about where the signal had to come in for the race to cause a crash, but was slightly off about the point in the signal handler where it was crashing. The fix was easy, and thanks to DTrace, easy to verify, and is now in X.Org's git repository.

DTrace Wiki: Topics: X

After last month's Silicon Valley OpenSolaris User Group meeting, Brendan asked if I'd work on the “X” topic on the DTrace Wiki. I agreed to, and have put up a small placeholder while I figure out what to flesh it out with - if you have any suggestions or requests for what you'ld like to see in a short introductory article to using DTrace to probe X, especially using, but not limited to, the Xserver DTrace provider, send me e-mail or leave a comment here.

Monday Apr 16, 2007

X11 API's for querying multi-head screen layout

I should probably stick this in the X.Org wiki or OpenSolaris/GenUnix wiki, but I'm being lazy and just summarizing to here now from an e-mail thread last week on one of the Sun internal lists...

There are now 4 sets of API's to query the layout of display screens in a logical X screen when using a setup such as Xinerama, TwinView, or MergedFB:

  1. the original X11R6.4 XPanoramiX\* API
  2. Sun's Xinerama API
  3. XFree86's Xinerama API
  4. Xrandr 1.2

The first is pretty much universally available, but also universally deprecated. The functions for it are defined in libXinerama on Linux, libXext on Solaris, but I don't think we've ever shipped the header for it on Solaris, since it was deprecated long ago.

The second Sun created and proposed to the old X.Org industry consortium to become the standard, but they did not adopt it, crafted a replacement in committee that was universally incompatible with everything and never adopted that either. It's what the <X11/extensions/xinerama.h> header in Solaris represents, and was never documented, but you can find a description of the API in PSARC 2000/036. The functions from it are found in Solaris libXext, but nowhere else in the world that I know of.

The third, aka libXinerama, is available on most XFree86 & Xorg based systems, and finally came to Solaris in nv_62 and soon in Solaris 10 patches & the next update release. It uses the <X11/extensions/Xinerama.h> (note the capitalization compared to the other) header and is documented in the man page we created for the Solaris integration and contributed back to X.Org.

The fourth is a very recent development, having been released after X11R7.2's release in February. X.Org is currently in the process of releasing version 1.3 of the Xorg server package (release candidate 5 is out now) - this will be the first Xorg server feature release not tied to one of the whole X Window System releases. (X11R7.2 had Xorg server 1.2, X11R7.3 is expected to have Xorg server 1.4.) The xf86-video-intel driver 2.0 release candidate 4 is also out, it is the first driver to support the new Xrandr 1.2 additions, though work is in progress on other drivers, including the Radeon and nvidia drivers. We don't have a definite roadmap for including this in Solaris yet - we'll probably do it once the X.Org releases are finished and we're passed the upcoming milestone of the second Solaris Express Developer Edition release in May.

About

Engineer working on Oracle Solaris and with the X.Org open source community.

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle, the X.Org Foundation, or anyone else.

See Also
Follow me on twitter

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today