Testing Suspend and Resume for Driver Developers.
By randyf on Jan 01, 2008
Happy New Year!
But in the new year, you are a driver developer and need to support DDI_SUSPEND and DDI_RESUME. How do you do this, and then test if it is working?
First, go to my weblog describing the Power Management Interfaces and the docs.sun.com Ch. 12 "Writing Device Drivers: Power Management pages, and understand, first, how to use the Solaris Power Managment API's.
OK, so you probably have done this already, but it is always nice to have these references to go back and check. And you already know how to get this driver added and loaded/unloaded into your test machine (add_drv(1m), modload(1m), and modunload(1m) also for reference). But how do you go about testing the driver?
In How to Suspend and Resume, I described about modifying power.conf(4) to enable S3 support on your machine. I even described about using sys-suspend to suspend your machine. However, in the course of driver development, there will be the need to do some partial testing, or in the case of suspend and resume, other drivers might not be compliant and suspend might never get to your driver to test it.
No problem! There are test points to help you out. To start, we need to understand the primary entry point to the cpr module (where most of the work is done). Note that these commands require privilege to use, so you will have to become root, or otherwise get root privileges (see the RBAC related guides). The main command is: uadmin(1m). This utility takes two and an optional third argument. The first is the primary command, the second is the function within that command, and the third is a string that is dependent on the first two.
For Suspend and Resume, the command will always be 3 (actually defined as A_FREEZE in the uadmin(2) man page). This is the same for all Suspend and Resume functions, including for S3 (Suspend to RAM, or sleep) and S4 (Suspend to Disk, or hibernate). Note here, that as of this blog writing, S3 is supported only on x86 systems, and S4 is supported only on Sparc systems. The second argument instructs the cpr module what it should do: sleep, hibernate, or do a test entry. The third is only used in one specific test point, but it will be described at that time.
So let's start with the first four common functions: 0, 1, 20, and 21.
0 Suspend to Disk
1 Check Suspend to Disk
20 Suspend to RAM
21 Check Suspend to RAM
The "Check" functions simply verify that the feature is supported and enabled for your platform. This is useful for various applications that only desire information about the ability to perform an operation. It returns 0 if supported, and non-0 if unsupported (the system call also returns an ENOTSUP error code). And the "Suspend" commands actually perform the operation. They will return 0 if successful, and non-0 if unsuccessful. So if a "Suspend to RAM" operation is performed on a Sparc machine, it will return non-0 and not do a suspend (in this case, the system call also returns an ENOTSUP, though other failures will return different error codes).
OK, we already know these, as this is what happens when the sys-suspend command is executed, but we also know that if there are non-compliant drivers, then the uadmin command may not get to your specific driver, or may well fail in other ways, and it makes it hard to test a driver. This is where some other functions come to play. As this blog is more specific to the x86 Suspend to RAM, I will only describe the Suspend to RAM test points.
22 Suspend to RAM, but don't power off and resume normally
23 Suspend to RAM, but don't power off and resume as a failure
24 Suspend to RAM, but don't power off and skip drivers that fail suspend
25 Suspend only a single driver - this takes the optional 'mdep' argument
The first two test points should be fairly descriptive. Go through the process of suspending, but instead of powering off, just resume (the second sets a "failure" error that is mostly only useful for framework testing, as errors are not provided to the drivers on resume). This is a good way to verify the code paths of your driver. If your driver doesn't properly execute DDI_SUSPEND or DDI_RESUME, it will be easily checked. And just as importantly, the hardware state remains intact, as the hardware wasn't actually powered off. This is also useful, if you are working on a system without a compliant framebuffer, and had to bypass it's suspend processing. Note, though, that this is an incomplete test as the hardware is not actually powered off. But it is a good way to test that code paths execute correctly.
The "25" function, though, can be used to only suspend your driver. It takes an optional 3rd (or mdep) argument that is the major number of your driver. This can be found by searching /etc/name_to_major for your driver name, and using that number. It will only suspend your driver, and isn't useful for nexus drivers, as it doesn't walk the device tree to suspend leaf drivers (and can have adverse effects to those drivers). This may change in the future, however. But it is very useful for test suspending hardware drivers. The syntax is:
uadmin 3 25 14, where 14 is the driver major number.
The last function, 24 should be used very cautiously! It walks the device tree as expected for a regular suspend and resume, but if it encounters a driver that returns DDI_FAILURE, instead of stopping and resuming, it will continue to the next driver, ultimately walking through the entire device tree. But use caution with this function, as it has no way of knowing if a driver returned a failure because it actually failed, or if the driver doesn't support suspend and resume. In either case, the machine could be left in an undesirable state, and a reboot might be needed (or worse, a hang, and you don't know what failed). The /var/adm/messages, as in all the other functions, will contain information as to what drivers failed, or what other failures might have occured.
This leads us to a good way to test your driver: first try uadmin 3 25 [major_number], and see how your driver responds to a simple driver suspend. Then try a suspend without power off by uadmin 3 22. If still successful, it is now time to try the real thing, and execute uadmin 3 20, or /usr/openwin/bin/sys-suspend. If all goes well, your machine will suspend, and a simple case of pressing a key (or power button), the machine will resume to it's original state.
This is a great time to have a "Chimay Blue Lable"! A 'powerful' Belgian Trappist style ale with complex Belgian malt character, and hop bitterness to compliment the malt body. A light fruitiness due to the hops and Trappist yeast, and a good warming level of alcohol, makes for a strong and very enjoyable ale to celebrate the success of your driver!