Using DRM-Specific Functionality Via DRMAA
By templedf on Jul 05, 2007
This topic came up recently on the Grid Engine open source users mailing list. DRMAA is great for writing cross-DRM applications, but what if you need some specific functionality for a specific DRM? Fortunately, the DRMAA specification has an answer for that. Two, in fact.
Answer #1: the native specification
The simplest answer is to use the job template's native specification attribute. The native specification is an opaque string that will be passed directly to the underlying DRM system to modify the template for the job being submitted. In the Grid Engine implementation, the native specification accepts the same set of switches as the qsub command. (Sort of. See below.)
Let's take an example. If you want to specify the queue to which the job should be submitted as all.q, the qsub switch would be
-q all.q. To do the same thing in DRMAA, since there isn't a job template attribute for setting the queue, you'd use the native specification. In this example, you'd set the native specification to "-q all.q". In C, that would look like:
drmaa_set_attribute(jt, DRMAA_NATIVE_SPECIFICATION, "-q all.q", error, DRMAA_ERROR_STRING_BUFFER - 1);
and in the Java™ language, it would be:
If you wanted to apply more than one native switch, you'd just string them together, like "-q all.q -l h_cpu=600 -pe make 5". There are two caveats. First, there is an issue that will be fixed in an upcoming release that requires that the native specification not start with whitespace. Second, there is a list of options which will be silently ignored. They are: -cwd, -help, -sync, -t, -verify, -w w, and -w v. These options have no meaning in the context of a DRMAA application. See the drmaa_attributes(3) man page for more details.
Answer #2: job categories
The other option is the job template's job category attribute. The job category is essentially the same thing as the native specification, except that there's a level of indirection thrown in. Instead of directly specifying the DRM-specific string, you specific a look-up name. The DRM then uses that look-up name to find the DRM-specific string to use. All of the same rules apply, but your code doesn't contain any DRM-specific code. When you release your application which uses job categories, you should include a list of the job categories that the administrator needs to define and what they should specify.
Let's revisit the previous example. First, you need to know where Grid Engine looks up the look-up name. Grid Engine uses the same files to resolve job categories that it uses to resolve command options for qtcsh, the qtask files. The qtask files have a very simple format: <command_name> <options>. In our example, we might add a line to the qtask file that looks like this:
main_queue -q all.q
In C, we would then set the job category like this:
drmaa_set_attribute(jt, DRMAA_JOB_CATEGORY, "main_queue", error, DRMAA_ERROR_STRING_BUFFER - 1);
In the Java language, it would look like:
As with the native specification, if you want to apply multiple native switches, you'd just string them all together. Issue 2325 does not apply, but the list of ignored options does. Keep in mind that you can only specify one job category for a job template, so you can't rely on combining job categories.
When To Use Which
It's a trade-off, really. With the native specification, you make your code DRM-dependent. You could, of course, build in a switch statement that sets the correct options depending on the reported DRM, but that's still DRM-dependent. The up-side, though, is that you're in control. You don't have to hope that the grid administrator doesn't screw up the job category configuration. With the job category, your code is clean, but if the grid administrator misconfigures the your job categories, your application could fail in interesting and unpredictable ways. A solution would be to include the proper job category settings for the major DRMs in the release notes.
The intention of the DRMAA specification is that the job category should be used in most cases. The native specification is more of a last resort. Keep in mind that they are not mutually exclusive. For an explanation of how Grid Engine resolves conflicting sets of options, see this earlier post.