By meem on Sep 02, 2008
Being able to easily write scripts from the command-line has long been regarded as one of UNIX's core strengths. However, over the years, surprisingly little attention has been paid to writing CLIs whose output lend themselves to scripting. Indeed, even modern CLIs often fail to consider parsable output as a distinct concept from human output, leading to overwrought and fragile scripts which inevitably break as the CLI is enhanced over time. Some recent CLIs have "solved" the parsable format problem by using popular formats such as XML and JSON. These are fine formats for sophisticated scripting languages, but a poor match for traditional UNIX line-oriented tools (e.g. grep, cut, head) that form the foundation of shell-scripting.
Even those CLIs that consider the shell when designing a parsable output format often fall short of the mark. For dladm(1M), it took us (Sun) three tries to create a format that can be easily parsed from the shell. So, while the final format we settled on may seem simple and obvious, as is often the case, making things simple can prove to be surprisingly hard. Further, there are a number of alternative output formats that seem compelling at first blush but ultimately prove to be unworkable.
So that others working on similar problems may benefit, below I've summarized our set of guidelines -- some obvious, some not -- that we arrived at while working on dladm. As each CLI has its own constraints, not all of them may prove applicable, but I'd urge anyone designing a CLI with parsable output to consider each one carefully.
To provide some specifics to hang our guidelines on, first, here's an example of the dladm human output format:
# dladm show-link -o link,class,over LINK CLASS OVER eth0 phys -- eth1 phys -- eth2 phys -- default0 aggr eth0 eth1 eth2 cyan0 vlan default0... and here's the equivalent parsable output format:
# dladm show-link -p -o link,class,over eth0:phys: eth1:phys: eth2:phys: default0:aggr:eth0 eth1 eth2 cyan0:vlan:default0Now, the guidelines:
- Design CLIs that output in a regular format -- even in human output mode.
Once your human output mode ceases to be regular (ifconfig(1M) output is a prime example of an irregular format), later adding a parsable output mode becomes difficult if not impossible. (As an aside, I've often found that irregular output suggests deeper design flaws, either in the CLI itself or the objects it operates on.)
- Prefer tabular formats in parsable output mode.
Because traditional shell scripting works best with lines of information, tabular formats where each line both identifies and describes a unique object are ideal. For example, above, the link field uniquely identifies the object, and the class and over fields describe that object. In some cases, multiple fields may be required to uniquely identify the object (e.g., with dladm show-linkprop, both the link and the property field are needed). As an aside: in the multiple-field case, the human output mode may choose to use visual nesting (e.g., by grouping all of a link's properties together on successive lines and omitting the link value entirely), but it's important this not be done in parsable output mode so that the shell script can remain simple.
- Require output fields to be specified.
Unlike humans, scripts always invoke a CLI with a specific purpose in mind. Also unlike humans, scripts are not facile at adapting to change (e.g., the addition of new fields). Thus, it's imperative that scripts be forced to explicitly specify the fields they need (with dladm, attempting to use -p without -o yields an error message). With this approach, new fields can be added to a CLI without any risk of breaking existing consumers. Further, if a field used by a script is removed, the failure mode becomes hard (the CLI will produce an error), rather than soft (the consumer misparses the CLI's output and does something unpredictable). Note that for similar reasons, if your CLI provides a way to print field subsets that may change over time (e.g., -o all), those must also fail in parsable output mode.
- Leverage field specifiers to infer field names.
Because field names must be specified in an order, it's natural to use that same order as the output order, and thus avoid having to explicitly identify the field names in the parsable output format. That is, as shown above, dladm can omit indicating which field name corresponds with which value because the order is inferred from the invocation of -olink,class,over. This may seem a minor point, but in practice it saves a lot of grotty work in the shell to otherwise correlate each field name with its value.
- Omit headers.
Similarly, because the field order is known (and no human will be staring at the output) there is no utility in providing a header in parsable output mode, and indeed its presence would only complicate parsing. As shown above, dladm omits the header in parsable output mode.
- Do not adorn your field values.
In human output mode, it can be useful to give visual indications for specific field values. For instance, as shown above, dladm shows empty values as "--" in human output mode so that the table does not look malformed. In parsable output mode, such embellishments only complicate and confuse consumers of the data (and may in fact make it ambiguous), and thus should be avoided. As above, in parsable output format, empty values are shown as actually being empty.
- Do not use whitespace as a field separator.
Whitespace may seem like a natural field separator, but in practice it's problematic. Specifically, many shells treat whitespace separators specially by merging consecutive instances into a single instance. For example, consider representing three consecutive empty values. With a non-whitespace field separator such as ":", this would be output as "::" (empty value 1, : separator, empty value 2, :, empty value 3). With the shell's IFS variable set to ":", the shell will parse this as three separate empty values, as intended. With space as the field separator, this would be output as "   ", and with IFS set to " " the shell would misparse this as a single empty value.
- Do not restrict your allowed field values.
While some fields may be controlled directly by the CLI (e.g., the class field above), others are either outside of your direct control (e.g., the link field above), or outside of even your system's control (e.g., the essid field output by dladm show-wifi). As such, aside from ensuring the field value is printable ASCII (where newline is considered as unprintable), no values should be filtered out or forbidden.
Thus, any values that have special meaning should generally be escaped. For instance, with ":" as a field delimiter, IPv6 address "fe80::1" would become "fe80\\:\\:1" when displayed in parsable output mode. Thankfully, escaping does not complicate shell parsing because all popular scripting shells have read builtins that will automatically strip escapes. Thus, the common idiom of piping the output to a read/while loop works as expected without any special-purpose logic. For instance, even though the BSSID field will contain embedded colons, this will loop through each BSSID on each link, trying to connect to one until it succeeds:
dladm scan-wifi -p -o link,bssid | while IFS=: read link bssid; do dladm connect-wifi -i $bssid $link && break doneThat said, if only a single field has been requested, the field separator is not needed. Since no ambiguity exists in that case, there's no need to escape it -- and not doing so can make things more convenient for other shell idioms -- e.g., to collect all in-range SSIDs:
ssids=`dladm scan-wifi -p -o bssid`
If unprintable ASCII values can legitimately occur in a given
field's output, you need to use another encoding format.