Wednesday Sep 03, 2008

strsep() in libc

As of today, strsep() function lives in Nevada's libc (tracked by CR 4383867 and PSARC 2008/305). This constitutes another step in the quest for more feature-full (in terms of compatibility) libc in OpenSolaris. In binary form, the changes will be available in build 99. The documentation will be part of the string(3C) man page.

Here's a small example of how to use it:

#include <stdio.h>
#include <string.h>
#include <err.h>

int parse(const char \*str) {
        char \*p = NULL;
        char \*inputstring, \*origstr;
        int ret = 1;
        
        if (str == NULL)
                errx(1, "NULL string");

        /\*
         \* We have to remember original pointer because strsep()
         \* will change 'inputstr' pointer.
         \*/
        if ((origstr = inputstring = strdup(str)) == NULL)
                errx(1, "strdup() failed");

        printf("=== parsing '%s'\\n", inputstring);
        for ((p = strsep(&inputstring, ",")); p != NULL;
           (p = strsep(&inputstring, ","))) {
                if (p != NULL && \*p != '\\0')
                        printf("%s\\n", p);
                else if (p != NULL) {
                        warnx("syntax error");
                        ret = 0;
                        goto bad;
                }
        }
bad:
        printf("=== finished parsing\\n");
        free(origstr);
        return (ret);
}

int main(int argc, char \*argv[]) {
        if (argc != 2)
                errx(1, "usage: prog ");

        if (!parse(argv[1]))
                exit(1);

        return (0);
}

This example was actually used as a unit test (use e.g. "1,22,33,44" and "1,22,,44,33" as input string) and it also nicely illustrates important properties of strsep() behavior:

  • While searching for tokens, strsep() modifies the original string. This is shared property with strtok().
  • Unlike strtok(), strsep() is able to detect empty fields.

There is a function in Solaris' libc which can do token splitting and does not modify the original string - strcspn(). The other notable property of strsep() is that (unlike strtok()) it does not conform to ANSI-C. Time to draw a table:

 function(s)   ISO C90    modifies     detects
                           input     empty fields
-------------+----------+----------+--------------+
 strsep()        No          Yes         Yes
 strtok()        Yes         Yes         No
 strcspn()       Yes         No        Sort of

None of the above functions is bullet-proof. The bottom line is the user should decide which is the most suitable for given task and use it with its properties in mind.

About

blog about security and various tools in Solaris

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today