Sunday Dec 27, 2009

Five cscope Tips

As software becomes increasingly complex and codebases continue to sprawl, source code cross-reference tools have become a critical component of a software engineer's toolbox. Indeed, since most of us are tasked with enhancing an existing codebase (rather than writing from scratch), proficiency in use of a cross-reference tool can mean the difference between understanding the subtleties of a subsystem in an afternoon and spending weeks battling "unforeseen" complications.

At Sun, we primarily use a tweaked version of the venerable cscope utility which has origins going back to AT&T in the 1980s (now freely available from As with many UNIX utilities, despite its age it has remained popular because of its efficiency and flexibility, which are especially important when understanding (and optionally modifying) source trees with several million lines of code.

Despite cscope's importance and popularity, I've been surprised to discover that few are familiar with anything beyond the basics. As such, in the interest of increasing cscope proficiency, here's my list of five features every cscope user should know:

  1. Display more than 9 search results per page with -r.

    Back in the 1980s the default behavior may have made sense, but with modern xterms often configured to have 50-70 rows the default is simply inefficient and tedious. By passing the -r option to cscope at startup (or including -r in the CSCOPEOPTIONS environment variable), cscope will display as many search results as will fit. The only caveat is that selecting an entry from the results must include explicitly pressing return (e.g., "3 [return]" instead of "3") so that entries greater than 9 can be selected. I find this tradeoff more than acceptable. (Apparently, the current open-source version of cscope uses letters to represent search results beyond 9 and thus does not require -r.)

  2. Display more pathname components in search results with -pN.

    By default, cscope only displays the basename of a given matching file. In large codebases, files in different parts of the source tree can often have the same name (consider main.c), which makes for confusing search results. By passing the -pN option to cscope at startup (or including -pN in the CSCOPEOPTIONS environment variable) -- where N is the number of pathname components to display -- this confusion can be eliminated. I've generally found -p4 to be a good middle-ground. Note that -p0 will cause pathnames to be omitted entirely from search results, which can also be useful for certain specialized queries.
  3. Use regular expressions when searching.

    While it is clear that one can enter a regexp when using "Find this egrep pattern", it's less apparent that almost all search fields will accept regexps. For instance, to find all definitions starting with ipmp_ and ending with ill, just specify ipmp_.\*ill to "Find this definition". In addition to allowing groups of related functions to be quickly found, I find this feature is quite useful when I cannot remember the exact name of a given symbol but can recall specific parts of its name. Note that this feature is not limited to symbols -- e.g., passing .\*ipmp.\* to "Find files #including this file" returns all files in the cscope database that #include a file with ipmp somewhere in its name.
  4. Use filtering to refine previous searches.

    cscope provides several mechanisms for refining searches. The most powerful is the ability to filter previous searches through an arbitrary shell command via \^. For instance, suppose you want to find all calls to GLDv3 functions (which all start with mac_) from the nge driver (which has a set of source files starting with nge). You might first specify a search pattern of mac_.\* to "Find functions calling this function". With ON's cscope database, this returns a daunting 2400 matches; filtering with "\^grep common/io/nge", quickly pares the results down to the 12 calls that exist within the nge driver. Note that this can be repeated any number of times -- e.g., "\^sort -k2" alphabetizes the remaining search results by calling function.
  5. Use the built-in history mechanisms.

    You can quickly restore previous search queries by using \^b (control-b); \^f will move forward through the history. This feature is especially useful when performing depth-first exploration of a given function hierarchy. You can also use \^a to replay the most recent search pattern (e.g., in a different search field), and the > and < commands to save and restore the results of a given search. Thus, you could save search results prior to refining it using \^ (as per the previous tip) and restore them later, or restore results from a past cscope session.

Of course, this is just my top-five list -- there are many other powerful features, such as the ability to make changes en masse, build custom cscope databases using the xref utility, embed command-line mode in scripts (mentioned in a previous blog entry), and employ numerous extensions that provide seamless interaction with popular editors such as XEmacs and vim. Along these lines, I'm eager to hear from others who have found ways to improve their productivity with this exceptional utility.




« December 2009

No bookmarks in folder


No bookmarks in folder