Sometimes a slash is not a slash, as this quiz explains.
March 22, 2021 | Download a PDF of this article
More quiz questions available here
If you have worked on our quiz questions in the past, you know none of them is easy. They model the difficult questions from certification examinations. We write questions for the certification exams, and we intend that the same rules apply: Take words at their face value and trust that the questions are not intended to deceive you but to straightforwardly test your knowledge of the ins and outs of the language.
The objective of this quiz is to use the Path
interface to operate on files and directory paths. Given this code fragment:
Path defaultRoot =
Paths.get(System.getProperty("user.dir")).getRoot();
Path path = Paths.get("tmp", "john", "..", "doe");
path = defaultRoot.resolve(path);
System.out.print(path.getName(2)); // line n1
Assume that the file system containing the current working directory has an empty directory, tmp
, in its root.
What is the result? Choose one.
Answer. The Path
class represents the idea of a path on the file system—that is, an optional sequence of hierarchical directory names, perhaps ending in a filename. A Path
object itself is not directly linked with the physical file system. Such a connection is created when some action is taken using the Path
—for example, listing the contents of a directory or creating a file. The reason for avoiding such a hard connection is fairly compelling: If you could use a Path
only to represent something that already exists, a Path
could not be used to create a new directory or file.
Given this background, the question revolves around four inquiries. What do the Paths.get
operations do? What does the resolve
operation do? Can you extract getName(2)
from the path after resolve
has done its work? And if getName
doesn’t throw an exception, what value does it provide in this situation?
You could be forgiven for wondering what the first two lines do in general and, to be fair, some of that code is beyond the scope of the exam. That first line in particular is likely beyond the scope of the real exam, but we left it in to show how you can answer a question even if you don’t necessarily have a perfect understanding of everything. Given that none of the options admits the possibility of code other than the last line failing in any way, you can safely assume that these first lines compile and run without crashing. It’s also reasonable—and accurate—to assume that this opening code does what it seems to suggest.
Java’s APIs give you the tools to work with hierarchical directories without ever making explicit literal references to root directories and path separators.
The first line extracts a Path
object that represents the root of the file system that contains the user’s current working directory. So, for example, if the current working directory is C:\users\simon\javaprojects\examproj1
, the extracted Path
object represents C:\
.
The second line extracts another Path
—which is likely to be a relative path—representing a path hierarchy that might be represented in a UNIX-like format as either tmp/john/../doe
or tmp/doe
.
Which of those two relative paths do you get? From the perspective of Paths.get
, the ..
is simply an element of the path; it is not treated as anything special in the basic creation process. Therefore, the effective path has four elements: tmp
, john
, ..
, and doe
.
Why do you make the path this way rather than by simply specifying a String
such as tmp/john/../doe
? The reason is that Java seeks to allow you to create platform-independent programs. It’s important to be able to describe and manipulate paths on a file system without hardcoding things such as forward slashes or backslashes. Allowing a variable-length argument list of Strings
and joining them at runtime in whatever way is appropriate for the current execution platform is much more flexible than defining literal strings with separators. You can access the system property file.separator
to find the local character if you want, but why bother? So, yes, this approach does work, and it looks as if it should. Therefore, option D is incorrect.
As a side note, in literal path strings, forward slashes (UNIX style) work properly on Microsoft Windows–based JVMs, but backslashes (Windows style) fail on UNIX-based JVMs. The flexibility on Windows JVMs is possible because a forward slash is not a legal character at the operating system level in Windows paths or filenames. Therefore, if the JVM sees the forward slash, it can safely swap that character for a backslash before handing the path to the Windows system. But if a backslash shows up in a literal path string in UNIX, it’s a legal—if rather odd—character in a UNIX pathname, so simply swapping would change a valid meaning.
This entire question of forward slashes and backslashes is made even more interesting because there are other systems that use different path schemes entirely. On the OpenVMS operating system, local paths have a structure such as sys$disk:[dir.path.elements]myfile.txt;1
. Clearly, making simple assumptions about slashes would not work with this. The lesson here is that Java’s APIs give you the tools to work with hierarchical directories without ever making explicit literal references to root directories and path separators. Clearly, it’s a good habit to use them, because it will protect your code if it’s ever run on unfamiliar operating systems.
The next question is what does resolve
do? There are a couple of corner cases, but the most commonly used behavior is the one used here. If the invocation path is not empty and the argument path is not absolute, the effect is to concatenate the two paths. As a result, this takes the relative path (tmp/john/../doe
in UNIX-like format) and anchors it to the root of the current file system. Again in UNIX-like format, this becomes /tmp/john/../doe
. Note that resolve
still does not eliminate the ..
part; that is the job of the method normalize
.
So, now you need to know whether getName(2)
is successful and, if it is, what it returns. It turns out that Path
effectively treats the elements of a path (in this case, tmp
, john
, ..
, and doe
) like a list with a zero-based indexing system. Further, the root part of the path is not included in that list nor in its indexing scheme. Importantly, this exclusion of the root is consistent even if the path were on a Windows system where this example would likely represent C:\tmp\john\..\doe
. Thus, in a platform-independent way, the indexing starts at zero with tmp
(and never with C:\
or something similar). This again shows that option D is incorrect.
Further, you can see that the index 2 should be within the valid range, so option E is incorrect, because no exception is thrown. Further, you will get ..
regardless of the host operating system. This tells you that option B is correct, while options A and C are both incorrect.
Conclusion: The correct answer is option B.