Quiz Yourself: Functional Interfaces (Advanced)

The subtleties of boxing and unboxing in streams

October 4, 2019

Download a PDF of this article
More quiz questions available here

If you have worked on our quiz questions in the past, you know none of them is easy. They model the difficult questions from certification examinations. The levels marked “intermediate” and “advanced” refer to the exams, rather than the questions. Although in almost all cases, “advanced” questions will be harder. We write questions for the certification exams, and we intend that the same rules apply: Take words at their face value and trust that the questions are not intended to deceive you, but straightforwardly test your knowledge of the ins and outs of the language.

The objective of this question is to develop code that uses primitive versions of functional interfaces. Given the following code:

DoubleStream ds = DoubleStream.of(1.0, 2.0, 3.0);
… // line n1

Which code added at line n1 will process the stream in the most efficient way and print 9.0 to the console? Choose one.

  1. Function<Double, DoubleUnaryOperator> fun = a -> d -> d + a;
  2. DoubleFunction<DoubleUnaryOperator> fun = a -> d -> d + a;
  3. DoubleFunction<DoubleFunction<Double>> fun = a -> d -> d + a;
  4. Function<Double, DoubleFunction<Double>> fun = a -> d -> d + a;

Answer. This question addresses several topics. One is the interaction of primitive data types with Java’s generics mechanism. Another is the matching of complex functional types.

Let’s start by looking at the first topic in a general way. The generics mechanism in Java works only with object types. That is to say, you can define a List<Something> for any Something if, but only if, Something is some Object type. You cannot define a List<int> or a List of any other primitive type.

To help mitigate this inconvenience, Java’s wrapper types may be used. So you can, for example, declare something as List<Integer>. Further, the autoboxing and unboxing features of the compiler will often (but not always) allow you to process primitive types in these kinds of generic situations without explicitly writing the code that converts between objects and primitives. So, although the following code works:

List<Integer> li = new ArrayList<>();
int ninetyNine = li.get(0);

the compiler actually emits code equivalent to this form:

List<Integer> li = new ArrayList<>();
int ninetyNine = ((Integer)li.get(0)).intValue();

Notice that there is quite a bit of hidden CPU load in this second form. There is the construction and initialization of an Integer object (or perhaps, the finding of an existing Integer in a pool of preconstructed objects). There’s also a cast; the cast is actually performed on the return of get(0) and the invocation of the intValue method to extract the primitive result.

All this CPU overhead is typically a reasonable trade-off for improved source-code readability (and, hence, maintainability). However, it’s important to realize that it happens, because in some cases, the trade-off is unacceptable due to the cumulative loss of performance. In any case, it’s always “less efficient”—in the wording of the question—even if the loss is acceptable.

Now, when you define data structures such as List, it’s relatively hard to avoid this boxing/unboxing process, and the trade-off is usually worth it. However, if you’re performing bulk computations, there’s a related way you could run afoul of the inability of generics to handle primitives. Suppose you want to define a process that takes a single input and creates a single result. This is the classic java.util.function.Function<E,F> where the input to the operation is of type E and the result is of type F. Now, suppose you want to use this operation to perform arithmetic on a double value. You might try this:

Function<Double, Double> fdd = in -> in + 2.0;

Unfortunately, you have created a function that takes a Double object, and from that, the compiler generates code to extract the double primitive from the wrapper object before adding 2.0. Further, the compiler generates code to build a Double object wrapper around the result. The overhead of such a transformation is huge compared to simply adding 2.0 to a double primitive, and that trade-off is much less likely to be acceptable if the transformation function is used repeatedly.

In view of these limitations, and with the express goal of allowing increased efficiency—specifically by avoiding all this boxing and unboxing—the functional interfaces and the Stream API (and some other elements of Java 8’s functional features) provide special case versions that work directly on primitives. In this example, the relevant features are the following:

  • DoubleStream—A version of the stream concept dedicated to working with primitive double data
  • java.util.function.DoubleFunction<E>—A functional interface defining a method that takes a single primitive double argument and returns an object of type E
  • java.util.function.DoubleUnaryOperator—A functional interface defining a method that takes a single primitive double argument and returns a primitive double result

Many more such features are provided, but these three are the ones directly relevant to this question.

To satisfy the first important piece of the question, you need to find an answer that avoids, or at least minimizes, boxing and unboxing, because those processes reduce the efficiency required by the question. That is satisfied because the given code creates a DoubleStream directly, which is the primitive version of a stream that handles double values directly. Therefore, in the subsequent processing, you’ll want to ensure that the values remain primitive, and for that you’ll want to be sure you’re using primitive-oriented functional interfaces.

Now let’s look at the rest of the background. You can see that all the options seek to define a variable fun, and fun is used in the stream processing in the expression ds.map(fun.apply(1.0)).sum().

A map operation takes each item of the upstream type, runs it through the provided processing operation (which must create one result), and produces a stream of the type returned by that processing operation. From this, you can determine several requirements about the operation, and from those, you can determine requirements about the type and behavior of fun.

Notice that fun is not the operation applied in the map. Rather, evaluation of the expression fun.apply(1.0) creates the operation performed by the map operation. In other words, fun is not itself the operation but rather it is a kind of factory for the behavior.

Let’s just call the behavior used by map “the operation” for simplicity. The operation must take a primitive double as its input, and it must create a primitive double as the result. How can you accomplish this? From a purely logical deductive perspective, you know that the map operation input must be assignment-compatible with the upstream type. In this case, the upstream type is double. Therefore, double, or maybe Double, could work, but the latter would require autoboxing, so you should reject that for inefficiency.

How about the downstream type? Well, map operations always create a new stream, and only primitive streams have a sum() operation; after all, you can’t add up a list of Automobile objects to a single Automobile result. Therefore, you know that the resulting value must be a primitive type, and double is the obvious choice. Actually, although the APIs allow you to map from a primitive double stream to a primitive int or long stream, that change would require either the mapToInt or mapToLong methods. So if the notion that this is the obvious type isn’t rigorous enough for you, and you aren’t satisfied to infer the “doubleness” from the lack of other options in the question, you can know for sure from that fact.

Therefore, the type of the operation must be a type that takes a single double argument and returns a double result. This is the behavior defined by the DoubleUnaryOperator interface.

Now, if you know your APIs in detail (and logic suffices for this question; you don’t have to learn this), you know that the argument type for map on a DoubleStream is precisely DoubleUnaryOperator. Importantly, however, this is one of the situations that autoboxing doesn’t handle. Although it might seem that Function<Double, Double>, DoubleFunction<Double>, and ToDoubleFunction<Double> should be compatible with DoubleUnaryOperator, that’s not the case; you must provide a DoubleUnaryOperator.

What about the type of fun? Whatever it is, it must embody a behavior that takes a double argument (or possibly a Double, by autoboxing, but you know you don’t want that version if you can avoid it) and returns a DoubleUnaryOperator—the value that will be used in the map operation.

A function that takes a primitive double as its input but returns an object type is called a DoubleFunction<E>, where E is the return type. (By contrast, a ToDoubleFunction<E> would be a function that takes an E as argument and returns a primitive double.) Given this, you know that the efficient type for fun would be DoubleFunction<DoubleUnaryOperator>. This happens to be the type declared in option B. Let’s consider how that would behave and see if it would give the right output.

The call to fun.apply(1.0), given the definition of fun shown in option B, would create a function that takes a double argument and returns that argument plus 1. If you run this code, you’ll find that the map operation results in stream data of 2.0, 3.0, and 4.0 (each being greater by 1 than the corresponding input) and the sum will be 9.0, as required. You already know that this declaration is as efficient as it could possibly be, so it’s safe to conclude that option B is the correct answer. Nevertheless, let’s investigate the others and verify that they’re either unworkable or relatively inefficient.

Option A declares the same logical computation for fun but declares it as taking a Double object, rather than a primitive double, as its initial input. The resulting function is still a DoubleUnaryOperator that has the effect of incrementing its argument by 1.0. Therefore, the code would compile, and the result would be 9.0. However, that boxing operation is sufficient to show that it’s less efficient than option B and, therefore, option A must be incorrect.

Option C and option D both define factories for operations defined in terms of DoubleFunction<Double>. Now, DoubleFunction<Double> declares a function that takes a primitive double as argument and returns an object Double as result, so you might expect that this would compile. But even if it did, it’s clearly not as efficient as option B; therefore, both options must be wrong already. However, as previously mentioned, the autoboxing mechanism can convert a double to a Double or vice versa, but it cannot convert a function that takes a Double as argument into a function that takes a double as argument. That would be a conversion of function types, not merely a boxing/unboxing operation. As a result, both options C and D will fail to compile and, therefore, both are incorrect.

The correct option is B.

Simon Roberts

Simon Roberts joined Sun Microsystems in time to teach Sun’s first Java classes in the UK. He created the Sun Certified Java Programmer and Sun Certified Java Developer exams. He wrote several Java certification guides and is currently a freelance educator who publishes recorded and live video training through Pearson InformIT (available direct and through the O’Reilly Safari Books Online service). He remains involved with Oracle’s Java certification projects.

Mikalai Zaikin

Mikalai Zaikin is a lead Java developer at IBA IT Park in Minsk, Belarus. During his career, he has helped Oracle with development of Java certification exams, and he has been a technical reviewer of several Java certification books, including three editions of the famous Sun Certified Programmer for Java study guides by Kathy Sierra and Bert Bates.

Share this Page