Testing overload resolution

(Mostly) Understanding overload resolution

Overload resolution can be thought of a process that takes as input a method call - a receiver type R, a method name N and an actual argument type list A1, A2, ... An - and returns a suitable method M for such a method call. M must satisfy certain requirements; for instance, M's name will be identical to N, M will be a member of a supertype of R and, finally, actual arguments A1, A2, ... An must be convertible to M's formal argument types (via method invocation conversion). Let's start with a simple example:

class Simple {
   void m(Integer i);

Simple s = ...

In the above example, overload resolution takes the receiver type Simple, the method name m and the actual argument type int and returns the method m(Integer). This example is relatively straightforward, as Simple has only one method named m. Note that it is not necessary for actual arguments to be an exact match with the formal arguments - in this case we have that overload resolution picks a method accepting an Integer, while the call-site is supplying an argument of type int - this conversion is commonly known as boxing.

Of course, there are cases in which more than one method is found; we say that there are multiple resolution candidates. In such a setting, the compiler will try to determine the most specific candidate. In the example below there are 3 candidates, one of them being also the most specific:

class Multiple {
 void m(Object o);
   void m(Number i);
   void m(Integer i); //this is the most specific!

Multiple m = ...

When more than one candidate is found, their declarations are further examined by the compiler, which tries to determine if one method declaration can be subsumed by any of the other candidate declarations. In the above case, we have that m(Integer) is the most specific candidate because an argument of type Integer can be passed where an argument of type Number (resp. Object) is expected (but not vice-versa!). It is possible that the most-specific analysis fails, as in the case below:

class Ambiguous {
 void m(Object o, Integer i);
   void m(Integer i, Object o);

Ambiguous m = ...
m.m(1,2); //error
m.m((Object)1,2); //ok

Here the compiler finds two resolution candidates, but neither is more specific than the other; the only way out is for the compiler to issue an ambiguity error on the call-site. This problem can be easily resolved by adding a type-cast on one (or more) actual arguments, in order to help the compiler find the method that is to be called. Another problematic case, somewhat dual to the one we've just discussed, is when no available candidate in the receiver type can be found:

class Missing {
   void m(String s);

Missing m = ...
m.m(1); //error

In the above case, Missing does have a method named m; however, such method is expecting an argument of type String while the call-site supplies an argument of type int. Again, the compiler is forced to issue an error, as no conversion exists such that int can be converted to String.

I bet the basics of overload resolution are obvious to most developers --- this is essentially the machinery every programmer has to fight against in order to make their programs to compile without errors! But what is then that makes Java overload resolution so hard? Overload resolution used to be pretty straightforward before JDK 5.0; however, with the addition of boxing/unboxing, varargs and generics the complexity of this area went up to the roof. We now have three distinct overload resolution phases, a compatibility phase, a boxing phase and a varargs phase. Those phases are applied in sequence and each phase is only applied if the previous phase failed to find a resolution candidate. In addition, as method declarations can now have type-parameters (aka generic methods), overload resolution is now deeply tied with method type-inference (the process of finding an instantiation for method type-parameters given actual argument types and the target-type of an assignment context - where applicable). This means that the task of testing a compiler implementation has gotten significantly harder, due to the combinatorial explosion that arises when all the variants described above are taken into account.

(Mostly) Testing overload resolution

The most common approach to test the outcome of overload resolution is what I call the black-box approach. A black-box overload resolution test is a simple Java program containing one or more resolution candidates; the entry point of the test will contain at least one call-site. In the case where there is more than one resolution candidate, the body of the most specific candidate will execute some pre-determined instruction so that, when the test is executed, it will be easy to verify whether the test triggered a call to the most specific method. Here's an example:

class BlackBox1 {
   static boolean mostSpecific = false;

   static void m(Object o) { }
   static void m(Number i) { }
   static void m(Integer i) { mostSpecific = true; }

   public static void main(String[] args) {

The way the above test works is straightforward: there are three overload candidates, but only one of them sets the variable mostSpecific. The test main method contains a method call followed by an assertion statement that checks as to whether the mostSpecific flag has been set as a side-effect of the previous method call. Overload resolution is here treated as a black-box --- the test doesn't verify that all resolution steps are taken in a way that is compatible with the behavior described in the JLS - the only thing the test cares about is the outcome of overload resolution. It is thus possible for such a test to pass even though the compiler did not follow all the required steps - i.e. if some overload candidate (which later turned out to be non-influent) has been ignored during the overload process. That is, this way of testing overload resolution is inextricably fragile.

A slightly more efficient way to perform a black-box overload resolution test is to verify that the right method descriptor is found inside the compiler-generated bytecode, rather than relying on properties of test execution (i.e. such side-effect of a method call). This way of testing overload resolution leads to tests that are typically more compact, as those tests are not designed to be executed (they are compile-only tests). A bytecode variant of the test above would look like the following:

class BlackBox2 {
   void m(Object o) { }
   void m(Number i) { }
   void m(Integer i) { }

   void test() { m(1); }

This time, in order to perform the test we need to compile the test source first, and then analyze the generated bytecode; in this case, we would need to make sure that the generated bytecode contains an invokevirtual instruction whose method descriptor is the one for m(Integer). This can be achieved i.e. using a tools like javap, which would produce the following output:

void test();
   Stack=2, Locals=1, Args_size=1
   0: aload_0
   1: iconst_1
   2: invokestatic #2; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
   5: invokevirtual #3; //Method m:(Ljava/lang/Integer;)V
   8: return

The line in red contains the information we were looking for: the code generated by the compiler does indeed contain a reference to the method descriptor for m(Integer), meaning that overload resolution terminated correctly. Again, the test only looks at the outcome of overload resolution by inspecting a compiler artifact. There is no way for the test to determine whether the right set of overload candidates were consulted by the compiler, which means checking the bytecode is by no mean a more complete way of testing overload resolution.

Instrumenting the compiler

In the previous section, we have shown how black-box tests cannot provide enough guarantees about the correct behavior of overload resolution implementation. If the test treats the compiler as a black box, and only focuses on analyzing the compiler output, it may be to late for the test to detect anomalies that occurred during the overload resolution process. Ultimately, in order to write better overload resolution tests, we needs some help from the compiler; for instance, the compiler should be able to tell, for any given call-site, what is the set of overload candidates that are being consulted and, possibly, which candidate has been selected as the most specific one. Dually, information about why a candidate has been discarded from the set might be useful too (i.e. if the method has a signature that is not compatible with the actual argument types in the call-site). The compiler should provide a debugging mode in which such info are being generated on-the-fly, while overload resolution is taking place. The information could be a simple compiler diagnostic, or, in a more sophisticated approach, the compiler could decorate the bytecode with special custom attributes that could be used to reconstruct the choices made by the compiler during the overload resolution process.

I recently worked at a patch to the javac compiler that enables advanced debugging diagnostic for overload resolution. Such diagnostics are turned off by default, and can be enabled using a non-standard compiler flag. Some of the things shown by these new diagnostics are:
  • set of methods considered during a given overload resolution phase
  • for each method that has been discarded from the candidate set, show why the method was deemed not applicable
  • show the most specific method (if any)
  • for each method, show the instantiated signature of that method (if applicable)

The verbose resolution diagnostic support also comes with a set of predefined filters that can be used to turn off verbose diagnostic in specific cases (i.e. when resolving superclass constructor). For instance, if we compile the above class BlackBox2 with the debugging diagnostics enabled, javac generates the following output:

BlackBox2.java:7: Note: erroneous resolution for method m in type BlackBox2
  phase: BASIC
  with actuals: int
  with type-args: no arguments
      #0 not applicable method found: m(Integer)
         (actual argument int cannot be converted to Integer by method invocation conversion)
      #1 not applicable method found: m(Number)
         (actual argument int cannot be converted to Number by method invocation conversion)
      #2 not applicable method found: m(Object)
         (actual argument int cannot be converted to Object by method invocation conversion)

BlackBox2.java:7: Note: resolving method m in type BlackBox2 to candidate 0
  phase: BOX
  with actuals: int
  with type-args: no arguments
      #0 applicable method found: m(Integer)
      #1 applicable method found: m(Number)
      #2 applicable method found: m(Object)

The diagnostic above can be a bit obscure and verbose - yet it is also very informative, as it essentially describes all the steps the compiler has taken in order to resolve the call-site m(1). We can see how the compiler first attempted to apply a basic (compatibility) resolution step, in which boxing is disabled - this resulted in all candidates being inapplicable. Since no result has been found, the compiler went on and enabled boxing - all methods suddenly becomes applicable, but the most specific is m(Integer), as pointed out by the compiler diagnostic. Now we can truly say the compiler is behaving according to the JLS!

Truly testing overload resolution

The verbose resolution diagnostics shown above are a very good tool to inspect the compiler behavior; however, given the large amount of information generated by the compiler, it is not be immediate to see how those diagnostics could be turned into overload resolution tests. Ultimately, we need a clean and compact way to write overload resolution tests. The solution is to start from a black-box approach and decorate the overload candidates with annotations - we can then write a test harness that process those annotations, and check their contents against the verbose resolution diagnostics generated by javac. What would such a test look like?

class BetterResolutionTest {
   void m(Object o) { }
   void m(Number i) { }
   @Candidate(applicable=BOX, mostSpecific=true)
   void m(Integer i) { }

   { m(1); }

Cool huh? The code for the test is very close to the one given above - the only difference being the @Candidate annotations added to the method declarations. Each @Candidate annotation is used to tell the test harness as to whether the method should be considered as applicable (and if so, in which phases); it is also used to indicate the expected most specific method. Thanks to Java annotations, everything can be achieved in a very natural and declarative way. The harness will compile the test, and check the compiler output against the contents of the above annotations. For instance, the harness will check that the most specific method found during overload resolution corresponds to the method declaration annotated with a @Candidate annotation whose mostSpecific attribute is set.

If you feel adventurous, you could have a look at the following tests, which unleash the full power of the overload resolution test harness:

This new overload resolution testing methodology will help us (and you?) write better, more complete overload resolution tests with just minimum effort. I think this is a great step towards improving our test coverage in such a tricky and complex area that lies within the very core of any Java compiler implementation.


Post a Comment:
  • HTML Syntax: NOT allowed

Maurizio Cimadamore is a member of the langtools team based in Santa Clara, CA. His efforts are mainly focused on the type-system area of the Java compiler.


« October 2016