Compatibly Evolving BigDecimal
By Darcy-Oracle on Apr 21, 2008
Back in JDK 5, JSR 13 added true floating-point arithmetic to BigDecimal, which involved many new methods and constructors along with new supporting classes in the java.math package. I was actively involved in the JSR 13 expert group and integrated the code into the JDK. These changes had some surprising compatibility impacts which can be classified according to their source, binary, and behavioral effects.
The numerical values representable in BigDecimal are (unscaledValue × 10^{-scale}) where unscaledValue is a BigInteger and scale is a 32-bit integer. Before Java SE 5, scale was constrained to be positive or zero (in other words, 10 raised to a negative or zero exponent) and JSR 13 removed this restriction to allow any integer exponent. Consequently, prior to JSR 13 BigDecimal integral values with trailing zeros had to have them explicitly represented; for example the value one million had to be stored as (1,000,000 × 10^{0}) rather than (1 × 10^{6}) or (10 × 10^{5}), etc. One behavioral consequence of JSR 13 was that all the methods operating on BigDecimal values understand and accept numbers without the old exponent restriction.
The new API elements added by JSR 13 are listed in table 1; the additions will be examined under each kind of compatibility.
New Fields |
public static final ZERO public static final ONE public static final TEN |
---|---|
New Constructors |
public BigDecimal(char[] in, int offset, int len) public BigDecimal(char[] in, int offset, int len, MathContext mc) public BigDecimal(char[] in) public BigDecimal(char[] in, MathContext mc) public BigDecimal(String val, MathContext mc) public BigDecimal(double val, MathContext mc) public BigDecimal(BigInteger val, MathContext mc) public BigDecimal(BigInteger unscaledVal, int scale, MathContext mc) public BigDecimal(int val) public BigDecimal(int val, MathContext mc) public BigDecimal(long val) public BigDecimal(long val, MathContext mc) |
New Methods |
public static BigDecimal valueOf(double val) public BigDecimal add(BigDecimal augend, MathContext mc) public BigDecimal subtract(BigDecimal subtrahend, MathContext mc) public BigDecimal multiply(BigDecimal multiplicand, MathContext mc) public BigDecimal divide(BigDecimal divisor, int scale, RoundingMode roundingMode) public BigDecimal divide(BigDecimal divisor, RoundingMode roundingMode) public BigDecimal divide(BigDecimal divisor) public BigDecimal divide(BigDecimal divisor, MathContext mc) public BigDecimal divideToIntegralValue(BigDecimal divisor) public BigDecimal divideToIntegralValue(BigDecimal divisor, MathContext mc) public BigDecimal pow(int n) public BigDecimal pow(int n, MathContext mc) public BigDecimal abs(MathContext mc) public BigDecimal negate(MathContext mc) public BigDecimal plus() public BigDecimal plus(MathContext mc) public int precision() public BigDecimal round(MathContext mc) public BigDecimal setScale(int newScale, RoundingMode roundingMode) public BigDecimal scaleByPowerOfTen(int n) public BigDecimal stripTrailingZeros() public String toEngineeringString() public String toPlainString() public BigInteger toBigIntegerExact() public long longValueExact() public int intValueExact() public short shortValueExact() public byte byteValueExact() public BigDecimal ulp() |
Binary Compatibility
Adding new public methods and constructors, even ones that overload existing names is binary compatible. Adding public static final fields is binary compatible, meaning existing clients of the library will continue to link. However, there is a possible complication here since BigDecimal is not final and since it has public constructors, it can be subclassed. (As discussed in Effective Java, Item 13, Favor Immutability, this was a design oversight when the class was written.) Adding fields to classes can be binary incompatible, but the needed combination of circumstances does not arise in this case. Therefore, individually and as a whole, the BigDecimal API additions are binary compatible.
Source Compatibility
For source compatibility, we can distinguish between clients of a types and extenders/implementors of a type; certain changes can inconvenience extenders/implementors but not clients.
Adding the public static final fields is binary-preserving source compatible. If a subclass, say MyDecimal, already has a field with the same name as a field being added to BigDecimal, the existing declaration in MyDecimal hides the new declaration in the parent class BigDecimal. Therefore, existing uses of, say, MyDecimal.TEN, would continue to resolve to the same binary name.
Since constructors are not inherited and all the new constructors are public rather than protected, just the uses of constructors in clients needs to be considered; there are no distinct special issues for subclasses. The constructors in BigDecimal during Java SE 1.4.x, the platform version immediately predating JSR 13, are listed in table 2.
Existing Constructors |
BigDecimal(BigInteger val) BigDecimal(BigInteger unscaledVal, int scale) BigDecimal(double val) BigDecimal(String val) |
---|
To assess the source compatibility impact, we can compare the new constructors with the old constructors and see if any possible overload resolutions would change, including the possibility of stopping an existing compilation by removing the existence of a most specific method. Of the twelve new constructors, ten are clearly not problematic and binary-preserving source compatible; the ten either have more parameters than the existing constructors or are not applicable to the same invocations, see table 3. For example, eight of the new constructors have the new type MathContext as a parameter. Because of primitive subtyping the other two new constructors, BigDecimal(int val) and BigDecimal(long val) are both applicable to and more specific than invocations that would previously resolve to BigDecimal(double val). Therefore, adding these two new constructors is not binary-preserving source compatible because a different constructor can be resolved for the same existing source code, code with one-argument calls to a BigDecimal constructor where the argument is a primitive type. These two constructors need a secondary screening to assess their behavioral equivalence.
New Constructor | Source Compatibility Impact |
---|---|
public BigDecimal(char[] in, int offset, int len) | Binary preserving; more parameters than existing constructors. |
public BigDecimal(char[] in, int offset, int len, MathContext mc) | Binary preserving; more parameters than existing constructors. |
public BigDecimal(char[] in) | Binary preserving; disjoint with existing one-parameter constructors. |
public BigDecimal(char[] in, MathContext mc) | Binary preserving; disjoint with existing two-parameter constructors. |
public BigDecimal(String val, MathContext mc) | Binary preserving; disjoint with existing two-parameter constructors. |
public BigDecimal(double val, MathContext mc) | Binary preserving; disjoint with existing two-parameter constructors. |
public BigDecimal(BigInteger val, MathContext mc) | Binary preserving; disjoint with existing two-parameter constructors. |
public BigDecimal(BigInteger unscaledVal, int scale, MathContext mc) | Binary preserving; disjoint with existing two-parameter constructors. |
public BigDecimal(int val) | Warning: not binary preserving since more specific than existing one-parameter constructor, behavioral equivalence must be assessed. |
public BigDecimal(int val, MathContext mc) | Binary preserving; disjoint with existing two-parameter constructors. |
public BigDecimal(long val) | Warning: not binary preserving since more specific than existing one-parameter constructor, behavioral equivalence must be assessed. |
public BigDecimal(long val, MathContext mc) |
None; disjoint with existing two-parameter constructors. |
Before JDK 5, the expressions BigDecimal(123) and BigDecimal(123L) in source code would resolve to a call to BigDecimal(double); as part of that resolution primitive widening conversion converts the argument expression to double before the constructor is invoked. All int values are exactly representable as double and the double constructor when given an integral value will return a BigDecimal with the numerical value in question and a scale of zero. The new int constructor will also return a BigDecimal with the numerical value of the argument and a scale of zero. Therefore, adding the int constructor will result in behavioral equivalent programs; although the new constructor will cause some invocations to resolve to a different constructor, calling the other constructor will still always result in an equivalent,
bd1.equals(bd2)==true, BigDecimal. However, the new long constructor does not have behavioral equivalence for all values. Some long values are not exactly representable in double and the old long → double conversion can silently lose precision. For example, printing the value of (new BigDecimal(Long.MAX_VALUE)) gives
9223372036854775808
under JDK 1.4.2 but
9223372036854775807
under JDK 5. More dramatically, printing (new BigDecimal(0x4000000000000200L))
gives
4611686018427387904
under JDK 1.4.2 but
4611686018427388416
under JDK 5. While the new behavior is "better" in the sense of exactly capturing the long argument value, it is a subtle change to existing source code.
Strictly speaking, among the spectrum of different source compatibility levels, adding this constructor only preserves the weakest property, maintaining the ability to compile. Since the resolution of constructors in existing code is changed, adding this constructor is not binary-preserving source compatible, nor is it behaviorally equivalent since a different BigDecimal will be returned for some inputs.
Since the class already had a static factory method with a long parameter that would convert values exactly, the long constructor did not need to be added to exactly get a BigDecimal with a long's value in a single operation.
Partially because of the unintentional, if beneficial, change in source meaning as well as some of the usual reasons (possibility to cache, etc.), in retrospect I think it would have been preferable for the functionality of all twelve new constructors to be provided through static factories instead. (While not directly applicable in BigDecimal, in general even if constructors aren't considered harmful, static factories can have better generics support.
A similar analysis can be undertaken for all the new methods. Additionally, since subclasses are possible, inheritance conflicts need to be considered too. Note that the new methods taking MathContext and RoundingMode parameters cannot conflict with existing methods in subclasses so all those additions are binary-preserving source compatible. However, if all the parameters of a new method are existing types, a subclass could potentially have a conflicting method with an unrelated return type. For example, MyDecimal could have a (strange) public double divide(BigDecimal divisor) method which would conflict with the addition of public BigDecimal divide(BigDecimal divisor). While BigDecimal generally shouldn't be subclassed, the addition of some of these new methods could prevent existing subclasses from compiling, yet another reason to favor composition over inheritance.
Behavioral Compatibility
In terms of evolving the behavior of existing methods after introducing the expanded exponent range, the main issues were the behavior of arithmetic operations and text ↔ BigDecimal conversion operations; the latter would prove to be unexpectedly troublesome.
As summarized in table 4, the behavior of arithmetic operations was quite compatible with a number of strong invariants. Given input values a1 and b1 representable under the old system, and given an existing method, say add, and its result c1, in the old and new BigDecimal if the inputs to an operation are .equals, same numerical value and same representation, then the output is exactly equivalent too, same numerical value with the same representation. More generally, in the old and new BigDecimal if the inputs to an operation satisfy the weaker property of being compareTo() == 0, meaning they have same numerical value but with a possibly different representation, then the output will be numerically equal, but possibly with a different representation.
Old: c1 = a1.add(b1); |
New: c2 = a2.add(b2); |
If a1.equals(a2) AND b1.equals(b2), then c1.equals(c2) |
If a1.compareTo(a2) == 0 AND b1.compareTo(b2) == 0, then c1.compareTo(c2) == 0 |
A main advantage of decimal arithmetic over binary arithmetic is what-you-see-is-what-you-get for input and output values, the complicated vagaries of binary ↔ decimal conversion can be avoided and exact computation can be straightforward. Therefore, when removing the restriction on exponent values, being able to have a textual representation that readily mapped to all possible unscaled value and exponent pairs was paramount to make the new arithmetic usable. Before JSR 13, the toString method did not use exponential notation, all leading and trailing zeros were explicit. For fractional values, the length of the output grew linearly with the size of the exponent, as well as the number of digits of precision. Conversely, without negative exponents, the internal representation and string output of integer-valued BigDecimal numbers grew with the magnitude of the number, even when it was inherently low-precision. To take advantage of the new unrestricted exponent range, a textual notation was needed that allowed the positive or negative exponent to be recovered; this was accomplished by changing to using scientific notation in the toString output. When converting from text to BigDecimal, a positive exponent could be reconstructed from integer values that previously would have been forced to have a zero exponent. However, the new output was legal input to the old constructors, so similar properties similar to the old and new arithmetic behavior applied:
Within a given release, BigDecimal(bd.toString()).equals(bd) == true, meaning converting to and from a string preserves numerical value and representation.
toString output from the old BigDecimal converted by the new BigDecimal yields a result equivalent to the old value.
New toString output converted by the old BigDecimal yields:
An equivalent result when the exponent is negative.
A numerical equal result when the exponent is positive (representation may differ).
If needed, in the new BigDecimal on textual input the old semantics on exponents is easy to code:
BigDecimal bd = new BigDecimal(myString); if (bd.scale() < 0) bd = bd.setScale(0);
and a toPlainString method was added to provide the old-style output when needed.
Staying within the realm of old and new BigDecimal versions, these arrangements solidly preserve a very reasonable kind of behavioral compatibility, numerical value and representation are kept constant when possible, otherwise, numerical value is preserved possibly with a different representation. Backwards serial compatibility is slightly weaker; rather than being converted to exponent-zero values as done for textual inputs, new serial streams holding positive exponents are rejected by old BigDecimal implementations. Unfortunately, despite these consistencies across JDK versions, some users of BigDecimal still ran into compatibility issues from the textual output changes made by JSR 13.
A common use for BigDecimal is interfacing to databases and while the new scientific notation was legal input to the old BigDecimal string constructors, scientific notation was not legal notation to databases. The addition of the toPlainString method did not help the situation without recompiling the source of the application in question; such recompilation could be unwanted since it would tie the application to JDK 5 with the new method. Other unpalatable workarounds include subclassing BigDecimal to enforce the old toString behavior or using reflection to see if the toPlainString method is available to call to avoid introducing a hard dependency on the new method.
While the changes in textual input and output of BigDecimal were reasonable in the context of direct Java compatibility, the expert group underestimated the behavioral compatibility impact of these change when dealing with databases. While the changes remain justifiable in terms of supporting the new values, if the compatibility cost were known, the expert group could have and should have worked with database vendors to mitigate the migration cost associated with this change.
Conclusion
Fully understanding the compatibility impact of changes is subtle and shortcomings are quick to lead to user anger. Merely maintaining binary compatibility is not sufficient for many purposes. Following good coding guidelines from the beginning can pay silent rewards when later evolving the class by reducing the space of possible concerns.
Acknowledgments
Alex provided helpful comments on a draft of this entry.
Further Reading
Joseph D. Darcy and Mike Cowlishaw, JavaOne 2004 BOF 1638, Big News for BigDecimal,
Now all we need is operator overloading.
Posted by guest on April 26, 2008 at 12:26 AM PDT #
Why should i be careful using java%
Posted by mika on April 27, 2008 at 07:19 AM PDT #
Operator overloading would be a fine topic for a future blog entry...
Posted by Joe Darcy on April 28, 2008 at 02:51 AM PDT #