value types and struct tearing

value types and struct tearing

This note explains how a notion of “value types” for the VM should protect the integrity of that value type’s invariants, and points out important differences, in memory effects, between “struct-like” and “persistent” designs for values.

First, the running example

Consider this two-variable data structure, a pair of encapsulated integral coordinates, with a couple of access functions:

``````class XY {
private int x = 1, y = 0;
public XY() { }  // make a blank one
public boolean step(int p, int q) {
int x1 = x + p, y1 = y + q;
if (x1 == 0 && y1 == 0)  return false;
x = x1; y = y1;
return true;
}
public double measure() {  // inverse radius
return Math.pow((double)x*x + (double)y*y, -0.5);
}
public void copyFrom(XY that) {
this.x = that.x; this.y = that.y;
}
}
``````

The reader function `measure` produces a real number, but will fail to do so if both values `x`, `y` are zero at the same time. The writer function `step` updates the two coordinates (incrementally, as if in a random walk) except when they would both be zero.

The details are not very important; the main point is that the two values are mostly independent, but are coupled by an invariant that can be violated by an uncoordinated change to either value alone.

Off to the races

What could go wrong? Why, nothing, if one of the following conditions is satisfied:

1. If the object is confined to one thread only.

2. If multiple threads are updating the object, but some sort of synchronization is preventing calls to `step` and `measure` from overlapping.

3. If there is some assurance that an `XY` cannot change and has been safely published. Stashing a copy of the object is the basic idea; that is why we might need methods like `copyFrom`.

Side note 1: Even a fully confined `XY` object could in principle be broken by an asynchronous interrupt which cancels an assignment to the `y` field, but such interrupts are not now part of the Java landscape.

Side note 2: General descriptions of these sharing options, and more, may be found in Chapter 3 of Java Concurrency in Practice.

The easy way to cover all the cases is to mark all the methods as synchronized. This is the design of the old Java class `java.util.Vector`, which synchronizes all of its access functions. Since synchronization is usually not free, this design choice pushes a cost onto all users. This cost can sometimes be reduced by using transactions under the hood, such as Intel supports, but since the language and bytecode set do not directly express the transactions, there is always a possibility that the JVM will have to create a real critical section, so the optimization is not reliable.

This is why many programmers (including those who designed newer standard classes like `ArrayList`) omit the synchronization and instead push the responsibility for locking onto the user. This allows users to balance the cost of locking against the danger of races.

The user of an `ArrayList` must take responsibility to avoid publishing a reference to the list without proper mutual exclusion between threads. Often this is easy enough. If the data is shared, it is shared between threads that are programmed together under common design rules and tested well enough to catch simple bugs.

Insecurity complex

The responsibility of the user increases if the list will contain sensitive data, such as parameters to a privileged operation. In that case, the programmer must ensure that a reference to the shared list cannot leak to uncontrolled code, even if the program is abused in some way. Otherwise, an attacker may create or obtain a shared list reference, hand it to privileged code, and then at the same time mutate it. The mutations will appear to the victim code via data races. Most such mutations will be harmless, since the privileged code will validity-check the arguments and throw an exception. But with enough attempts from the attacker, a race can happen which puts the list into a state which confuses the privileged code into doing something unpredicted—except by the attacker. This kind of weakness, known as a TOCTTOU bug, is a risk whenever privilege checks are based on mutable data objects.

It is important to note that marking a variable as `private` does not protect against inconsistent updates from data races. In the simple example above, where all the state is private, a data race can create a disallowed state as follows:

``````final XY victim = new XY();
Thread racer = new Thread() { public void run() {
for (int i=0;i<1e6;i++) { victim.step(-1, 1); victim.step(1, -1); } } };
racer.start();
for (int i=0;i<1e6;i++) {
assert(victim.measure() <= 1e12) : i;
}
``````

This code can fail its assertion quickly, often in the first 100 tries. The victim is observed in a state where both fields are zero. Locally valid updates to single fields of the victim can cause this violation. The sequence of updates that leads the victim to that disallowed state, plus the existence of additional surprising states, is an exercise for the reader.

Side note: Since `measure` reads each field twice, it is allowed to pick up different values for the two reads of one field. This is rare, because optimizers tend to merge the reads, but it cannot be excluded. Thus, inconsistent reads of the same variable can therefore be a source of bugs, and this is true even for one-field objects. We could call this a single-variable-double-read hazard.

The invariant on `XY` fails because the two fields lose their mutual coherence because of race conditions. Let’s call this loss of coherence structure tearing, because it looks like the victim has been torn in parts and reassembled from other parts. In the analogous case of (non-volatile) 64-bit primitives on old 32-bit JVMs, the 64-bit value can come apart into independent 32-bit halves, which can race apart. This is not exactly the same as “word tearing” but is close enough (I think) to merit the term.

Persistence pays off

In our simple example, another way to protect the data is to formulate it using final variables, changing the design to be persistent.

Side note: These days we are recycling the term “persistent” as an alternative and refinement to the older term “immutable”. A problem with the older term is evident whenever you have a conversation that stalls on the audible similarity of phrases like “immutable object” and “a mutable object”. Mumblable “immutable” is semblable to “mutable”, say that ten times fast.

``````class XY {
final private int x, y;
private XY(int x, int y) { this.x = x; this.y = y; }
public static final BLANK = new XY(1, 0);
public XY step(int p, int q) {
int x1 = x + p, y1 = y + q;
if (x1 == 0 && y1 == 0)  return null;
return new XY(x1, y1);
}
public double measure() {  // inverse radius
return Math.pow((double)x*x + (double)y*y, -0.5);
}
//XY copyFrom(XY that) { return that; }  // no special copy
}
``````

In contrast to this persistent version of `XY`, we can call the first version struct-like. Despite the encapsulation of the fields, it interacts with memory like a C struct.

The persistent version does not suffer from races on its `x` and `y` variables. A persistent `XY` object state can be captured or published simply by moving a reference. There only needs to be one public “blank” uninitalized value, and the constructor itself is hidden inside the capsule.

In exchange for this new stability, users must update their references whenever they call the updater function `step`. Crucially, it is now impossible to observe an `XY` object in a disallowed state. The class has full control over its internal invariants, even in the face of deviously attacking racer threads. There are still races possible, but they are on the references. And since updating a reference is one memory operation, there is no issue of multiple fields parting ways. The single-variable-double-read bug can still happen, rarely.

It seems there is always a cost for stability and security. In this case, the cost is allocating a new `XY` object for each distinct state. This is required because the class insists on creating a new `XY` object to represent each new position in the coordinate space. Put another way, the user is forbidden to use the optimization of re-using `XY` objects to represent multiple values over time. This is a reasonable restriction, since that optimization can lead to race conditions, as described above.

Side note: Since the `XY` constructor is private, the class might cache values like `java.lang.Integer` does, but this has its own sometimes surprising costs, since the caching logic can interfere with optimizations such as escape analysis. In any case, making the constructor private is a helpful move.

Where flattery gets us

But, suppose the JVM had an optimization to flatten instances of `XY`, in either the mutable or the persistent form. A flattened instance of `XY` would have both of its fields stored directly in some containing object (or array). There would be no need to have a separate `XY` object on the heap to hold the fields. Of course, if references to `XY` were required for some uses, these references could be created temporarily and then discarded.

Obviously, this sort of thing is what we are calling value types. In a nutshell, a value type encapsulates a group of component values, and can be efficiently stored in any type of Java variable.

Would value types this provide significantly better options for designing `XY` and similar classes, than the current ones sketched above? The answer is a qualified “yes”.

Perhaps the biggest advantage would be better use of memory. Overhead for the distinct `XY` object (header+padding) would vanish. And code would access the `x` and `y` fields directly within a containing object, using static offset arithmetic rather than a dynamic and cache-busting pointer chase. This is most evident if many pairs are stored in an array: Hardware can get clever about sequential accesses to the fields.

The other main advantage would be better support for methods which pass and return composite values. A method could receive or return two or more values inside a single flattened value “on the stack” (or in registers, usually). Of course, locals could also contain flattened values. So complex numbers and vectors get practical, along with a host of other small but useful types.

There are more advantages that may accrue from value types, but this is not the place to go into the details.

Memory vs. method

The two main advantages—better use of memory and better method types—are at some tension with each other. If value types are conceived as a memory layout mechanism only, then their primary usage will be via a Java pointer. The flattened realization in registers (in method types) will be a fiction to be uneasily maintained, at the cost of confusing various JVM optimizations.

On the other hand, value types will fail to deliver effective use of memory if they are conceived as only a clever way of loading registers (again, holding method arguments, locals, and return values). This is because their representation in other variables (fields and array elements) must live in memory, and as such is sensitive to quality of memory layout.

Naturally, we want both advantages, with good representations in both register-based and memory-based variables. To the extent the system materializes Java references that point to flattened values, those references must be easily and routinely optimized away. And such materialized references should be rare. Therefore, a good design for value types needs a primary representation that is pointer-free, even if there is a secondary “boxed” representation which uses references.

Climbing into the capsule

We also want encapsulation, which means privacy of component fields and of methods. Any credible value type design for the JVM must allow a value type to restrict access and enforce invariants.

But encapsulation is incomplete—and therefore a dangerous illusion—if it the protected invariants can be subverted by race conditions, since those race conditions are available to anyone who can perform a value type assignment.

This brings us back to the defects of the struct-like programming style for `XY` in the first example above. As seen above, the Java memory model permits structure tearing, and requires the author of a class to clearly state its contract, allowing the user to take defensive action if the class does not adequately defend itself against races.

But I wish to use value types, someday, to contain security critical values, such as 96-bit timestamps or encapsulated native pointers. The persistent-style design of `String` is integral to its usability, outside of any container, for carrying security critical values. Struct-like value types will, I think, be difficult or impossible to secure to the same degree, because of their weaker encapsulation. They will have to be put inside a container to manage safe access (a bit like an `ArrayList` inside `Collections.synchronizedList`), and this will tend to cancel the advantages of flattening them in the first place.

Assignment as an act of violence

There are a few specific reasons the persistent design provides better encapsulation and safer APIs than the struct-like design.

First, since value types can be stored in all sorts of variables, not just memory locations, it will be extremely common to assign them from place to place. Note that assignment is a JVM primitive (in the current JVM design) and cannot be customized by a class. This is of no consequence for existing Java APIs, but if it is simply generalized to a racy componentwise copy (as in C), then every assignment statement which operates on a value type will be at risk of structure tearing.

But, Java programmers do not expect assignments to corrupt any internal structure of the assigned quantity. (The sole exception of 64-bit primitives on 32-bit machines is widely neglected.) Allowing a value type assignment to racily disturb the underlying value’s invariants is likely to introduce a new and persistent family of bugs to Java programs.

Fixing this problem for struct-like types would require reifying the assignment operation as an explicit method which could then perform synchronization on both the source and destination of the copied value. The code to do this would be complex and prone to errors and deadlocks. In practice, designers of value types would punt on the problem, pushing it (with wide-eyed trust and hope, doubtless) to their users. Their users, meanwhile, would have to learn the conventional distinctions between safe and unsafe versions of types, for example choosing between `FastString` and `ThreadSafeString`.

Publishing via finals

Second, and more subtly, safe publication is one of the least well understood aspects of the Java memory model, but it is crucial to the safe sharing of any non-persistent type. Safe publication requires either accurate synchronization on the object containing the mutable state, or publishing the state via a final variable. Both of these options require levels of indirection which are convenient enough today, but would tend to vanish as data structures are flattened.

The simplest (and thus safest) extension of safe publication patterns to value types is to declare that their component fields are final, so that when a value is assigned, its components are automatically published individually. This pattern comes out naturally from persistent-style values, but must be imposed on struct-like values.

In a sense, we are revisiting the old design decision in Java to make variables mutable by default. Recall that blank final fields were added in Java 1.1, enabling persistent-style types, and even today we don’t yet have frozen (immutable, persistent) arrays. It would be consistent with the oldest versions of Java to define the new composite types to be mutable unless requested otherwise. But more modern forms of inter-thread communication have been created outside of that original model, avoiding synchronization and adding (via the JMM) safe publication semantics to final variables. Although it would be a stretch to say, “Java should have been designed with immutability as the default”, I believe that consistency with Java 1.0 conventions for mutability could reasonably be traded away in order to gain modern thread safety.

Persistence costs

Those are the reasons why I think the persistent design pattern is preferable over the struct-like pattern. That is true despite the difficulties of the persistent design, which I will describe next.

First, the stability provided by final variables is not free. In the current JVM, they may require memory fences in constructors. In a JVM supporting persistent-style value types, every assignment to a memory location (not to registers) may require a similar handshake with the memory system, to ensure that writes of the component values are safe publications.

Correctness vs. throughput?

Second, the atomicity provided by single-reference updates goes away if objects are flattened, and it must be recovered some other way. A store to a persistent-style value type must be all or nothing: It must ensure that other threads either see the whole store, or none of it. This will require another kind of handshake with the memory system, along the lines of the Intel transactions mentioned above, or an atomic multi-word store instruction. Note that the example type `XY` fits in 64 bits, and so would be supported at no extra cost by all 64-bit processors. Processors which provide larger atomic vector store instructions will cheaply support any value that fits in their vectors. In the case of a jumbo value type which spills over multiple hardware vectors and/or cache lines, the JVM’s software will need to perform additional handshakes, perhaps including old fashioned boxing.

For sophisticated users who know what they are doing, a non-transactional store operation, with the possibility of structure tearing, needs to be provided. Because of tearing, the library would have to provide additional interlocks to ensure proper confinement, immutability, and so forth.

In other words, the struct-like component-wise assignment operator will need to be made available as a privileged operation derived from the persistent pattern, and used only inside carefully designed concurrency-safe libraries.

This is the flip side of the necessity for providing persistent containers for struct-like values, for safe publication. The key question, I think, is which should be the default (mutable or persistent) and under what circumstances the non-default mode be supplied.

Notation, notation, notation

A third downside to the persistent style of values is notational. It seems there are times when you want to say “just change the imaginary component of this number to zero, please”. If a value type is willing to expose its components and accept component-wise updates, it seems harsh to require the user to create a new value from scratch:

``````Complex c = ...;
c = new Complex(c.re, 0.0);  // rebuild from scratch
c = c.changeIm(0.0);  // maybe use a helper method
c.im = 0.0;  // but what I meant was this
``````

This can be viewed mainly a matter of syntax sugar, but there is also a deep connection to the hardware here. A multi-component value type is realized either as bits in a block of memory words or as as bits in a collection of live registers. It is fundamentally reasonable for the user to ask to change just one of those low-level components in isolation, assuming the library designer allows the operation.

At the JVM level, it should be possible to render the bytecodes of a component-wise update fairly directly and without confusion to a register or memory write (adding memory handshakes as needed). Reconstructing a new value from the ground up, just because one component shifted, might create a bunch of noisy intermediate representation which could distract the JIT compiler from more important optimization work.

Our race is now run

To conclude, I believe that implementation challenges of persistent-style values are manageable, and that the corresponding opposing trade-offs for struct-like values are much more difficult to control. In the end it comes down to safe and clean user model versus direct compilation to memory instructions. And when safety seems opposed to speed, I think we can agree that JVM needs to lean towards designs that are safe by default, in the expectation that slowness is easier to fix than insecurity.

And in the end, I think we will get both safety and speed.

Comments:

Minor comment on "side note 1": I believe that a StackOverflowException can also cause invariants to be violated.

Posted by Jeroen Frijters on March 06, 2014 at 04:18 AM PST #

@Jeroen: That is true, generally, along with OutOfMemoryError. But in the XY case users probably don't need to worry about it happening between the two field assignments. Unless a de-opt happens somehow and tips over the stack.

Edits to blog post: fix broken links, a couple minor rephrasings. H/T to Markdown & Smartypants.

Posted by John Rose on March 06, 2014 at 12:37 PM PST #

I'm confused. How would persistent containers work as array elements/fields? It seems like they would need to be boxed, which defats the original purpose. And if persistent containers are only used as locals, like in the example, those aren't shared, and so aren't subject to tearing anyway.

Posted by guest on March 06, 2014 at 10:11 PM PST #

I'm confused. It seems to me you avoid structure-tearing in the persistent form through the atomicity of a pointer update. However, if the plan is to flatten the persistent object into its containing object, wouldn't you lose that atomicity i.e. wouldn't it eventually turn into two writes in the case of the XY example?

Posted by Sebastian Fernandez on March 06, 2014 at 10:24 PM PST #

Did you look at how C# does it? In .NET, value types are well-designed. To a .NET dev it is obvious that they should exist as a feature. In the Java world this is still being debated. Just ask a few .NET guys and they tell you from 10 years of experience with them.

For example, value types can tear in .NET. If you don't want that just don't create a race. That design trade-off works well.

Posted by guest on March 06, 2014 at 10:55 PM PST #

"I'm confused. How would persistent containers work as array elements/fields? It seems like they would need to be boxed, which defats the original purpose. "

Not necessarily, a 'box' that is really only a pointer to a place in an array is much more efficient than a full object box. The locality of reference in the array of values is significantly higher (imagine a Map, with keys and values interleved in the same array, insted of indrection to an Entry object followed by a second indirection to the key and value objects).

So although pointers are still necessary, multiple levels of indirection can be avoided in real world use cases (along with object header overhead).

Posted by Scott C on March 09, 2014 at 12:38 PM PDT #

I wish java had sturcts in EXACTLY the same way as in .NET. There is no need to reinvent the wheel. They are ultimate solution for simple garbage free code!

1. they can be mutable or immutable
2. mutable are ideal for garbage free iterators and alike data structures
3. the only minor issue with structs in .NET is boxing if you try to use them in polymorphic way (ie as reference type):

let's say we have struct A : ISomething {} .. to pass it to some method w/o boxing it has to be declared like:
void method<T>(T data) where T : ISomething {}
instead of:
void method(ISomethig data) {}

this is THE ONLY thing that PROBABLY needs an improvement... Just import the good things as it is! Please don't screw up everyones mind :)

Posted by Nick Evgeniev on March 17, 2014 at 03:24 PM PDT #

Post a Comment:
• HTML Syntax: NOT allowed
About

John R. Rose

Java maven, HotSpot developer, Mac user, Scheme refugee.

Once Sun and present Oracle engineer.

Archives
Sun Mon Tue Wed Thu Fri Sat « April 2015 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Today
RSS Atom