The SPARC architecture has an interesting feature called Register Windows. The idea is that the processor should contain multiple sets of registers on chip. When a new routine is called, the processor can give a fresh set of registers to the new routine, preserving the value of the old registers. When the new routine completes and control returns to the calling routine, the register values for the old routine are also restored. The idea is for the chip not to have to save and load the values held in registers whenever a routine is called; this reduces memory traffic and should improve performance.
The trouble with register windows, is that each chip can only hold a finite number of them. Once all the register windows are full, the processor has to spill a complete set of registers to memory. This is in contrast with the situation where the program is responsible for spilling and filling registers - the program only need spill a single register if that is all that the routine requires.
Most SPARC processors have about seven sets of register windows, so if the program remains in a call stack depth of about seven, there is no register spill/fill cost associated with calls of other routines. Beyond this stack depth, there is a cost for the spills and fills of the register windows.
The SPARC architecture book contains a more detailed description of register windows in section 5.2.2.
Most of the time software is completely unaware of this architectural decision, in fact user code should never have to be aware of it. There are two situations where software does need to know about register windows, these really only impact virtual machines or operating systems:
The SPARC V9 instruction
flushw will cause the processor to store all the register windows in a thread to memory. For SPARC V8, the same effect is attained through
trap x03. Either way, the cost can be quite high since the processor needs to store up to about 7 sets of register windows to memory Each set is 16 8-byte registers, which results in potentially a lot of memory traffic and cycles.