macros and powers of two

Unlike many userland applications, there are large portions of the
kernel that need to do bit-level manipulations. These generally fall
into a few distinct areas:

  • Drivers communicating with hardware
  • Hash functions, bitmasks, and other high-performance algorithms
  • Memory allocators, page manipulators, etc.

In this note, I'm going to peek into the last group, which
typically needs to do a lot of power-of-2 based alignment work, and look
at a set of macros used to abstract away some of the bit-fiddling.

The main low-level allocators in Solaris are
kmem handles your standard page-backed memory allocation, while
vmem handles the more abstract concept of allocating from spaces
of integers. kmem actually layers on top of vmem, but
the details are beyond the scope of this blog entry.

While implementing vmem,
Jeff Bonwick introduced a
set of handy macros for dealing with things relative to powers of two.
The macros live in uts/common/sys/sysmacros.h,
and are defined thusly:

\* Macros for various sorts of alignment and rounding when the alignment
\* is known to be a power of 2.

P2ALIGN(x, align)

((x) & -(align))

P2PHASE(x, align)

((x) & ((align) - 1))

P2NPHASE(x, align)

(-(x) & ((align) - 1))

P2ROUNDUP(x, align)

(-(-(x) & -(align)))

P2END(x, align)

(-(~(x) & -(align)))

P2PHASEUP(x, align, phase)

((phase) - (((phase) - (x)) & -(align)))

P2CROSS(x, y, align)

(((x) \^ (y)) > (align) - 1)
\* Determine whether two numbers have the same high-order bit.


(((x) \^ (y)) < ((x) & (y)))

The purpose of this blog entry is to talk about what they do and how
they work; if you don't want the challenge spoiled, you should stop
reading now.

Alright, now that they've all left, we can dive in. First, remember
the basic equation of two's complement arithmetic:
-x = ~x + 1

That is, to negate a number, complement all of its bits and add one.
If I assume that the number has a lowest set
bit,[1] and capital
Xs represent complements of the corresponding xs, here
are some relations for any non-zero binary number:
        x = ...xxxxx10000...
x-1 = ...xxxxx01111...
~x = ...XXXXX01111...
-x = ~x+1 = ...XXXXX10000...

Now, for powers of two, there is only one set bit, so all of the
xs are 0. This means that (x-1) is a mask for the offset
modulo x, and -x is a mask which clears the offset. In addition,
all numbers are unsigned, align is a power-of-2, and
offset is less than align. These relations in mind,
we can look at what each of the above macros do:

P2ALIGNx & -align
clears all bits below align, rounding x down to the
next lower multiple of align
P2PHASEx & (align - 1)
clears all bits above or equal to align, getting(x % align), the phase of x with regards toalign
P2NPHASE(-x) & (align - 1)
This is the first tricky formula. We negate x, and get
its phase. The result is the same as (align - P2PHASE(x,
align)) % align
, but is much faster to compute.
P2ROUNDUP-((-x) & -align)
This can be re-written as -P2ALIGN(-x, align), which shows how
the operation works; "negate, round down, negate" is the same as
"round up".
So we round x up to the next align boundary.
P2END-((~x) & -align)This can be re-written as-((-x-1) & -align) = P2ROUNDUP(x+1, align); round up(x+1) to the next align boundary. This gets
the end address of the current align block.
(phase -

 ((phase - x) &


This can be re-written as (phase + P2ROUNDUP(x - phase, align)).
So what we've done is rounded up x to have a specificphase with regards to align. (If you've got
some linear algebra under your belt, this is similar to
transforming f(y) = P2ROUNDUP(y, align) into a
coordinate system with x = y + phase)
P2CROSS(x \^ y) > (align-1)(x \^ y) has bits set wherever x and y
differ. This just tests to see if they differ in any position
above the alignment, which will only be the case if [x, y]
crosses an align boundary.
P2SAMEHIGHBIT(x \^ y) < (x & y)
Similar to P2CROSS; here, we're checking that the first bit that
differs is less than the highest set bit the two have in common.

From the above, here are some trivial relations between the various

P2ALIGN(x, align)   = x - P2PHASE(x, align)
P2ROUNDUP(x, align) = x + P2NPHASE(x, align)
P2END(x, align) = P2ALIGN(x, align) + align

Finally, the one drawback of these macros is the fact that they
are macros. This means that (as usual) the argument parsing
can leave much to be desired. The biggest problem is that there is no
type enforcement; if x, y, or align are not
unsigned integers of the same width, bad results can occur. Hence the
(recently added) macros, P2\*_TYPED, defined
the originals, which allow an explicit type to be specified.

Despite this drawback, these macros tend to be much clearer than the same
things done by hand in the code. They probably aren't used enough, in fact.


[1] Of course, zero doesn't have a
lowest set bit, but you can treat it as being 2bits, and get
the same answer.

Technorati Tag: href="http://www.technorati.com/tag/OpenSolaris" rel="tag">OpenSolaris

Technorati Tag: href="http://www.technorati.com/tag/Solaris" rel="tag">Solaris

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.