Tuesday Jun 14, 2005

User Credentials and all that

Peter Harvey's story reminds me of the unforeseen consequences of creating the ucred in Solaris 10. The ucred was motivated by two factors: the introduction of privileges and a way to propagate information about process credentials through the system in userland.

Before Solaris 10, we had several mechanisms, some internal, some public, all propagating a subset of that information.

in sys/door.h:

 \* Structure used to return info from door_cred
typedef struct door_cred {
        uid_t   dc_euid;        /\* Effective uid of client \*/
        gid_t   dc_egid;        /\* Effective gid of client \*/
        uid_t   dc_ruid;        /\* Real uid of client \*/
        gid_t   dc_rgid;        /\* Real gid of client \*/
        pid_t   dc_pid;         /\* pid of client \*/
        int     dc_resv[4];     /\* Future use \*/
} door_cred_t;

in sys/tl.h:

#define TL_OPT_PEER_CRED 10
typedef struct tl_credopt {
        uid_t   tc_uid;         /\* Effective user id \*/
        gid_t   tc_gid;         /\* Effective group id \*/
        uid_t   tc_ruid;        /\* Real user id \*/
        gid_t   tc_rgid;        /\* Real group id \*/
        uid_t   tc_suid;        /\* Saved user id (from exec) \*/
        gid_t   tc_sgid;        /\* Saved group id (from exec) \*/
        uint_t  tc_ngroups;     /\* number of supplementary groups \*/
} tl_credopt_t;

in rpc/svc.h:

 \* Obtaining local credentials.
typedef struct __svc_local_cred_t {
        uid_t   euid;   /\* effective uid \*/
        gid_t   egid;   /\* effective gid \*/
        uid_t   ruid;   /\* real uid \*/
        gid_t   rgid;   /\* real gid \*/
        pid_t   pid;    /\* caller's pid, or -1 if not available \*/
} svc_local_cred_t;

and in the project I missed this one in sys/stropts.h:

struct k_strrecvfd {    /\* SVR4 expanded syscall interface structure \*/
        struct file \*fp;
        uid_t uid;
        gid_t gid;
        char fill[8];

There was also the need to be able to enquire about other processes and perhaps network connections and packets; a getpeereid interface was requested.

Now, what information should such an interface return? Network interfaces often only allow you to shape requests as a blob of bytes. And that blob needs to have a predictable maximum size too. As you can see from the above examples, even declaring a number of filler elements is not sufficient; none of the above structures which include a filler have space for the full complement of 16 groups, let alone Pete's proposed 65536 maximum number of groups.

The most natural way of implementing a blob which such restrictions is using an opaque data structure with accessor functions (in <ucred.h>):

extern ucred_t \*ucred_get(pid_t pid);

extern void ucred_free(ucred_t \*);

extern uid_t ucred_geteuid(const ucred_t \*);
extern uid_t ucred_getruid(const ucred_t \*);
extern uid_t ucred_getsuid(const ucred_t \*);
extern gid_t ucred_getegid(const ucred_t \*);
extern gid_t ucred_getrgid(const ucred_t \*);
extern gid_t ucred_getsgid(const ucred_t \*);
extern int   ucred_getgroups(const ucred_t \*, const gid_t \*\*);

extern const priv_set_t \*ucred_getprivset(const ucred_t \*, priv_ptype_t);
extern uint_t ucred_getpflags(const ucred_t \*, uint_t);

extern pid_t ucred_getpid(const ucred_t \*); /\* for door_cred compatibility \*/

extern size_t ucred_size(void);

extern int getpeerucred(int, ucred_t \*\*);

extern zoneid_t ucred_getzoneid(const ucred_t \*);

extern projid_t ucred_getprojid(const ucred_t \*);

The ucred_t itself is defined in sys/ucred.h, a header which isn't installed on the system because programs are not supposed to use it; it is a private interface between the kernel and the library.
One function of note is perhaps ucred_size() which returns the maximum size of a credential on the system; it can be used to size credentials allocated on the stack or embedded in structures.
In many cases, the system will just allocate one for you and return the allocated one, but the interfaces have been structured so you can reuse ones returned earlier or ones you allocated yourself.

By now you may be asking yourself where you get creds; well, here are some examples in the OpenSolaris source code: nscd getting a door cred, rpcbind getting an rpc caller credential and the use of the TL option by RPC.

And your typical use of the function in an inetd started daemon:

#include <ucred.h>

main(int argc, char \*\*argv)
	ucred_t \*uc = NULL;

	if (getpeerucred(0, &uc) == 0) {
		/\* we know something about the caller \*/

        return (0);

And a slightly bigger example where we use XPG4 recvmsg to receive a UCRED control messages:

 \* Send a 1 byte UDP packet; print the response packet if one is
 \* received.

#include <sys/socket.h>
#include <sys/uio.h>
#include <sys/signal.h>
#include <netinet/in.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <netdb.h>
#include <stdlib.h>
#include <arpa/inet.h>

main(int argc, char \*\*argv)
        struct sockaddr_storage stor;
        struct sockaddr_in \*sin = (struct sockaddr_in \*)&stor;
        struct sockaddr_in6 \*sin6 = (struct sockaddr_in6 \*)&stor;
        ssize_t bytes;
        union {
                struct cmsghdr hdr;
                unsigned char buf[2048];
                double align;
        } cbuf;
        unsigned char buf[2048];
        struct msghdr msg;
        struct cmsghdr \*cmsg;
        struct iovec iov;
        int one = 1;

        msg.msg_name = &stor;
        msg.msg_iov = &iov;
        msg.msg_iovlen = 1;

        iov.iov_base = buf;

        setsockopt(0, IPPROTO_IP, IP_RECVDSTADDR, &one, sizeof (one));
        setsockopt(0, IPPROTO_IPV6, IPV6_RECVPKTINFO, &one, sizeof (one));
        setsockopt(0, SOL_SOCKET, SO_RECVUCRED, &one, sizeof (one));

        while (1) {
                char abuf[256];
                msg.msg_control = &cbuf;
                msg.msg_controllen = sizeof (cbuf);
                msg.msg_namelen = sizeof (stor);
                iov.iov_len = sizeof (buf);

                bytes = recvmsg(0, &msg, 0);

                if (bytes >= 0) {

                        if (msg.msg_namelen != 0 &&
                            connect(0, (struct sockaddr \*)&stor,
                            msg.msg_namelen) != 0)
                        printf("you connected from %s with the credential\\n",
                                    stor.ss_family == AF_INET ?
                                        (void \*)&sin->sin_addr :
                                        (void \*)&sin6->sin6_addr,
                                        abuf, sizeof(abuf)));
                        for (cmsg = CMSG_FIRSTHDR(&msg); cmsg;
                            cmsg = CMSG_NXTHDR(&msg, cmsg)) {
                                if (cmsg->cmsg_level == SOL_SOCKET &&
                                    cmsg->cmsg_type == SCM_UCRED) {
                                        ucred_t \*uc = (ucred_t \*)

                                        /\* We have a ucred here !! \*/

                        if (msg.msg_namelen != 0)
                                (void) connect(0, NULL, 0);
                } else {

But thinking back of Pete's problem, we see a problem when increasing max groups, even worse, this libnsl private datastructure is abused and multiple copies exist which need to be kept in sync (so parts of the system broke when I changed it in this one place). The bug is an illustration why cut & paste programming doesn't work and why even when you share a private defintion, you must use a proper header file. I filed the bug as soon as I did the quick fix for the Solaris Express respin, the bug is 4994017.

Technorati Tag:
Technorati Tag:

Tuesday May 31, 2005

Southpark Stdio

I guess we all have to do this now, so here's my self-portrait. After pointing my kids to this, they and their friends spend a whole afternoon creating images of themselves, their mothers and fathers. Well, I preempted them and did myself before they had a chance.

Saturday May 07, 2005

Open Solaris Release Date Set

Well, the Open Solaris "vaporware" release date is now set; and as the end of Q2 draws near this should be no surprise. It's only a few days after my dad's 73rd birthday, so I have two things to celebrate that week.

Wednesday Apr 27, 2005

The End of Realmode Boot

I've already mentioned two great new features in our current development release; ACPICA and USB hotplug.

But there's one change that's much more far reaching than that: Newboot.

Most Solaris x86 users will be familiar with the blue screen/device configuration assistant/boot sequence and how ancient some of that feels. Perhaps few are aware that the DCA is actually a realmode DOS like environment where each boot device requires its own realmode driver. These drivers needed to be compiled with a 16 bit compiler and 16 bit MASM, not available for ready money anywhere. While the official build environment required NT, I managed to build it on environments ranging from MS Windows 98 and 2000 on actual PCs to Caldera DOS 7 on a SunPCi card (which allowed for automatic building which was great fun). Now that this piece of shameful history lies in the past, I am not afraid to confess.

But as of last Sunday, April 17th, 2005, we have "legacy free" newboot. Newboot uses grub with ufs support so we now have native grub support and a menu we can edit from inside Solaris. Device enumeration completely done using ACPI

Because it skip the device configuration assistant and boot a single large file with all kernel device drivers which makes startup quite a bit quicker and allows us to boot from any bootable device as long as we also support it in the kernel so we can mount root.

And we've reverted back to white on black consoles; this again takes some getting used, surprisingly enough.

One thing to note is that before you may had to disable ACPI in the kernel and the BIOS; with Newboot + ACPICA, you actually stand a much better chance of the system working with all the default settings: ACPI on, ACPI 2.0 enabled. Even legacy USB enabled now has a much better chance of working than before.

But this is a radical change an PC BIOSes and hardware being like it is, interesting times ahead. SO please test drive when this hits Solaris Express in a few months time.

As of this writing, it's a bit in the balance whether you'll get to see the source first as part of OpenSolaris or the binaries as part of a Solaris Express.

Tuesday Apr 26, 2005

Netherlands/Benelux OpenSolaris/Solaris Usergroup

A few of our customers approached me to start a Solaris user group in the Netherlands (or perhaps a somewhat larger area)

Any takers? Perhaps offers of venues, talks wanted?

For those who don't know, I am based in the Netherlands.

Yet Another Desktop/Laptop Usability Step

"Solaris Nevada" build 14 is proving to be another quantum leap for Solaris desktop usability.

I discussed the new USB hotplug support in vold before, but in the last few days we've also gotten the virtual keyboard/mouse driver in the next Solaris release. People often complained about the fact that their laptop keyboard died until the next reboot when they plugged in a USB or other keyboard. Well, not anymore! We now have virtualized keyboard and mouse drivers which collect events from all available keyboards and present them through a single virtual keyboard and mouse. It is also still possible to use the devices as seprate devices in case you have a multi-head/multi-user environment, but for the common case of a single system with multiple keyboard (laptop + keyboard) this is another big step.

You can plug in the other keyboard at any time, running under X or the commandline, it just works.

Solaris FAQ Updated

For the first time in many years (2.5 years) I've updated the Solaris FAQ

Much more work is needed on it but at least this is a start. I'm hoping to update it more regularly now. It's also still here but it seems to be doing fine there.

Wednesday Apr 20, 2005

ACPICA in Solaris

With the long history of neglect that Solaris on x86 endured, quite a few components got to be extremely stale and fragile. And this wasn't just a lack of device drivers but also a lack of basic new functionality in the core OS.

This week saw another quantum leap; the induction of Intel's ACPI reference implementation (ACPICA) into the next Solaris release.

For years I wanted to have battery support on my old VAIO and later on my Ferrari. And I wanted a power button that did something, etc. I tried to make do with the old "acpi_intp" interpreter which was part of Solaris; but it leaked memory like a sieve and was limited in functionality. Integrating ACPICA looked daunting but fortunately someone made an actual project out of this and the end result is that we now have a state of the art ACPI interpreter in Solaris.

There are basically only two ACPI interpreters in widespread use: the Windows one and the Intel one; by leveraging Intel's source, we stand a fair chance of having Solaris work with more ACPI BIOSes. If our system required ACPI to be turned of for Solaris to work, you may find yourself forced to switch it on when you upgrade later this year.

I've been distributing acpica and a number of other useful Solaris binaries in a single internal kit called "frkit" (originally aimed at Ferrari's but now running on countless systems); frkit includes acpica, a powerbutton/battery handler, an AMD PowerNOW! powermanagement module, a GNOME battery monitor, and our development cardbus and wireless drivers + tools.

One of the more interesting parts of that is possibly the "NDISulator" port from FreeBSD which allows the Sun Ferraristi to use the builtin Broadcom wireless on their ACer Ferraris in 32 and 64 bit mode.

ACPICA is just phase one of a larger project; we have not yet bothered much with the "P" (for power) from ACPI; but we hope to leverage the new implementation to provide the necessary "S3" and "S4" sleep state support.

The speed at with new features work on my Ferrari which I've had now for 4 months is in stark contrast with my Vaio which I got not too long before S9 for x86 was postponed. It's clear that we needed a ramp-up after the wind-down, but it seems to be going more quickly than ever before.

Tuesday Apr 19, 2005

USB hotplug finally works

Today I was pleasantly surprised to see the latest putback to SNV, our next Solaris release (soon on a OpenSOlaris source server near you)

Before this putback you could hotplug/eject devices into devices (SD cards and such in card readers, floppies in floppy drives) but you had to restart vold to mount USB pen drives. But now, you can insert them, remove them, etc, and they're mounted and unmounted automatically.

This was really one of my most wanted features and its great to finally have it. Will it make an update? I certainly hope so; but you can always try Sun Express.

Thursday Apr 14, 2005

Solaris 10 Encryption Supplement download

The Solaris 10 encryption supplement is available for download here. Apparently, it's difficult to find going through the Sun download sides so I give a link here.

The supplement adds 256 bit AES and 448 bit Blowfish; DES, 3DES, 128 bit AES, blowfish and RC4 are already in the standard release. In other words, the standard release gets you what you would have gotten with the encryption supplement for older releases. And the S10 Encryption Supplement takes it one step further. The main reason for not making this part of the Solaris CDs is import restrictions, rather than export restrictions.

Tuesday Apr 12, 2005

Timezones and multi boot.

One of the things that has always been bothering me is the fact that on x86 systems you cannot really run multiple operating systems and survice the timezone change. That's because the clock runs in local time; and localtime is ambiguous. The system cannot tell whether the DST change has been or not so it needs to record this fact in the filesystem (that's why Solaris on x86 has the "rtc -c" cronjob). If you boot all your OSes in turn after the changeover, your system will be N hours off once they're all done adjusting time. The problem is probably best summarized here

On Unix this was long solved by running the clock in the UTC or GMT timezone; that clock is unambiguous, give or take a leap second, and allows multiple versions of the OS to coexist.

Last week, it was pointed out to me in comp.unix.solaris that there is a hidden registry key in later releases of MS-Windows. I already knew how to fix up Solaris, so I combined this to:

       Set the following registry key (it does not exist!)


       In the control panel with Day&Time settings, check the "automatically adjust" check box.

        Boot into Solaris and run:

              rtc -c -z UTC

        then correct the clock with date/rdate
        (if you use liveupgrade, lumount your other partition(s) and copy the
        /etc/rtc_config file to all of them)

        In Linux, you'll need to run "timeconfig" and select "RTC set to GMT".

Note that if you don't multiboot, it's probably also a good idea to run "rtc -c -z UTC" and then correct the date; for one you won't be bitten by the AMD64 timezone bug we had in Solaris 10.

Saturday Apr 09, 2005

Open Solaris CAB

It has been a busy week flying to SFO and having our first CAB meeting. The first good thing that happened was that KLM had finally changed the aging and horrible MD11 for shiny new Boeing 777s with personal video. I have a bit of a problem with the new immigration procedures, and I like the Brazilian's government stand on this.
I had met Rich Teer fleetingly before in the hallways of the Menlo Park Campus so he recognized me when we checked in at the same time; we probably were on the same BART train from the airport. But I had never met the others. I feel we have a great team with very many different competences, from Roy Fielding's experience with Apache's governance model, Simon Phipps' tireless evangelism. And Al Hopper with his tireless Solaris on x86 enthusiasm. Rich Teer, a SPARC fan, and accomplished author and myself as Solaris engineering representative, being the more technical side of things.
Are we just marketing as the Register would have it? No, we're very serious about it. Is the CAB just a bunch of YES-men? Can we get the respect of our community if we are?
Sun takes both Open Solaris and the independence from Sun serious; Jonathan Schwartz came to meet us but none of the other executives was allowed at our meeting. He talked to us at length and was very serious about clearing up the roadblocks that we had already determined to be on the path to OpenSolaris. It is clear that they want us to succeed and want us to independent. Jonathan even stayed for lunch. CAB with Jonathan
The Sun press conference we took part in was a first for me. The press was not hostile and mostly asked questions which were to the point; some more than others.
We have a lot of work to do and will do most of it in email on a publicly readable mailing list.
The second day we listened to Jonathan's keynote at the OSBC conference and spend the afternoon doing interviews with the press followed by a press reception and Sun engineering diner/Open Solaris launch party at Lulu's. And guess what, we were able to make the Americans walk all over town, the itenary was all "5 min cab ride" and some such nonsense.
On the final day I took Ben Rockwood's advice and tried out "Clam chowder in sourdough" after taking the cable car to lombard street and walking down to the harbor. The weather was gorgeous, the same cannot be said of the weather in Amsterdam which is now unseasonably cold.

Friday Dec 10, 2004


Sometimes completely unexpected events take place such as this little mishap with our car; modern MPVs are pretty forgiving and you might not immediately notice that you've got a flat, especially not if the driver hasn't driven the particular vehicle for a while, you have the fans blowing and are playing music at the same time. Until you hit the highway, that is. So, what happened next? A lot of noise and a rather poignant smell of burning rubber before we even hit 50 or 60 and in the few seconds we figured out that we had to pull over, the tire give way ; it had half left the rim before we came to a full stop on the hard shoulder. And here's what's left of it:
blown out tire

With cars zooming past at 120kmph(75mph) on a fairly narrow stretch of highway on a cold dark night with wife and kids in the car, you feel like someone has just painted a target sign on you and you anxiously wait until the help arrives. And they did, within 5 or 10 minutes.

Monday Nov 29, 2004

Fujitsu Lifebook B112 running Solaris 10

As one of two resident Solaris Engineers in Holland, you sometimes get strange requests such as one from an IT operations person who was given this really old laptop. A fujitsu Lifebook B112; 96MB, 3GB harddisk. It was running Solaris 7 and he had no root password.

Hacking it was not too difficult; installing Solaris 10, of course, would be more fun.

There are a few challenges getting a vintage laptop up and running; it has no onboard networking, no bootable CD. And Solaris 10 does not support booting from the Xircom PE3 parallel port ethernet card anymore. And where would you even find such a device?

Well, turns out that we were indeed able to locate a Xicrom PE3 adaptor and armed with "perl" we could make the S9 "pe" driver run under S10; the "pe" driver is no longer supported because we obsolete GLDv0 but by making PE load "misc/GLD" rather than "misc/gld" and installing the old GLD driver we added the device to the miniroot on the S10 install server. Armed with an S9 boot floppy we then booted S10 from the server and started the install. It couldn't fit all of Solaris 10 so we removed a few components to cut it down so it would fit in the 2.8GB partition (the installer is a bit generous in allocating space so at the end we only had 2.0 GB installed).

After churning for around 3 hours, Solaris had been installed (Pe ethernet is very slow and a 233MHz Pentium is not very fast). I had to go home (friday) and resolved to see how far we'd get on monday. But I couldn't wait.

The next hack was attacking a system sitting idle with install finished (we didn't dare reboot because it'd come up w/o ethernet if we did) from the install server. After looking with snoop I found that there are actually two ways of doing that: the first one is the new "eventhook" mechanism we have for dhcp; whenever a dhcp event occurs, the eventhook script is run. The second method was even simpler; it turns out that Solaris 10 init stat's inittab every 5 minutes. So I added a line to inittab which popped an xterm up over my VPN tunnel to my house. Added the "PE" ethernet driver; finished some other config stuff by hand and rebooted. And it came up, still with PE but I soon killed it remotely fiddling with cardbus and the PCMCIA Ethernet card.

With my own modified PCIC driver I was able to get the lifebook to use "pcelx0", with all supported devices. Xorg came up without a hitch too, just a little bit of fiddling to get it to use the External monitor; no luck with the touchscreen yet. Here's what it looked like Lifebook B112 Running Solaris desktop login

Even sound was simple; ``update_drv -a -i '"ESS1879"' sbpro'' and the sound driver attached.

Thursday Aug 12, 2004

The Solaris BOF

Yesterday's Solaris BOF at the Usenix Security symposium was well attended (about 70 people); there were great many questions about zones which shows there is a great interest in the topic. (I di both a zones and a privilege application).

Talking to many people in the hallways of the conference is great and I'm still tackled with questions the day after the BOF (we ran out of time, on questions).



« July 2015