By user13277689 on Apr 07, 2009
Each build of (Open)Solaris is tested with a variety of test suites on variety of platforms
and I wanted nc test suite to participate in these runs.
Eoin Hughes from PIT team (which runs those tests) was kind enough to workaround couple of bugs (which are fixed now) in the test suite so it can be run in PIT environment. Later on, I got a report from Eoin that as a result of nc test suite run CR 6793191 (watchmalloc triggers system panic on sockfs copyin) was caught. This bug is manifested by a panic:
Panic message (this particular panic is on a DomU, although this happens across the board): panic[cpu0]/thread=ffffff0150ce1540: copyin_noerr: argument not in kernel address space ffffff000416dcf0 unix:bcopy_ck_size+102 () ffffff000416ddb0 genunix:watch_xcopyin+151 () ffffff000416dde0 genunix:watch_copyin+1d () ffffff000416de50 sockfs:copyin_name+91 () ffffff000416deb0 sockfs:bind+90 () ffffff000416df00 unix:brand_sys_syscall32+328 ()
The bug is actually a regression caused by CR 6292199 (bcopy and kcopy should'nt use rep, smov) and was fixed by an engineer from Intel in OpenSolaris/Nevada code base.
This is instance of an event which I like so much - unintended positive consequence elsewhere. In contrast with so called collateral damage this is something which is beneficial in other areas. I've written nc test suite to test primarily nc(1) command but here it proved to be useful for testing other areas of the system as well. In this case it was thanks to the fact that the test suite is run with memory leak checking by default (see NC_PRELOADS variable in src/suites/net/nc/include/vars file).
And yes, CR 6793191 is fixed by now.