Wednesday Jan 29, 2014

Ksplice SNMP Plugin

The Ksplice team is happy to announce the release of an SNMP plugin for Ksplice, available today on the Unbreakable Linux Network. The plugin will let you use Oracle Enterprise Manager to monitor the status of Ksplice on all of your systems, but it will also work with any monitoring solution that is SNMP compatible.

Installation

You'll find the plugin on Ksplice channel for your distribution and architecture. For Oracle Linux 6 on x86_64 that's ol6_x86_64_ksplice. Install the plugin by running (as root):

yum install ksplice-snmp-plugin

Configuration

If you haven't set up SNMP before, you'll need to do a little bit of configuration. Included below is a sample /etc/snmp/snmpd.conf file to try out the plugin:


# Setting up permissions
# ======================
com2sec local localhost public
com2sec mynet 10.10.10.0/24 public

group local v1 local
group local v2c local
group local usm local
group mynet v1  mynet
group mynet v2c mynet
group mynet usm mynet

view all included .1 80

access mynet "" any noauth exact all none none
access local "" any noauth exact all all none

syslocation Oracle Linux 6
syscontact sysadmin <root@localhost>

# Load the plugin
# ===============
dlmod kspliceUptrack /usr/lib64/ksplice-snmp/kspliceUptrack.so

 

You'll want to replace the IP address next to com2sec with the address of your local network, or the address of your SNMP monitoring software. If you are running on a 32 bit architecture x86), replace lib64 in the dlmod path with lib. The above configuration is an example, intended only for testing purposes. For more information about configuring SNMP, check out the SNMP documentation, including the man pages for snmpd and snmpd.conf.

Examples

You can test out your configuration by using the snmpwalk command and verifying the responses. Some examples:

Displaying the installed version of Ksplice:

$ snmpwalk -v 1 -c public -O e localhost KSPLICE-UPTRACK-MIB::kspliceVersion
KSPLICE-UPTRACK-MIB::kspliceVersion.0 = STRING: 1.2.12

Checking if a kernel has all available updates installed:

$ snmpwalk -v 1 -c public -O e localhost KSPLICE-UPTRACK-MIB::kspliceStatus
KSPLICE-UPTRACK-MIB::kspliceStatus.0 = STRING: outofdate

Displaying and comparing the kernel installed on disk with the Ksplice effective version:

$ snmpwalk -v 1 -c public -O e localhost KSPLICE-UPTRACK-MIB::kspliceBaseKernel
KSPLICE-UPTRACK-MIB::kspliceBaseKernel.0 = STRING: 2.6.18-274.3.1.el5

$ snmpwalk -v 1 -c public -O e localhost KSPLICE-UPTRACK-MIB::kspliceEffectiveKernel
KSPLICE-UPTRACK-MIB::kspliceEffectiveKernel.0 = STRING: 2.6.18-274.3.1.el5

Displaying a list of all installed updates:

$ snmpwalk -v 1 -c public -O e localhost KSPLICE-UPTRACK-MIB::ksplicePatchTable

In this case, there are none. This is why the base kernel version and effective kernel version are the same, and why this kernel is out of date.

Displaying a list of updates that can be installed right now, including their description:

$ snmpwalk -v 1 -c public -O e localhost KSPLICE-UPTRACK-MIB::kspliceAvailTable
KSPLICE-UPTRACK-MIB::kspliceavailIndex.0 = INTEGER: 0
KSPLICE-UPTRACK-MIB::kspliceavailIndex.1 = INTEGER: 1
KSPLICE-UPTRACK-MIB::kspliceavailIndex.2 = INTEGER: 2
KSPLICE-UPTRACK-MIB::kspliceavailIndex.3 = INTEGER: 3
KSPLICE-UPTRACK-MIB::kspliceavailIndex.4 = INTEGER: 4
KSPLICE-UPTRACK-MIB::kspliceavailIndex.5 = INTEGER: 5
KSPLICE-UPTRACK-MIB::kspliceavailIndex.6 = INTEGER: 6
KSPLICE-UPTRACK-MIB::kspliceavailIndex.7 = INTEGER: 7
KSPLICE-UPTRACK-MIB::kspliceavailIndex.8 = INTEGER: 8
KSPLICE-UPTRACK-MIB::kspliceavailIndex.9 = INTEGER: 9
KSPLICE-UPTRACK-MIB::kspliceavailIndex.10 = INTEGER: 10
KSPLICE-UPTRACK-MIB::kspliceavailIndex.11 = INTEGER: 11
KSPLICE-UPTRACK-MIB::kspliceavailIndex.12 = INTEGER: 12
KSPLICE-UPTRACK-MIB::kspliceavailIndex.13 = INTEGER: 13
KSPLICE-UPTRACK-MIB::kspliceavailIndex.14 = INTEGER: 14
KSPLICE-UPTRACK-MIB::kspliceavailIndex.15 = INTEGER: 15
KSPLICE-UPTRACK-MIB::kspliceavailIndex.16 = INTEGER: 16
KSPLICE-UPTRACK-MIB::kspliceavailIndex.17 = INTEGER: 17
KSPLICE-UPTRACK-MIB::kspliceavailIndex.18 = INTEGER: 18
KSPLICE-UPTRACK-MIB::kspliceavailIndex.19 = INTEGER: 19
KSPLICE-UPTRACK-MIB::kspliceavailIndex.20 = INTEGER: 20
KSPLICE-UPTRACK-MIB::kspliceavailIndex.21 = INTEGER: 21
KSPLICE-UPTRACK-MIB::kspliceavailIndex.22 = INTEGER: 22
KSPLICE-UPTRACK-MIB::kspliceavailIndex.23 = INTEGER: 23
KSPLICE-UPTRACK-MIB::kspliceavailIndex.24 = INTEGER: 24
KSPLICE-UPTRACK-MIB::kspliceavailIndex.25 = INTEGER: 25
KSPLICE-UPTRACK-MIB::kspliceavailName.0 = STRING: [urvt04qt]
KSPLICE-UPTRACK-MIB::kspliceavailName.1 = STRING: [7jb2jb4r]
KSPLICE-UPTRACK-MIB::kspliceavailName.2 = STRING: [ot8lfoya]
KSPLICE-UPTRACK-MIB::kspliceavailName.3 = STRING: [f7pwmkto]
KSPLICE-UPTRACK-MIB::kspliceavailName.4 = STRING: [nxs9cwnt]
KSPLICE-UPTRACK-MIB::kspliceavailName.5 = STRING: [i8j4bdkr]
KSPLICE-UPTRACK-MIB::kspliceavailName.6 = STRING: [5jr9aom4]
KSPLICE-UPTRACK-MIB::kspliceavailName.7 = STRING: [iifdtqom]
KSPLICE-UPTRACK-MIB::kspliceavailName.8 = STRING: [6yagfyh1]
KSPLICE-UPTRACK-MIB::kspliceavailName.9 = STRING: [bqc6pn0b]
KSPLICE-UPTRACK-MIB::kspliceavailName.10 = STRING: [sy14t1rw]
KSPLICE-UPTRACK-MIB::kspliceavailName.11 = STRING: [ayo20d8s]
KSPLICE-UPTRACK-MIB::kspliceavailName.12 = STRING: [ur5of4nd]
KSPLICE-UPTRACK-MIB::kspliceavailName.13 = STRING: [ue4dtk2k]
KSPLICE-UPTRACK-MIB::kspliceavailName.14 = STRING: [wy52x339]
KSPLICE-UPTRACK-MIB::kspliceavailName.15 = STRING: [qsajn0ce]
KSPLICE-UPTRACK-MIB::kspliceavailName.16 = STRING: [5tx9tboo]
KSPLICE-UPTRACK-MIB::kspliceavailName.17 = STRING: [2nve5xek]
KSPLICE-UPTRACK-MIB::kspliceavailName.18 = STRING: [w7ik1ka8]
KSPLICE-UPTRACK-MIB::kspliceavailName.19 = STRING: [9ky2kan5]
KSPLICE-UPTRACK-MIB::kspliceavailName.20 = STRING: [zjr4ahvv]
KSPLICE-UPTRACK-MIB::kspliceavailName.21 = STRING: [j0mkxnwg]
KSPLICE-UPTRACK-MIB::kspliceavailName.22 = STRING: [mvu2clnk]
KSPLICE-UPTRACK-MIB::kspliceavailName.23 = STRING: [rc8yh417]
KSPLICE-UPTRACK-MIB::kspliceavailName.24 = STRING: [0zfhziax]
KSPLICE-UPTRACK-MIB::kspliceavailName.25 = STRING: [ns82h58y]
KSPLICE-UPTRACK-MIB::kspliceavailDesc.0 = STRING: Clear garbage data on the kernel stack when handling signals.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.1 = STRING: CVE-2011-1160: Information leak in tpm driver.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.2 = STRING: CVE-2011-1585: Authentication bypass in CIFS.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.3 = STRING: CVE-2011-2484: Denial of service in taskstats subsystem.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.4 = STRING: CVE-2011-2496: Local denial of service in mremap().
KSPLICE-UPTRACK-MIB::kspliceavailDesc.5 = STRING: CVE-2009-4067: Buffer overflow in Auerswald usb driver.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.6 = STRING: CVE-2011-2695: Off-by-one errors in the ext4 filesystem.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.7 = STRING: CVE-2011-2699: Predictable IPv6 fragment identification numbers.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.8 = STRING: CVE-2011-2723: Remote denial of service vulnerability in gro.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.9 = STRING: CVE-2011-2942: Regression in bridged ethernet devices.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.10 = STRING: CVE-2011-1833: Information disclosure in eCryptfs.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.11 = STRING: CVE-2011-3191: Memory corruption in CIFSFindNext.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.12 = STRING: CVE-2011-3209: Denial of Service in clock implementation.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.13 = STRING: CVE-2011-3188: Weak TCP sequence number generation.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.14 = STRING: CVE-2011-3363: Remote denial of service in cifs_mount.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.15 = STRING: CVE-2011-4110: Null pointer dereference in key subsystem.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.16 = STRING: CVE-2011-1162: Information leak in TPM driver.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.17 = STRING: CVE-2011-2494: Information leak in task/process statistics.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.18 = STRING: CVE-2011-2203: Null pointer dereference mounting HFS filesystems.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.19 = STRING: CVE-2011-4077: Buffer overflow in xfs_readlink.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.20 = STRING: CVE-2011-4132: Denial of service in Journaling Block Device layer.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.21 = STRING: CVE-2011-4330: Buffer overflow in HFS file name translation logic.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.22 = STRING: CVE-2011-4324: Denial of service vulnerability in NFSv4.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.23 = STRING: CVE-2011-4325: Denial of service in NFS direct-io.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.24 = STRING: CVE-2011-4348: Socking locking race in SCTP.
KSPLICE-UPTRACK-MIB::kspliceavailDesc.25 = STRING: CVE-2011-1020, CVE-2011-3637: Information leak, DoS in /proc.

And here's what happens after you run uptrack-upgrade -y, using Ksplice to fully upgrade your kernel:

$ snmpwalk -v 1 -c public -O e localhost KSPLICE-UPTRACK-MIB::kspliceStatus
KSPLICE-UPTRACK-MIB::kspliceStatus.0 = STRING: uptodate

$ snmpwalk -v 1 -c public -O e localhost KSPLICE-UPTRACK-MIB::kspliceAvailTable

$ snmpwalk -v 1 -c public -O e localhost KSPLICE-UPTRACK-MIB::ksplicePatchTable
KSPLICE-UPTRACK-MIB::ksplicepatchIndex.0 = INTEGER: 0
KSPLICE-UPTRACK-MIB::ksplicepatchIndex.1 = INTEGER: 1
KSPLICE-UPTRACK-MIB::ksplicepatchIndex.2 = INTEGER: 2
KSPLICE-UPTRACK-MIB::ksplicepatchIndex.3 = INTEGER: 3
[ . . . ]

The plugin displays that the kernel is now up-to-date.

SNMP and Enterprise Manager

Once the plugin is up and running, you can monitor your system using Oracle Enterprise Manager. Specifically, you can create an SNMP Adapter to allow Enterprise Manager Management Agents to query the status of Ksplice on each system with the plugin installed. Check out our documentation on SNMP support in Enterprise Manager to get started, including section 22.6, "About Metric Extensions".

This plugin represents the first step in greater functionality between Ksplice and Enterprise Manager and we're excited about what is coming up. If you have any questions about the plugin or suggestions for future development, leave a comment below or drop us a line at ksplice-support_ww@oracle.com.

Monday Oct 14, 2013

Best Practice: Ksplice Deployment

Ksplice is designed to work in many different computing environments. Because upgrading the kernel on any running system is a hassle, we want you to be able to deploy Ksplice as widely as possible. As a consequence of this, there are a number of ways to set up a Ksplice installation for your Oracle Linux infrastructure.

As the first part of a series of articles on Ksplice best practices, we've published a guide to deploying Ksplice on the Oracle Technology Network. This guide explains the different types of Ksplice deployments, helps you choose which one is right for your environment, and shows you how to get started. We've also produced a video describing these deployment options, as well as explaining how you can use Ksplice to diagnose kernel issues.

Once you've decided on how you'll install Ksplice you can find more information in the Ksplice User's Guide or through your normal Oracle support channels. Ksplice is available exclusively as part of Oracle Linux Premier Support.

Happy Ksplicing!


Was this best practice article helpful? Do you have a question about Ksplice, or an idea for a future best practice post? Let us know by leaving a comment below.

Thursday Aug 08, 2013

CVE-2013-2224: Denial of service in sendmsg().

In September 2012, CVE-2012-3552 was reported which could allow an attacker to corrupt slab memory which could lead to a denial-of-service or possible privilege escalation depending on the target machine workload.  This bug had originally been fixed in the mainline kernel in April 2011 and was a fairly large patch for a security fix.  The RedHat backport for this fix introduced a new bug which has been assigned CVE-2013-2224 which again could allow for a denial-of-service or possible privilege escalation.  Rack911 & Tortoiselabs created a reproducer in June 2013 which would allow an unprivileged user to cause a denial-of-service.

RedHat have not yet released a kernel with this CVE fixed, but CentOS have released a custom kernel with the vendor fix for CentOS 6.

We have just released a Ksplice update to address this issue for releases 5 and 6 of Oracle Linux, RedHat Enterprise Linux, CentOS and Scientific Linux.  We recommend that all users of Ksplice on these distributions install this zero-downtime update.

[Read More]

Friday May 31, 2013

CVE-2013-2850: Remote heap buffer overflow in iSCSI target subsystem.

We have just released a rebootless update to deal with a critical security vulnerability:

CVE-2013-2850: Remote heap buffer overflow in iSCSI target subsystem.

If an iSCSI target is configured and listening on the network, a remote
attacker can corrupt heap memory, and gain kernel execution control over
the system and gain kernel code execution.

As this vulnerability is exploitable by remote users, Ksplice is issuing an update for all affected kernels immediately.

This update was embargoed for release until today (May 30th), when the information regarding this vulnerability has been made public. We are pushing updates for Ubuntu Precise, Quantal, and Raring, as well as for Debian Wheezy, Fedora 17 and Fedora 18. This bug was introduced in version 3.1 of the Linux kernel and so does not affect Oracle UEK kernels, or any RedHat 6 derivatives or earlier.

We recommend Oracle Linux Premier Support for receiving rebootless kernel updates via Ksplice.

Wednesday May 15, 2013

Ksplice update for CVE-2013-2094

This is a 0-day local privilege escalation found by Tommi Rantala while fuzzing the kernel using Trinity. The cause of that oops was patched in 3.8.10 in commit 8176cced706b5e5d15887584150764894e94e02f.

'spender' on Reddit has an interesting writeup on the details of this exploit.

We've already shipped this for Fedora 17 and 18 for the 3.8 kernel, and an update for Ubuntu 13.04 will ship as soon as Canonical releases their kernel.

We have a policy of only shipping updates that the vendor has shipped, but in this case we are shipping an update for this CVE for Oracle's UEK2 kernel early. Oracle is in the process of preparing an updated UEK2 kernel with the same fix and will be released through the normal channels.

All customers with Oracle Linux Premier Support should use Ksplice to update their kernel as soon as possible.

[EDITED 2013-05-15]: We have now released an early update for Oracle RHCK 6, RedHat Enterprise Linux 6, Scientific Linux 6 and CentOS 6.

[EDITED 2013-05-15]: We have released an early update for Wheezy. Additionally, Ubuntu Raring, Quantal and Precise have released their kernel, so we have released updates for them.

Tuesday Apr 02, 2013

Ksplice Inspector

With so many kernel updates released, it can be difficult to keep track. At Oracle, we monitor kernels on a daily basis and provide bug and security updates administrators can apply without a system reboot. To help out, the Ksplice team at Oracle has produced the Ksplice Inspector, a web tool to show you the updates Ksplice can apply to your kernel with zero downtime.

The Ksplice Inspector is freely available to everyone. If you're running any Ksplice supported kernel, whether it is Oracle's Unbreakable Enterprise Kernel, a Red Hat compatible kernel, or the kernel of one of our supported desktop distributions, visit https://www.ksplice.com/inspector and follow the instructions and you'll see a list of all the available Ksplice updates for your kernel.

But what if you're running a system without a browser, or one without a GUI at all? We've got you covered: you can get the same information from our API through the command line. Just run the following command:

(uname -s; uname -m; uname -r; uname -v) | \
curl https://uptrack.api.ksplice.com/api/1/update-list/ \
-L -H "Accept: text/text" --data-binary @-

Once you've seen all the updates available for your kernel, you can quickly patch them all with Ksplice. If you're an Oracle Linux Premier Support customer, access to Ksplice is included with your subscription and available through the Unbreakable Linux Network. If you're running Red Hat Enterprise Linux and you would like to check it out, you can try Ksplice free for 30 days.

Let us know what you think by commenting below or sending us an email at ksplice-support_ww@oracle.com.

Friday Nov 09, 2012

Introducing RedPatch

The Ksplice team is happy to announce the public availability of one of our git repositories, RedPatch. RedPatch contains the source for all of the changes Red Hat makes to their kernel, one commit per fix and we've published it on oss.oracle.com/git. With RedPatch, you can access the broken-out patches using git, browse them online via gitweb, and freely redistribute the source under the terms of the GPL. This is the same policy we provide for Oracle Linux and the Unbreakable Enterprise Kernel (UEK). Users can freely access the source, view the commit logs and easily identify the changes that are relevant to their environments.

To understand why we've created this project we'll need a little history. In early 2011, Red Hat changed how they released their kernel source, going from a tarball that had individual patch files to shipping the kernel source as one giant tarball with a single patch for all Red Hat-introduced changes. For most people who work in the kernel this is merely an inconvenience; driver developers and other out-of-kernel module developers can see the end result to make sure their module still performs as expected.

For Ksplice, we build individual updates for each change and rely on source patches that are broken-out, not a giant tarball. Otherwise, we wouldn’t be able to take the right patches to create individual updates for each fix, and to skip over the noise — like a change that speeds up bootup — which is unnecessary for an already-running system. We’ve been taking the monolithic Red Hat patch tarball and breaking it into smaller commits internally ever since they introduced this change.

At Oracle, we feel everyone in the Linux community can benefit from the work we already do to get our jobs done, so now we’re sharing these broken-out patches publicly. In addition to RedPatch, the complete source code for Oracle Linux and the Oracle Unbreakable Enterprise Kernel (UEK) is available from both ULN and our public yum server, including all security errata.

Check out RedPatch and subscribe to redpatch-users@oss.oracle.com for discussion about the project. Also, drop us a line and let us know how you're using RedPatch!

Tuesday Oct 18, 2011

The Ksplice Pointer Challenge

Back when Ksplice was just a research project at MIT, we all spent a lot of time around the student computing group, SIPB. While there, several precocious undergrads kept talking about how excited they were to take 6.828, MIT's operating systems class.

"You really need to understand pointers for this class," we cautioned them. "Reread K&R Chapter 5, again." Of course, they insisted that they understood pointers and didn't need to. So we devised a test.

Ladies and gentlemen, I hereby do officially present the Ksplice Pointer Challenge, to be answered without the use of a computer:

What does this program print?

#include <stdio.h>
int main() {
  int x[5];
  printf("%p\n", x);
  printf("%p\n", x+1);
  printf("%p\n", &x);
  printf("%p\n", &x+1);
  return 0;
}

This looks simple, but it captures a surprising amount of complexity. Let's break it down.

To make this concrete, let's assume that x is stored at address 0x7fffdfbf7f00 (this is a 64-bit system). We've hidden each entry so that you have to click to make it appear -- I'd encourage you to think about what the line should output, before revealing the answer.

printf("%p\n", x);
What will this print?

Well, x is an array, right? You're no stranger to array syntax: x[i] accesses the ith element of x.

If we search back in the depths of our memory, we remember that x[i] can be rewritten as *(x+i). For that to work, x must be the memory location of the start of the array.

Result: printf("%p\n", x) prints 0x7fffdfbf7f00. Alright.

printf("%p\n", x+1);
What will this print?

So, x is 0x7fffdfbf7f00, and therefore x+1 should be 0x7fffdfbf7f01, right?

You're not fooled. You remember that  in C, pointer arithmetic is special and magical. If you have a pointer p to an int, p+1 actually adds sizeof(int)to p. It turns out that we need this behavior if *(x+i) is properly going to end up pointing us at the right place -- we need to move over enough to pass one entry in the array. In this case, sizeof(int) is 4.

Result: printf("%p\n", x) prints 0x7fffdfbf7f04. So far so good.

printf("%p\n", &x);
What will this print?

Well, let's see. & basically means "the address of", so this is like asking "Where does x live in memory?" We answered that earlier, didn't we? x lives at 0x7fffdfbf7f00, so that's what this should print.

But hang on a second... if &x is 0x7fffdfbf7f00, that means that it lives at 0x7fffdfbf7f00. But when we print x, we also get 0x7fffdfbf7f00. So x == &x.

How can that possibly work? If x is a pointer that lives at 0x7fffdfbf7f00, and also points to 0x7fffdfbf7f00, where is the actual array stored?

Thinking about that, I draw a picture like this:


That can't be right.

So what's really going on here? Well, first off, anyone who ever told you that a pointer and an array were the same thing was lying to you. That's our fallacy here. If x were a pointer, and x == &x, then yes, we would have something like the picture above. But x isn't a pointer -- x is an array!

And it turns out that in certain situations, an array can automatically "decay" into a pointer. Into &x[0], to be precise. That's what's going on in examples 1 and 2. But not here. So &x does indeed print the address of x.

Result: printf("%p\n", &x) prints 0x7fffdfbf7f00.

Aside: what is the type of &x[0]? Well, x[0] is an int, so &x[0] is "pointer to int". That feels right.

printf("%p\n", &x+1);
What will this print?

Ok, now for the coup de grace. x may be an array, but &x is definitely a pointer. So what's &x+1?

First, another aside: what is the type of &x? Well... &x is a pointer to an array of 5 ints. How would you declare something like that?

Let's fire up cdecl and find out:

cdecl> declare y as array 5 of int;
int y[5]
cdecl> declare y as pointer to array 5 of int;
int (*y)[5]

Confusing syntax, but it works:
int (*y)[5] = &x; compiles without error and works the way you'd expect.

But back to the question at hand. Pointer arithmetic tells us that &x+1 is going to be the address of x + sizeof(x). What's sizeof(x)? Well, it's an array of 5 ints. On this system, each int is 4 bytes, so it should be 20 bytes, or 0x14.

Result &x+1 prints 0x7fffdfbf7f14.

And thus concludes the Ksplice pointer challenge.

What's the takeaway? Arrays are not pointers (though they sometimes pretend to be!). More generally, C is subtle. Oh, and 6.828 students, if you're having trouble with Lab 5, it's probably because of a bug in your Lab 2.


P.S. If you're interested in hacking on low-level systems at a place where your backwards-and-forwards knowledge of C semantics will be more than just an awesome party trick, we're looking to hire kernel hackers for the Ksplice team.

We're based in beautiful Cambridge, Mass., though working remotely is definitely an option. Send me an email at waseem.daher@oracle.com with a resume and/or a github link if you're interested!

Monday Jun 27, 2011

Building a physical CPU load meter

I built this analog CPU load meter for my dev workstation:

Physical CPU load meter

All I did was drill a few holes into the CPU and probe the power supply lines...

Okay, I lied. This is actually a fun project that would make a great intro to embedded electronics, or a quick afternoon hack for someone with a bit of experience.

The parts

The main components are:

  • Current meter: I got this at MIT Swapfest. The scale printed on the face is misleading: the meter itself measures only about 600 microamps in each direction. (It's designed for use with a circuit like this one). We can determine the actual current scale by connecting (in series) the analog meter, a variable resistor, and a digital multimeter, and driving them from a 5 volt power supply. This lets us adjust and reliably measure the current flowing through the analog meter.

  • Arduino: This little board has a 16 MHz processor, with direct control of a few dozen input/output pins, and a USB serial interface to a computer. In our project, it will take commands over USB and produce the appropriate amount of current to drive the meter. We're not even using most of the capabilities of the Arduino, but it's hard to beat as a platform for rapid development.

  • Resistor: The Arduino board is powered over USB; its output pins produce 5 volts for a logic 'high'. We want this 5 volt potential to push 600 microamps of current through the meter, according to the earlier measurement. Using Ohm's law we can calculate that we'll need a resistance of about 8.3 kilohms. Or you can just measure the variable resistor from earlier.

We'll also use some wire, solder, and tape.

Building it

The resistor goes in series with the meter. I just soldered it directly to the back:

Back of the meter

Some tape over these components prevents them from shorting against any of the various junk on my desk. Those wires run to the Arduino, hidden behind my monitor, which is connected to the computer by USB:

The Arduino

That's it for hardware!

Code for the Arduino

The Arduino IDE will compile code written in a C-like language and upload it to the Arduino board over USB. Here's our program:

#define DIRECTION 2
#define MAGNITUDE 3

void setup() {
    Serial.begin(57600);
    pinMode(DIRECTION, OUTPUT);
    pinMode(MAGNITUDE, OUTPUT);
}

void loop() {
    int x = Serial.read();
    if (x == -1)
        return;

    if (x < 128) {
        digitalWrite(DIRECTION, LOW);
        analogWrite (MAGNITUDE, 2*(127 - x));
    } else {
        digitalWrite(DIRECTION, HIGH);
        analogWrite (MAGNITUDE, 255 - 2*(x - 128));
    }
}

When it turns on, the Arduino will execute setup() once, and then call loop() over and over, forever. On each iteration, we try to read a byte from the serial port. A value of -1 indicates that no byte is available, so we return and try again a moment later. Otherwise, we translate a byte value between 0 to 255 into a meter deflection between −600 and 600 microamps.

Pins 0 and 1 are used for serial communication, so I connected the meter to pins 2 and 3, and named them DIRECTION and MAGNITUDE respectively. When we call analogWrite on the MAGNITUDE pin with a value between 0 and 255, we get a proportional voltage between 0 and 5 volts. Actually, the Arduino fakes this by alternating between 0 and 5 volts very rapidly, but our meter is a slow mechanical object and won't know the difference.

Suppose the MAGNITUDE pin is at some intermediate voltage between 0 and 5 volts. If the DIRECTION pin is low (0 V), conventional current will flow from MAGNITUDE to DIRECTION through the meter. If we set DIRECTION high (5 V), current will flow from DIRECTION to MAGNITUDE. So we can send current through the meter in either direction, and we can control the amount of current by controlling the effective voltage at MAGNITUDE. This is all we need to make the meter display whatever reading we want.

Code for the Linux host

On Linux we can get CPU load information from the proc special filesystem:

keegan@lyle$ head -n 1 /proc/stat
cpu  965348 22839 479136 88577774 104454 5259 24633 0 0

These numbers tell us how much time the system's CPUs have spent in each of several states:

  1. user: running normal user processes
  2. nice: running user processes of low priority
  3. system: running kernel code, often on behalf of user processes
  4. idle: doing nothing because all processes are sleeping
  5. iowait: doing nothing because all runnable processes are waiting on I/O devices
  6. irq: handling asynchronous events from hardware
  7. softirq: performing tasks deferred by irq handlers
  8. steal: not running, because we're in a virtual machine and some other VM is using the physical CPU
  9. guest: acting as the host for a running virtual machine

The numbers in /proc/stat are cumulative totals since boot, measured in arbitrary time units. We can read the file twice and subtract, in order to get a measure of where CPU time was spent recently. Then we'll use the fraction of time spent in states other than idle as a measure of CPU load, and send this to the Arduino.

We'll do all this with a small Python script. The pySerial library lets us talk to the Arduino over USB serial. We'll configure it for 57,600 bits per second, the same rate specified in the Arduino's setup() function. Here's the code:

#!/usr/bin/env python

import serial
import time

port = serial.Serial('/dev/ttyUSB0', 57600)

old = None
while True:
    with open('/proc/stat') as stat:
        new = map(float, stat.readline().strip().split()[1:])
    if old is not None:
        diff = [n - o for n, o in zip(new, old)]
        idle = diff[3] / sum(diff)
        port.write(chr(int(255 * (1 - idle))))
    old = new
    time.sleep(0.25)

That's it!

That's all it takes to make a physical, analog CPU meter. It's been done before and will be done again, but we're interested in what you'd do (or have already done!) with the concept. You could measure website hits, or load across a whole cluster, or your profits from trading Bitcoins. One standard Arduino can run at least six meters of this type (being the number of pins which support analogWrite), and a number of switches, knobs, buzzers, and blinky lights besides. If your server room has a sweet control panel, we'd love to see a picture!

~keegan

Saturday May 07, 2011

Improving your social life with git

I've used RCS, CVS, Subversion, and Perforce. Then I discovered distributed version control systems. Specifically, I discovered git. Lightweight branching? Clean histories? And you can work offline? It seemed to be too good to be true.

But one important question remained unanswered: Can git help improve my social life? Today, we present to you the astonishing results: why yes, yes it can. Introducing gitionary. The brainchild of Liz Denys and Nelson Elhage, it's what you get when you mash up Pictionary, git, and some of your nerdiest friends.

Contestants get randomly assigned git commands and are asked to illustrate them. They put this bold idea to the test quite some time ago in an experiment/party known only as git drunk (yes, some alcohol may have been involved), and we've reproduced selected results below. Each drawing is signed with the artist's username, and the time it took our studio audience to correctly guess the git command.

Suffice it to say, git is complicated. Conceptually, git is best modeled as a set of transformations on a directed acyclic graph, and sometimes it's easiest to illustrate it that way, as with this illustration of git rebase:

On other occasions, a more literal interpretation works best:

But not always:

A little creativity never hurts, though, which is why this is my personal favorite from the evening:

You can find more details (and the full photoset from the evening) at Liz's writeup of the event. Download the gitionary cards, print them double-sided on card stock, and send us pictures of your own gitionary parties!

~wdaher

Thursday Mar 31, 2011

Security Advisory: Plumber Injection Attack in Bowser's Castle

Advisory Name:
Plumber Injection Attack in Bowser's Castle
Release Date:
2011-04-01
Application:
Bowser's Castle
Affected Versions:
Super Mario Bros., Super Mario Bros.: The Lost Levels
Identifier:
SMB-1985-0001
Advisory URL:
https://blogs.oracle.com/ksplice/entry/security_advisory_plumber_injection_attack

Vulnerability Overview


Multiple versions of Bowser's Castle are vulnerable to a plumber injection attack. An Italian plumber could exploit this bug to bypass security measures (walk through walls) in order to rescue Peach, to defeat Bowser, or for unspecified other impact.

Exploit


This vulnerability is demonstrated by "happylee-supermariobros,warped.fm2". Attacks using this exploit have been observed in the wild, and multiple other exploits are publicly available.

Affected Versions


Versions of Bowser's Castle as shipped in Super Mario Bros. and Super Mario Bros.: The Lost Levels are affected.

Solution


An independently developed patch is available:

--- a/smb.asm   1985-09-13 12:00:00.000000000 +0900
+++ b/smb.asm   2011-04-01 12:00:00.000000000 -0400
@@ -12009,12015 +12009,12015 @@
         ldy $04
         cpy #$05
         bcc *+$09
-        lda $45
+        lda #$01
         sta $00
         jmp $df4b
         jsr $dec4

A binary hot patch to apply the update to an existing version is also available.

All users are advised to upgrade.

Mitigations


For users unable to apply the recommended fix, a number of mitigations are possible to reduce the impact of the vulnerability.

NOTE THAT NO MITIGATION IS BELIEVED TO BE COMPLETELY EFFECTIVE.

Potential mitigations include:

Credit


The vulnerability was originally discovered by Mario and Luigi, of Mario Bros. Security Research.

The provided patch and this advisory were prepared by Lakitu Cloud Security, Inc. The hot patch was developed in collaboration with Ksplice, Inc.

Product Overview


Bowser's Castle is King Bowser's home and the base of operations for the Koopa Troop. Bowser's Castle is the final defense against assaults by Mario to kidnap Princess Peach, and is guarded by Bowser's most powerful minions.

~nelhage

Wednesday Mar 16, 2011

disown, zombie children, and the uninterruptible sleep

PID 1 Riding, by Albrecht Dürer

It's the end of the day on Friday. On your laptop, in an ssh session on a work machine, you check on long.sh, which has been running all day and has another 8 or 9 hours to go. You start to close your laptop.

You freeze for a second and groan.

This was supposed to be running under a screen session. You know that if you kill the ssh connection, that'll also kill long.sh. What are you going to do? Leave your laptop for the weekend? Kill the job, losing the last 8 hours of work?

You think about what long.sh does for a minute, and breathe a sigh of relief. The output is written to a file, so you don't care about terminal output. This means you can use disown.

How does this little shell built-in let your jobs finish even when you kill the parent process (the shell inside the ssh connection)?

Dissecting disown

As we'll see, disown synthesizes 3 big UNIX concepts: signals, process states, and job control.

The point of disowning a process is that it will continue to run even when you exit the shell that spawned it. Getting this to work requires a prelude. The steps are:

  1. suspend the process with ctl-Z.
  2. background with bg.
  3. disown the job.

What does each of these steps accomplish?

First, here's a summary of the states that a process can be in, from the ps man page:

PROCESS STATE CODES
       Here are the different values that the s, stat and state output specifiers (header "STAT" or "S")
       will display to describe the state of a process.
       D    Uninterruptible sleep (usually IO)
       R    Running or runnable (on run queue)
       S    Interruptible sleep (waiting for an event to complete)
       T    Stopped, either by a job control signal or because it is being traced.
       W    paging (not valid since the 2.6.xx kernel)
       X    dead (should never be seen)
       Z    Defunct ("zombie") process, terminated but not reaped by its parent.

       For BSD formats and when the stat keyword is used, additional characters may be displayed:
       <    high-priority (not nice to other users)
       N    low-priority (nice to other users)
       L    has pages locked into memory (for real-time and custom IO)
       s    is a session leader
       l    is multi-threaded (using CLONE_THREAD, like NPTL pthreads do)
       +    is in the foreground process group

And here is a transcript of the steps to disown long.sh. To the right of each step is some useful ps output, in particular the parent process id (PPID), what process state our long job is in (STAT), and the controlling terminal (TT). I've highlighted the interesting changes:

Shell 1: disown Shell 2: monitor with ps
1. Start program
$ sh long.sh
$ ps -o pid,ppid,stat,tty,cmd $(pgrep -f long)
  PID  PPID STAT TT       CMD
26298 26145 S+   pts/0    sh long.sh
2. Suspend program with Ctl-z
^Z
[1]+  Stopped     sh long.sh
$ ps -o pid,ppid,stat,tty,cmd $(pgrep -f long)
  PID  PPID STAT TT       CMD
26298 26145 T    pts/0    sh long.sh
3. Resume program in background
$ bg
[1]+ sh long.sh &
$ ps -o pid,ppid,stat,tty,cmd $(pgrep -f long)
  PID  PPID STAT TT       CMD
26298 26145 S    pts/0    sh long.sh
4. disown job 1, our program
$ disown %1
$ ps -o pid,ppid,stat,tty,cmd $(pgrep -f long)
  PID  PPID STAT TT       CMD
26298 26145 S    pts/0    sh long.sh
5. Exit the shell
$ exit
logout
$ ps -o pid,ppid,stat,tty,cmd $(pgrep -f long)
  PID  PPID STAT TT       CMD
26298     1 S    ?        sh long.sh

Putting this information together:

  1. When we run long.sh from the command line, its parent is the shell (PID 26145 in this example). Even though it looks like it is running as we watch it in the terminal, it mostly isn't; long.sh is waiting on some resource or event, so it is in process state S for interruptible sleep. It is in fact in the foreground, so it also gets a +.
  2. First, we suspend the program with Ctl-z. By ``suspend'', we mean send it the SIGTSTP signal, which is like SIGSTOP except that you can install your own signal handler for or ignore it. We see proof in the state change: it's now in T for stopped.
  3. Next, bg sets our process running again, but in the background, so we get the S for interruptible sleep, but no +.
  4. Finally, we can use disown to remove the process from the jobs list that our shell maintains. Our process has to be active when it is removed from the list or it'll get reaped when we kill the parent shell, which is why we needed the bg step.
  5. When we exit the shell, we are sending it a SIGHUP, which it propagates to all children in the jobs table**. By default, a SIGHUP will terminate a process. Because we removed our job from the jobs table, it doesn't get the SIGHUP and keeps on running (STAT S). However, since its parent the shell died, and the shell was the session leader in charge of the controlling tty, it doesn't have a tty anymore (TT ?). Additionally, our long job needs a new parent, so init, with PID 1, becomes the new parent process.
**This is not always true, as it turns out. In the bash shell, for example, there is a huponexit shell option. If this option is disabled, a SIGHUP to the shell isn't propagated to the children. This means if you have a backgrounded, active process (you followed steps 1, 2, and 3 above, or you started the process backgrounded with ``&'') and you exit the shell, you don't have to use disown for the process to keep running. You can check or toggle the huponexit shell option with the shopt shell built-in.

And that is disown in a nutshell.

What else can we learn about process states?

Dissecting disown presents enough interesting tangents about signals, process states, and job control for a small novel. Focusing on process states for this post, here are a few such tangents:

1. There are a lot of process states and modifiers. We saw some interruptible sleeps and suspended processes with disown, but what states are most common?

Using my laptop as a data source and taking advantage of ps format specifiers, we can get counts for the different process states:

jesstess@aja:~$ ps -e h -o stat | sort | uniq -c | sort -rn
     90 S
     31 Sl
     17 Ss
      9 Ss+
      8 Ssl
      4 S<
      3 S+
      2 SNl
      1 S<sl
      1 S<s
      1 SN
      1 SLl
      1 R+

So the vast majority are in an interruptible sleep (S), and a few processes are extra nice (N) and extra mean (<).

We can drill down on process ``niceness'', or scheduling priority, with the ni format specifier to ps:

jesstess@aja:~$ ps -e h -o ni | sort -n | uniq -c
      1 -11
      2  -5
      1  -4
      2  -2
      4   -
    156   0
      1   1
      1   5
      1  10

The numbers range from 19 (super friendly, low scheduling priority) to -20 (a total bully, high scheduling priority). The 6 processes with negative numbers are the 6 with a < process state modifier in the ``ps -e h -o stat'' output, and the 3 with positive numbers have the Ns. Most processes don't run under a special scheduling priority.

Why is almost nothing actually running?

In the ``ps -e h -o stat'' output above, only 1 process was marked as R running or runnable. This is a multi-processor machine, and there are over 150 other processes, so why isn't something running on the other processor?

The answer is that on an unloaded system, most processes really are waiting on an event or resource, so they can't run. On the laptop where I ran these tests, uptime tells us that we have a load average under 1:

jesstess@aja:~$ uptime
 13:09:10 up 16 days, 14:09,  5 users,  load average: 0.92, 0.87, 0.82

So we'd only expect to see 1 process in the R state at any given time for that load.

If we hop over to a more loaded machine -- a shell machine at MIT -- things are a little more interesting:

dr-wily:~> ps -e -o stat,cmd | awk '{if ($1 ~/R/) print}'
R+   /mit/barnowl/arch/i386_deb50/bin/barnowl.real.zephyr3
R+   ps -e -o stat,cmd
R+   w
dr-wily:~> uptime
 23:23:16 up 22 days, 20:09, 132 users,  load average: 3.01, 3.66, 3.43
dr-wily:~> grep processor /proc/cpuinfo
processor	: 0
processor	: 1
processor	: 2
processor	: 3

The machine has 4 processors. On average, 3 or 4 processors have processes running (in the R state). To get a sense of how the running processes change over time, throw the ps line under watch:

watch -n 1 "ps -e -o stat,cmd | awk '{if (\$1 ~/R/) print}'"

We get something like:

watching the changing output of ps

2. What about the zombies?

Noticeably absent in the process state summaries above are zombie processes (STAT Z) and processes in uninterruptible sleep (STAT D).

A process becomes a zombie when it has completed execution but hasn't been reaped by its parent. If a program produces long-lived zombies, this is usually a bug; zombies are undesirable because they take up process IDs, which are a limited resource.

I had to dig around a bit to find real examples of zombies. The winners were old barnowl zephyr clients (zephyr is a popular instant messaging system at MIT):

jesstess@linerva:~$ ps -e h -o stat,cmd | awk '{if ($1 ~/Z/) print}'
Z+   [barnowl] <defunct>
Z+   [barnowl] <defunct>

However, since all it takes to produce a zombie is a child exiting without the parent reaping it, it's easy to construct our own zombies of limited duration:

jesstess@aja:~$ cat zombie.c 
#include <sys/types.h>

int main () {
    pid_t child_pid = fork();
    if (child_pid > 0) {
        sleep(60);
    }
    return 0;
}
jesstess@aja:~$ gcc -o zombie zombie.c
jesstess@aja:~$ ./zombie
^Z
[1]+  Stopped                 ./zombie
jesstess@aja:~$ ps -o stat,cmd $(pgrep -f zombie)
T    ./zombie
Z    [zombie] <defunct>

When you run this script, the parent dies after 60 seconds, init becomes the zombie child's new parent, and init quickly reaps the child by making a wait system call on the child's PID, which removes it from the system process table.

3. What about the uninterruptible sleeps?

A process is put in an uninterruptible sleep (STAT D) when it needs to wait on something (typically I/O) and shouldn't be handling signals while waiting. This means you can't kill it, because all kill does is send it signals. This might happen in the real world if you unplug your NFS server while other machines have open network connections to it.

We can create our own uninterruptible processes of limited duration by taking advantage of the vfork system call. vfork is like fork, except the address space is not copied from the parent into the child, in anticipation of an exec which would just throw out the copied data. Conveniently for us, when you vfork the parent waits uninterruptibly (by way of wait_on_completion) on the child's exec or exit:

jesstess@aja:~$ cat uninterruptible.c 
int main() {
    vfork();
    sleep(60);
    return 0;
}
jesstess@aja:~$ gcc -o uninterruptible uninterruptible.c
jesstess@aja:~$ echo $$
13291
jesstess@aja:~$ ./uninterruptible

and in another shell:

jesstess@aja:~$ ps -o ppid,pid,stat,cmd $(pgrep -f uninterruptible)

13291  1972 D+   ./uninterruptible
 1972  1973 S+   ./uninterruptible

We see the child (PID 1973, PPID 1972) in an interruptible sleep and the parent (PID 1972, PPID 13291 -- the shell) in an uninterruptible sleep while it waits for 60 seconds on the child.

One neat (mischievous?) thing about this script is that processes in an uninterruptible sleep contribute to the load average for a machine. So you could run this script 100 times to temporarily give a machine a load average elevated by 100, as reported by uptime.

It's a family affair

Signals, process states, and job control offer a wealth of opportunities for exploration on a Linux system: we've already disowned children, killed parents, witnessed adoption (by init), crafted zombie children, and more. If this post inspires fun tangents or fond memories, please share in the comments!


*Albrecht had some help from Adam and Photoshop Elements. Larger version here.
Props to Nelson for his boundless supply of sysadmin party tricks, which includes this vfork example.

~jesstess

Monday Feb 21, 2011

Mapping your network with nmap

If you run a computer network, be it home WiFi or a global enterprise system, you need a way to investigate the machines connected to your network. When ping and traceroute won't cut it, you need a port scanner.

nmap is the port scanner. It's a powerful, sophisticated tool, not to mention a movie star. The documentation on nmap is voluminous: there's an entire book, with a free online edition, as well as a detailed manpage. In this post I'll show you just a few of the cool things nmap can do.

The law and ethics of port scanning are complex. A network scan can be detected by humans or automated systems, and treated as a malicious act, resulting in real costs to the target. Depending on the options you choose, the traffic generated by nmap can range from "completely innocuous" to "watch out for admins with baseball bats". A safe rule is to avoid scanning any network without the explicit permission of its administrators — better yet if that's you.

You'll need root privileges on the scanning system to run most interesting nmap commands, because nmap likes to bypass the standard network stack when synthesizing esoteric packets.

A firm handshake

Let's start by scanning my home network for web and SSH servers:

root@lyle# nmap -sS -p22,80 192.168.1.0/24
Nmap scan report for 192.168.1.1
PORT   STATE    SERVICE
22/tcp filtered ssh
80/tcp open     http

Nmap scan report for 192.168.1.102
PORT   STATE    SERVICE
22/tcp filtered ssh
80/tcp filtered http

Nmap scan report for 192.168.1.103
PORT   STATE  SERVICE
22/tcp open   ssh
80/tcp closed http

Nmap done: 256 IP addresses (3 hosts up) scanned in 6.05 seconds

We use -p22,80 to ask for a scan of TCP ports 22 and 80, the most popular ports for SSH and web servers respectively. If you don't specify a -p option, nmap will scan the 1,000 most commonly-used ports. You can give a port range like -p1-5000, or even use -p- to scan all ports, but your scan will take longer.

We describe the subnet to scan using CIDR notation. We could equivalently write 192.168.1.1-254.

The option -sS requests a TCP SYN scan. nmap will start a TCP handshake by sending a SYN packet. Then it waits for a response. If the target replies with SYN/ACK, then some program is accepting our connection. A well-behaved client should respond with ACK, but nmap will simply record an open port and move on. This makes an nmap SYN scan both faster and more stealthy than a normal call to connect().

If the target replies with RST, then there's no service on that port, and nmap will record it as closed. Or we might not get a response at all. Perhaps a firewall is blocking our traffic, or the target host simply doesn't exist. In that case the port state is recorded as filtered after nmap times out.

You can scan UDP ports by passing -sU. There's one important difference from TCP: Since UDP is connectionless, there's no particular response required from an open port. Therefore nmap may show UDP ports in the ambiguous state open|filtered, unless you can prod the target application into sending you data (see below).

To save time, nmap tries to confirm that a target exists before performing a full scan. By default it will send ICMP echo (the ubiquitous "ping") as well as TCP SYN and ACK packets. You can use the -P family of options to customize this host-discovery phase.

Weird packets

nmap has the ability to generate all sorts of invalid, useless, or just plain weird network traffic. You can send a TCP packet with no flags at all (null scan, -sN) or one that's lit up "like a Christmas tree" (Xmas scan, -sX). You can chop your packets into little fragments (--mtu) or send an invalid checksum (--badsum). As a network administrator, you should know if the bad guys can confuse your security systems by sending weird packets. As the manpage advises, "Let your creative juices flow".

There's a second benefit to sending weird traffic: We can identify the target's operating system by seeing how it responds to unusual situations. nmap will perform this OS detection if you specify the -O flag:

root@lyle# nmap -sS -O 192.168.1.0/24
Nmap scan report for 192.168.1.1
Not shown: 998 filtered ports
PORT     STATE  SERVICE
23/tcp   closed telnet
80/tcp   open   http
MAC Address: 00:1C:10:33:6B:99 (Cisco-Linksys)
Device type: WAP|broadband router
Running: Linksys embedded, Netgear embedded, Netgear VxWorks 5.X
...
Nmap scan report for 192.168.1.100
Not shown: 998 filtered ports
PORT      STATE SERVICE
139/tcp   open  netbios-ssn
445/tcp   open  microsoft-ds
MAC Address: 00:1F:3A:7F:7C:26 (Hon Hai Precision Ind.Co.)
Warning: OSScan results may be unreliable because we could not find
  at least 1 open and 1 closed port
Device type: general purpose
Running (JUST GUESSING) : Microsoft Windows Vista|2008|7 (98%)
...
Nmap scan report for 192.168.1.104
All 1000 scanned ports on 192.168.1.104 are closed
MAC Address: 7C:61:93:53:9F:E5 (Unknown)
Too many fingerprints match this host to give specific OS details
TCP/IP fingerprint:
SCAN(V=5.21%OT=%CT=1%CU=42921%PV=Y%DS=1%DC=D%G=N%M=7C6193%TM=4D6079CD)
SEQ(CI=Z%II=I)
T5(R=Y%DF=Y%T=40%W=0%S=Z%A=S+%F=AR%O=%RD=0%Q=)
...

Since the first target has both an open and a closed port, nmap has many protocol corner cases to explore, and it easily recognizes a Linksys home router. With the second target, there's no port in the closed state, so nmap isn't as confident. It guesses a Windows OS, which seems especially plausible given the open NetBIOS ports. In the last case nmap has no clue, and gives us some raw findings only. If you know the OS of the target, you can contribute this fingerprint and help make nmap even better.

Behind the port

It's all well and good to discover that port 1234 is open, but what's actually listening there? nmap has a version detection subsystem that will spam a host's open ports with data in hopes of eliciting a response. Let's pass -sV to try this out:

root@lyle# nmap -sS -sV 192.168.1.117
Nmap scan report for 192.168.1.117
Not shown: 998 closed ports
PORT     STATE SERVICE VERSION
443/tcp  open  ssh     OpenSSH 5.5p1 Debian 6 (protocol 2.0)
8888/tcp open  http    thttpd 2.25b 29dec2003

nmap correctly spotted an HTTP server on non-standard port 8888. The SSH server on port 443 (usually HTTPS) is also interesting. I find this setup useful when connecting from behind a restrictive outbound firewall. But I've also had network admins send me worried emails, thinking my machine has been compromised.

nmap also gives us the exact server software versions, straight from the server's own responses. This is a great way to quickly audit your network for any out-of-date, insecure servers.

Since a version scan involves sending application-level probes, it's more intrusive and can cause more trouble. From the book:

In the nmap-service-probes included with Nmap the only ports excluded are TCP port 9100 through 9107. These are common ports for printers to listen on and they often print any data sent to them. So a version detection scan can cause them to print many pages full of probes that Nmap sends, such as SunRPC requests, help statements, and X11 probes.

This behavior is often undesirable, especially when a scan is meant to be stealthy.

Trusting the source

It's a common (if questionable) practice for servers or firewalls to trust certain traffic based on where it appears to come from. nmap gives you a variety of tools for mapping these trust relationships. For example, some firewalls have special rules for traffic originating on ports 53, 67, or 20. You can set the source port for nmap's TCP and UDP packets by passing --source-port.

You can also spoof your source IP address using -S, and the target's responses will go to that fake address. This normally means that nmap won't see any results. But these responses can affect the unwitting source machine's IP protocol state in a way that nmap can observe indirectly. You can read about nmap's TCP idle scan for more details on this extremely clever technique. Imagine making any machine on the Internet — or your private network — port-scan any other machine, while you collect the results in secret. Can you use this to map out trust relationships in your network? Could an attacker?

Bells and whistles

So that's an overview of a few cool nmap features. There's a lot we haven't covered, such as performance tuning, packet traces, or nmap's useful output modes like XML or ScRipT KIdd|3. There's even a full scripting engine with hundreds of useful plugins written in Lua.

~keegan

Tuesday Feb 15, 2011

Happy Birthday Ksplice Uptrack!

One year ago, we announced the general availability of Ksplice Uptrack, a subscription service for rebootless kernel updates on Linux. Since then, a lot has happened!

Adoption

Over 600 companies have deployed Ksplice Uptrack on more than 100,000 production systems, on all 7 continents (Antarctica was the last hold-out). More than 2 million rebootless updates have been installed!

Ksplice at the South Pole

You see the greatest system administration time and cost savings when you use Ksplice Uptrack on all of your machines, and we've designed and priced Uptrack with this large-scale usage in mind. We've been very happy to see that our customers share this view: most of our customers either rolled out Ksplice across the board upon sign-up or have substantially expanded their usage after first purchasing for a particular environment.

We've introduced several features this year to make managing large Uptrack installations easier:

  • autoinstall: the Uptrack client can be configured to install rebootless updates on its own as they become available, making kernel updates fully automatic. Autoinstall is now our most popular configuration.
  • the Uptrack API: programmatically query the state of your machines through our RESTful API.
  • Nagios plugins: easily integrate Uptrack monitoring into your existing Nagios infrastructure with plugins that monitor for out of date, inactive, and unsupported machines.

Supported kernels

We continue to expand the distributions and flavors we support. You can now use Ksplice Uptrack on (deep breath) RHEL 4 & 5, CentOS 4 & 5 (and CentOS Plus), Debian 5 & 6, Ubuntu 8.04, 9.10, 10.04, & 10.10, Virtuozzo 3 & 4, OpenVZ for EL5 and Debian, CloudLinux 5, and Fedora 13 & 14. Within these distributions, we continue to support more flavors and older kernels; our launch may have been last year, but we support kernels back to 2007 and earlier!

You've always been able to use Ksplice in virtualized environments like VMWare, Xen, and Virtuozzo, This year, we've given you even more virtualization options by adding support for virtualization kernels like Debian OpenVZ and Debian Xen. Starting with Ubuntu Maverick, we support all Ubuntu kernel flavors, including the Maverick private cloud/EC2 kernel.

Thanks to some changes at Amazon and Rackspace, you can now even use Ksplice Uptrack on stock Linux kernels in Amazon EC2 and Rackspace Cloud.

As always, you can roll out a free trial on any of these distributions, for an unlimited number of systems.

This year, we also added Fedora Desktop to our free version options, so now you can use Ksplice Uptrack on both Ubuntu Desktop and Fedora Desktop completely for free.

Reboots saved

Did you know that between RHEL 4 and 5, Red Hat released 22 new kernels this past year?

chart: reboots on RHEL this past year

Without Ksplice, that's coordination and downtime for 22 reboots, including the hat trick of font-size: 125%; ">3 kernels in 3 weeks for RHEL 5. Gartner estimates that 90% of exploited systems are exploited using known, patched vulnerabilities, so if you're not rebooting and you're not using Ksplice, you are putting your servers (and your customers) at risk.

Looking forward

What do you want to see from Ksplice this year? What features can we add to help you deploy and monitor your Uptrack installations? Tell us what you want, and we'll do our best to deliver!

~jesstess

Monday Jan 24, 2011

8 gdb tricks you should know

Despite its age, gdb remains an amazingly versatile and flexible tool, and mastering it can save you huge amounts of time when trying to debug problems in your code. In this post, I'll share 10 tips and tricks for using GDB to debug most efficiently.

I'll be using the Linux kernel for examples throughout this post, not because these examples are necessarily realistic, but because it's a large C codebase that I know and that anyone can download and take a look at. Don't worry if you aren't familiar with Linux's source in particular -- the details of the examples won't matter too much.

  1. break WHERE if COND

    If you've ever used gdb, you almost certainly know about the "breakpoint" command, which lets you break at some specified point in the debugged program.

    But did you know that you can set conditional breakpoints? If you add if CONDITION to a breakpoint command, you can include an expression to be evaluated whenever the program reaches that point, and the program will only be stopped if the condition is fulfilled. Suppose I was debugging the Linux kernel and wanted to stop whenever init got scheduled. I could do:

    (gdb) break context_switch if next == init_task
    

    Note that the condition is evaluated by gdb, not by the debugged program, so you still pay the cost of the target stopping and switching to gdb every time the breakpoint is hit. As such, they still slow the target down in relation to to how often the target location is hit, not how often the condition is met.

  2. command

    In addition to conditional breakpoints, the command command lets you specify commands to be run every time you hit a breakpoint. This can be used for a number of things, but one of the most basic is to augment points in a program to include debug output, without having to recompile and restart the program. I could get a minimal log of every mmap() operation performed on a system using:

    (gdb) b do_mmap_pgoff 
    Breakpoint 1 at 0xffffffff8111a441: file mm/mmap.c, line 940.
    (gdb) command 1
    Type commands for when breakpoint 1 is hit, one per line.
    End with a line saying just "end".
    >print addr
    >print len
    >print prot
    >end
    (gdb)
    
  3. gdb --args

    This one is simple, but a huge timesaver if you didn't know it. If you just want to start a program under gdb, passing some arguments on the command line, you can just build your command-line like usual, and then put "gdb --args" in front to launch gdb with the target program and the argument list both set:

    [~]$ gdb --args pizzamaker --deep-dish --toppings=pepperoni
    ...
    (gdb) show args
    Argument list to give program being debugged when it is started is
      " --deep-dish --toppings=pepperoni".
    (gdb) b main
    Breakpoint 1 at 0x45467c: file oven.c, line 123.
    (gdb) run
    ...
    

    I find this especially useful if I want to debug a project that has some arcane wrapper script that assembles lots of environment variables and possibly arguments before launching the actual binary (I'm looking at you, libtool). Instead of trying to replicate all that state and then launch gdb, simply make a copy of the wrapper, find the final "exec" call or similar, and add "gdb --args" in front.

  4. Finding source files

    I run Ubuntu, so I can download debug symbols for most of the packages on my system from ddebs.ubuntu.com, and I can get source using apt-get source. But how do I tell gdb to put the two together? If the debug symbols include relative paths, I can use gdb's directory command to add the source directory to my source path:

    [~/src]$ apt-get source coreutils
    [~/src]$ sudo apt-get install coreutils-dbgsym
    [~/src]$ gdb /bin/ls
    GNU gdb (GDB) 7.1-ubuntu
    (gdb) list main
    1192    ls.c: No such file or directory.
        in ls.c
    (gdb) directory ~/src/coreutils-7.4/src/
    Source directories searched: /home/nelhage/src/coreutils-7.4:$cdir:$cwd
    (gdb) list main
    1192        }
    1193    }
    1194    
    1195    int
    1196    main (int argc, char **argv)
    1197    {
    1198      int i;
    1199      struct pending *thispend;
    1200      int n_files;
    1201
    

    Sometimes, however, debug symbols end up with absolute paths, such as the kernel's. In that case, I can use set substitute-path to tell gdb how to translate paths:

    [~/src]$ apt-get source linux-image-2.6.32-25-generic
    [~/src]$ sudo apt-get install linux-image-2.6.32-25-generic-dbgsym
    [~/src]$ gdb /usr/lib/debug/boot/vmlinux-2.6.32-25-generic 
    (gdb) list schedule
    5519    /build/buildd/linux-2.6.32/kernel/sched.c: No such file or directory.
        in /build/buildd/linux-2.6.32/kernel/sched.c
    (gdb) set substitute-path /build/buildd/linux-2.6.32 /home/nelhage/src/linux-2.6.32/
    (gdb) list schedule
    5519    
    5520    static void put_prev_task(struct rq *rq, struct task_struct *p)
    5521    {
    5522        u64 runtime = p->se.sum_exec_runtime - p->se.prev_sum_exec_runtime;
    5523    
    5524        update_avg(&p->se.avg_running, runtime);
    5525    
    5526        if (p->state == TASK_RUNNING) {
    5527            /*
    5528             * In order to avoid avg_overlap growing stale when we are
    
  5. Debugging macros

    One of the standard reasons almost everyone will tell you to prefer inline functions over macros is that debuggers tend to be better at dealing with inline functions. And in fact, by default, gdb doesn't know anything at all about macros, even when your project was built with debug symbols:

    (gdb) p GFP_ATOMIC
    No symbol "GFP_ATOMIC" in current context.
    (gdb) p task_is_stopped(&init_task)
    No symbol "task_is_stopped" in current context.
    

    However, if you're willing to tell GCC to generate debug symbols specifically optimized for gdb, using -ggdb3, it can preserve this information:

    $ make KCFLAGS=-ggdb3
    ...
    (gdb) break schedule
    (gdb) continue
    (gdb) p/x GFP_ATOMIC
    $1 = 0x20
    (gdb) p task_is_stopped_or_traced(init_task)
    $2 = 0
    

    You can also use the macro and info macro commands to work with macros from inside your gdb session:

    (gdb) macro expand task_is_stopped_or_traced(init_task)
    expands to: ((init_task->state & (4 | 8)) != 0)
    (gdb) info macro task_is_stopped_or_traced
    Defined at include/linux/sched.h:218
      included at include/linux/nmi.h:7
      included at kernel/sched.c:31
    #define task_is_stopped_or_traced(task) ((task->state & (__TASK_STOPPED | __TASK_TRACED)) != 0)
    

    Note that gdb actually knows which contexts macros are and aren't visible, so when you have the program stopped inside some function, you can only access macros visible at that point. (You can see that the "included at" lines above show you through exactly what path the macro is visible).

  6. gdb variables

    Whenever you print a variable in gdb, it prints this weird $NN = before it in the output:

    (gdb) p 5+5
    $1 = 10
    

    This is actually a gdb variable, that you can use to reference that same variable any time later in your session:

    (gdb) p $1
    $2 = 10
    

    You can also assign your own variables for convenience, using set:

    (gdb) set $foo = 4
    (gdb) p $foo
    $3 = 4
    

    This can be useful to grab a reference to some complex expression or similar that you'll be referencing many times, or, for example, for simplicity in writing a conditional breakpoint (see tip 1).

  7. Register variables

    In addition to the numeric variables, and any variables you define, gdb exposes your machine's registers as pseudo-variables, including some cross-architecture aliases for common ones, like $sp for the the stack pointer, or $pc for the program counter or instruction pointer.

    These are most useful when debugging assembly code or code without debugging symbols. Combined with a knowledge of your machine's calling convention, for example, you can use these to inspect function parameters:

    (gdb) break write if $rsi == 2
    

    will break on all writes to stderr on amd64, where the $rsi register is used to pass the first parameter.

  8. The x command

    Most people who've used gdb know about the print or p command, because of its obvious name, but I've been surprised how many don't know about the power of the x command.

    x (for "examine") is used to output regions of memory in various formats. It takes two arguments in a slightly unusual syntax:

    x/FMT ADDRESS
    

    ADDRESS, unsurprisingly, is the address to examine; It can be an arbitrary expression, like the argument to print.

    FMT controls how the memory should be dumped, and consists of (up to) three components:

    • A numeric COUNT of how many elements to dump
    • A single-character FORMAT, indicating how to interpret and display each element
    • A single-character SIZE, indicating the size of each element to display.

    x displays COUNT elements of length SIZE each, starting from ADDRESS, formatting them according to the FORMAT.

    There are many valid "format" arguments; help x in gdb will give you the full list, so here's my favorites:

    x/x displays elements in hex, x/d displays them as signed decimals, x/c displays characters, x/i disassembles memory as instructions, and x/s interprets memory as C strings.

    The SIZE argument can be one of: b, h, w, and g, for one-, two-, four-, and eight-byte blocks, respectively.

    If you have debug symbols so that GDB knows the types of everything you might want to inspect, p is usually a better choice, but if not, x is invaluable for taking a look at memory.

    [~]$ grep saved_command /proc/kallsyms
    ffffffff81946000 B saved_command_line
    
    
    (gdb) x/s 0xffffffff81946000
    ffffffff81946000 <>:     "root=/dev/sda1 quiet"
    

    x/i is invaluable as a quick way to disassemble memory:

    (gdb) x/5i schedule
       0xffffffff8154804a <schedule>:   push   %rbp
       0xffffffff8154804b <schedule+1>: mov    $0x11ac0,%rdx
       0xffffffff81548052 <schedule+8>: mov    %gs:0xb588,%rax
       0xffffffff8154805b <schedule+17>:    mov    %rsp,%rbp
       0xffffffff8154805e <schedule+20>:    push   %r15
    

    If I'm stopped at a segfault in unknown code, one of the first things I try is something like x/20i $ip-40, to get a look at what the code I'm stopped at looks like.

    A quick-and-dirty but surprisingly effective way to debug memory leaks is to let the leak grow until it consumes most of a program's memory, and then attach gdb and just x random pieces of memory. Since the leaked data is using up most of memory, you'll usually hit it pretty quickly, and can try to interpret what it must have come from.

~nelhage

Ksplice is hiring!

Do you love tinkering with, exploring, and debugging Linux systems? Does writing Python clones of your favorite childhood computer games sound like a fun weekend project? Have you ever told a joke whose punch line was a git command?

Join Ksplice and work on technology that most people will tell you is impossible: updating the Linux kernel while it is running.

Help us develop the software and infrastructure to bring rebootless kernel updates to Linux, as well as new operating system kernels and other parts of the software stack. We're hiring backend, frontend, and kernel engineers. Say hello at jobs@ksplice.com!

About

Tired of rebooting to update systems? So are we -- which is why we invented Ksplice, technology that lets you update the Linux kernel without rebooting. It's currently available as part of Oracle Linux Premier Support, Fedora, and Ubuntu desktop. This blog is our place to ramble about technical topics that we (and hopefully you) think are interesting.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today