<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
   <title>Linux &apos;n Stuff</title>
   <link rel="alternate" type="text/html" href="http://blogs.oracle.com/linuxnstuff/" />
   <link rel="self" type="application/atom+xml" href="http://blogs.oracle.com/linuxnstuff/xml/rss.xml" />
   <id>tag:blogs.oracle.com,2009:/linuxnstuff//479</id>
   <updated>2009-08-28T19:05:57Z</updated>
   <subtitle>Andy Grover&apos;s work blog.</subtitle>
   <generator uri="http://www.sixapart.com/movabletype/">Movable Type Enterprise 4.23-en</generator>


<entry>
   <title>GSOC Mentoree interviewed</title>
   <link rel="alternate" type="text/html" href="http://blogs.oracle.com/linuxnstuff/2009/08/gsoc_mentoree_interviewed.html" />
   <id>tag:blogs.oracle.com,2009:/linuxnstuff//479.14088</id>
   
   <published>2009-08-28T19:05:56Z</published>
   <updated>2009-08-28T19:05:57Z</updated>
   
   <summary>My GSOC 2009 mentoree, Kanika Vats, shares some thoughts on her experiences, and tech careers for women.I think her take on open source technology and its community is incredibly positive. I have no doubt at some point the bloom will...</summary>
   <author>
      <name>andy.grover</name>
      
   </author>
   
   
   <content type="html" xml:lang="en" xml:base="http://blogs.oracle.com/linuxnstuff/">
      <![CDATA[<p><a href="http://www.linuxpromagazine.com/Online/Blogs/ROSE-Blog-Rikki-s-Open-Source-Exchange/ROSE-Blog-Interviews-GSoC-participant-Kanika-Vats">My GSOC 2009 mentoree, Kanika Vats, shares some thoughts on her experiences, and tech careers for women.</a><br /><br />I think her take on open source technology and its community is incredibly positive. I have no doubt at some point the bloom will come off the rose a bit, but this is certainly a good first impression for a newcomer to have, and hopefully will make it easier to put future negativity in perspective.<br /><br />I was the sole non-female in the <a href="http://systers.org">Systers</a> development effort to customize Mailman to their community's requirements. (I was recruited by my SO, <a href="http://buunabet.org/">Jen</a>, who was also a mentor as well as overall organizer for Systers' GSOC effort.) As a first-time mentor, this was an extremely rewarding experience. <br /><br />First, it made clear that being a mentor/teacher/manager-type was <i>work</i>, even though someone else was doing the heavy lifting. It really helps to know what you're talking about, which I didn't always, such as giving advice on how to use the Storm ORM, when my experience was with SQLObject. (An ORM's an ORM, right?)<br /><br />Second, working with a new contributor was a chance to give exposure to the things that school doesn't seem to teach but that are very important in the real world, and especially around open-source development: Coding standards; Understanding and enhancing other people's code, and writing code so that other people can understand it; Using source control and branches to <i>help</i> one's productivity; bug tracking; and online collaboration, especially across time zones.<br /><br />Finally, I will have the opportunity to attend the annual <a href="http://gracehopper.org/2009/">Grace Hopper Women in Computing</a> conference, where I am led to believe people of my gender will be in the extreme minority. Will it be <i>weird</i>? Too early to say -- although I have heard the conference schwag may include soaps and candles instead of 2XL-sized t-shirts. Should be interesting!<br /><br /><div class="zemanta-pixie"><img class="zemanta-pixie-img" alt="" src="http://img.zemanta.com/pixy.gif?x-id=9719bf91-9a96-811d-a1cb-f7fd0c80d4bc" /></div></p>]]>
      
   </content>
</entry>

<entry>
   <title>RHEL and Emacs 23</title>
   <link rel="alternate" type="text/html" href="http://blogs.oracle.com/linuxnstuff/2009/08/rhel_and_emacs_23.html" />
   <id>tag:blogs.oracle.com,2009:/linuxnstuff//479.13859</id>
   
   <published>2009-08-17T05:41:48Z</published>
   <updated>2009-08-17T05:41:49Z</updated>
   
   <summary>Dear Lazyweb,Is there a RPM (or repo) available that has Emacs 23 for RHEL5?Unfortunately the link to Chip Coldwell&apos;s RPM is now dead.Thanks -- Andy...</summary>
   <author>
      <name>andy.grover</name>
      
   </author>
   
   
   <content type="html" xml:lang="en" xml:base="http://blogs.oracle.com/linuxnstuff/">
      <![CDATA[<p>Dear Lazyweb,<br /><br />Is there a RPM (or repo) available that has Emacs 23 for RHEL5?<br /><br />Unfortunately the link to <a href="http://people.redhat.com/coldwell/emacs/repo/rhel/emacs-release-23-1.el5.noarch.rpm">Chip Coldwell's RPM</a> is now dead.<br /><br />Thanks -- Andy<br /><br /><br /><div class="zemanta-pixie"><img class="zemanta-pixie-img" alt="" src="http://img.zemanta.com/pixy.gif?x-id=1c88d10a-34bc-8f5b-b329-cc9bf906d2ec" /></div></p>]]>
      
   </content>
</entry>

<entry>
   <title>Drobo and Linux</title>
   <link rel="alternate" type="text/html" href="http://blogs.oracle.com/linuxnstuff/2009/04/drobo_and_linux.html" />
   <id>tag:blogs.oracle.com,2009:/linuxnstuff//479.11774</id>
   
   <published>2009-04-24T23:13:10Z</published>
   <updated>2009-04-24T23:13:20Z</updated>
   
   <summary>The Path to DroboI just got a Drobo to replace an aging NAS appliance. (Note I am not a storage developer, either at the filesystem or block level, so I am just talking about this as a guy who needs...</summary>
   <author>
      <name>andy.grover</name>
      
   </author>
   
   
   <content type="html" xml:lang="en" xml:base="http://blogs.oracle.com/linuxnstuff/">
      <![CDATA[<p><big>The Path to Drobo</big><br /><br />I just got a <a href="http://drobo.com/">Drobo</a> to replace an aging NAS appliance. (Note I am not a storage developer, either at the filesystem or block level, so I am just talking about this as a guy who needs to store stuff safely)<br /><br />When the Drobo first came out, I was not that impressed. It was like an external USB-attached hard drive, but you could hotswap drives? It seemed overpriced. But, I took another look at it recently because we want to reduce the number of machines in our home network, for power, noise, and simplicity's sake.<br /><br />Our mythtv box has to be on all the time anyways, and because our <a href="http://www.intel.com/support/motherboards/server/ss4000-e/index.htm">NAS box</a> is not one of the <a href="http://www.synology.com/enu/index.php">current</a> <a href="http://www.readynas.com/">generation</a> that contains a wealth of services, the myth box has already been doing a lot of mounting NAS shares and re-exporting services (e.g. iTunes) anyway. So by switching to the direct-attached Drobo, not too much changes (except the myth box becomes a Samba server, which is is way better at than the NAS's puny CPU!). This eliminates one loud machine that never spun down its disks in favor of a faster, quieter, more flexible one. Or at least that was the plan.<br /><br /><big>Considering the Drobo</big><br /><br />Drobo has a number of nice things about it that I like:<br /><ul><li>Storage is redundant, can use different sizes of disk, and new disk capacity is used seamlessly (i.e. don't have to break the RAID5 array)</li><li>Takes bare 3.5" SATA disks, no hotswap caddies needed. Why aren't other companies doing this?<br /></li><li>USB and Firewire 800 (400 works too)</li><li>Shorter times to re-establish redundancy on a disk removal/failure due to being filesystem-aware</li><li>Disks spin down, and it's generally quiet</li><li>Supports ext3 and Linux (in development)<br /></li><li>A speed upgrade from what I had (18MB/sec vs old NAS's 8MB/sec)</li></ul>It also has some things I don't like:<br /><ul><li>The <a href="http://www.drobospace.com/">Drobo user forum</a> requires Drobo serial number (i.e. customers only) to access. This is a pretty uncommon restriction and a bit worrisome to prospective customers.<br /></li><li>Drobo doesn't appear to report capacity above 2TB. READ_CAPACITY(16) fails? Maybe this a Linux or FW issue, because larger LUNs work on other OSes from what I hear.</li><li>I configured Drobo with a 2TB LUN. Once I add enough HD capacity to pass that threshold, it looks like I'm back into less-fun, more complicated scenarios to use its full space. This could involve either a destructive reformat of the Drobo, or multiple LUNs.</li><li>Linux support is still under development. So now I'm becoming a drobo-utils contributor, which looks like it will be fun to hack, but really I'd rather not have to write code to get the hw I paid for to work on Linux.</li><li>The flipside of being FS-aware: can't do its magic (skip relay-out of empty space, do proper storage virtualization) on anything but ext3, NTFS, HFS+, and FAT32.<br /></li><li>The OS thinks Drobo is larger than it actually is. One must refer to lights on the front (or drobo-utils) to determine actual space remaining. This means there will be issues using Drobo for apps like MythTV that want to fill the volume and then expire the oldest data.<br /></li></ul><big>Installing Drobo</big><br /><br />It didn't go as smoothly as I hoped, but in the end it wasn't the Drobo's fault.<br /><br />Basically the first three adapters I used to connect it caused varying degrees of flakiness. First I tried the mythtv's built-in 1394a connector. It formatted the Drobo ok, but I was seeing errors in dmesg. So I put in an additional 1394 card. That worked better, but there were long pauses, and throughput was averaging to 3MB/sec. iotop showed bursts of 30MB/sec activity followed by long stretches of nothing.<br /><br />What was going on? Nothing in dmesg. drobo-utils diagnostics was broken so it was no help. Was it's little ARM brain stressed? Impossible. Benchmarks on the web were in the 15MB/sec range. Before giving up, I tried the USB connection, and was gratified to see a much more consistent 18MB/sec rate.<br /><br />Problem solved, (I thought) I proceeded to start copying all my stuff onto the Drobo. After a few minutes it would start getting USB errors and vanish.<br /><br />ARRG.<br /><br />One last thing to try: pull out the known-good, based-on-TI-chipset FW card from my desktop and try that. That did it, both copying a 2GB iso and bonnie++ ran with no issues. All my problems were caused by flaky 1394 or USB adapters.<br /><br />So lessons learned. Make sure Drobo-connected 1394 card has TI chipset (both mobo and 1st 1394 card had older VIA chips on them). Motherboard USB can also be flaky. Drobo seems to have an above-average sensitivity to these issues.<br /><br />I'm really glad I was able to get it sorted because I really couldn't see any device, even more expensive ones, that fulfilled my desire for easy-to-use, non-destructively-expandable, redundant filesystem.<br /><br /><big>How Drobo Works</big><br /><br />Much of Drobo's unique featureset is made possible by combining filesystem-awareness with block-awareness. Drobo, as a block device, is just getting commands like "write this 512-byte chunk of data to sector X". Normal block devices have no knowledge about what is in the data to write. Drobo does.<br /><br />The first thing it must do is sparsely map the sectors in its 2TB virtual LUN to the actual sectors it has available on its disks. It only has to allocate a real sector whenever the OS's filesystem code writes to a given sector. This is equivalent to what VMWare and VirtualBox do when you allocate a dynamically-expanding disk image. At this level, the actual storage required for a large virtual volume will start small and grow. It will never shrink, because even if a sector contains data for a file that is deleted, the sector has still been written-to, and there's no way for the block layer to know it can reclaim those now-unused sectors.<br /><br />The second thing Drobo does is it looks <i>inside</i> the block-based data it's being sent to decode the blocks that the filesystem is using for its metadata. For example, if a large file is deleted on an ext3 filesystem, there are updates to the filesystem's on-disk structures to record this. By noticing those metadata block writes, Drobo can determine when the filesystem is no longer referencing a bunch of other sectors. Now Drobo has the correct map of sectors the filesystem is using, instead of all the sectors the filesystem has ever touched.<br /><br />The benefit of this should be clear to anyone who has ever inserted a new disk into a RAID array and had to wait a whole day for it to be ready. This takes the same amount of time no matter how much data is on the array's volumes, because it is ensuring every single sector is mirrored. Drobo knows how much data you actually have, so its sync times are based on how much data is in the filesystem, not how long it takes to write to every sector. Kinda nice, BUT I'd imagine a filesystem-aware block device makes filesystem developers a little queasy.<br /><br />What about new filesystems? BtrFS? Ext4? ZFS? You can't put these on a Drobo and have it work right.<br /><big><br />Drobo and TRIM</big><br /><br />How can Drobo handle these new filesystems? One way is for a firmware update to add support for them, like the ones it currently supports. However...<br /><br />At the most recent <a href="http://lwn.net/Articles/327740/">Linux Storage and Filesystem Workshop</a>, there was some discussion about a TRIM command, to tell a block device the filesystem is not using a region anymore. This helps SSDs manage themselves better, and also could help Drobo and others:<br /><blockquote>From there, the discussion went to the seemingly unrelated topic of "thin provisioning."  This is an offering from certain large storage array vendors; they will sell an array which claims to be much larger than the amount of storage actually installed.  When the available space gets low, the customer can buy more drives from the vendor.  Meanwhile, from the point of view of the system, the (apparently) large array has never changed.<br /></blockquote>Drobo's maker Data Robotics is probably not one of the "large" vendors referred to here, but just like for the big guys' arrays, Drobo could cease its layer-violating-yet-awesome hackery if a) new filesystems used TRIM and b) Drobo understood what it meant. No doubt Drobo would want to keep its current code in place for its use with TRIM-unaware systems, but I think a TRIM-aware BtrFS on a TRIM-aware Drobo would be great -- let Drobo do its magic but still keep that nice clean filesystem and block separation.<br /><br /><div class="zemanta-pixie"><img class="zemanta-pixie-img" src="http://img.zemanta.com/pixy.gif?x-id=f3874174-419f-8db0-b13e-308cb76f1332" /></div></p>]]>
      
   </content>
</entry>

<entry>
   <title>Converged fabric does not mean TCP/IP everywhere</title>
   <link rel="alternate" type="text/html" href="http://blogs.oracle.com/linuxnstuff/2009/03/converged_fabric_does_not_mean.html" />
   <id>tag:blogs.oracle.com,2009:/linuxnstuff//479.10990</id>
   
   <published>2009-03-25T06:33:43Z</published>
   <updated>2009-03-25T15:54:09Z</updated>
   
   <summary>While I was not able to attend the OFA workshop in person this year, I have been making use of the provided teleservices to follow it.The big issue this year is whether it&apos;s preferable to have the RDMA protocol used...</summary>
   <author>
      <name>andy.grover</name>
      
   </author>
   
   
   <content type="html" xml:lang="en" xml:base="http://blogs.oracle.com/linuxnstuff/">
      <![CDATA[<p>While I was not able to attend the OFA workshop in person this year, I have been making use of the provided teleservices to follow it.<br /><br />The big issue this year is whether it's preferable to have the RDMA protocol used in a datacenter over Ethernet be encapsulated in TCP/IP, or just on Ethernet itself. While the trend over the past 20 years has been to use TCP/IP even for LAN traffic, the reasons for this may not apply to the two possible examples of going the <i>other</i> way, namely Fibre Channel over Ethernet (FCoE) and what was called RDMA over Converged Enhanced Ethernet (RoCEE or RoE)<br /><br />Both FCoE and RoE have TCP/IP-based work-alikes -- iSCSI and iWARP respectively. The latter technologies face the problem that processing received packets <i>cannot</i> be made to be efficient! People have tried. A lot. Netchannels anyone? TCP's functionality (reordering, demultiplexing, etc) means that each packet is going to get copied in its entirety at least once, in addition to the NIC's copy. You can try to coalesce this by moving the TCP stack into the NIC with a TOE, but that <a href="http://www.linuxfoundation.org/en/Net:TOE">sucks for about 14 reasons</a>. You can try to hide it with an asynchronously-operating <a href="http://www.intel.com/go/ioat/">DMA engine</a> to do the copy for you, but that has issues of its own. Finally, you can add a custom TCP stack in the kernel for your special protocol. This is a flagrant layering violation and the Linux netdev crew will also put a hit out on you.<br /><br />Encapsulating in only Ethernet makes things a lot more palatable, with the two caveats that you have to do without the niceties that TCP gives you (congestion control) as well as IP (routing). In a datacenter you don't need routing, and if CEE does congestion control then you're covered. It is now possible to <i>sanely</i> build an adapter that can do 0-copy FCoE and RoE -- just bolt your IB^H^HRDMA macrocell and your FC macrocell next to the regular stuff on your Ethernet silicon and divert to them based on ethertype. send/recv rings all look kinda the same, don't they? Your card now shows up as three separate devices, and all three (fc, rdma, net) perform at their max efficiency.<br /><br />Internet protocols have pushed far into the LAN/datacenter environment, displacing almost all equivalent LAN protocols. TCP/IP's ubiquity is a virtue above almost all others, and that same virtue is also behind the drive towards Ethernet as a common fabric. But performance and clean implementation outweigh ubiquity. Convergence on TCP/IP in the datacenter is not the big win, it's a nice-to-have. iWARP had potential downsides that people thought could be worked around...but the workarounds needed workarounds and now costs totally outweigh its benefits. The big win is converged Ethernet, not converged TCP, and I think that's what the debate should not lose sight of.<br /><br /><div class="zemanta-pixie"><img class="zemanta-pixie-img" src="http://img.zemanta.com/pixy.gif?x-id=9ae4550c-4365-480a-88ad-20ac677b26f1" /></div></p>]]>
      
   </content>
</entry>

<entry>
   <title>Filesystem testing</title>
   <link rel="alternate" type="text/html" href="http://blogs.oracle.com/linuxnstuff/2009/03/filesystem_testing.html" />
   <id>tag:blogs.oracle.com,2009:/linuxnstuff//479.10594</id>
   
   <published>2009-03-05T03:36:22Z</published>
   <updated>2009-03-05T03:36:29Z</updated>
   
   <summary>SSD’s, Journaling, and noatime/relatime | Thoughts by TedThe chief mechanism for benchmarking Linux filesystems seems to be operations on the Linux kernel&apos;s source tree.This seems to me a dangerous over-optimization for kernel developer-centric usage. What if people want to put...</summary>
   <author>
      <name>andy.grover</name>
      
   </author>
   
   
   <content type="html" xml:lang="en" xml:base="http://blogs.oracle.com/linuxnstuff/">
      <![CDATA[<p><a href="http://thunk.org/tytso/blog/2009/03/01/ssds-journaling-and-noatimerelatime/">SSD’s, Journaling, and noatime/relatime | Thoughts by Ted</a><br /><br />The chief mechanism for benchmarking Linux filesystems seems to be operations on the Linux kernel's source tree.<br /><br />This seems to me a dangerous over-optimization for kernel developer-centric usage. What if people want to put files <i>other than the Linux kernel source</i> on their disk? Other use cases are completely untested!<br /><br />OK I'm being facetious.<br /><br />Dumb questions: Are snapshotted filesystems like btrfs, and version control systems like git two things that could converge, or do they just <i>look</i> like they could converge and really never will?<br /><blockquote></blockquote><br /><br /><div class="zemanta-pixie"><img class="zemanta-pixie-img" src="http://img.zemanta.com/pixy.gif?x-id=9a7c49dd-85f2-4457-a3ef-5d20e5dca61a" /></div></p>]]>
      
   </content>
</entry>

<entry>
   <title>RDS queued for Linux kernel inclusion!</title>
   <link rel="alternate" type="text/html" href="http://blogs.oracle.com/linuxnstuff/2009/02/rds_queued_for_linux_kernel_in.html" />
   <id>tag:blogs.oracle.com,2009:/linuxnstuff//479.10493</id>
   
   <published>2009-02-28T03:15:24Z</published>
   <updated>2009-02-28T03:15:30Z</updated>
   
   <summary>RDS is queued to be included in Linux 2.6.30.RDS is a protocol initially designed to enable Oracle RAC clustered database nodes to exchange datagrams rapidly and reliably. It has been developed and maintained out-of-tree for a number of years.With this...</summary>
   <author>
      <name>andy.grover</name>
      
   </author>
   
   
   <content type="html" xml:lang="en" xml:base="http://blogs.oracle.com/linuxnstuff/">
      <![CDATA[<p>RDS is queued to be included in Linux 2.6.30.<br /><br />RDS is a protocol initially designed to enable <a href="http://www.oracle.com/technology/products/database/clustering/">Oracle RAC clustered database nodes</a> to exchange datagrams rapidly and reliably. It has been developed and maintained out-of-tree for a number of years.<br /><br />With this will come all the usual benefits that <a href="http://en.wikipedia.org/wiki/Greg_Kroah-Hartman">GregKH</a> has enumerated many times: more eyes fixing bugs, a broader spectrum of users, fewer compatibility hacks, and it's the Right Thing to do if at all possible, natch.<br /><br />One interesting angle is that Linux already <i>has</i> a reliable-datagram protocol, <a href="http://en.wikipedia.org/wiki/SCTP">SCTP</a>, and of course it's also not too hard to implement reliable datagrams on top of TCP, so why another? RDS has grown from its roots as a "thin sockets layer over Infiniband" to have its own set of distinct and unique features tailored to its particular niche.<br /><br />Working on RDS has been a very positive experience for me. It's been the right balance of exposure to new stuff and also a return to the world of Linux networking development, which I find rewarding. There's still much to do to achieve further performance gains and higher code quality, so still plenty to keep me busy.<br /><br />A good week.<br /><br /><br /><div class="zemanta-pixie"><img class="zemanta-pixie-img" src="http://img.zemanta.com/pixy.gif?x-id=14ea8a2b-c175-4d78-9e7e-37486c9b4e73" /></div></p>]]>
      
   </content>
</entry>

<entry>
   <title>I think git just won</title>
   <link rel="alternate" type="text/html" href="http://blogs.oracle.com/linuxnstuff/2009/01/i_think_git_just_won.html" />
   <id>tag:blogs.oracle.com,2009:/linuxnstuff//479.9671</id>
   
   <published>2009-01-14T19:06:57Z</published>
   <updated>2009-01-14T19:07:02Z</updated>
   
   <summary>Qt is moving to LGPL and a more open development process.The article notes:In addition to adopting the LGPL license for Qt, Nokia will also be completely changing Qt&apos;s development model to make it more inclusive and transparent. The source code...</summary>
   <author>
      <name>andy.grover</name>
      
   </author>
   
   
   <content type="html" xml:lang="en" xml:base="http://blogs.oracle.com/linuxnstuff/">
      <![CDATA[<p><a href="http://arstechnica.com/news.ars/post/20090114-nokia-qt-lgpl-switch-huge-win-for-cross-platform-development.html">Qt is moving to LGPL and a more open development process.</a><br /><br />The article notes:<br /><blockquote>In addition to adopting the LGPL license for Qt, Nokia will also be completely changing Qt's development model to make it more inclusive and transparent. The source code will be moved to a publicly-accessible Git repository so that the latest changes will always be visible. The use of Git, a distributed version control system, will make it easier for third-party developers to participate directly in the process of improving Qt. To further reduce the barrier to participation, Nokia plans to accept code from contributors without requiring copyright assignment.</blockquote><br />Aside from how great this is for developers using Qt, I think this pretty much indicates the DSCM race has been won. Aside from the kernel and X using Git (and Rails), both KDE and Gnome camps appear to be moving towards Git.<br /><br />The one thing that Git still seems to be lacking is a nice libgit library for applications to access it. The common approach for language wrappers for Git in Ruby and Python is to shell out to the git executable to do stuff. It works, but yeesh, cmon. I want git embedded <em>everywhere</em>. I want Firefox to use git to store my bookmarks. That's not gonna happen until other applications can easily use Git's functionality.</p>]]>
      
   </content>
</entry>

<entry>
   <title>Adventures in Netlink</title>
   <link rel="alternate" type="text/html" href="http://blogs.oracle.com/linuxnstuff/2008/12/adventures_in_netlink.html" />
   <id>tag:blogs.oracle.com,2008:/linuxnstuff//479.8951</id>
   
   <published>2008-12-04T07:55:27Z</published>
   <updated>2008-12-04T07:55:50Z</updated>
   
   <summary>Ever spent hours writing code and then found out it&apos;s not needed? I just had the pleasure. I wrote some code using netlink (via libnl) that takes an interface name (&quot;eth0&quot;) and returns the ipv4 addresses associated with it, but...</summary>
   <author>
      <name>andy.grover</name>
      
   </author>
   
   
   <content type="html" xml:lang="en" xml:base="http://blogs.oracle.com/linuxnstuff/">
      <![CDATA[<p>Ever spent hours writing code and then found out it's not needed? I just had the pleasure. I wrote some code using netlink (via libnl) that takes an interface name ("eth0") and returns the ipv4 addresses associated with it, but the bug I thought I was fixing was not there, so I'm making a blog entry out of it :)<br /><br />Libnl has documentation. Autogenerated documentation -- which is better than nothing, but it's really an API reference, not a tutorial. Of course these days everything is in a git repo so I was able to muddle through by looking at the source, but of course this shouldn't be needed, in an ideal world...<br /><br />First, include some headers. This assumes you have libnl-devel RPM installed, and don't forget to link against libnl with "-lnl":<br /><br /><code>#include &lt;netlink/route/link.h&gt;<br />#include &lt;netlink/route/addr.h&gt;</code><br /><br />Here's the main function:<br /><code><br />static uint32_t parse_iface(char *ptr)<br />{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; struct nl_handle *sock;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; struct nl_cache *link_cache;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; struct nl_cache *addr_cache;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int ifindex;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; uint32_t result = 0;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; struct rtnl_addr *addr;<br />&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sock = nl_handle_alloc();<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; nl_connect(sock, NETLINK_ROUTE);<br />&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; link_cache = rtnl_link_alloc_cache(sock);<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; addr_cache = rtnl_addr_alloc_cache(sock);<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ifindex = rtnl_link_name2i(link_cache, ptr);<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; addr = rtnl_addr_alloc();<br />&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; rtnl_addr_set_family(addr, AF_INET);<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; rtnl_addr_set_ifindex(addr, ifindex);<br />&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; nl_cache_foreach_filter(addr_cache, (struct nl_object *)addr, nl_cache_callback, &result);<br />&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; rtnl_addr_put(addr);<br />&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return ntohl(result);<br />}<br /></code><br />This code is connecting to the netlink socket to receive route-related information. The <code>rtnl_*_alloc_cache()</code> calls basically fill those caches with all the information from the kernel related to links (aka net interfaces) and addresses (all of them, whether ipv4, ipv6, appletalk, or whatever). So now all the info we need is in our address space, and we just need to sift through it!<br /><br />Luckily, we can cross-reference the two separate caches because ifindex refers to the same interface in either cache. The code then gets the ifindex for the passed-in name (e.g. if ptr was "eth0").<br /><br />The next part was kind of weird, due to being so flexible that simple things like what I wanted to do become nonobvious. The way you filter a cache is by creating an instance of the thing in the cache, and then only filling in the fields that you want to match the results. That's what the next few lines do, alloc an addr and fill in the limiters I wanted: ifindex (so I just get addresses for the one interface) and the family (limit to ipv4 addresses only).<br /><br />libnl is designed to handle multiple returns for everything. This isn't dumb -- an interface can still have more than 1 ipv4 addresses that will get returned, so it has a callback interface. You give it a function and it calls it once for each result. I named my function <code>nl_cache_callback</code>, let's take a look at it:<br /><code><br />static void nl_cache_callback(struct nl_object *obj, void *arg)<br />{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; struct nl_addr *addr;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; uint32_t *ipv4_addr = arg;<br />&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* only get addr in 1st callback */<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (*ipv4_addr)<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return;<br />&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; addr = rtnl_addr_get_local((struct rtnl_addr *)obj);<br />&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; *ipv4_addr = *(uint32_t *)nl_addr_get_binary_addr(addr);<br />}<br /></code><br />Like I said, this could be called more than once but I really just want the first one, so I am passing an initially-zero variable in the context argument, and if it's already set I bail out. The last wrinkle is that a <code>struct rtnl_addr</code> can hold a variety of different address types, so if I didn't already know I'd asked for only 32-bit ipv4 addresses, I'd have to use the accessor methods to find out the address's length and type. But since I do, I just copy the addr's 4 bytes into my 4 bytes, and that's it. Back in the first function, it's a quick <code>ntohl(ipv4_addr)</code>[1] and <a href="http://en.wikipedia.org/wiki/Bob%27s_your_uncle">Bob's your uncle</a>[2]!<br /><br />Have fun!<br /><br />[1] libnl also has a helper function that fills in a sockaddr directly, pretty nice<br />[2] I actually <i>have</i> an uncle named Bob! Your results may vary.</p>]]>
      
   </content>
</entry>

<entry>
   <title>Hello!</title>
   <link rel="alternate" type="text/html" href="http://blogs.oracle.com/linuxnstuff/2008/10/hello.html" />
   <id>tag:blogs.oracle.com,2008:/linuxnstuff//479.7994</id>
   
   <published>2008-10-08T18:27:32Z</published>
   <updated>2008-10-08T18:48:48Z</updated>
   
   <summary>Hi, I&apos;m Andy Grover, and I&apos;m a Linux kernel developer. I&apos;m currently working on a network protocol called RDS, which is used (mostly on Infiniband today) for Oracle RAC IPC messages as well as by the recently-announced Exadata technology. RDS...</summary>
   <author>
      <name>andy.grover</name>
      
   </author>
   
   <category term="epel" label="epel" scheme="http://www.sixapart.com/ns/types#tag" />
   <category term="exadata" label="exadata" scheme="http://www.sixapart.com/ns/types#tag" />
   <category term="git" label="git" scheme="http://www.sixapart.com/ns/types#tag" />
   <category term="hello" label="hello" scheme="http://www.sixapart.com/ns/types#tag" />
   <category term="infiniband" label="infiniband" scheme="http://www.sixapart.com/ns/types#tag" />
   <category term="linux" label="linux" scheme="http://www.sixapart.com/ns/types#tag" />
   <category term="oel" label="oel" scheme="http://www.sixapart.com/ns/types#tag" />
   <category term="rds" label="rds" scheme="http://www.sixapart.com/ns/types#tag" />
   <category term="rhel" label="rhel" scheme="http://www.sixapart.com/ns/types#tag" />
   
   <content type="html" xml:lang="en" xml:base="http://blogs.oracle.com/linuxnstuff/">
      <![CDATA[<p>Hi, I'm Andy Grover, and I'm a Linux kernel developer. I'm currently working on a network protocol called RDS, which is used (mostly on Infiniband today) for Oracle RAC IPC messages as well as by the recently-announced Exadata technology. RDS is poised for wider adoption, so I hope to talk more about its growth in the future, as well as technical issues around it, and Linux in general.</p>

<p>A tip: Sometimes enterprise distros can be a little short on having all the packages you need, if you're using them for development instead of actually serving anything. For example, git is the source control tool of choice for most people doing Linux kernel-related work. RHEL and OEL don't have it. However, the <a href="http://fedoraproject.org/wiki/EPEL">EPEL</a> (Extra Packages for Enterprise Linux) repository does! It's run by the Fedora project and does a good job of providing a way to have the stability of an enterprise distro, and the package selection of a community distro.</p>]]>
      
   </content>
</entry>

</feed>
