The next step for packet interception in [Open]Solaris
By avalon on Jun 15, 2007
The packet filtering hooks project (PSARC/2005/334) delivered a first take on a framework to deliver the ability to intercept packets inline with the processing flow without requiring any additional STREAMS modules to be inserted between IP and the network device driver. Whilst this has been successful in improving performance for IPFilter, the other primary goal of exposing an interface for external use has not yet been achieved. The rest of this blog entry will summarise what the state of play is here and what we need to be doing for the future. The design document for PSARC/2005/334 can be found at: Packet Filtering Hooks Design, 3 Sep 2006
The goal of software engineering fot the Solaris kernel is have all of the Committed public APIs become stable, so that code can osentsibly be recompiled, or even reused as binary blobs, on any later release. This means that it is necessary to understand in great depth how the interface is going to be used.
For the packet filtering hooks, the design presented for PSARC/2005/334 was for a world without IP Instances (PSARC/2006/366.) IP Instances introduced something very important that was previously absent: a pointer that was the context in which all of the networking code needed to run. For packet filtering, this impacts the interface at which packets are delivered to IPFilter. It also impacts the initialisation/teardown as IPFilter was changed to provide a seperate set of rules for each IP Instance. This one change, introducing context for IP, would have meant an incompatible change was required to the kernel interfaces published - not a position we would ever want to be in by design.
Life being what it is, the existing documentation for PSARC/2005/334 was written without taking into account the changes from PSARC/2006/336, so at this point in time, the documentation of the interfaces is not 100% correct in describing the function calls and parameter lists. Looking forward, we need to revisit this area and update internal drafts of documentation to better reflect these changes so that they can be included in the distribution and published at the appropriate point in time.
The biggest problem with PSARC/2005/334 as it existed going into PSARC review was how to manage multiple hooks for the same event, especially when hooks are given license to change the contents of a packet. The first proposal was to use a numbering system that implemented a priority mechanism, similar to what Linux uses today, but this was rejected. The current thoughts on how to proceed with this are to allow two different mechanisms:
- When registering a callback for a particular hook event, allow one of following attributes to be specified:
- first - put the callback at position #1 in the list
- last - put the callback at end of position the list
- before - require that this callback be before another callback
- after - require that this callback be after another callback
- Allow a command line utility to specify a more absolute ordering, such as "a,ipf,\*,y" that overrides the above
Without a final design that is accepted to answer this capability, it is not possible to promote the current API to being suitable for public use. This is one of the major issues that is outstanding today - only one callback can be registered for hooks such as inbound IP packets.
Kernel Socket Interface
At the time the first design was put togehter, there was no work being done on an interface to present an API for kernel socket programming, so some functionality that exists in user space via ioctls was implemented internally with direct function calls instead. An easy example of this is obtaining an IP address from a network interface. Fast forward to today where we do have a project underway to deliver Kernel Sockets, we need to reconsider whether or not it is necessary to retain that direct function call interface or discard it because it duplicates functionality.
Summary of the state of play today
So where are we at today?
Today we have a programming interface that exists in [Open]Solaris where the primary user is IPFilter. The API used is considered to be private (or internal), so strictly speaking it is not available to programmers outside of Solaris development. However, because it will result in various header files being shipped and that the source code can be easily observed, it seems like it is there, ready, waiting to be used.
At this point, the best we can say is that people should look at the PDF above, look at the header files and start to experiment with it, get a feel for how the interface works, how to integate it into your greater source code tree, etc, with the knowledge that whilst it may compile and work cleanly for you now, the programming interface will change, it will fail if another hooks beats you to registering and most importantly, it is unsupported, meaning if someone else outside of Sun has a problem (i.e panic) using it, Sun is not in any way obliged to either fix the problem or provide a patch.
Invitation to ISVs
If you're reading this blog entry and you work for an ISV that is intending to target Solaris for your product or you're working to build a product on top of Solaris and wish to make use of this interface at some point in the future, the best advice is to contact your local Sun sales office and get in contact with someone from our MDE team.
Discussion on changes
If you would like to either comment on the above design doucment for PSARC/2005/334, have some requirements that are currently not being met by this or planned future work, or anything else you'd like to discuss relating to this topic, please join us in the OpenSolaris networking community in the networking-discuss forum.