TCP/IP overview on Solaris
By sameers on May 10, 2006
The write-up gives overview of the TCP/IP stack implementation on Solaris. The discussion starts with some of the stream's concepts like queues, message etc.,. The discussion is extended to explaining how TCP/IP is build as modules to fit in the stream's framework. We will thereafter see how packet's traverse up and down stream in the stack as messages. There is slight discussion on the SYNC queue(stream framework) and how it is used for asynchronously processing TCP/IP messages. Then there is a small discussion on the service routines that would process the messages on the queue's message queue in case it is not able to pass on the message to the next module. We have small discussion on the Fire-engine design related squeue processing. Finally we will see how the normal & TCP urgent data(OOB data) processed at TCP, stream head & sockfs level. The writeup is completely based on the knowledge gained as a result of experience, code browsing and reading some documents. I'd say that the write-up gives fair idea of the TCP/IP implementation on Solaris with chances of errors in my understanding of the subject. There is a great scope of modifications and comments are always welcome. For better understanding of the stream concepts, please refer the book on streams programming guide available at docs.sun.com. For better understanding of TCP/IP concepts, please refer TCP/IP illustrated vol. 1 by W. Richard Stevens .
Please find the complete writeup here...
Important kernel data structures related to TCP-IP and kernel stream's/squeue's and their linking can be found here...
some of the commands to debug TCP/IP on Solaris using 'scat' crash analyser
On Solaris 10: scat is a debugger which is used to analyse kernel live memory or crash dump. I'd like to introduce some of the scat commands that can be used to debug stream's and TCP/IP stack as well.
scat> stream -s
the output of the command will give all the streams currently active on the system. Each entry will have stream pointer, modules stacked on the stream, messages in the queue/syncq, and QFULL information.
scat> stream findproc [stream address]
From 'stream -s' output we can get the stream address. We can use stream address to find the process to which this stream belongs. The output contains proess name, file descriptor associated with the stream, vnode address associated with the stream.
scat> sdump [stream vnode address] sonode
from the vnode address associated with the stream(from stream findproc output), we can get to the sonode structure. sonode contains all the information about the socket.
scat> stream -l [stream address]
We can use stream address to find out details of the stream. This command gives detailed information about all the modules stacked on the stream, details of queue_t structure for each module(stream head also), list all the messages completely queued on the queue, state of all the modules queue and stream head. q_ptr field of the queue_t structure points to the private data for the module. For TCP module, q->q_ptr points to conn_s structure. conn_s is the connection structure for the TCP containing all conneciton specific information. conn_tcp field of conn_s structure points to the tcp_t structure for the TCP connection. for all these we need TCP's queue_t address which we can get from stream -l output. So, we can use sdump command of scat to debug queue_t, conn_s, tcp_t structures in the following way -
for queue_t structure
for TCP conn_s structure
scat> sdump [TCP queue address] queue_t q_ptr
this will give us address of conn_s structure for the TCP
scat> sdump [conn_s address] conn_s
here we get the dump of the conn_s structure for the TCP.
for TCP tcp_t structure
scat> sdump [conn_s address] conn_s conn_tcp
this will give us address of tcp_t structure for the TCP
scat> sdump [tcp_t address] tcp_t
here we get the dump of the tcp_t structure for the TCP.
Note: in the entire write-up queue maps to queue_t, message block maps to mblk_t, data block maps dblk_t. I've not used actual fields of the data structures but just the names which denote the fields. Like there is mention of readp & writep for message block which corresponds to read pointer and write pointer of the mblk_t. So, opensolaris.org is the best place to brows the source code.