X

News, tips, partners, and perspectives for the Oracle Linux operating system and upstream Linux kernel work

Using RDS with ipv6 Networks

RDS is the open source Reliable Datagram Sockets protocol developed by Oracle and contributed to the Linux kernel. Kernel developer Ka-Cheong Poon writes the following about his work to make the RDS code ipv6 capable. 

 

Currently, RDS can only work between peers using IPv4 address. As IPv6 deployment is increasing around the world, the need to have RDS working between peers using IPv6 address is becoming more and more important. We have updated Oracle Linux RDS to support IPv6 to meet the needs. This article explains how an existing application written in C which utilizes RDS can be changed to support IPv6. RFC 3493 can be consulted for the basic IPv6 API. And RFC 4291 explains the IPv6 addressing architecture. We have also updated the rds-tools package to support IPv6.  Some usage examples are given below.

RDS IPv6 support is designed to ensure that minimal changes are needed to modify an application to support IPv6.  As before, the following call creates an RDS socket

        int sd;
        sd = socket(AF_RDS, SOCK_SEQPACKET, 0);

The socket created can be used to communicate with peer using either IPv4 or IPv6 address. This is a bit different from, say a TCP socket, which requires specifying the address family when a socket is created.  So when will this RDS socket be changed to be an IPv6 RDS socket? Before explaining that, let's define the following union to make address handling illustration easier

union sockaddr_ip {
        struct sockaddr_in      addr4;
        struct sockaddr_in6     addr6;
};

This union can be used to store either an IPv4 or an IPv6 address. Suppose an application needs to use an user supplied local address to communicate with an user supplied peer address, it can use the following code to parse the supplied buffers

        char *user_suuplied_laddr, *user_supplied_paddr;
        struct addrinfo *ainfo;
        union sockaddr_ip local_addr, peer_addr;
        socklen_t local_addrlen, peer_addrlen;

 

        if (getaddrinfo(user_supplied_laddr, NULL, NULL, &ainfo) != 0) {
                /* Error handling code */
        } else {
                /* Just use the first one returned. */
                local_addrlen = ainfo->ai_addrlen;
                memcpy(&local_addr, ainfo->ai_addr, local_addrlen);
                freeaddrinfo(ainfo);
                ...
        }
        if (getaddrinfo(user_supplied_paddr, NULL, NULL, &ainfo) != 0) {
                /* Error handling code */
        } else {
                /* Just use the first one returned. */
                peer_addrlen = ainfo->ai_addrlen;
                memcpy(&dst_addr, ainfo->ai_addr, peer_addrlen);
                freeaddrinfo(ainfo);
                ...
        }
        /* The following checks for address family mismatched.  Note that the
         * address family field of all socket address structures are at the same
         * position.  Hence we can use sin_family to do the check.
         */
        if (local_addr.addr4.sin_family != peer_addr.addr4.sin_family) {
                /* Error handling code */
        }

The getaddrinfo(3) function understands both IPv4 and IPv6 addresses and fills in the address with correct information.

Before communicating with the peer, an RDS socket is required to be bound first.  This can be done by the following

        if (bind(sd, (struct sockaddr *)local_addr, local_addrlen) != 0) {
                /* Error handling code */
        }

If the user has supplied an IPv4 address, the RDS socket becomes an IPv4 RDS socket.  But if the user has supplied an IPv6 address, the RDS socket becomes an IPv6 socket. So an RDS socket's family is fixed at binding time.  Note also that an RDS socket can only be bound once. A second bind() will fail. This means that once the address family of an RDS socket is set, it cannot be changed. And an RDS socket can only communicate with a peer in the same family. This is the reason for the family mismatched check above.

The application can now talk to the peer as following

        char *msg;
        size_t msg_len;

 

        if (sendto(sd, msg, msg_len, 0, (struct sockaddr *)peer_addr, peer_addrlen) < 0) {
                /* Error handling code */
        }

If the application wants to use the same socket to talk to another peer, it can do so too. Just be sure that the new peer address must be in the same family as the bound address. As before, getsockname() can be used to obtain the socket bound address like

        union sockaddr_ip ipaddr;
        socklen_t addrlen;

 

        if (getsockname(sd, (struct sockaddr *)ipaddr, &addrlen) < 0) {
                /* Error handling code */
        }

If the socket is not yet bound, the returned address family is set to AF_UNSPEC.  This can be used to determine if the address is bound or not.  Note the use of union sockaddr_ip here as the socket can be bound to either an IPv4 or IPv6 address.

In summary, changing an RDS application to use IPv6 address only requires changing the socket address structure used to store an IP address and the code used to fill in such structures.  Once the address structure is changed, all the other socket calls are pretty much the same as before.

rds-tools package

The rds-tools package includes three tools, rds-info, rds-ping and rds-stress.  All of them are updated to support IPv6.  Followings are examples of their usage.

rds-ping

Suppose in host A,

[host-a]> ip addr show ib0
9: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast state UP qlen 4096
    link/infiniband 80:00:02:08:fe:80:00:00:00:00:00:00:00:02:c9:03:00:0a:79:fd brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
    inet6 2010:211::12/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::202:c903:a:79fd/64 scope link
       valid_lft forever preferred_lft forever
And in host B,
[host-b]> ip addr show ib0
6: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast state UP qlen 4096
    link/infiniband 80:00:02:08:fe:80:00:00:00:00:00:00:00:02:c9:03:00:0a:75:a5 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
    inet6 2010:211::22/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::202:c903:a:75a5/64 scope link
       valid_lft forever preferred_lft forever

We can use rds-ping to test RDS IPv6 reachability between them by doing

[host-a]# rds-ping fe80::202:c903:a:75a5%ib0
   1: 160 usec
   2: 155 usec
   ...

fe80::202:c903:a:75a5 is host B's IPv6 link local address.  Since link local address needs to be associated with an interface, we need to put %ib0 (or %9 in this case) after the address to specify that we want to use ib0 to reach that link local address.  And from host B, we can also do

[host-b]# rds-ping 2010:211::12
   1: 162 usec
   2: 153 usec
   ...

In this case, we test host A's global address.  There is no need to specify the interface used for IPv6 global address.

rds-info

There is no change in the default rds-info output.  This means that no IPv6 connection information is displayed by default to ensure backward compatibility.  To display IPv6 connection information, we can use the "-a" option. This option shows both IPv4 and IPv6 information. We continue to use the rds-ping example.  After doing the above in host A, we can use rds-info to check RDS connection status

[host-a]# /tmp/rds-info -Ina

RDS IB Connections:
                            LocalAddr                            RemoteAddr  Tos  SL                         LocalDev                        RemoteDev
                         2010:211::12                          2010:211::22    0   0              fe80::2:c903:a:79fd              fe80::2:c903:a:75a5
                fe80::202:c903:a:79fd                 fe80::202:c903:a:75a5    0   0              fe80::2:c903:a:79fd              fe80::2:c903:a:75a5

 
RDS Connections:
                            LocalAddr                            RemoteAddr  Tos           NextTX           NextRX Flgs
                         2010:211::12                          2010:211::22    0                5                5 --C-
                fe80::202:c903:a:79fd                 fe80::202:c903:a:75a5    0               11               11 --C-

The above shows that there are two RDS connections in host A.  One is the connection between host A and B's global addresses and the other is between host A and B's link local addresses. And because they are using the InfiniBand interfaces, there are also two IB connections. Without the "-a" option, nothing will be shown as there is only IPv6 information.

rds-stress

The address options "-s" and "-r" can now take IPv6 address. For example,

[host-a]> rds-stress -r 2010:211::12
waiting for incoming connection on 2010:211::12:4000
accepted connection from 2010:211::22::41735
negotiated options, tasks will start in 2 seconds
Starting up....
  ...
[host-b]> rds-stress -r 2010:211::22 -s 2010:211::12
connecting to 2010:211::12:4000
negotiated options, tasks will start in 2 seconds
Starting up....
  ...

The updated rds-stress can communicate with both old and updated rds-stress.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.