Wednesday Jan 16, 2008

DSCP IP Addresses

On the subject of Sun SPARC Enterprise M-Class server Domain-to-SCF Communication Protocol (I wrote about it yesterday), I received two questions about the IP addresses reserved for the internal DSCP network. The first has to do with reusing the IP addresses in multiple machines; the second has to do with the DSCP netmask.

DSCP IP Address Reuse

One customer emailed me to ask if it was OK to use the same DSCP IP address range on multiple machines in the same datacenter. The short answer is yes.

The only requirement for the DSCP addresses is that they not be used elsewhere on the external networks, either the network used by the SCF or by the Solaris domains. Clearly if you use the same IP addresses for some machines in the datacenter as are used for either the SCF or the Solaris domains, then the SCF and Solaris domains would not be able to connect to those other machines; requests to those IP addresses would be routed to the internal DSCP network, not to the external Ethernet ports.

It is perfectly acceptable to use the same DSCP addresses on every SPARC Enterprise machine in the datacenter. In our development lab, we had well over a dozen machines on a single subnet (both the SCF and the Solaris domains), and they all shared a common set of DSCP IP addresses. We operated that way for over a year without a problem.

We could have hard-coded the DSCP addresses, but it always seems whatever address you select, at least one customer is using that same address in their datacenter. So after consulting with manufacturing and Sun Service, we decided to leave the DSCP IP addresses entirely up to the customer.

DSCP Network Address and Netmask

Another customer raised an interesting question. They had a SPARC Enterprise machine which is capable of supporting multiple Solaris domains, but they were only configuring it into a single domain. When they tried to use setdscp on the SCF, they provided a network address and netmask that only contained two IP addresses, one for the SCF and one for the Solaris domain. setdscp doesn't allow that; it insists that the netmask be large enough to handle the maximum number of domains that the chassis can support. This is so that setdscp can compute IP addresses for the SCF and all possible Solaris domains.

But this customer was adamant that they could only spare two IP addresses for DSCP.

In actuality, setdscp can be invoked in three ways. The first method:

        setdscp -i address -m netmask
is the one recommended by the user documentation. This takes an IP network address and a netmask and computed IP addresses for the SCF and all the possible domains. This method is provided purely as a convenience. The network address and netmask are not used at all by DSCP, other than to compute individual IP addresses.

You can also invoke setdscp with no arguments, and it will prompt you for a network address and netmask and compute IP addresses for the SCF and all domains. In effect, this is the same as the first method, except it's interactive.

The third method of using setdscp is to manually assign the SCF and domain DSCP IP addresses one at a time using the format:

        setdscp -s -i address
        setdscp -d domain_id -i address
The first line sets the SCF's DSCP interface to a specific IP address. The second line sets each domain's DSCP interface IP address. You can invoke the second line once for each domain you plan on configuring.

Caveat: With the second approach, you can configure just some of the domains you plan on creating. For example, you may have an M9000, capable of 24 domains, but you only plan on using it as a single system, so you configure the SCF and domain 0's DSCP IP addresses. Later, though, you may decide to create a new domain 1. You can add boards to domain 1, but when you try to power it on, the poweron command will fail, because there's no DSCP address for domain 1. Last time I checked (this was in XCP1040) the failure occurred late in the boot process, after the poweron command had returned to the user saying the domain was being powered on. As a result, the only way to see why the domain didn't power on was to inspect the error logs. (Note: This behavior may have changed in later versions of the SCF firmware.)

Summary

In summary, you can use the same DSCP IP addresses for all Sun SPARC Enterprise M-Class machines in your datacenter. Given that, it should be rare that you run low on IP addresses for DSCP, but if you do, know that you can manually configure only the DSCP IP addresses you really need.

Tuesday Jan 15, 2008

DSCP: Policy Failure for the incoming packet

As a comment to my post DSCP: Domain to Service Processor Communication Protocol, Mike Beach provided the following comment:
        We are seeing ipsec messages in the system log regarding the DSCP addresses.    
        Is there an ipsec configuration option on the host that we should check?

        # dmesg |tail -1
        Jan 14 14:38:42 sco01b ip: [ID 372019 kern.error] ipsec_check_inbound_policy:   
        Policy Failure for the incoming packet (not secure); Source 192.168.037.026,    
        Destination 192.168.037.028.
My response to Mike was, "I think, in general, you can ignore the ipsec_check_inbound_policy messages. They are probably happening whenever the SCF is rebooted." But I wanted to provide a more complete response, and the tiny comment block wasn't the right place.

On a SPARC Entperprise M-class server, the SCF and Solaris use IPsec to authenticate each end of the connection when they set up a domain-to-SCF communication (DSCP) link. If for some reason the SCF reboots (during an SCF failover, when the SCF firmware is upgrade, or manually rebooted, for example), the SCF sends a reset message to Solaris to reset the TCP connection. The reset message is sent in-the-clear. When Solaris sees the message in-the-clear (that is, "not secure"), it logs the policy failure.

If you see this message only when Solaris or the SCF reboots, then it safely can be ignored. If you're seeing this at other times, or continuously logged every second, then contact your service engineer and escalate the problem to Sun.

There was a bug filed against Solaris about these messages (technically it was an RFE -- request for enhancement). In part, the bug says:

        The cause of the messages is IPsec, this system recieved a clear text packet
        but the IPsec policy on this system only allows IPsec encrypted packets, so the
        system discards the packet. To stop the system logging itself to death, there
        is a rate limiting function which will only log a message every
        ipsec_policy_log_interval milliseconds.

        The default value for ipsec_policy_log_interval is 1000 ( one second ).

        In certain configurations, messages like this are expected and after a while 
        too many messages will start to become anoying and fill up the messages file.

        This value can be tunned with ndd upto 999999 milliseconds ( just over 16 minutes )
        but can't actually be disabled. This is a request to allow the systems administrator
        to turn off these messages should they wish.
This was fixed in OpenSolaris build 37 and Solaris 10 Update 4, and allows you to specify an ipsec_policy_log_interval of 0 to turn off the logging altogether. (Caveat: I haven't actually tried the fix myself.)

Hope this provides enough information for you to understand what's going on when you see these messages.

Friday Jul 06, 2007

Disks, Disks, Everywhere

The Sun SPARC-Enterprise M9000-64 has internal support for up to 64 internal 2.5" SAS disks -- that's more disks than the Sun Fire X4500 (although, the X4500 does take up a lot less space). We were concerned that the system administrator might find it difficult to locate a disk, or map c0t0d0 to a physical disk.

To make service a bit easier, each physical disk bay in the Sun SPARC-Enterprise M-class line of servers has two LEDs: one for power, the other for fault indications. We used the fault LED in blinking mode to act as an indicator so the system administrator can identify and locate a specific disk.

The disk LEDs are accessed using cfgadm(1M). The cfgadm PCI plugin always supported LED manipulation, using a -x option. So we expanded the cfgadm SCSI plugin to use the same syntax. We wanted to add the ability to directly control the disk LEDs, so we added "-x led=LED[,mode=MODE]" (where LED could be power, fault, active or attn, just like PCI, and mode can be on, off or blink).

But we also felt there was a general problem being solved here, that of a "locator" indication. In the case of the M-class servers, the "locator" indication happens to be blinking the fault LED, but in the future, other platforms may have separate locator LEDs, or use some other method (for example, a separate locator LED, an annoying wable sound like a car alarm coming from the disk, or an LCD display on the from of the server that draws a hand pointing to the disk you want to remove). Our solution should not pre-suppose that the fault LED, or any LED for that matter, must be used as the locator.

So another -x option was added to the cfgadm SCSI plugin: "-x locator[={on|off}]". This allows a user to turn on or turn off the locator indication, regardless of the underlying implementation.

Here's a snippet from the new cfgadm_scsi(1M) man page showing the two new -x options:

 -x hardware_function   Some of the following  commands  can 
                        only  be  used with SCSI controllers
                        and some only with SCSI devices.

                        In the  following,  controller_ap_id
                        refers  to  an ap_id for a SCSI con-
                        troller,    for     example,     c0.
                        device_ap_id  refers to an ap_id for
                        a   SCSI   device,   for    example:
                        c0::dsk/c0dt3d0.

                        The  following   hardware   specific
                        functions are defined:

                        locator [=on|off] device_ap_id
                            Sets or gets the hard disk loca-
                            tor  LED,  if  it is provided by
                            the platform.  If  the  [on|off]
                            suboption  is not set, the state
                            of  the  hard  disk  locator  is
                            printed.

                        led[=LED,mode=on|off|blink]
                        device_ap_id
                            If  no  sub-arguments  are  set,
                            this  function  print  a list of
                            the  current  LED  settings.  If
                            sub-arguments   are   set,  this
                            function  sets  the  mode  of  a
                            specific LED for a slot.

To give you an idea of how this works, here's some annotated output from a Solaris session:

    Check the current state of the fault LED for disk c0t0d0; it is now off:
    # cfgadm -x led=fault c0::dsk/c0t0d0
    Disk                    Led
    c0t0d0                  fault=off
    
    Turn on the locator indication for disk c0t0d0:
    # cfgadm -x locator=on c0::dsk/c0t0d0
    # cfgadm -x locator c0::dsk/c0t0d0
    Disk                    Led
    c0t0d0                  locator=on
    
    Check the current state of the fault LED; it is now blinking:
    # cfgadm -x led=fault c0::dsk/c0t0d0
    Disk                    Led
    c0t0d0                  fault=blink
    
    Turn off the fault LED:
    # cfgadm -x led=fault c0::dsk/c0t0d0
    Disk                    Led
    c0t0d0                  fault=off
    # cfgadm -x led=fault,mode=off c0::dsk/c0t0d0
    
    And the locator indication is also off:
    # cfgadm -x locator c0::dsk/c0t0d0
    Disk                    Led
    c0t0d0                  locator=off
    

The intention is certainly that the system administrator would use '-x locator' to locate and replace internal SAS disks. But once SCSI FMA is supported in Solaris, we'll have the ability to set the fault LED on disks, and administrators can use '-x led' to view the state of the fault LED.

Thursday Jul 05, 2007

IO Box Attachment Point IDs

The new Sun External I/O Expansion Unit, or IO Box, supports PCI-X and PCI-Express hot plug. However, early on we realized that with the IO Box being remote from the host, it would be a challenge figuring out what the hot plug attachment points would be.

Normally, PCI card AP IDs (attachment point IDs) are labeled based on the physical location of the card -- its I/O Unit (IOU) and slot. For example, on Sun SPARC Enterprise M-class servers, the AP ID "iou#0-pci#1" is the slot PCI#1 on I/O Unit IOU#0.

With IO Box, however, there is no fixed physical location for the IO Box slots. The IO Box does connect to a host slot (like iou#0-pci#1), so one could label the AP IDs something like "iou#0-pci#1:iob.pci3" to show that it's PCI slot 3 in an IO Box attached to IOU#0-PCI#1.

On the other hand, this introduces issues when the IO Box is physically remote from the server -- it might not be obvious where this box conencts to the host. We don't want customers tracing cables, and a simple mistake could cause you to power off or power on the wrong slot. Something better, something more reliably was needed.

So we augmented the AP ID to include the serial id of the IO Box boat. With this approach, someone can look at an IO Box, write down the serial id and slot they wanted to power off, then go back to Solaris and power off that slot based on serial id. Similarly, if you powered off a slot and wanted to go remove the card, you can write down the serial id and slot, then go find the IO Box boat with matching serial id, and have confidence that you're removing the right card.

The resulting AP ID is a combination of physical location of the host slot, and serial id of the IO Box boat. An example of the AP ID format looks like this "iou#0-pci#4:iobE00E7.pcie1", which slows PCIE slot 1 in the IO Box boat with serial id ending in "E00E7". From the AP ID, it also is clear that the IO Box boat is connected to the host using a link card in host slot IOU#0-PCI#4.

I should also note that the IO Box boat serial id is prominently featured on handle, in plain view. There's no need to remove the boat to get to the product nameplate.

Here's some sample output from 'cfgadm -a' showing just the PCI slots:

    # cfgadm -a
    Ap_Id                          Type         Receptacle   Occupant     Condition
    ...
    iou#0-pci#0                    unknown      empty        unconfigured unknown
    iou#0-pci#1                    unknown      empty        unconfigured unknown
    iou#0-pci#2                    unknown      disconnected unconfigured unknown
    iou#0-pci#3                    pci-pci/hp   connected    configured   ok
    iou#0-pci#3:iobX00FC.pci1      unknown      empty        unconfigured unknown
    iou#0-pci#3:iobX00FC.pci2      fibre/hp     connected    configured   ok
    iou#0-pci#3:iobX00FC.pci3      scsi/hp      connected    configured   ok
    iou#0-pci#3:iobX00FC.pci4      unknown      empty        unconfigured unknown
    iou#0-pci#3:iobX00FC.pci5      fibre/hp     connected    configured   ok
    iou#0-pci#3:iobX00FC.pci6      unknown      empty        unconfigured unknown
    iou#0-pci#4                    pci-pci/hp   connected    configured   ok
    iou#0-pci#4:iobE00E7.pcie1     unknown      empty        unconfigured unknown
    iou#0-pci#4:iobE00E7.pcie2     etherne/hp   connected    configured   ok
    iou#0-pci#4:iobE00E7.pcie3     etherne/hp   connected    configured   ok
    iou#0-pci#4:iobE00E7.pcie4     pci-pci/hp   connected    configured   ok
    iou#0-pci#4:iobE00E7.pcie5     pci-pci/hp   connected    configured   ok
    iou#0-pci#4:iobE00E7.pcie6     unknown      empty        unconfigured unknown
    
In the above output, IOU#0-PCI#0 through IOU#0-PCI#4 are the host slots; IOU#0-PCI#4 is connected to a PCI-X IO Box boat, while IOU#0-PCI#4 is connected to a PCI-Express IO Box boat.

Tuesday Jul 03, 2007

Building a better IO Box

One of the most unique features of the Sun External I/O Expansion Unit, or IO Box, is the way it is managed.

We knew customers would not want an unmanaged, "black box" for external I/O. The system administrator needs access to the status of the IO Box. They also need to know when the IO Box fails, and why it is failing (and how that failure is affecting the rest of the system). And being able to light "locator" indicators to find one of hundreds of IO Boxes located remotely from the host server was critical for service. Providing a managed IO Box, with fault and status information available using standard protocols, and LEDs controlled by the host, is essential to provide a highly available system.

On the other hand, we didn't want the IO Box to be another system that the customer had to manage. The IO Box does not have a service processor; it doesn't have its own MIB; it doesn't need software upgrades; it doesn't have an ethernet port.

The IO Box is just a bunch of I/O slots available to the host; it shouldn't matter if they are located in the same chassis as, or several meters (or several dozen meters) away from, the CPUs and memory.

The IO Box is fully managed, as a part of the host, reporting status and faults back to the host to which it is connected. The host SP includes IO Box status as part of the overall system status. The IO Box does not require any special cables between the host and the box, other than the PCI-Express cables. Separate cabling would introduce the risk that a customer could cross-wire a box to the wrong host. And we wanted to make the wiring as simple as possible.

This is accomplished by using extra signals in the PCI-Express cable to implement a management connection between the host service processor (SP) and the IO Box. The standard PCI-Express connector has two pins for SMBus: SMCLK (B5) and SMDAT (B6). Using those two pins, the host (SP) is able to talk to a "slave" microcontroller on the link card. The microcontroller then uses spare signals in the cable to communicate in a reliable fashion with the microcontroller on the link card in the IO Box to forward requests from the host SP. The link card in the IO Box then uses SMBus on those same two pins B5 and B6, this time as "master", to access devices in the IO Box, reading or writing any device that the host SP requests. In effect, the I2C devices in the IO Box become "local" to the host service processor, even though the IO Box itself is remote. The microcontrollers on the link cards act as proxies.

With this arrangement, the host SP can retrieve environmental information about the IO Box: temperatures, fan speeds, voltages, currents, switch positions, etc. In addition, the host SP can control the IO Box, turning off power to a power supply unit so it can be removed, lighting the "locator" indicator so the IO Box can be found in the datacenter, etc. And when the IO Box experiences an error, the host SP can gather error information, and factor it into other host-detected errors to diagnose the fault.

IO Box management is entirely optional -- the IO Box as a standalone unit can function with no host management. But management by the host SP provides an added dimension of availability and serviceability which is not found on low-end I/O expansion units.

Friday Jun 29, 2007

IO Box Has Shipped!

As I mentioned in my blog XCP 1041 Now Available, Sun SPARC Enterprise M-class servers support an External I/O Expansion Unit, or IO Box. This week, IO Box started shipping to customers! IO Box is one of my pet projects.

The IO Box addresses a critical problem with previous generations of enterprise servers: The I/O-to-CPU ration was too low for some customer applications. On a Sun Fire 25K fully populated with I/O boards and CPU boards, there are 72 PCI slots, which is plenty for most customers. But with 72 CPU sockets filled with dual-core UltraSPARC IV CPUs, that yields a PCI-slot to CPU core ration of 1/2 -- one PCI slot for ever two CPU cores. The SPARC Enterprise M9000-64 has 128 PCI-Express slots, and while that may seem like a lot, with 64 dual-core SPARC64-VI CPU chips, that's still just 1 PCI-Express slot per core (and it will get worse when we pack more cores into each CPU chip). Some customers really care about I/O, and a higher I/O-to-CPU ratio is important.

The IO Box allows you to connect one PCI-Express slot in the host to an IO Boat, which has six additional, hot-plug-capable PCI-Express or PCI-X slots. Each IO Box can support up to two IO Boats (either PCI-Express or PCI-X), independently connected to the same host. The host-to-box link can either be copper (low cost, but short and bulky, so only really applicable if the server and the IO Box are in the same cabinet) or fibre optic (higher cost, but with 25m cable lengths you can locate the IO Boxes together in a separate cabinet from your servers).

There are other IO expansion units on the market, but the Sun version is really designed for the enterprise-class environment, with features like:

  • Fully redundant, hot-swappable power supplies.
  • Support for Sun's indicator standard, with LEDs for locating field replaceable units (FRUs), showing FRU power state, identifying FRUs that are ready to remove, and faults.
  • Ability to monitor IO Box internal voltages, currents, temperatures, LEDs, power state and switch settings from the host's Service Processor.
  • Host's ability to detect and diagnose IO Box faults seamlessly with other errors and faults in the host.
In sort, the IO Box is less a peripheral, and more like a part of the system; it just happens to be located several meters away. Disconnecting compute power and I/O only makes sense if you reconnect them virtually, so there's only one system to manage.

While IO Box is initially supported on the M-class servers, it's history predates the Sun/Fujitsu APL agreements, and will almost certainly be supported on other Sun products in the future. I've got some neat things I'd like to share about IO Box in future posts...

Wednesday May 16, 2007

DSCP: Domain to Service Processor Communication Protocol

The Sun SPARC Enterprise M-series servers feature a new approach to Solaris/Service Processor communication.

Shared memory has long been a common method for inter-processor communication. Back in the 1980's I worked on embedded systems that used shared RAM to allow a host processor to communicate with digital signal processors (DSPs). The shared RAM was partitioned into mailboxes, with a pair of mailboxes per "application". The host processor would place a command in the incoming mailbox for a given DSP application and signal an interrupt to that DSP. The DSP interrupt service routine would check the mailboxes, find the new command, give it to the application for processing, and place the response in the outgoing mailbox, then interrupt the host processor.

Sun Fire 6800/6900: Shared RAM Mailbox

Years later, when I came to work at Sun I found the same approach was used to allow the embedded service processor (aka, system controller) to send commands to Solaris and get responses back. For example, the Sun Fire 6800/6900 family of servers used this approach. Each domain (a group of UltraSPARC CPUs running a single instance of Solaris) had a separate shared memory with the service processor (SP). The interface between the SP and one Solaris domain looked something like this:

SP
 

Domain(s)
 
Application(s) Application(s)
|
ioctls
|
|
ioctls
|
Mailbox
Driver
  Mailbox
Driver
| |
Hardware
Shared RAM
Interrupts

While the architecture is efficient and effective, one weakness is that is doesn't scale with applications. As new applications are introduced (for example, Dynamic Reconfiguration or Fault Management), a new mailbox needs to be carved out, and a new protocol invented.

Sun Fire 15K/25K: Internal Ethernet

The Sun Fire 15000/25000 servers improved on the basic mailbox approach by running a separate Ethernet connection from the SP to each and every domain (actually, to every expander). The mailbox was used for low-level operation (running POST, booting to OpenBoot, simulating a serial console, etc). But the Ethernet connection, called the Maintenance Area Network, was used for application-level communication. This allowed applications such as Dynamic Reconfiguration to be developed using standard APIs such as sockets and multiplexed using TCP/IP, which helped ease their development. When a new application was rolled out, we didn't need to carve-out a new mailbox; we could just use a different TCP port. The one down side is the added cost and complexity of having a separate Ethernet subnet within the Sun Fire 15K chassis.

Sun SPARC Enterprise Approach

Sun SPARC Enterprise M-series decided the application scalability of the Maintenance Area Network was a benefit, but we wanted to achieve it without having to run a separate Ethernet network. The result is what we called DSCP -- the Domain to Service Processor Communication Protocol. DSCP provides IP communication between the SP and the domain without any new hardware; it uses a single mailbox in the shared RAM, and using a pseudo serial driver on top of that mailbox, we enable PPP (the Point-to-Point Protocol). The DSCP stack looks like this:

SP
 

Domain(s)
 
Application(s) Application(s)
| |
Sockets API   Sockets API
| |
ipv4 ipv4
| |
ppp ppp
| |
tty Driver tty Driver
(dm2s)
| |
Mailbox
Driver
  Mailbox
Driver
(scfd)
| |
Hardware
Shared RAM
Interrupts

Configuring DSCP

The Sun SPARC Enterprise Server Administration Guide explains how to set up DSCP, but it is really quite simple. The easiest method is using the syntax:
    setdscp -i NETWORK -m NETMASK
Choose a network address (be sure to pick a subnet that is not in use at your facility) and the corresponding netmask, and setdscp will do the rest. For example, in my lab the subnet 192.168.244.0 is unused, so I do:
    XSCF> setdscp -i 192.168.224.0 -m 255.255.255.0
There are other ways to set up the DSCP network addresses, but this is really the best approach.

setdscp will assign an IP address to the SP, and reserve one IP address for every possible domain (the M9000-64 supports 24 domains, so a maximum of 25 IP addresses are reserved). A common question that's asked is, if you're running PPP between the SP and each domain, don't you need to two addresses for each domain, one for the domain and one for the SP? No, not really. Since routing is done based on the destination address, we can get away with using the same IP address for the SP on every PPP link. So technically speaking, the NETWORK and NETMASK are not defining a DSCP subnet; they are defining a range of IP addresses from which DSCP selects endpoint addresses. A subtle difference, but still a difference.

On the SP, showdscp will display the IP addresses assigned to each domain and the SP, for example:

    XSCF> showdscp

    DSCP Configuration:

    Network: 192.168.224.0
    Netmask: 255.255.255.0

     Location     Address
    ----------   ---------
    XSCF         192.168.224.1
    Domain #00   192.168.224.2
    Domain #01   192.168.224.3
    Domain #02   192.168.224.4
    Domain #03   192.168.224.5
In Solaris, the prtdscp(1M) command will display the IP address of that domain and the SP (prtdscp is located in /usr/platform/SUNW,SPARC-Enterprise/sbin). You can get the same basic information from ifconfig sppp0:
    % /usr/platform/SUNW,SPARC-Enterprise/sbin/prtdscp
    Domain Address: 192.168.224.2
    SP Address: 192.168.224.1

    % ifconfig sppp0
    sppp0: flags=10010008d1 mtu 1500 index 3
            inet 192.168.224.2 --> 192.168.224.1 netmask ffffff00

Benefits

Plumbing IP between the Solaris domain and the SP brings the obvious benefit of standards-based communication -- networking applications "just work". For example, you can configure the SP as an NTP server and configure the Solaris domains to use NTP to synchronize their time with the SP, all using the internal DSCP network. You can even use ssh to connect to the SP from a Solaris domain using the DSCP network. Since the SP does not have a hostname on the DSCP network, you need to get the IP address using prtdscp, for example
    ssh `/usr/platform/SUNW,SPARC-Enterprise/sbin/prtdscp -s`
Personally, I create an alias sshsp with the above line.

On the SP side, you can't use ssh or scp directly -- they're not available in the XSCF shell. But you can use them indirectly. You can configure log archiving (see the setarchiving man page) to use one of the domains as an archive host:

    XSCF> setarchiving -t rjh@192.168.224.2:/home/rjh/archive
    XSCF> setarchiving enable
[I'm not sure it makes sense to use a domain as a log archive host -- a catastrophic failure with the system means you also lose your log archive host -- but it is technically possible.]

And when you need to take a snapshot of the system for diagnosis purposes (see snapshot man page), you can specify one of the domains as the snapshot host using the -t option, for example:

    XSCF> snapshot -l -t rjh@192.168.224.2:/home/rjh/snap
    Downloading Public Key from '192.168.224.2'...
    Public Key Fingerprint: 44:9a:ad:55:2e:33:99:2e:fd:b7:47:74:de:ad:be:ef
    Accept this public key (yes/no)? yes
    Enter ssh password for user 'rjh' on host '192.168.224.2':
    Setting up ssh connection to rjh@192.168.224.2...
    Collecting data into rjh@192.168.224.2:/home/rjh/snap/mymachine_10.4.55.144_2007-05-07T19-39-40.zip
    Data collection complete
If your domain has internet access or a DVD burner, this might be the easiest way to get a snapshot back to a Sun Service Engineer.

Security

One of the most important security goals with the DSCP design was: Ensure that if one Solaris domain is compromised, that an attacker would not be able to affect the SP or another domain in the same chassis. This primary security requirement drove most of the DSCP design approach.

Using PPP provides an added security benefit. Each shared RAM mailbox represents a single PPP connection between Solaris and the SP. This means there is no opportunity for one domain to snoop the traffic between another domain and the SP, and no way for one domain to directly attach another domain using the DSCP network. There is also no routing between DSCP networks (or from DSCP to Ethernet or vice versa) on the SP. The communication paths of each domain are physically isolated.

Most of the protocols used on the Sun SPARC Enterprise servers place the client on the SP and the server on the domain. This means that the SP does not need to open up well-known ports for incoming connections, reducing the opportunity for attacks. Furthermore, the severs running in Solaris use IPsec to authenticate that incoming connections are coming from the SP.

To prevent the domain from attacking the SP, several methods are used. First, all of the authentication and authorization protocols employed for Ethernet users are in place for the DSCP networks. There is no DSCP "back door", so to speak. Further, the SP employs a firewall that blocks all the ports on the DSCP networks except a couple -- ssh and ntp. There are additional features in place, for example, bandwidth limiting to prevent denial-of-service attacks.

Summary

The Domain to Service Processor Communication Protocol enables IP-based communication between the Solaris domain and the SP, in a secure fashion, which enables standards-compliant applications such as ssh and ntp to "just work" between the SP and Solaris domains.
About

Bob Hueston

Search

Top Tags
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today