I ran into an interesting problem when installing Oracle RAC in a system that included SPARC Enterprise M5000 Servers and a Sun ZFS Storage 7x20 Appliance, illustrated in the following diagram:
When configuring the shared storage for Oracle RAC, you may decide to use NFS for Data Files. In this case, you must set up the NFS mounts on the storage appliance to allow root access from all of the RAC clients. This allows files being created from the RAC nodes to be owned by root on the mounted NFS filesystems, rather than an anonymous user, which is the default behavior.
In a default configuration, a Solaris NFS server maps "root" access to "nobody". This can be overridden as stated on the share_nfs(1M) man page:
Only root users from the hosts specified in access_list will have root access... By default, no host has root access, so root users are mapped to an anonymous user ID...
Example: The following will give root read-write permissions to hostb:share -F nfs -o ro=hosta,rw=hostb,root=hostb /var
The Sun ZFS Storage 7x20 Appliance features a browser user interface (BUI), "a graphical tool for administration of the appliance. The BUI provides an intuitive environment for administration tasks, visualizing concepts, and analyzing performance data." The following clicks can be used to allow root on the clients mounting to the storage to be considered root:
Go the the "Shares" page
Repeat steps 3-8 for each RAC node. Repeat steps 1-8 for every share that will be used for RAC shared storage.
More intuitive readers, after reviewing the network diagram and the screenshot of the S7420 NFS exceptions screen, may immediately observe that it was a mistake to enter the hostname of the RAC nodes associated with the gigabit WAN network. In hindsight, this was an obvious mistake, but at the time I was entering the data, I simply entered the names of the machines, which did not strike me as a "trick question".
The next step is to configure the RAC nodes as NFS clients. After the shares have been set up on the Sun ZFS Storage Appliance, the next step is to mount the shares onto the RAC nodes. For each RAC node, update the /etc/vfstab file on each node with an entry similar to the following:
nfs_server:/vol/DATA/oradata /u02/oradata nfs rw,bg,hard,nointr,rsize=32768,wsize=32768,proto=tcp,noac,forcedirectio, vers=3,suid
Here's a tip of the hat to Pythian's "Installing 11gR2 Grid Infrastructure in 5 Easy Lessons":
Lesson #3: Grid is very picky and somewhat uninformative about its NFS support
Like an annoying girlfriend, the installer seems to say “Why should I tell you what’s the problem? If you really loved me, you’d know what you did wrong!”
You need to trace the installer to find out what exactly it doesn’t like about your configuration.
Running the installer normally, the error message is:
[FATAL] [INS-41321] Invalid Oracle Cluster Registry (OCR) location.
CAUSE: The installer detects that the storage type of the location (/cmsstgdb/crs/ocr/ocr1) is not supported for Oracle Cluster Registry.
ACTION: Provide a supported storage location for the Oracle Cluster Registry.
OK, so Oracle says the storage is not supported, but I know that ... NFS is support just fine. This means I used the wrong parameters for the NFS mounts. But when I check my vfstab and /etc/mount, everything looks A-OK. Can Oracle tell me what exactly bothers it?
It can. If you run the silent install by adding the following flags to the command line:
If you get past this stage, it is clear sailing up until you run "root.sh" near the end of the Grid Installation, which is the stage that will fail if the root user's files are mapped to anonymous.
So now, I will finally get to the piece to the puzzle that I found to be perplexing. Remember that in my configuration (see diagram, above) each RAC node has two potential paths to the Sun ZFS storage appliance, one path via the router that is connected to the corporate WAN, and one path via the private 10 gigabit storage network. When I accessed the NAS storage via the storage network, root was always mapped to nobody, despite my best efforts. While trying to debug, I discovered that when I accessed the NAS storage via the corporate WAN network, root was mapped to root:
# ping -s s7420-10g0 1 1
PING s7420-10g0: 1 data bytes
9 bytes from s7420-10g0 (192.168.42.15): icmp_seq=0.
# ping -s s7420-wan 1 1
PING s7420-wan: 1 data bytes
9 bytes from s7420-wan (10.1.1.15): icmp_seq=0.
# nfsstat -m
/S7420/OraData_WAN from s7420-wan:/export/OraData
Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60
/S7420/OraData_10gbe from s7420-10g0:/export/OraData
Attr cache: acregmin=3,acregmax=60,acdirmin=30,acdirmax=60
# touch /S7420/OraData_10gbe/foo1
# touch /S7420/OraData_WAN/foo2
# touch /net/s7420-10g0/export/OraData/foo3
# touch /net/s7420-wan/export/OraData/foo4
# touch /net/192.168.42.15/export/OraData/foo5
# touch /net/10.1.1.15/export/OraData/foo6
# ls -l /S7420/OraData_10gbe/foo*
-rw-r--r-- 1 nobody nobody 0 Sep 20 12:54 /S7420/OraData_10gbe/foo1
-rw-r--r-- 1 root root 0 Sep 20 12:54 /S7420/OraData_10gbe/foo2
-rw-r--r-- 1 nobody nobody 0 Sep 20 12:55 /S7420/OraData_10gbe/foo3
-rw-r--r-- 1 root root 0 Sep 20 12:56 /S7420/OraData_10gbe/foo4
-rw-r--r-- 1 nobody nobody 0 Sep 20 12:58 /S7420/OraData_10gbe/foo5
-rw-r--r-- 1 root root 0 Sep 20 13:04 /S7420/OraData_10gbe/foo6
Having discovered that when I accessed the NAS storage via the storage network, root was always mapped to nobody, but that when I accessed the NAS storage via the corporate WAN network, root was mapped to root, I investigated the mounts on the S7420:
# ssh osm04 -l root
Last login: Thu Sep 15 22:46:46 2011 from 192.168.42.11
Executing shell commands may invalidate your service contract. Continue? (Y/N)
Executing raw shell; "exit" to return to appliance shell ...
| You are entering the operating system shell. By confirming this action in |
| the appliance shell you have agreed that THIS ACTION MAY VOID ANY SUPPORT |
s7420# showmount -a | grep OraData
When I saw the "showmount" output, the lightbulb in my brain turned on and I understood the problem: I had entered the node names associated with the WAN, rather than node names associated with the private storage network. When NFS packets were arriving from the corporate WAN, the S7420 was using DNS to resolve WAN IP addresses into the WAN hostnames, which matched the hostnames that I had entered into the S7420 NFS Exception form. In contrast, when NFS packets were arriving from the 10 gigabit private storage network, the system was not able to resolve the IP address into hostname because the private storage network data did not exist in DNS. Even if the name resolution was successful, if would have been necessary to enter the the node names associated private storage area network into the S7420 NFS Exceptions form.
Several solutions spring to mind: (1) On a typical Solaris NFS server, I would have enabled name resolution of the 10 gigabit private storage network addresses by adding entries to /etc/hosts, and used those node names for the NFS root access. This was not possible because on the appliance, /etc is mounted as read-only. (2) It occurred to me to enter the IP addresses into the S7420 NFS exceptions form, but the BUI would only accept hostnames. (3) One potential solution is to put the private 10 gigabit IP addresses into the corporate DNS server. (4) Instead, I chose to give root read-write permissions to all clients on the 10 gigabit private storage network:
Now, the RAC installation will be able to complete successfully with RAC nodes accessing the Sun ZFS Storage 7x20 Appliance via the private 10 gigabit storage network.