Saturday Apr 23, 2016

Solaris 11: Few Random Commands

Logical Domains
Dependencies & Dependents
Network & Power Consumption Stats
HBA Listing

Dependencies

In the following example, dummy domain depends on two domains -- primary domain for virtual disks and virtual network devices, and dom2 for virtual disks and SR-IOV virtual functions (VF).

# ldm list-dependencies dummy
DOMAIN            DEPENDENCY        TYPE
dummy             primary           VDISK,VNET
                  dom2              VDISK,IOV

-l option displays the actual devices.

Dependents

-r option in list-dependencies sub-command shows the dependents for the logical domain(s). -l option displays the actual devices in here too.

eg.,

# ldm list-dependencies -r -l dom2
DOMAIN            DEPENDENT         TYPE           DEVICE
dom2              dummy             VDISK          primary-vds0/vdisk0:vol_dummy_disk0
                                    VDISK          service-vds0/vdisk0:vol_dummy_disk0
                                    IOV            /SYS/IOU1/PCIE14/IOVIB.PF0.VF1
                                    IOV            /SYS/IOU1/EMS4/CARD/NET0/IOVNET.PF0.VF1

Network Statistics

list-netstat sub-command in ldm utility displays the statistics for all the network devices that are configured in the domain(s).

eg.,

# ldm ls-netstat dom2
DOMAIN
dom2

NAME               IPACKETS     RBYTES       OPACKETS     OBYTES
----               --------     ------       --------     ------
net4               444.42M      61.27G       444.42M      7.12G
net1               13.20M       4989.22M     0            0
net0               0            0            0            0
..
ldoms-net0.vf1     37.62M       6.44G        49.15K       3.97M
ldoms-net0.vf0     523.34K      90.15M       62.85K       14.67M
ldoms-net1.vf15    0            0            0            0
..

List HBAs

list-hba sub-command in ldm utility lists out the physical SCSI HBA initiator ports in the target domain. -t option shows SCSI transport medium type such as FC or SAS.

eg.,

# ldm ls-hba -t  primary
NAME                                                 VSAN
----                                                 ----
/SYS/IOU0/EMS1/CARD/SCSI/HBA0/PORT1
        init-port w5080020000016331
        Transport Protocol SAS
/SYS/IOU0/EMS1/CARD/SCSI/HBA0/PORT2
        init-port w5080020000016331
        Transport Protocol SAS
...

Check the latest man page for ldm(1M) for the complete list of options for those sub-commands.

Power Consumption Statistics

ldmpower command shows the breakdown of power consumed by processors and memory in different domains or in a given domain. Note that ldmpower is an independent command, not part of ldm utility.

eg.,

# ldmpower -c processors -c memory -l primary
Processor Power Consumption in Watts
DOMAIN          15_SEC_AVG      30_SEC_AVG      60_SEC_AVG
primary         313             315             317

Memory Power Consumption in Watts
DOMAIN          15_SEC_AVG      30_SEC_AVG      60_SEC_AVG
primary         594             595             595

As usual, check the man page ldmpower(1M) for details.


PCI Devices

Scan & List

scanpci utility scans PCI buses and reports configuration settings for each PCI device that it finds.

eg.,

# scanpci -v

pci bus 0x0002 cardnum 0x06 function 0x00: vendor 0x111d device 0x80ba
 Integrated Device Technology, Inc. [IDT] Device unknown
 CardVendor 0x0000 card 0x0000 (Card unknown)
  STATUS    0x0010  COMMAND 0x0147
  CLASS     0x06 0x04 0x00  REVISION 0x03
  BIST      0x00  HEADER 0x01  LATENCY 0x00  CACHE 0x10
  MAX_LAT   0x00  MIN_GNT 0x03  INT_PIN 0x00  INT_LINE 0x00
  Bus: primary=02, secondary=03, subordinate=72, sec-latency=0
  I/O behind bridge: 00000000-03ffffff
  Memory behind bridge: 00100000-2000ffff
  Prefetchable memory behind bridge: 100000000-777f0ffff

pci bus 0x0003 cardnum 0x00 function 0x00: vendor 0x8086 device 0x1521
 Intel Corporation I350 Gigabit Network Connection
 CardVendor 0x108e card 0x7b18 (Oracle/SUN, Quad Port GbE PCIe 2.0 Low Profile Adapter, UTP)
  STATUS    0x0010  COMMAND 0x0146
  CLASS     0x02 0x00 0x00  REVISION 0x01
  BIST      0x00  HEADER 0x80  LATENCY 0x00  CACHE 0x10
  BASE0     0x00100000 SIZE 1048576  MEM
  BASE3     0x00200000 SIZE 16384  MEM
  BASEROM   0x00280000 SIZE 524288
  MAX_LAT   0x00  MIN_GNT 0x00  INT_PIN 0x01  INT_LINE 0x00

...

SRU
IDRs

Detect & Show Installed

Identifying Solaris 11 SRU level is one of the popular topics and it was covered in Solaris 11 documentation, My Oracle Support (MOS) and elsewhere by numerous bloggers. After all this, we still have to wonder why there isn't a conscious effort by the development team that owns pkg utility to make this a bit easier for normal human beings. In any case, pkg list entire or pkg info entire commands show a cryptic version string with a bunch of information encoded including the SRU level.

The second numeral in "branch" version represents the Solaris update whereas the third numeral is the SRU.

eg.,

In the following example, Solaris update is 3 (running Solaris 11 - hence it is a Solaris 11.3 system) and current SRU is 5. (I underlined every alternate field)

# pkg info entire | grep -i branch
        Branch: 0.175.3.5.0.6.0

IDRs

IDRs (Interim Diagnostic Reliefs) are in reality package updates whose names usually start with "idr" - therefore, just checking for the string "idr" in the list of installed packages on the system is sufficient to list out what IDRs were installed on the target system.

# pkg list idr*
pkg list: No packages matching 'idr*' installed <-- no IDRs installed

(Ticket Stub CSS Theme Credit: Kia Skretteberg)

Saturday Mar 26, 2016

Programming in C: Few Tidbits #6

[1] Case-insensitive String Comparison

strcasecmp() is the case-insensitive version of strcmp(). When building, make sure to include strings.h (note the plural form).

strncasecmp() is the counterpart to strncmp().

eg.,
..
#include <string.h>
void main(int argc, char** argv)
	..
        printf("\n\"%s\" and \"%s\" are ", argv[1], argv[2]);
        strcasecmp(argv[1], argv[2]) ? printf(" .. not identical") : printf(" .. identical");
        ..

% ./strcompare next NeXT
"next" and "NeXT" are  .. identical

[2] Initializing a Variable Length Array

As of the length of the variable length array is not known to the compiler at compile time, initializing a variable length automatic array with some default value usually results in a compilation error.

eg.,
% cat -n varlen.c
     1  #include <stdlib.h>
     2
     3  void main(int argc, char** argv) {
     4          int size=atoi(argv[1]);
     5          int array[size] = { 0 };
     6  }
/home/gmandali/C % cc -o varlen varlen.c
"varlen.c", line 5: variable length array can not be initialized: array
cc: acomp failed for varlen.c

One option is to explicitly initialize each element in the array to the desired default value.

eg.,
% cat -n varlen.c
     1  #include <stdlib.h>
     2
     3  void main(int argc, char** argv) {
     4          int size=atoi(argv[1]);
     5          int array[size];
     6          for (int i = 0; i < size; ++i)
     7                  array[i] = 0;
     8  }

% cc -o varlen varlen.c
%

Another option is to rely on memset().

eg.,
% cat -n varlen.c
     1  #include <stdlib.h>
     2  #include <string.h>
     3
     4  void main(int argc, char** argv) {
     5          int size=atoi(argv[1]);
     6          int array[size];
     7          memset(array, 0, sizeof(array));
     8  }

% cc -o varlen varlen.c
%

In this case, sizeof operator evaluates the operand (variable length array).


[3] Check if a Process is Alive

kill() function can be used to send a null signal (signal 0) to check the validity of a process.

eg.,
% cat -n chkproc.c
     1  #include <sys/types.h>
     2  #include <signal.h>
     3  #include <stdlib.h>
     4  #include <stdio.h>
     5
     6  void main(int argc, char **argv) {
     7          pid_t pid = atoi( argv[1] );
     8          int rc = kill( pid, 0 );
     9          if ( !rc ) {
    10                  printf("process with pid ( %d ) is alive", pid);
    11          } else {
    12                  printf("process with pid ( %d ) is not found", pid);
    13          }
    14  }

% cc -o chkproc chkproc.c

% ps
  PID TTY         TIME CMD
19846 pts/25      0:00 ps
28978 pts/25      0:00 bash

% ./chkproc 28978
process with pid ( 28978 ) is alive

% ./chkproc 28979
process with pid ( 28979 ) is not found

pid_t is actually int data type.


[4] Evaluating "( integer )" Expression

This is a simple one but probably an important one to remember. When dealing with an expression that yields an integer or when testing an integer, only a value of zero evaluates to false. Everything else including negative numbers, floating point numbers, characters, strings, .. are evaluated to true. In other words, anything non-zero is true.

eg.,
% cat -n ifexpr.c
     1  #include <stdio.h>
     2  #include <stdlib.h>
     3  #include <ctype.h>
     4
     5  void main(int argc, char **argv) {
     6          printf("\n[ %10d ] : %s",  0, (0) ? "true" : "false" );
     7          printf("\n[ %10d ] : %s",  1, (1) ? "true" : "false" );
     8          printf("\n[ %10d ] : %s",  -2, (-2) ? "true" : "false" );
     9          printf("\n[ %10s ] : %s",  "dummy", ("dummy") ? "true" : "false" );
    10          printf("\n[ %10c ] : %s",  'C', ('C') ? "true" : "false" );
    11          printf("\n[ %10f ] : %s",  2.34, (2.34) ? "true" : "false" );
    12          printf("\n[ %10f ] : %s\n",  0.0000, (0.0000) ? "true" : "false" );
    13  }

% cc -o ifexpr ifexpr.c
% ./ifexpr

[          0 ] : false
[          1 ] : true
[         -2 ] : true
[      dummy ] : true
[          C ] : true
[   2.340000 ] : true
[   0.000000 ] : false

[5] Variable Declaration in a switch { case: } Statement

Let's start with an example that shows compilation failure.

% cat -n varincase.c
     1  #include 
     2
     3  void main() {
     4          switch ( 1 ) {
     5                  case 2:
     6                          int i = 15;
     7                          break;
     8                  default:
     9                          printf("default\n");
    10          }
    11  }

% cc -o varincase varincase.c
"varincase.c", line 6: syntax error before or at: int
"varincase.c", line 7: undefined symbol: i
cc: acomp failed for varincase.c

In this example, identifier i is being declared and initialized in the case label of switch statement. Since no linkage or no static scope was specified, identifier i will have automatic storage duration that is local to the block containing the invocation (switch() { .. }, that is).

However the problem with this code is that case <expression> and default are labels and the controlling expression in switch statement causes the control to jump to appropriate label. In the example, when the control is transferred to default label, it enters the scope of identifier i without initializing it [as it bypasses the case where the identifier is being declared and initialized], which is not permitted. Therefore the fix is to declare the identifier (if at all needed) in its own compound statement so the scope is limited to that compound statement. (A compond statement is a block of code.)

Adding curly braces around case 2: statements will make the code compile and run as shown below.

% cat -n varincase.c
     1  #include <stdio.h>
     2
     3  void main() {
     4          switch ( 1 ) {
     5                  case 2:
     6                  {
     7                          int i = 15;
     8                          break;
     9                  }
    10                  default:
    11                          printf("default\n");
    12          }
    13  }

% cc -o varincase varincase.c

% ./varincase
default

PS:
Legible version of this post @http://technopark02.blogspot.com/2016/03/programming-in-c-few-tidbits-6.html

Wednesday Dec 30, 2015

[Solaris] Memory Blacklisting, Duplicate IP Address & Recovery, Group Package Installations, ..

-1-


Memory blacklist operation

To check if memory blacklist operation by LDoms Manager (ldm) is in progress, run:

echo "zeus ::print -a zeus_t mem_blacklist_inprog" | mdb -p `pgrep ldmd`

If no blacklist operation is in progress, the above may return output that is similar to:

<hex-address> mem_blacklist_inprog = 0 (false)

When a memory blacklist operation is in progress, the above may return output that is similar to:

<hex-address> mem_blacklist_inprog = 0x1 (true)

In such a situation, any attempt to run ldm commands related to memory management may fail with error:

A memory blacklist operation is being processed. Other memory operations are disabled until it completes

Sometimes a power cycle may clear the blacklist operation. In not-so-lucky situations, the affected DIMM(s) may have to be serviced.

(Credit: Eunice M.)

-2-

Duplicate IP Address & Recovery

If two nodes [running Solaris] on a network share the same IP address, Solaris kernel detects the duplicate address, marks it as duplicate and eventually disables and turns off the IP interface if the problem persists. These actions are typically recorded in the system log with warnings such as the following.

eg.,
Dec 23 16:46:18 some-host ip: [ID 759807 kern.warning] WARNING: net0 has duplicate address xx.xx.xx.xx (in use by 00:10:e0:5d:9c:83); disabled
Dec 23 16:46:18 some-host in.routed[737]: [ID 238047 daemon.warning] interface net0 to xx.xx.xx.xx turned off

When the IP interface was disabled/turned off, ipadm show-if shows down state for that interface.

Once the problem was discovered and fixed [by the administrator or user] to avoid duplication of IP address, Solaris kernel enables and brings up the IP interface that it turned off earlier upon detecting a duplicate IP address. This action too is recorded in the system log.

Dec 23 16:51:18 some-host ip: [ID 636139 kern.notice] NOTICE: recovered address ::ffff:xx.xx.xx.xx on net0

Once the system marks an interface down due to the conflicting IP address in a remote system, the local system periodically checks to see if the conflict persists. In Solaris 11.3, the time between the checks is 300,000 milliseconds (300 seconds or 5 minutes). However in some cases waiting for 5 minutes might not be desirable. In such cases, the time between the duplicate IP address checks can be tuned by modifying the IP tunable parameter, _dup_recovery, to appropriate value.

eg.,

Reduce the _dup_recovery value to 90 seconds.

  • Temporary change (non-persistent)

    # ndd -get /dev/ip ip_dup_recovery
    300000
    
    # ndd -set /dev/ip ip_dup_recovery 90000
    
    # ndd -get /dev/ip ip_dup_recovery
    90000
    
  • Permanent change (persistent across reboots)

    # ipadm show-prop -p _dup_recovery ip
    PROTO PROPERTY              PERM CURRENT      PERSISTENT   DEFAULT      POSSIBLE
    ip    _dup_recovery         rw   300000       --           300000       0-3600000
    
    # ipadm set-prop -p _dup_recovery=90000 ip
    
    # ipadm show-prop -p _dup_recovery ip
    PROTO PROPERTY              PERM CURRENT      PERSISTENT   DEFAULT      POSSIBLE
    ip    _dup_recovery         rw   90000        90000        300000       0-3600000
    

Notice the slight difference in parameter names when ndd and ipadm were used to tune the same value.


-3-

Solaris OS: What Group Package was Installed?

pkg list on target system shows this information.

eg.,
# pkg list | grep "group/system/solaris" | grep server
group/system/solaris-minimal-server               0.5.11-0.175.3.1.0.5.0     i--

List all packages that are part of the group package that was installed by running:

pkg list -as `pkg contents -r -H -o fmri -t depend -a type=group <group-package>`

List available group packages to install Solaris server:

# pkg search solaris-*-server | awk '{ print $3 "\t" $4}'
VALUE   					PACKAGE
solaris/group/system/solaris-large-server       pkg:/group/system/solaris-large-server@0.5.11-0.175.3.1.0.5.0
solaris/group/system/solaris-minimal-server     pkg:/group/system/solaris-minimal-server@0.5.11-0.175.3.1.0.5.0
solaris/group/system/solaris-small-server       pkg:/group/system/solaris-small-server@0.5.11-0.175.3.1.0.5.0
  • solaris-large-server provides an Oracle Solaris large server environment that contain all of the common network services that an enterprise server must provide. Hardware drivers that are required for servers such as InfiniBand drivers are also part of this group package.

  • solaris-small-server installs a smaller set of packages on a server and provides command-line environment

  • solaris-minimal-server installs the smallest possible set of Solaris packages which provides a minimal command-line environment

In addition to the above, solaris/group/system/solaris-desktop group package provides Solaris desktop environment. This package contains the GNOME desktop environment that includes GUI applications such as web browsers and mail clients, and drivers for graphics and audio devices.

Keep in mind that solaris-desktop group package has a lot more packages compared to the other three group packages outlined above.


-4-

Solaris OS: What AI Manifest was Used?

On target system, locate the AI manifest file that was used to perform the Solaris installation at:

  • /var/log/install/ai.xml

Installation log can also be found in the same directory.


-5-

Package History

pkg history shows command history related to a specific package or all packages installed on the system. This includes information such as who initiated the package operation, the complete command, how long it took to complete the operation, whether a new boot environment (BE) was created and the errors encountered, if any.

Refer to pkg(1) man page for the options and description.


PS:
Legible version of this post with full outputs @ [Solaris] Memory Blacklisting, Duplicate IP Address & Recovery, Group Package Installations, ..



Fancy Separator Credit: jkneb

Saturday Feb 28, 2015

Programming in C: Few Tidbits #5

1. Splitting a long string into multiple lines

Let's start with an example. Here is a sample string. It'd be nice to improve readability by splitting it into multiple lines.

const char *quote = "The shepherd drives the wolf from the sheep's for which the sheep thanks the shepherd as his liberator, while the wolf denounces him for the same act as the destroyer of liberty. Plainly, the sheep and the wolf are not agreed upon a definition of liberty.";

Couple of ideas.

  • Line continuation: split the string anywhere at white space, and end the line with a backslash (\). Repeat until done.

    Backslash (\) is the continuation character often referred as backslash-newline.

    eg.,

    const char *quote = "The shepherd drives the wolf from the sheep's for which \
                            the sheep thanks the shepherd as his liberator, while \
                            the wolf denounces him for the same act as the destroyer \
                            of liberty. Plainly, the sheep and the wolf are not agreed \
                            upon a definition of liberty.";
    

    The C preprocessor removes the backslash and joins the following line with the current one. This is repeated until all lines are joined. However in the above example, indentation becomes part of the actual string thus a bunch of unwanted whitespaces appear in the final string. Besides, it is not possible to include comments at the end of any of those lines [after the line continuation character] if you ever wanted to. Both of these minor issues can be avoided with string literal concatenation (discussed next).

    Compiling the above with Solaris Studio C compiler results in the following output.

    The shepherd drives the wolf from the sheep's for which                         the sheep thanks the shepherd as his liberator, while                   the wolf denounces him for the same act as the destroyer             of liberty. Plainly, the sheep and the wolf are not agreed                      upon a definition of liberty.
    
  • String literal concatenation: split the string anywhere at white space, and end the line with a pair of quotes ("). Start the next line with quotes, and repeat until done.

    eg.,

    const char *quote = "The shepherd drives the wolf from the sheep's for which "         /* dummy comment */
                           "the sheep thanks the shepherd as his liberator, while "        // another dummy comment
                           "the wolf denounces him for the same act as the destroyer "
                           "of liberty. Plainly, the sheep and the wolf are not agreed "
                           "upon a definition of liberty.";
    

    Adjacent string literals are concatenated at compile time. Comments outside the string literals are ignored, and the concatenated string will not include indented whitespaces unless they are within the string literals (delimited by quotes).

    Printing the above results in the following output.

    The shepherd drives the wolf from the sheep's for which the sheep thanks the shepherd as his liberator, while the wolf denounces him for the same act as the destroyer of liberty. Plainly, the sheep and the wolf are not agreed upon a definition of liberty.

2. Simultaneous writing to multiple streams

The straight forward approach is to make multiple standard I/O library function calls to write to desired streams.

eg.,

..
fprintf(stdout, "some formatted string");
fprintf(stderr, "some formatted string");
fprintf(filepointer, "some formatted string");
..

This approach may work well as long as there are only a few occurrences of such writing. However if there is a need to repeat it many times over the lifetime of a process, it can be simplified by wrapping all those function calls that write to different streams into a function so a single call to the wrapper function takes care of writing to different streams. If the number of arguments in the formatted string is not constant or not known in advance, one option is to make the wrapper function a variadic function so that it accepts a variable number of arguments. Here is an example.

The following example writes all messages to the log file, writes only informative messages to standard output (stdout) and writes only fatal errors to the high priority log. Without the logfmtstring() wrapper function, the same code would have had five different standard I/O library calls rather than just three that the sample code has.

% cc -o mstreams multstreams.c

% ./mstreams
[info] successful entries: 52. failed entries: 7

% cat app.log
[info] successful entries: 52. failed entries: 7
[fatal] billing system not available
[error] unable to ping internal system at 10.135.42.36

% cat app_highpriority.log
[fatal] billing system not available
%

Web search keywords: C Variadic Functions


3. Declaring variables within the scope of a CASE label in a SWITCH block

It is possible to declare and use variables within the scope of a case label with one exception -- the first statement after a case label should be a statement or an expression, but not a declaration. If not compiler throws an error during compilation such as the following.

The above failure can be fixed by either moving the variable declaration to any place after a valid statement if possible, or by adding a dummy or null statement right after the case label.

eg.,

1. Move the declaration from right after the case label to any place after a valid statement.

/* works */
switch(NULL) {
        default:
                printf("\nin default ..");
                int cyear = 2015;
                printf("\nyear = %d", cyear);
}

2. Add a dummy or null statement right after the case label.

/* works */
switch(NULL) {
        default:
                ; // NULL statement
                int cyear = 2015;
                printf("\nin default ..");
                printf("\nyear = %d", cyear);
}

3. Yet another option is to define or create scope using curly braces ({}) for the case where variables are declared.

/* works too */
switch(NULL) {
        default:
        {
                int cyear = 2015;
                printf("\nin default ..");
                printf("\nyear = %d", cyear);
        }
}

Also see: Keyword – switch, case, default

(Full copy of the same blog post with complete examples can be found at:
technopark02.blogspot.com/2015/02/programming-in-c-few-tidbits-5.html)

Saturday Jan 31, 2015

Programming in C: Few Tidbits #4

1. Using Wildcards in Filename Pattern Matching

Relying on *stat() API is not much of an option when using wildcards to match a filename pattern. Some of the options involve traversing a directory checking each file for a match using fnmatch(), or to use system() function to execute an equivalent shell command. Another option that is well suited for this task is the glob*() API, which is part of Standard C Library Functions. (I believe glob() depends on fnmatch() for finding matches).

Here is a simple example that displays the number of matches found for pattern "/tmp/lint_" along with the listing of matches.

% ls -1 /tmp/lint_*
/tmp/lint_AAA.21549.0vaOfQ
/tmp/lint_BAA.21549.1vaOfQ
/tmp/lint_CAA.21549.2vaOfQ
/tmp/lint_DAA.21549.3vaOfQ
/tmp/lint_EAA.21549.4vaOfQ
/tmp/lint_FAA.21549.5vaOfQ
/tmp/lint_GAA.21549.6vaOfQ


% cat match.c
#include <stdio.h>
#include <glob.h>

...
glob_t buf;

if (argc == 1) return 0;

glob(argv[1], 0 , NULL , &buf);

printf("\nNumber of matches found for pattern '%s': %d\n",
      argv[1], buf.gl_pathc);

for (int i = 0; i < buf.gl_pathc; ++i) {
    printf("\n\t%d. %s", (i + 1), buf.gl_pathv[i]);
}

globfree(&buf);
...


% ./<executable> /tmp/lint_\*

Number of matches found for pattern '/tmp/lint_*': 7

        1. /tmp/lint_AAA.21549.0vaOfQ
        2. /tmp/lint_BAA.21549.1vaOfQ
        3. /tmp/lint_CAA.21549.2vaOfQ
        4. /tmp/lint_DAA.21549.3vaOfQ
        5. /tmp/lint_EAA.21549.4vaOfQ
        6. /tmp/lint_FAA.21549.5vaOfQ
        7. /tmp/lint_GAA.21549.6vaOfQ

Please check the man page out for details -- glob(3C).


2. Microtime[stamp]

One of the old blog posts has an example to extract the current timestamp using time API. It shows the timestamp in standard format month-date-year hour:min:sec. In this post, let's add microseconds to the timestamp.

Here is the sample code.

% cat microtime.c
#include <stdio.h>
#include <time.h>

...
char timestamp[80], etimestamp[80];
struct timeval tmval;
struct tm *curtime;

gettimeofday(&tmval, NULL);

curtime = localtime(&tmval.tv_sec);
if (curtime == NULL) return 1;

strftime(timestamp, sizeof(timestamp), "%m-%d-%Y %X.%%06u", curtime);
snprintf(etimestamp, sizeof(etimestamp), timestamp, tmval.tv_usec);

printf("\ncurrent time: %s\n", etimestamp);
...

% ./<executable>
current time: 01-31-2015 15:49:26.041111

% ./<executable>
current time: 01-31-2015 15:49:34.575214

One major change from old approach is the reliance on gettimeofday() since it returns a structure [timeval] with a member variable [tv_usec] to represent the microseconds.

strftime() fills up the date/time data in timestamp variable as per the specifiers used in time format (third argument). By the time strftime() completes execution, timestamp will have month-date-year hr:min:sec filled out. Subsequent snprintf fills up the only remaining piece of time data - microseconds - using the tv_usec member in timeval structure and writes the updated timestamp to a new variable, etimestamp.

Credit: stackoverflow user unwind.


3. Concatenating Multi-Formatted Strings

I have my doubts about this header - so, let me show an example first. The following rudimentary example attempts to construct a sentence that is something like "value of pi = (22/7) = 3.14". In other words, the sentence has a mixture of character strings, integers, floating point number and special characters.

% cat fmt.c
#include <stdio.h>
#include <string.h>

...
char tstr[48];
char pistr[] = "value of pi = ";
int num = 22, den = 7;
float pi = ((float)num/den);

char snum[8], sden[8], spi[8];

sprintf(sden, "%d", den);
sprintf(snum, "%d", num);
sprintf(spi, "%0.2f", pi);

strcpy(tstr, pistr);
strcat(tstr, "(");
strcat(tstr, snum);
strcat(tstr, "/");
strcat(tstr, sden);
strcat(tstr, ") = ");
strcat(tstr, spi);

puts(tstr);
...

% ./<executable>
value of pi = (22/7) = 3.14

Nothing seriously wrong with the above code. It is just that it uses a bunch of sprintf(), strcpy() and strcat() calls to construct the target string. Also it overallocates the memory required for the actual string.

The same effect can be achieved by using asprintf(). The resulting code will be much smaller and easy to maintain however. This function also eases the developer from the burden of allocating memory of appropriate size. In general, overallocation leads to memory wastage and underallocation likely leads to buffer overflows posing unnecessary security risks. When relying on asprintf(), developers are not relieved from two factors though -- checking the return value to see if the call succeeded, and in freeing up the buffer when done with it. Ignoring those two aspects lead to program failures in the worst case, and memory leaks are almost guaranteed.

Here is the alternate version that achieves the desired effect by making use of asprintf().

% cat ifmt.c
#include <stdio.h>
#include <stdlib.h>

...
char *tstr;
int num = 22, den = 7;
float pi = ((float)num/den);

int ret = asprintf(&tstr, "value of pi = (%d/%d) = %0.2f", num, den, pi);

if (ret == -1) return 1;

puts(tstr);
free(tstr);
...

% ./<executable>
value of pi = (22/7) = 3.14

Also see: snprintf()

(Full copy of the same blog post with complete examples can be found at:
http://technopark02.blogspot.com/2015/01/programming-in-c-few-tidbits-4.html)

Thursday Jul 31, 2014

Programming in C: Few Tidbits #2

(1) ceil() returns an incorrect value?

ceil() rounds the argument upward to the nearest integer value in floating-point format. For example, calling ceil() with an argument (2/3) should return 1.

printf("\nceil(2/3) = %f", ceil(2/3));

results in:

ceil(2/3) = 0.000000

.. which is not the expected result.

However:

printf("\nceil((float)2/3) = %f", ceil((float)2/3));

shows the expected result.

ceil((float)2/3) = 1.000000

The reason for the incorrect result in the first attempt can be attributed to the integer division. Since both operands in the division operation are integers, it resulted in an integer division which discarded the fractional part.

Desired result can be achieved by casting one of the operands to float or double as shown in the subsequent attempt.

One final example for the sake of completeness.

printf("\nceil(2/(float)3) = %f", ceil(2/(float)3));
..
ceil(2/(float)3) = 1.000000

(2) Main difference between abort() and exit() calls

On a very high level: abort() sends SIGABRT signal causing abnormal termination of the target process without calling functions registered with atexit() handler, and results in a core dump. Some cleanup activity may happen.

exit() causes normal process termination after executing functions registered with the atexit() handler, and after performing cleanup activity such as flushing and closing all open streams.

If it is desirable to bypass atexit() registered routine(s) during a process termination, one way is to call _exit() rather than exit().

Of course, this is all high level and the information provided here is incomplete. Please check relevant man pages for detailed information.


(3) Current timestamp

The following sample code shows the current timestamp in two different formats. Check relevant man pages for more information.

#include <time.h>
..
char timestamp[80];
time_t now;
struct tm *curtime;

now = time(NULL);
curtime = localtime(&now);

strftime(timestamp, sizeof(timestamp), "%m-%d-%Y %X", curtime);

printf("\ncurrent time: %s", timestamp);
printf("\ncurrent time in a different format: %s", asctime(curtime));
..

Executing this code shows output

current time: 07-31-2014 22:05:42
current time in a different format: Thu Jul 31 22:05:42 2014

Monday Jun 30, 2014

Programming in C: Few Tidbits

.. with little commentary aside. Target audience: new programmers. These tips are equally applicable in C and C++ programming environments.


1. Duplicating a file pointer

Steps: find the integer file descriptor associated with the file stream using fileno() call, make a copy of the file descriptor using dup() call, and finally associate the file stream with the duplicated file descriptor by calling fdopen().

eg.,
FILE *fptr = fopen("file", "mode");

FILE *fptrcopy = fdopen( dup( fileno(fptr) ), "mode");

2. Capturing the exit code of a command that was invoked using popen()

Using pipes is one way of executing commands programmatically that are otherwise invoked from a shell. While pipes are useful in performing tasks other than executing shell commands, this tip is mainly about the exit code of a command (to figure out whether it succeeded or failed) that was executed using popen() API.

To capture the exit code, simply use the value returned by pclose(). This function call returns the termination status of the command that was executed as a child process. However the termination status of the child process is in the top 16 bits of the return value, so dividing the pclose() return value by 256 gives the actual exit code of the command that was executed.

eg.,
...
FILE *ptr;
int rc;

if ((ptr = popen("ls", "r")) != NULL) {
	rc = pclose(ptr)/256;
	printf("\nls: exit code = %d", rc);
}

if ((ptr = popen("ls -W", "r")) != NULL) {
	rc = pclose(ptr)/256;
	printf("\nls -W: exit code = %d", rc);
}
...

% ./<executable>

ls: exit code = 0
ls: illegal option -- W
ls -W: exit code = 2

3. Converting an integer to a string

Standard C library has implementation for converting a string to an integer (atoi()), but not for converting an integer to a string. One way to achieve the desired result is by using sprintf() function call, which writes formatted data to a string.

eg.,
int weight = 30;
char *wtstr = malloc(sizeof(char) * 3);

sprintf(wtstr, "%d", weight);
...

sprintf() can also be used to convert data in other data types such as float, double to string. Also see: man page for snprintf().


4. Finding the length of a statically allocated array

When size was not specified explicitly, simply divide the total size of the array by the size of the first array element.

eg.,
static const char *greeting[] = { "Hi", "Hello", "Hola", "Bonjour", \
                                    "Namaste", "Ciao", "Ni Hao" };
int numgreetings = sizeof(greeting)/sizeof(greeting[0]);

After execution, numgreetings variable holds a value of 7. Note that sizeof(greeting[0]) is actually the size of a pointer to a character array.

  • sizeof is not a function, but an operator -- hence it is not necessary or required to use parentheses when using it.
  • Though not so useful, this is applicable even when the size was explicitly specified.

Monday Mar 31, 2014

[Solaris] ZFS Pool History, Writing to System Log, Persistent TCP/IP Tuning, ..

.. with plenty of examples and little comments aside.

[1] Check existing DNS client configuration

Solaris 11 and later:

% svccfg -s network/dns/client listprop config
config                      application        
config/value_authorization astring     solaris.smf.value.name-service.dns.client
config/options             astring     "ndots:2 timeout:3 retrans:3 retry:1"
config/search              astring     "sfbay.sun.com" "us.oracle.com" "oraclecorp.com" "oracle.com" "sun.com"
config/nameserver          net_address xxx.xx.xxx.xx xxx.xx.xxx.xx xxx.xx.xxx.xx

Solaris 10 and prior:

Check the contents of /etc/resolv.conf

% cat /etc/resolv.conf
search  sfbay.sun.com us.oracle.com oraclecorp.com oracle.com sun.com
options ndots:2 timeout:3 retrans:3 retry:1
nameserver      xxx.xx.xxx.xx
nameserver      xxx.xx.xxx.xx
nameserver      xxx.xx.xxx.xx

Note that /etc/resolv.conf file exists on Solaris 11.x releases too as of today.

[2] Logical domains: finding out the hostname of control domain

Use virtinfo(1M) command.

root@ppst58-cn1-app:~# virtinfo -a
Domain role: LDoms guest I/O service root
Domain name: n1d2
Domain UUID: 02ea1fbe-80f9-e0cf-ecd1-934cf9bbeffa
Control domain: ppst58-01
Chassis serial#: AK00083297

The above output shows that n1d2 domain is a guest domain, which is also an I/O domain, the service domain and a root I/O domain. Control domain is running on host ppst58-01.

Output from control domain:

root@ppst58-01:~# ldm list
NAME             STATE      FLAGS   CONS    VCPU  MEMORY   UTIL  NORM  UPTIME
primary          active     -n-cv-  UART    64    130304M  0.1%  0.1%  243d 2h 
n1d1             active     -n----  5001    448   916992M  0.2%  0.2%  3d 15h 26m
n1d2             active     -n--v-  5002    512   1T       0.0%  0.0%  3d 15h 29m

root@ppst58-01:~# virtinfo -a
Domain role: LDoms control I/O service root
Domain name: primary
Domain UUID: 19337210-285a-6ea4-df8f-9dc65714e3ea
Control domain: ppst58-01
Chassis serial#: AK00083297

[3] Administering NFS configuration

Solaris 11 and later:

Use sharectl(1M) command. Solaris 11.x releases include the sharectl administrative tool to configure and manage file-sharing protocols such as NFS, SMB, autofs.

eg.,
Display all property values of NFS:

# sharectl get nfs
servers=1024
lockd_listen_backlog=32
lockd_servers=1024
grace_period=90
server_versmin=2
server_versmax=4
client_versmin=2
client_versmax=4
server_delegation=on
nfsmapid_domain=
max_connections=-1
listen_backlog=32
..
..

# sharectl status
autofs  online client
nfs     disabled

eg.,
Modifying the nfs v4 grace period from the default 90s to 30s:

# sharectl get -p grace_period nfs
grace_period=90
# sharectl set -p grace_period=30 nfs
# sharectl get -p grace_period nfs
grace_period=30

Solaris 10 and prior:

Edit /etc/default/nfs file, and restart NFS related service(s).

[4] Examining ZFS Storage Pool command history

Solaris 10 8/07 and later releases log successful zfs and zpool commands that modify the underlying pool state. All those executed commands can be examined by running zpool history command. Because this command shows the actual zfs commands executed as they are, the 'history' feature is really useful in troubleshooting an error scenario that was resulted from executing some zfs command.

# zpool list
NAME       SIZE  ALLOC  FREE  CAP  DEDUP   HEALTH  ALTROOT
rpool      416G   152G  264G  36%  1.00x   ONLINE  -
zs3actact  848G  17.4G  831G   2%  1.00x   ONLINE  -

# zpool history -l zs3actact
History for 'zs3actact':
2014-03-19.22:02:32 zpool create -f zs3actact c0t600144F0AC6B9D2900005328B7570001d0 [user root on etc25-appadm05:global]
2014-03-19.22:03:12 zfs create zs3actact/iscsivol1 [user root on etc25-appadm05:global]
2014-03-19.22:03:33 zfs set recordsize=128k zs3actact/iscsivol1 [user root on etc25-appadm05:global]

Note that this log is enabled by default, and cannot be disabled.

[5] Modifying TCP/IP configuration parameters

Using ndd(1M) is the old way of tuning TCP/IP parameters, and still supported as of today (in Solaris 11.x releases). However using padm(1M) command is the recommended way to modify or retrieve TCP/IP Internet protocols on Solaris 11.x and later releases.

# ipadm show-prop -p max_buf tcp
PROTO PROPERTY              PERM CURRENT      PERSISTENT   DEFAULT      POSSIBLE
tcp   max_buf               rw   1048576      --           1048576      128000-1073741824

# ipadm set-prop -p max_buf=2097152 tcp

# ipadm show-prop -p max_buf tcp
PROTO PROPERTY              PERM CURRENT      PERSISTENT   DEFAULT      POSSIBLE
tcp   max_buf               rw   2097152      2097152      1048576      128000-1073741824

ndd style (still valid):

# ndd -get /dev/tcp tcp_max_buf
1048576

# ndd -set /dev/tcp tcp_max_buf 2097152

# ndd -get /dev/tcp tcp_max_buf
2097152

One of the advantages of using ipadm over ndd is that the configured/tuned non-default values are persistent across reboots. In case of ndd, we have to re-apply those values either manually or by creating a Run Control script (/etc/rc*.d/S*) to make sure that the intended values are set automatically during a reboot of the system.

[6] Writing to system log from a shell script

Use logger(1) command as shown in the following example.

eg.,

# logger -p local0.warning Big Brother is watching you

# dmesg | tail -1
Mar 30 18:42:14 etc27zadm01 root: [ID 702911 local0.warning] Big Brother is watching you

Check syslog.conf(4) man page for the list of available system facilities and the severity of the condition being logged (levels).

BONUS:

[*] Forceful NFS unmount on Linux

Try the lazy unmount option (-l) on systems running Linux kernel 2.4.11 or later to forcefully unmount a filesystem that keeps throwing Device or resource busy and/or device is busy error(s).

eg.,

# umount -f /bkp
umount2: Device or resource busy
umount: /bkp: device is busy
umount2: Device or resource busy
umount: /bkp: device is busy

# umount -l /bkp
#

Friday Feb 28, 2014

[Solaris] Changing hostname, Parallel Compression, pNFS, Upgrading SRUs and Clearing Faults

[1] Solaris 11+ : changing hostname

Starting with Solaris 11, a system's identify (nodename) is configured through the config/nodename service property of the svc:/system/identity:node SMF service. Solaris 10 and prior versions have this information in /etc/nodename configuration file.

The following example demonstrates the commands to change the hostname from "ihcm-db-01" to "ehcm-db-01".

eg.,
# hostname
ihcm-db-01

# svccfg -s system/identity:node listprop config
config                       application        
config/enable_mapping       boolean     true
config/ignore_dhcp_hostname boolean     false
config/nodename             astring     ihcm-db-01
config/loopback             astring     ihcm-db-01
#

# svccfg -s system/identity:node setprop config/nodename="ehcm-db-01"

# svccfg -s system/identity:node refresh  -OR- 
	# svcadm refresh svc:/system/identity:node
# svcadm restart system/identity:node

# svccfg -s system/identity:node listprop config
config                       application        
config/enable_mapping       boolean     true
config/ignore_dhcp_hostname boolean     false
config/nodename             astring     ehcm-db-01
config/loopback             astring     ehcm-db-01

# hostname
ehcm-db-01

[2] Parallel Compression

This topic is not Solaris specific, but certainly helps Solaris users who are frustrated with the single threaded implementation of all officially supported compression tools such as compress, gzip, zip.

pigz (pig-zee) is a parallel implementation of gzip that suits well for the latest multi-processor, multi-core machines. By default, pigz breaks up the input into multiple chunks of size 128 KB, and compress each chunk in parallel with the help of light-weight threads. The number of compress threads is set by default to the number of online processors. The chunk size and the number of threads are configurable.

Compressed files can be restored to their original form using -d option of pigz or gzip tools. As per the man page, decompression is not parallelized out of the box, but may show some improvement compared to the existing old tools.

The following example demonstrates the advantage of using pigz over gzip in compressing and decompressing a large file.

eg.,

Original file, and the target hardware.

$ ls -lh PT8.53.04.tar 
-rw-r--r--   1 psft     dba         4.8G Feb 28 14:03 PT8.53.04.tar

$ psrinfo -pv
The physical processor has 8 cores and 64 virtual processors (0-63)
  The core has 8 virtual processors (0-7)
	...
  The core has 8 virtual processors (56-63)
    SPARC-T5 (chipid 0, clock 3600 MHz)

gzip compression.

$ time gzip --fast PT8.53.04.tar 

real    3m40.125s
user    3m27.105s
sys     0m13.008s

$ ls -lh PT8.53*
-rw-r--r--   1 psft     dba         3.1G Feb 28 14:03 PT8.53.04.tar.gz

/* the following prstat, vmstat outputs show that gzip is compressing the 
	tar file using a single thread - hence low CPU utilization. */

$ prstat -p 42510

   PID USERNAME  SIZE   RSS STATE   PRI NICE      TIME  CPU PROCESS/NLWP      
 42510 psft     2616K 2200K cpu16    10    0   0:01:00 1.5% gzip/1

$ prstat -m -p 42510

   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/NLWP  
 42510 psft      95 4.6 0.0 0.0 0.0 0.0 0.0 0.0   0  35  7K   0 gzip/1

$ vmstat 2

 r b w   swap  free  re  mf pi po fr de sr s0 s1 s2 s3   in   sy   cs us sy id
 0 0 0 776242104 917016008 0 7 0 0 0  0  0  0  0 52 52 3286 2606 2178  2  0 98
 1 0 0 776242104 916987888 0 14 0 0 0 0  0  0  0  0  0 3851 3359 2978  2  1 97
 0 0 0 776242104 916962440 0 0 0 0 0  0  0  0  0  0  0 3184 1687 2023  1  0 98
 0 0 0 775971768 916930720 0 0 0 0 0  0  0  0  0 39 37 3392 1819 2210  2  0 98
 0 0 0 775971768 916898016 0 0 0 0 0  0  0  0  0  0  0 3452 1861 2106  2  0 98

pigz compression.

$ time ./pigz PT8.53.04.tar 

real    0m25.111s	<== wall clock time is 25s compared to gzip's 3m 27s
user    17m18.398s
sys     0m37.718s

/* the following prstat, vmstat outputs show that pigz is compressing the 
        tar file using many threads - hence busy system with high CPU utilization. */

$ prstat -p 49734

   PID USERNAME  SIZE   RSS STATE   PRI NICE      TIME  CPU PROCESS/NLWP      
49734 psft       59M   58M sleep    11    0   0:12:58  38% pigz/66

$ vmstat 2

 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr s0 s1 s2 s3   in   sy   cs us sy id
 0 0 0 778097840 919076008 6 113 0 0 0 0 0  0  0 40 36 39330 45797 74148 61 4 35
 0 0 0 777956280 918841720 0 1 0 0 0  0  0  0  0  0  0 38752 43292 71411 64 4 32
 0 0 0 777490336 918334176 0 3 0 0 0  0  0  0  0 17 15 46553 53350 86840 60 4 35
 1 0 0 777274072 918141936 0 1 0 0 0  0  0  0  0 39 34 16122 20202 28319 88 4 9
 1 0 0 777138800 917917376 0 0 0 0 0  0  0  0  0  3  3 46597 51005 86673 56 5 39

$ ls -lh PT8.53.04.tar.gz 
-rw-r--r--   1 psft     dba         3.0G Feb 28 14:03 PT8.53.04.tar.gz

$ gunzip PT8.53.04.tar.gz 	<== shows that the pigz compressed file is 
                                         compatible with gzip/gunzip

$ ls -lh PT8.53*
-rw-r--r--   1 psft     dba         4.8G Feb 28 14:03 PT8.53.04.tar

Decompression.

$ time ./pigz -d PT8.53.04.tar.gz 

real    0m18.068s
user    0m22.437s
sys     0m12.857s

$ time gzip -d PT8.53.04.tar.gz 

real    0m52.806s <== compare gzip's 52s decompression time with pigz's 18s
user    0m42.068s
sys     0m10.736s

$ ls -lh PT8.53.04.tar 
-rw-r--r--   1 psft     dba         4.8G Feb 28 14:03 PT8.53.04.tar

Of course, there are other tools such as Parallel BZIP2 (PBZIP2), which is a parallel implementation of the bzip2 tool are worth a try too. The idea here is to highlight the fact that there are better tools out there to get the job done in a quick manner compared to the existing/old tools that are bundled with the operating system distribution.


[3] Solaris 11+ : Upgrading SRU

Assuming the package repository is set up already to do the network updates on a Solaris 11+ system, the following commands are helpful in upgrading a SRU.

  • List all available SRUs in the repository.

    # pkg list -af entire
  • Upgrade to the latest and greatest.

    # pkg update

    To find out what changes will be made to the system, try a dry run of the system update.

    # pkg update -nv
  • Upgrade to a specific SRU.

    # pkg update entire@<FMRI>

    Find the Fault Managed Resource Identifier (FMRI) string by running pkg list -af entire command.

Note that it is not so easy to downgrade SRU to a lower version as it may break the system. Should there be a need to downgrade or switch between different SRUs, relying on Boot Environments (BE) might be a good idea. Check Creating and Administering Oracle Solaris 11 Boot Environments document for details.


[4] Parallel NFS (pNFS)

Just a quick note — RFC 5661, Network File System (NFS) Version 4.1 introduced a new feature called "Parallel NFS" or pNFS, which allows NFS clients to access storage devices containing file data directly. When file data for a single NFS v4 server is stored on multiple and/or higher-throughput storage devices, using pNFS can result in significant improvement in file access performance. However Parallel NFS is an optional feature in NFS v4.1. Though there was a prototype made available few years ago when OpenSolaris was still alive, as of today, Solaris has no support for pNFS. Stay tuned for any updates from Oracle Solaris teams.

Here is an interesting write-up from one of our colleagues at Oracle|Sun (dated 2007) -- NFSv4.1's pNFS for Solaris.

(Credit to Rob Schneider and Tom Gould for initiating this topic)


[5] SPARC hardware : Check for and clear faults from ILOM

Couple of ways to check the faults using ILOM command line interface.

By running:

  1. show faulty command from ILOM command prompt, or
  2. fmadm faulty command from within the ILOM faultmgmt shell

Once found, use the clear_fault_action property with the set command to clear the fault for a FRU.

The following example checks for the faulty FRUs from ILOM faultmgmt shell, then clears it out.

eg.,

-> start /SP/faultmgmt/shell
Are you sure you want to start /SP/faultmgmt/shell (y/n)? y

faultmgmtsp> fmadm faulty

------------------- ------------------------------------ -------------- --------
Time                UUID                                 msgid          Severity
------------------- ------------------------------------ -------------- --------
2014-02-26/16:17:11 18c62051-c81d-c569-a4e6-e418db2f84b4 PCIEX-8000-SQ  Critical
        ...
        ...
Suspect 1 of 1
   Fault class  : fault.io.pciex.rc.generic-ue
   Certainty    : 100%
   Affects      : hc:///chassis=0/motherboard=0/cpuboard=1/chip=2/hostbridge=4
   Status       : faulted

   FRU
      Status            : faulty
      Location          : /SYS/PM1
      Manufacturer      : Oracle Corporation
      Name              : TLA,PM,T5-4,T5-8
        ...

Description : A fault has been diagnosed by the Host Operating System.

Response    : The service required LED on the chassis and on the affected
              FRU may be illuminated.

        ...

faultmgmtsp> exit

-> set /SYS/PM1 clear_fault_action=True
Are you sure you want to clear /SYS/PM1 (y/n)? y
Set 'clear_fault_action' to 'True'

Note that this procedure clears the fault from the SP but not from the host.

Friday Jan 31, 2014

Solaris Tips : Automounted NFS, ZFS metaslabs, utility to manage F40 cards, powertop, ..

[1] Mounting NFS on Solaris 10 and later

With a relevant entry in /etc/vfstab, the general expectation is that Solaris automatically mounts the NFS shares upon a system reboot. However users may find that NFS shares are not being auto-mounted on some of the systems running the latest update of Solaris 10 or 11. One reason for this behavior could be the use of the Secure By Default network profile, which was introduced in Solaris 10 11/06. When this networking profile is in use, numerous services including the NFS client service are disabled. For the automounting of NFS shares, we will need the NFS client service running.

The fix is to enable NFS client service along with its dependencies.

# svcs -a | grep nfs\/client
disabled       Jan_17   svc:/network/nfs/client:default

# svcadm  enable -r svc:/network/nfs/client

# svcs -a | grep nfs\/client
online         Jan_20   svc:/network/nfs/client:default

On a similar note, if you want all default services to be enabled as they were in previous Solaris releases, run the following command as privileged user. Then use svcadm(1M) to disable unwanted services.

# netservices open

To switch back to the secure by default profile, run:

# netservices limited

[2] Utility to manage Sun Flash Accelerator F40 PCIe card(s) .. ddcli

The Sun Flash Accelerator F40 PCIe Card has two sets of firmware — NAND flash controller firmware, and SAS controller firmware (host PCIe to SAS controller). Both firmware sets are updated as a single F40 firmware package using the ddcli utility. This utility can be used to locate and display information about the cards in the system, format the cards, monitor the health and extract smart logs (to assist Oracle support in debugging and resolution) for a selected F40 card.

If ddcli utility is not available on systems where the F40 PCIe cards are installed, install patch "16005846: F40 (AURA 2) SW1.1 Release fw (08.05.01.00) and cli utility update" or later version, if available. This patch can be downloaded from support.oracle.com

Note that ddcli utility can be used to service and monitor the health of Sun Flash Accelerator F80 PCIe cards too. Install patch "Patch 17860600: SW1.0 for Sun Flash Acccelerator F80" to get access to the F80 card software package.

[3] Permission denied error when changing a password

An attempt to change the password for a local user 'XYZ' fails with Permission denied error.

# passwd XYZ
New Password: ********
Re-enter new Password: ********
Permission denied

# grep passwd /etc/nsswitch.conf
passwd: files ldap

Users have the flexibility to include and access password information in/from multiple repositories such as files and nis or ldap. Per the man page of passwd(1), when a user has a password stored in one of the name services as well as a local files entry, the passwd command tries to update both. It is possible to have different passwords in the name service and local files entry. Use passwd -r to change a specific password repository.

Hence the fix is to use the -r option in this case to ignore the nsswitch.conf file sequence and update the password information in local /etc files — /etc/passwd and /etc/shadow files.

# passwd -r files XYZ
New Password: ********
Re-enter new Password: ********
passwd: password successfully changed for oracle

[4] Microstate statistics for any process

ptime -m shows the full set of microstate accounting statistics for the lifetime of a given process. prstat -m also reports the microstate process accounting information, but the displayed statistics are accumulated since last display every interval seconds.

# prstat -p 39235

   PID USERNAME  SIZE   RSS STATE   PRI NICE      TIME  CPU PROCESS/NLWP      
 39235 psft     3585M 3320M sleep    59    0   2:23:11 0.0% java/257

# prstat -mp 39235

   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/NLWP  
 39235 psft     0.0 0.0 0.0 0.0 0.0  87  13 0.0   0   0   1   0 java/257


# ptime -mp 39235

real 428:31:25.902644700
user  2:06:32.283801209
sys     16:37.056999418
trap        2.250539737
tflt        0.000000000
dflt        2.018347218
kflt        0.000000000
lock 96013:52:37.184929717
slp  14349:50:02.286168683
lat      3:11.510473038
stop        0.002468763

In the above example, java process with pid 39235 spent most of its time sleeping waiting to acquire locks in user space (ref: 'lock' field). It also spent a lot of time in just sleeping waiting for some work (ref: 'slp' field). User CPU time is the next major one (ref: 'user' field). The process spent a little bit of time in system space (ref: 'sys' field), waiting for CPU (ref: 'lat' field) and almost negligible amount of time in processing system traps (ref: 'trap' field) and in servicing data page faults (ref: 'dflt' field).

[5] ZFS : metaslab utilization

ZFS divides the space on each device (virtual or physical) into a number of smaller, manageable regions called metaslabs. Each metaslab is associated with a space map that holds information about the free space in that region by keeping tracking of space allocations and deallocations.

The following sample outputs show that a virtual device, u01, made up of two physical disks has 139 metaslabs. The number of segments and free/available space in each metaslab is also shown in those outputs.

# zpool list u01
NAME   SIZE  ALLOC  FREE  CAP  DEDUP  HEALTH  ALTROOT
u01   1.09T   133G  979G  11%  1.00x  ONLINE  -

# zpool status u01
  pool: u01
 state: ONLINE
  scan: none requested
config:

        NAME                       STATE     READ WRITE CKSUM
        u01                        ONLINE       0     0     0
          mirror-0                 ONLINE       0     0     0
            c0t5000CCA01D1DD4A4d0  ONLINE       0     0     0
            c0t5000CCA01D1DCE88d0  ONLINE       0     0     0

errors: No known data errors

# zdb -m u01

Metaslabs:
        vdev          0   ms_array         27
        metaslabs   139   offset                spacemap          free      
        ---------------   -------------------   ---------------   -------------
        metaslab      0   offset            0   spacemap     30   free    4.65M
        metaslab      1   offset    200000000   spacemap     32   free     698K
        metaslab      2   offset    400000000   spacemap     33   free    1.25M
        metaslab      3   offset    600000000   spacemap     35   free     588K
	..
	..
        metaslab     62   offset   7c00000000   spacemap      0   free       8G
        metaslab     63   offset   7e00000000   spacemap     45   free    8.00G
        metaslab     64   offset   8000000000   spacemap      0   free       8G
	...
	...
        metaslab    136   offset  11000000000   spacemap      0   free       8G
        metaslab    137   offset  11200000000   spacemap      0   free       8G
        metaslab    138   offset  11400000000   spacemap      0   free       8G

# zdb -mm u01   

Metaslabs:
        vdev          0   ms_array         27
        metaslabs   139   offset                spacemap          free      
        ---------------   -------------------   ---------------   -------------
        metaslab      0   offset            0   spacemap     30   free    4.65M
                          segments       1136   maxsize    103K   freepct    0%
        metaslab      1   offset    200000000   spacemap     32   free     698K
                          segments         64   maxsize    118K   freepct    0%
        metaslab      2   offset    400000000   spacemap     33   free    1.25M
                          segments        113   maxsize    104K   freepct    0%
        metaslab      3   offset    600000000   spacemap     35   free     588K
                          segments        109   maxsize   28.5K   freepct    0%
	...
	...

What is the purpose of this topic? Just to introduce the ZFS debugger, zdb (check the man page zdb(1M)) to the power-users who would like to dig a little deep to find answers to tough questions such as if a ZFS filesystem is fragmented.

Keywords: ZFS zdb metaslab "space map"

[6] Roles can not login directly error on Solaris 11 and later

The root account in Solaris 11 is a role. A role is just like any other user account with the exception that users with roles cannot login directly. Here is an example that shows the failure when attempted to connect directly.

login: root
Password: ********
Roles can not login directly

In this example, connecting as a normal user (who have no roles assigned) and then using su to connect as root user would succeed. This additional step is to prevent malevolent users from getting away with no accountability. Check Bart's blog post SPOTD: The Guide Book to Solaris Role-Based Access Control for some relevant information.

If security is not a primary concern, and if connecting directly as root user is desirable, simply change the root role into a user.

# rolemod -K type=normal root

This change does not affect all the users who are currently in the root role — they retain the root role. Other users who have root access can su to root or log in to the system as the root user. To remove the root role assignment from other local users, set the role to an empty string using usermod command as shown in the following example.


/* assign root role to user 'giri' */
# usermod -R root giri

# roles giri
root

/* remove the role from user 'giri' */
# usermod -R "" giri
#

Keywords: RBAC, roles

[7] Large volume sizes (> 2 TB), and maximum size of UFS filesystem

As per the Solaris System Administration Guide, the maximum size of a UFS filesystem is ~16 TB.

To create a UFS file system greater than 2 TB, use EFI disk label. The EFI label provides support for physical disks and virtual disk volumes that are greater than 2 TB in size. Refer to the disk management section in Solaris System Administration Guide to find out the advantages and limitations of EFI.

Note that ZFS labels disks with an EFI label when creating a ZFS storage pool (zpool). And users in general need not be too concerned about the maximum size of a ZFS filesystem as it is several times larger than the maximum size supported by the UFS filesystem.

[8] powertop to observe the CPU power management

Although powertop was ported to Solaris and available as an add-on package from unofficial sources for the past few years, recent releases of Solaris bundled this tool with the core distribution. powertop can be used to monitor the effectiveness of CPU power management features on systems running Solaris. It also displays the clock frequently at which the CPU is operating along with the top events that are causing the CPU to wake up and use more energy.

Be aware that when the CPU power management is enabled with the elastic policy in effect (default on Solaris 11 and later), the CPUs on the system are susceptible to CPU throttling under certain conditions either to conserve power or to reduce the amount of heat generated by the chip. In other words, based on the load on the system, the frequency of a microprocessor can be automatically adjusted on the fly. This is referred as "CPU dynamic voltage and frequency scaling" (DVFS). Monitoring the output of powertop is one way to monitor the frequency levels of the processor on a busy system in order to minimize any performance related surprises. Set the power management policy to performance, if letting CPUs run at full speed all the time is desired. Performance policy effectively disables the CPU power management.

Power management settings can be controlled from the Service Processor's (SP) Integrated Lights Out Manager (ILOM) command line interface or browser user interface.

The following sample is gathered from an idle SPARC T5-8 server where the CPU power management was disabled.

                                                    Solaris PowerTOP version 1.3

Idle Power States       Avg     Residency             	Frequency Levels
C0 (cpu running)                (0.1%)                	 500 Mhz        0.0%
C1                      4.7ms   (99.9%)               	 800 Mhz        0.0%
                                                      	 933 Mhz        0.0%
                                                      	1067 Mhz        0.0%
                                                      	1200 Mhz        0.0%
							  ..
							  ..
                                                	3200 Mhz        0.0%
                                                	3333 Mhz        0.0%
                                                	3467 Mhz        0.0%
                                                	3600 Mhz      100.0%

Wakeups-from-idle per second: 109818.7  interval: 5.0s
no power usage estimate available

Top causes for wakeups:
94.4% (103630.7)               sched :  <xcalls> unix`dtrace_sync_func
 3.1% (3352.8)              OPMNPing :  <xcalls> unix`setsoftint_tl1
 1.1% (1155.6)                 sched :  <xcalls> unix`setsoftint_tl1
 0.4% (401.2)               <kernel> :  genunix`pm_timer
 0.3% (317.0)                  sched :  <xcalls> 
 0.2% (251.8)               <kernel> :  genunix`lwp_timer_timeout
 0.2% (204.4)                  sched :  <xcalls> unix`null_xcall
 0.1% (100.2)               <kernel> :  genunix`clock
 0.1% ( 65.6)               <kernel> :  genunix`cv_wakeup
 0.0% ( 50.2)               <kernel> :  SDC`sysdc_update
 0.0% ( 46.8)            <interrupt> :  mcxnex#0 
 0.0% ( 39.6)                   opmn :  <xcalls> unix`setsoftint_tl1
 0.0% ( 36.6)                   opmn :  <xcalls> 
 0.0% ( 36.4)                   opmn :  <xcalls> unix`vtag_flushrange_group_tl1
 0.0% ( 21.6)            <interrupt> :  ixgbe#0
	...
	...

Suggestion: enable CPU power management using poweradm(1m)

Q - Quit R - Refresh (CPU PM is disabled)
About

Benchmark announcements, HOW-TOs, Tips and Troubleshooting

Search

Archives
« May 2016
SunMonTueWedThuFriSat
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
    
       
Today