PeopleSoft: Fixing "msgget: No space left on device" Error on Solaris 10

When high number of application server processes are configured in a single or multiple PeopleSoft application server domains cumulative, it is very likely that the PeopleSoft application server domain boot process may fail with errors similar to the following:


Booting server processes ...
exec PSSAMSRV -A -- -C psappsrv.cfg -D CS90SPV -S PSSAMSRV :
Failed.
113954.ben15!PSSAMSRV.29746.1.0: LIBTUX_CAT:681: ERROR: Failure to create message queue
113954.ben15!PSSAMSRV.29746.1.0: LIBTUX_CAT:248: ERROR: System init function failed, Uunixerr = :
msgget: No space left on device

113954.ben15!tmboot.29708.1.-2: CMDTUX_CAT:825: ERROR: Process PSSAMSRV at ben15 failed with /T
tperrno (TPEOS - operating system error)

In this particular example, the PeopleSoft application server is running on a Solaris 10 system. Fortunately the error message is very clear in this case; and the failure is related to the message queues. During the domain boot up process, there is a call to msgget() to create a message queue. If the call to msgget() succeeds, it returns a non-negative integer that serves as the identifier for the newly created message queue. However in case of a failure, it returns -1 and sets the error number to EACCES, EEXIST, ENOENT or ENOSPC depending on the underlying cause.

From the above error messages it is evident that the msgget() failed with the errno set to ENOSPC (No space left on device). Man page of msgget(2) has the following explanation for ENOSPC error code on Solaris:

ERRORS
The msgget() function will fail if:
...
...
ENOSPC A message queue identifier is to be created but
the system-imposed limit on the maximum number of
allowed message queue identifiers system wide
would be exceeded. See NOTES.

NOTES
...
...

The system-imposed limit on the number of message queue
identifiers is maintained on a per-project basis using the
project.max-msg-ids resource control.

It has enough clues to suspect the configured number for the message queue identifiers.

Prior to the release of Solaris 10, the /etc/system System V IPC tunable, msgsys:msginfo_msgmni, could be configured to control the maximum number of message queues that can be created. The default value on pre-Solaris 10 systems is 50.

To reduce the administrative overhead, majority of the System V IPC tunables were obsoleted and equivalent resource controls were created for the remaining tunables in Solaris 10 operating system. In Solaris 10 and later versions, System V IPC can be tuned on a per project basis using the newly introduced resource controls.

In Solaris 10, the resource control project.max-msg-ids replaced the old /etc/system tunable, msginfo_msgmni. And the default value has been raised to 128.

Now back to the failure in PeopleSoft environment. Let's first check the current value for project.max-msg-ids.

  1. Get the project ID.

    % id -p
    uid=222227(psft) gid=2294(dba) projid=3(default)

  2. Using the prctl utility, examine the project.max-msg-ids resource control for the project with ID 3.

    % prctl -n project.max-msg-ids -i project 3
    project: 3: default
    NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT
    project.max-msg-ids
    privileged 128 - deny -
    system 16.8M max deny -

Alternatively run the command ipcs -q to check the number of active message queues. Note that the project with id '3' is configured to create a maximum of 128 (default) message queues. In any case, the number of active message queues from the ipcs -q output may almost match with the configured value for the project.max-msg-ids.

Since it appears the configured PeopleSoft application server domains need more than 128 message queues in order to bring up all the application server processes, the solution is to increase the value for the resource control project.max-msg-ids to any value above 128. For the sake of simplicity, let's increase it to 256 (2 \* default value, that is). Again prctl utility can be used to set the new value for the resource control.
  1. Login as the 'root' user

    % su
    Password:

  2. Increase the maximum value for the message queue identifiers to 256 using the prctl utility.

    # prctl -n project.max-msg-ids -r -v 256 -i project 3

  3. Verify the new maximum value for the message queue identifiers

    # prctl -n project.max-msg-ids -i project 3
    project: 3: default
    NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT
    project.max-msg-ids
    privileged 256 - deny -
    system 16.8M max deny -

To make this change persistent, create a Solaris project and attach it to the OS user as shown below.

% projadd -p 100 -c "PeopleSoft App Server IPC Tuning" -K "project.max-msg-ids=(priv,256,deny)" psftappipc % usermod -K project=psftappipc psft
In the above example, "psftappipc" is the name of the Solaris project and "psft" is the OS user who manages PeopleSoft application server.

That's all there is. With the above change, the PeopleSoft application server domain(s) should boot up at least with no Failure to create message queue .. msgget: No space left on device errors.

[Original blog post is at:
http://technopark02.blogspot.com/2008/03/peoplesoft-fixing-msgget-no-space-left.html]

Comments:

# prctl -n project.max-msg-ids -r -v 256 -i project 3 i have problem

Posted by toplist ekle on March 25, 2011 at 10:18 PM PDT #

Thank you for your help.I solved the problem.

Posted by emlak on June 30, 2011 at 07:36 AM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

Benchmark announcements, HOW-TOs, Tips and Troubleshooting

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today