« Rman Behaviour with Transported Tablespaces Plugged -In Read-Only | Main | RAC with ASM on Linux, Crash Scenario: Data Disk Group Loss »

Cleanup Zombies Consuming CPU

I've got the following request:

Our CRM database is getting closer to hang, idle CPU is consistently close to 0

This database is well known to suffer from sessions that after finishing its work remain open and sometimes consuming a lot of resources.

To check zombies we look at v$session "where last_call_et /60/60 > 1" , that is sessions that are doing nothing for more than 1 hour. You need to take care to filter out sessions where username is null, because they are the background processes, and will stand out whit big numbers on last_call_et.

To check out use this script:

col sid for 99999
col username for a15
col last_call_et for 9999999
col et2 for 999999

select     sid,
    username,
    logon_time,
    last_call_et,
    last_call_et/60/60 et2,
    program
from     v$session
where     username is not null
and     last_call_et/60/60>1
order     by 4
/

To kill the processes on Unix you may generate a script using this select:

set echo off head off veri off pages 50000 lines 120
spool killzombies.sh

select 'kill -9 '||b.spid ||' # '||a.username||' '||a.program
from v$session a, v$process b
where a.program like '&1' and a.username ='&2'
and last_call_et/60/60>1
and a.paddr=b.addr
/

spool off

To execute the script change permissions:

chmod 700 killzombies.sh
./killzombies.sh

After killing the sessions idle times returned to normal:

{oracle} /cxt/u01/user CXT-PROD > sar -u 1 100

SunOS xtdrsd 5.8 Generic_117350-38 sun4u    04/30/07

18:07:56    %usr    %sys    %wio   %idle
18:07:57      51       9       4      36
18:07:58      48      13       5      34
18:07:59      53       8       3      36
18:08:00      50      10       3      38
18:08:01      55      16       3      26
18:08:02      47      13       5      35
18:08:03      49       7       4      41
18:08:04      46       7       6      42
18:08:05      36       5       7      52
18:08:06      51       9       7      33
18:08:07      47       7       8      39
18:08:08      31       6       8      55
18:08:09      54      10       7      30
18:08:10      63       8       5      24
18:08:11      50       7       5      38
18:08:12      46       6       4      44
18:08:14      55       9       4      33








TrackBack

TrackBack URL for this entry:
http://blogs.oracle.com/mte1521/mt-tb.cgi/4330

Comments (1)

Ofir Manor:

did you consider using a profile? you can limit idle_time per session to one hours, so these session will just time out. If it is generic, maybe you could alter the default profile.
http://download-west.oracle.com/docs/cd/B19306_01/server.102/b14200/statements_6010.htm
"If a user exceeds the CONNECT_TIME or IDLE_TIME session resource limit, then the database rolls back the current transaction and ends the session. When the user process next issues a call, the database returns an error."
I worked with it a few times, but it was a while ago... It might be interesting to hear if it works nicely with RAC

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

About This Entry

This page contains a single entry from the blog posted on April 30, 2007 10:12 AM.

The previous post in this blog was Rman Behaviour with Transported Tablespaces Plugged -In Read-Only.

The next post in this blog is RAC with ASM on Linux, Crash Scenario: Data Disk Group Loss.

Many more can be found on the main index page or by looking through the archives.

Top Tags

Powered by
Movable Type and Oracle