More on TCP Timeouts in CORBA

I previously blogged here about how we configure TCP timeouts in the GlassFish ORB.  Scott Oaks recently discovered some cases where the default configuration needs to be changed. He also pointed out that the blog entry was missing some details about exactly HOW to set the appropriate TCP timeouts.

My previous blog entry referred to several properties that are defined in the class com.sun.corba.ee.impl.orbutil.ORBConstants.  In particular:

  • TRANSPORT_TCP_TIMEOUTS_PROPERTY is com.sun.corba.ee.transport.ORBTCPTimeouts
  • TRANSPORT_TCP_CONNECT_TIMEOUTS_PROPERTY is com.sun.corba.ee.transport.ORBTCPConnectTimeouts
  • WAIT_FOR_RESPONSE_TIMEOUT is com.sun.corba.ee.transport.ORBWaitForResponseTimeout

Any of these can be set by the appropriate -D command: e.g. -Dcom.sun.corba.ee.transport.ORBTCPTimeouts=500:30000:20.

 This is particularly important when running with a very busy app server on a large machine (like a T2000). It may happen that the default 6 second timeout is exceeded while waiting for more data to be read on a large request.  In this case, you may see errors logged like:

 

java.rmi.MarshalException: CORBA >COMM_FAILURE 1398079696 Maybe; nested
exception is: org.omg.CORBA.COMM_FAILURE: vmcid: SUN minor code: 208 completed: Maybe

 or

java.rmi.MarshalException: CORBA MARSHAL 1398079699 >Maybe; nested exception is
org.omg.CORBA.MARSHAL: vmcid: SUN minor code: 211 completed: Maybe

Errors that have completion status maybe cannot be retried, because the client ORB cannot assume they have not already executed on the server side. 

In this case, the ORBTCPTimeouts needs to be increased, say to something like 500:30000:20. This means:

  • The first timeout is .5 seconds
  • Each subsequent retry increases the timeout by 20%
  • The maximum time we will wait is 30 seconds (actually, due to some implementation details, the maximum is closer to double the configured value, or 60 seconds in this case).
We will probably increase the default in a future release, or possibly make it adapt to the observed load.


Comments:

Excellent article. We were having a horrible time trying to figure this out for returning large data sets. We ran into one issue: -Dcom.sun.corba.ee.transport.ORBTCPTimeouts=500:30000:20
would not work, we needed to add a fourth parameter(set to Integer.MAX_VALUE) and everything seemed to work. I filed a bug report on the glassfish-corba project regarding not being able to set three parameters

Posted by Jim McCollom on May 14, 2008 at 04:21 AM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

kcavanaugh

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today