When looking at the documentation on how to configure your applications for continuous availability you come across the recommended connection string in the form of:
Alias = (DESCRIPTION =
(CONNECT_TIMEOUT= 90)(RETRY_COUNT=100)(RETRY_DELAY=3)
(TRANSPORT_CONNECT_TIMEOUT=1000ms)
(ADDRESS_LIST =
(LOAD_BALANCE=on)
(ADDRESS = (PROTOCOL = TCP)(HOST=clu_site1-scan)(PORT=1521)))
(ADDRESS_LIST =
(LOAD_BALANCE=on)
(ADDRESS = (PROTOCOL = TCP)(HOST=clu_site2-scan)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME = my_service)))
This recommended connection string is widely used, but many references, like the High Availability 26ai documentation, don’t spell out how each parameter changes client connection behavior or what you should observe immediately after tuning. This blog post explains each parameter and shows how changing it affects connection establishment time.
To make the impact of each parameter visible, end-to-end connection time is measured with Linux “time” command. SQLcl is used because it runs on JDBC/Java, which mirrors many application stacks.
To simulate failures (no connection), I use two scenarios:
- Service unavailable: I don’t start the application service on the destination host, which mimics a Data Guard switchover/failover while the service relocates.
- Network timeout: I block port 1521 in the firewall to force a TCP connect timeout, which makes timeout settings easier to observe.
Additionally, SCAN IPs (not the SCAN name) are used to remove DNS lookup time and to control exactly which addresses the client tries. So my connection string essentially looks like this:
SSPDB=
(DESCRIPTION =
(CONNECT_TIMEOUT=1000ms)(RETRY_COUNT=10)(RETRY_DELAY=1000ms)
(TRANSPORT_CONNECT_TIMEOUT=1000ms)(LOAD_BALANCE=off)(FAILOVER=on)
(ADDRESS_LIST =
(LOAD_BALANCE=on)(FAILOVER=on)
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.124)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.225)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.102)(PORT=1521))
)
(ADDRESS_LIST =
(LOAD_BALANCE=on)(FAILOVER=on)
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.152)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.128)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.15)(PORT=1521))
)
(CONNECT_DATA=(SERVICE_NAME = SSOCDB_SSPDB.paas.oracle.com)
)
)
Timeouts are specified in milliseconds (for example, 1000ms). Millisecond units were added in later 12.2 clients; if your client rejects ms, you’re likely on an older, thereby unsupported, and unpatched client.
Recommendation: Always use the latest Oracle client available (26ai at the time of writing) and keep it patched, even when connecting to older database versions.
For each test, I start SQLcl with /nolog and run a script that establishes a single connect and then exits:
connect system/<password>@sspdb
exit;
To baseline the SQLcl startup overhead, I run the script with /nolog and exit only:
[oracle@noncdb bin]$ time sql /nolog @test.sql
SQLcl: Release 25.4 Production on Fri Mar 13 18:29:50 2026
Copyright (c) 1982, 2026, Oracle. All rights reserved.
real 0m1.656s
user 0m3.148s
sys 0m0.189s
In the best-case scenario (service registered and immediately reachable), the result of the script and duration of the test should be within range of the times shown below:
$ time sql /nolog @test.sql
SQLcl: Release 25.4 Production on Fri Mar 13 18:27:44 2026
Copyright (c) 1982, 2026, Oracle. All rights reserved.
Connected.
Disconnected from Oracle AI Database 26ai EE Extreme Perf Release 23.26.0.0.0 - for Oracle Cloud and Engineered Systems
Version 23.26.0.0.0
real 0m2.406s
user 0m5.049s
sys 0m0.236s
Estimated connect time ≈ (connect run “real”) − (baseline “real”) = 2.406s − 1.656s ≈ 0.75s.
This is an approximation because it includes script and client-side variance, but it’s consistent enough to compare parameter changes. In my environment, I don’t see a consistent timing difference between connecting via SCAN IPs and connecting directly to a local listener IP.
Next, let’s walk through the parameters one by one. I’ll start with a simple but important setting that isn’t even shown in the “recommended” string because it defaults to ON — yet it still affects how the client cycles through addresses:
1. FAILOVER=ON/OFF
FAILOVER controls client-side address failover (whether the client will try alternate addresses when a connect attempt fails). You can set it at two scopes:
- at the DESCRIPTION level (applies to the whole connect descriptor)
- inside each ADDRESS_LIST (applies only to that list)
SSPDB=
(DESCRIPTION =
(FAILOVER=off)
(ADDRESS_LIST =
(FAILOVER=on)
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.152)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.128)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.15)(PORT=1521))
)
(ADDRESS_LIST =
(FAILOVER=on)
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.124)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.225)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.102)(PORT=1521))
)
(CONNECT_DATA=(SERVICE_NAME = SSOCDB_SSPDB.paas.oracle.com)
)
)
If you set FAILOVER=off at the DESCRIPTION level, the client will stop after the first failing destination instead of moving on to the next ADDRESS_LIST. This means that address-list order can determine whether the client reaches the site where the service is running.
To make the effect visible, I put the standby SCAN first. The connect attempt returns ORA-12514, which indicates the service isn’t registered with that listener.
$ time sql /nolog @test.sql
SQLcl: Release 25.4 Production on Fri Mar 13 18:41:16 2026
Copyright (c) 1982, 2026, Oracle. All rights reserved.
Error starting at line : 1 File @ /home/oracle/sqlcl/bin/test.sql
In command -
connect ...
Error report -
Connection Failed
Connection failed
USER = system
URL = jdbc:oracle:thin:@sspdb4
Error Message = ORA-12514: Cannot connect to database. Service SSOCDB_SSPDB.paas.oracle.com is not registered with the listener at host 10.46.100.15 port 1521. (CONNECTION_ID=Kck7VnDZTGmr4uv/jdyoiQ==)
https://docs.oracle.com/error-help/db/ora-12514/
real 0m1.982s
user 0m4.168s
sys 0m0.208s
With FAILOVER=on in the ADDRESS_LIST, the client can cycle through the SCAN IPs until it either connects or receives a definitive error. In this case, the listener responds quickly with ORA-12514 (“service not registered”), so the failure returns fast. You’ll see larger delays when the target host/port is not reachable (TCP connect timeout), which is covered later.
If you also set FAILOVER=off inside the ADDRESS_LIST, the client returns the error after the first SCAN IP and doesn’t even try the other two (because LOAD_BALANCE is off; see the next section). In my test, the difference isn’t measurable because ORA‑12514 returns so quickly.
Takeaway: keep FAILOVER=on unless you have a specific reason to restrict connection failover attempts. Turning it off can make connections order-dependent and reduce the client’s ability to reach an available destination. (FAILOVER is on by default).
2. LOAD_BALANCE = ON/OFF
Like FAILOVER, LOAD_BALANCE can be set at two levels: at the DESCRIPTION level and inside an ADDRESS_LIST. Unlike FAILOVER, LOAD_BALANCE is off by default in common connect descriptors.
With LOAD_BALANCE=off inside an ADDRESS_LIST, the client tries addresses in a stable order (for example, the order returned by DNS or the order listed in the descriptor). That can create two issues:
- One SCAN IP can receive a disproportionate share of connection attempts (especially with connection pools that reuse the same resolved order).
- If the first address is unreachable, each connection attempt can incur a TCP connect timeout (TRANSPORT_CONNECT_TIMEOUT) before the client moves on.
With LOAD_BALANCE=on, the starting address is randomized, so the “slow first hop” is spread across attempts instead of hitting the same address every time.
SSPDB=
(DESCRIPTION =
(TRANSPORT_CONNECT_TIMEOUT=2000ms)(LOAD_BALANCE=off)
(ADDRESS_LIST =
(LOAD_BALANCE=on)
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.124)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.225)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.102)(PORT=1521))
)
(ADDRESS_LIST =
(LOAD_BALANCE=off)
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.152)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.128)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.15)(PORT=1521))
)
(CONNECT_DATA=(SERVICE_NAME = SSOCDB_SSPDB.paas.oracle.com)
)
)
For this test, I make the first SCAN IP in the primary ADDRESS_LIST unreachable and configure the connection try to give up on that IP after 2 seconds. Three connection attempts are run per setting. (With LOAD_BALANCE=on, I occasionally needed extra runs to capture a “slow first hop,” because the starting address is randomized.) The table below only shows the connect times:
| LOAD_BALANCE | 1st Try | 2nd Try | 3rd Try | Median |
|---|---|---|---|---|
| OFF | 0m4.512s | 0m4.420s | 0m4.469s | 0m4.467 |
| ON | 0m2.447s | 0m2.404s | 0m4.425s | 0m3.092 |
In these tests, enabling LOAD_BALANCE=on inside the ADDRESS_LIST reduced the median connect time when one SCAN IP was unreachable. If you want more consistent connection establishment time under partial network failures, LOAD_BALANCE=on is a strong default.
LOAD_BALANCE at the DESCRIPTION level
Setting LOAD_BALANCE above the ADDRESS_LISTs controls whether the client distributes initial attempts across sites or prefers the first site listed. Which setting is “best” depends on how often the active service moves:
- If you strongly prefer one site (and switch over rarely), keep LOAD_BALANCE=off so normal connects are fast and predictable; after a switchover, connects may take longer until you update the order or fall back.
- If roles change frequently (for example, with Fast-Start Failover), consider LOAD_BALANCE=on to reduce the penalty when the service is active at the non-preferred site.
- If you fail over application hosts as well as the database (DR with app relocation), prefer the local site first (LOAD_BALANCE=off) and change the site order as part of the DR procedure.
To illustrate this behavior, consider the following example. Assume the primary host is completely unavailable and that each of the three configured IP addresses is attempted with a connection timeout of two seconds.
The service “Primary” represents a service running on the primary database. The service “Standby” represents the same service after a switchover, where the former standby database has assumed the primary role. For simplicity, the example assumes four connection attempts in each scenario.
| Service | LOAD_BALANCE | 1st | 2nd | 3rd | 4th | Median |
|---|---|---|---|---|---|---|
| Primary | OFF | 0m2.498s | 0m2.446s | 0m2.448s | 0m2.417s | 0m2.452s |
| Standby | OFF | 0m8.660s | 0m8.449s | 0m8.435s | 0m8.436s | 0m8.495s |
| Primary | ON | 0m8.463s | 0m2.435s | 0m2.451s | 0m8.489s | 0m5.456s |
| Standby | ON | 0m8.464s | 0m8.462s | 0m2.427s | 0m2.406s | 0m5.439s |
In this example, LOAD_BALANCE=on produces more consistent connect times regardless of where the service is active. If you strongly prefer one site and switch over rarely, LOAD_BALANCE=off typically gives faster connects during normal operation—at the cost of slower connects immediately after a switchover.
3. RETRY_COUNT
With FAILOVER=on, the client can attempt multiple addresses from the connect descriptor before it gives up. LOAD_BALANCE only influences the order of those address attempts.
In my descriptor, there are six SCAN IPs. One full pass through those six addresses is one “round.” When the hosts are reachable and the listener responds immediately with “service not registered,” a round completes quickly (under ~300 ms in my tests).
During a service relocation (RAC) or a switchover (Data Guard), the service may be unavailable for a short window. A single round through the address list can complete before the service is active again, so the client can fail “too early.” Retries matter because the client typically reports a final error only after it has exhausted its configured attempts.
RETRY_COUNT is the number of additional rounds the client performs after the first round.
Total rounds = 1 + RETRY_COUNT. Examples:
- RETRY_COUNT=0 → 1 round (default)
- RETRY_COUNT=1 → up to 2 rounds
SSPDB=
(DESCRIPTION =
(RETRY_COUNT=10)(RETRY_DELAY=0ms)
(ADDRESS_LIST =
(LOAD_BALANCE=on)
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.124)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.225)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.102)(PORT=1521))
)
(ADDRESS_LIST =
(LOAD_BALANCE=on)
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.152)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.128)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.15)(PORT=1521))
)
(CONNECT_DATA=(SERVICE_NAME = SSOCDB_SSPDB.paas.oracle.com)
)
)
To show the impact, Ilet’s compare the timing for a single round (RETRY_COUNT=0) versus multiple retries. Because the service isn’t running, each run ends with an error. The output below shows an example with RETRY_COUNT=10.
Caution: avoid RETRY_DELAY=0 in production. Tight retry loops can create unnecessary load on listeners and networks. I use RETRY_DELAY=0ms here only to show the upper bound on retry speed (I’ll cover RETRY_DELAY next).
$ time sql /nolog @test.sql
SQLcl: Release 25.4 Production on Fri Mar 13 21:23:06 2026
Copyright (c) 1982, 2026, Oracle. All rights reserved.
Error starting at line : 1 File @ /home/oracle/sqlcl/bin/test.sql
In command -
connect ...
Error report -
Connection Failed
Connection failed
USER = system
URL = jdbc:oracle:thin:@sspdb
Error Message = ORA-12514: Cannot connect to database. Service SSOCDB_SSPDB.paas.oracle.com is not registered with the listener at host 10.46.100.128 port 1521. (CONNECTION_ID=wzeNBdjKRA+U3JYyp5DQIA==)
https://docs.oracle.com/error-help/db/ora-12514/
real 0m2.165s
user 0m4.407s
sys 0m0.224s
Here a small table to show how the time increases.
| RETRY_COUNT | Time |
|---|---|
| 0 | 0m2.036s |
| 10 | 0m2.165s |
| 50 | 0m2.225s |
| 100 | 0m2.484s |
| 1000 | 0m6.222s |
4. RETRY_DELAY
RETRY_DELAY gives the system a few seconds between retries so the service has time to start (or to complete a switchover/failover). Retrying in a tight loop (for example, RETRY_DELAY=0) can create avoidable load on listeners and networks because the client repeatedly triggers fast failures such as “service not registered”. When you set RETRY_COUNT, also set RETRY_DELAY. The 1-second delay is often too short for planned role transitions, which is why MAA commonly recommends 3 seconds.
RETRY_DELAY is the time the client waits between rounds.
SSPDB=
(DESCRIPTION =
(RETRY_COUNT=10)(RETRY_DELAY=1)
(ADDRESS_LIST =
(LOAD_BALANCE=on)
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.124)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.225)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.102)(PORT=1521))
)
(ADDRESS_LIST =
(LOAD_BALANCE=on)
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.152)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.128)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.15)(PORT=1521))
)
(CONNECT_DATA=(SERVICE_NAME = SSOCDB_SSPDB.paas.oracle.com)
)
)
This can directly be seen on the connection timings, if the service isn’t started:
$ time sql /nolog @test.sql
SQLcl: Release 25.4 Production on Fri Mar 13 21:36:06 2026
Copyright (c) 1982, 2026, Oracle. All rights reserved.
Error starting at line : 1 File @ /home/oracle/sqlcl/bin/test.sql
In command -
connect ...
Error report -
Connection Failed
Connection failed
USER = system
URL = jdbc:oracle:thin:@sspdb
Error Message = ORA-12514: Cannot connect to database. Service SSOCDB_SSPDB2.paas.oracle.com is not registered with the listener at host 10.46.100.152 port 1521. (CONNECTION_ID=q6g58oH2SZejJ1FGDX7Fbg==)
https://docs.oracle.com/error-help/db/ora-12514/
real 0m12.137s
user 0m4.243s
sys 0m0.269s
So here also a table with RETRY_DELAY of 1s:
| RETRY_COUNT | Time |
|---|---|
| 0 | 0m2.036s |
| 10 | 0m12.137s |
| 20 | 0m22.145s |
| 50 | 0m52.320s |
| 100 | 1m42.673s |
If all SCAN IPs are reachable, you can estimate total wait time from RETRY_COUNT and RETRY_DELAY as documented here. Using the “rounds” definition above: the client performs 1 + RETRY_COUNT rounds, but the delay only applies between rounds. A practical estimate is:
Total time ≈ initial round time + (RETRY_COUNT × RETRY_DELAY)
Example: RETRY_COUNT=100, RETRY_DELAY=3s, initial round ≈ 2s → total ≈ 2s + 300s = 302s.
When all SCAN IPs are reachable, this configuration can make the client wait roughly 300 seconds before giving up. Next, we’ll look at what changes when an IP address is unreachable (for example, during a failover or when a SCAN IP is blocked).
This is the task of the next parameter:
5. TRANSPORT_CONNECT_TIMEOUT
When a host or port is unreachable, a connect attempt can block until the underlying TCP stack times out, which can be far longer than an application can tolerate. TRANSPORT_CONNECT_TIMEOUT caps how long the client waits for the TCP connect to complete for each address.
This matters both for SCAN addresses and for the redirected local listener address returned by SCAN.
TRANSPORT_CONNECT_TIMEOUT defaults to 60 seconds, so it’s usually worth setting explicitly for HA connection strings. Choose a value that reflects worst-case network latency (don’t set it so low that normal jitter causes ORA-12170).
TRANSPORT_CONNECT_TIMEOUT provides similar control as TCP.CONNECT_TIMEOUT in sqlnet.ora; however, it will override it for the connect descriptor.
For these tests, I set LOAD_BALANCE=off to make address order deterministic and use TRANSPORT_CONNECT_TIMEOUT=1000ms so an unreachable IP adds ~1 second of delay. Most cases are run with RETRY_COUNT=0, and RETRY_COUNT=2 is only used where retries change the outcome.
SSPDB=
(DESCRIPTION =
(RETRY_COUNT=0)(RETRY_DELAY=2)(TRANSPORT_CONNECT_TIMEOUT=1000ms)
(ADDRESS_LIST =
(LOAD_BALANCE=off)
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.124)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.225)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.102)(PORT=1521))
)
(ADDRESS_LIST =
(LOAD_BALANCE=off)
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.152)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.128)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.15)(PORT=1521))
)
(CONNECT_DATA=(SERVICE_NAME = SSOCDB_SSPDB.paas.oracle.com)
)
)
Let’s investigate how different failure scenarios are affected by TRANSPORT_CONNECT_TIMEOUT. For the first three cases, we use a staggered SCAN IP unavailability scenario as follows:
- One SCAN IP unreachable
- Two out of 3 SCAN IPs inaccessible
- All Three SCAN IPs unreachable
In these cases the effects are as follows:
| Case 1: One SCAN IP unreachable | Case 2: Two out of 3 SCAN IPs inaccessible | Case 3: All Three SCAN IPs unreachable |
| After waiting one second on the first IP address, it connects to the next. RETRY_COUNT is irrelevant because the first pass succeeds: real 0m3.435s user 0m5.177s sys 0m0.273s | Overall, the connection time increases by another second, as the second SCAN IP is also not available: real 0m4.432s user 0m5.084s sys 0m0.255s | This is the typical scenario after a primary failure and standby failover, with the service already running on the standby. RETRY_COUNT has no effect because the service is available immediately: real 0m5.421s user 0m4.963s sys 0m0.233s |
Next we investigate a few other standard cases and one experimental case:
- Primary Cluster down and Service not yet started on the standby cluster
- Primary and Standby Cluster down
- Primary Cluster local IPs not available
| Case 4: Primary Cluster down no Service yet | Case 5: Primary and Standby Cluster down | Case 6: Primary local IPs not available | |
| Error Message | ORA-12514: Cannot connect to database. Service SSOCDB_SSPDB.paas.oracle.com is not registered with the listener at host 10.46.100.15 port 1521 | ORA-12170: Cannot connect. TCP connect timeout of 1000ms for host 10.46.100.15 port 1521. | ORA-12514: Cannot connect to database. Service SSOCDB_SSPDB.paas.oracle.com is not registered with the listener at host 10.46.100.15 port 1521. |
| RETRY_COUNT 0 | real 0m5.006s user 0m4.021s sys 0m0.228s | real 0m8.051s user 0m4.174s sys 0m0.212s | |
| RETRY_COUNT 2 | real 0m13.104s user 0m4.325s sys 0m0.249s | real 0m21.988s user 0m4.116s sys 0m0.222s | real 0m10.022s user 0m4.045s sys 0m0.228s |
| Remark | The connection fails, with the information that the service is not available on the Standby. This indicates the connection hasn’t waited long enough for the failover to finish and RETRY_COUNT should be increased. | As all 6 SCAN IPs are not reachable overall connection timeout changes. Unlike in Case 4 where the reply from the Standby is pretty much instant. | The client connects to SCAN successfully and receives a redirect to a local listener. As the local listener is unreachable, the TCP connect attempt times out (TRANSPORT_CONNECT_TIMEOUT) for each local listener hosting the service. |
| Max Connection Wait Time Formula | 3 SCANs x TRANSPORT_CONNECT_TIMEOUT) + (RETRY_COUNT x (RETRY_DELAY + (3 SCANS x TRANSPORT_CONNECT_TIMEOUT))) | (6 SCANs x TRANSPORT_CONNECT_TIMEOUT) + (RETRY_COUNT x (RETRY_DELAY + (6 SCANS x TRANSPORT_CONNECT_TIMEOUT))) | (# Local Listeners the service runs on x TRANSPORT_CONNECT_TIMEOUT) + (RETRY_COUNT x (RETRY_DELAY + (# Local Listeners the service runs on x TRANSPORT_CONNECT_TIMEOUT))) |
| Max Connection Wait Time for Recommended Connection String | (3 x 1) + (100 x (3 + (3 x 1)) = 3 + 600 = 603 sec | 6 x 1) + (100 x (3 + (6 x 1)) = 6 + 900 = 906 sec |
TRANSPORT_CONNECT_TIMEOUT – Summary
- Set TRANSPORT_CONNECT_TIMEOUT low to speed up failover from unreachable addresses, but not so low that healthy connections trigger an ORA-12170.
- Choose a value above the worst-case TCP connect latency in your environment.
- Check whether SCAN resolves to both IPv4 and IPv6, as additional addresses per round can increase worst-case connection time.
6. CONNECT_TIMEOUT
CONNECT_TIMEOUT is often misunderstood because it limits the duration of a single connection attempt, including server-process creation. If the timeout is reached, the client returns an error immediately, regardless of RETRY_COUNT or RETRY_DELAY.
Set it high enough for connections to succeed under peak load and significantly above TRANSPORT_CONNECT_TIMEOUT. Since it applies per attempt, it does not need to cover retry cycles. The commonly recommended 90 seconds is conservative, as CONNECT_TIMEOUT mainly protects against slow connection completion rather than unreachable hosts.
Example: what happens when CONNECT_TIMEOUT is hit. To force this in a lab, I intentionally set TRANSPORT_CONNECT_TIMEOUT higher than CONNECT_TIMEOUT (don’t do this in production):
SSPDB=
(DESCRIPTION =
(CONNECT_TIMEOUT=2000ms)(RETRY_COUNT=100)(RETRY_DELAY=1)
(TRANSPORT_CONNECT_TIMEOUT=5000ms)
(ADDRESS_LIST =
(LOAD_BALANCE=off)
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.124)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.225)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.102)(PORT=1521))
)
(ADDRESS_LIST =
(LOAD_BALANCE=off)
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.152)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.128)(PORT=1521))
(ADDRESS = (PROTOCOL = TCP)(HOST=10.46.100.15)(PORT=1521))
)
(CONNECT_DATA=(SERVICE_NAME = SSOCDB_SSPDB.paas.oracle.com)
)
)
Error ORA-18953 is returned:
$ time sql /nolog @test.sql
SQLcl: Release 25.4 Production on Fri Mar 13 23:16:41 2026
Copyright (c) 1982, 2026, Oracle. All rights reserved.
Error starting at line : 1 File @ /home/oracle/sqlcl/bin/test.sql
In command -
connect ...
Error report -
Connection Failed
Connection failed
USER = system
URL = jdbc:oracle:thin:@sspdb
Error Message = ORA-18953: Socket connect timed out. (CONNECTION_ID=2qKjy28+RgWOi8MMqugW3A==)
https://docs.oracle.com/error-help/db/ora-18953/
real 0m4.003s
user 0m4.032s
sys 0m0.265s
Interesting additional observations
- Throughout my tests, I used values in milliseconds to highlight the impact. E.g. for:
- RETRY_COUNT, RETRY_DELAY, and TRANSPORT_CONNECT_TIMEOUT
- While a JDBC client has no issues working with these settings, I recognized a few issues by using ms settings with TNSPING and SQL*Plus:
- TNSPING won’t operate if TRANSPORT_CONNECT_TIMEOUT is shorter than one second
- SQL*Plus seems to round up the RETRY_DELAY and TRANSPORT_CONNECT_TIMEOUT to full seconds (in my tests), which is the reason I used SQLcl to showcase the scenarios.
Summary
This article explained how to help applications to benefit from the Oracle built-in capability to wait for a connection until the service is up and running. This approach applies to both Oracle RAC and Data Guard and typically results in cleaner behavior during relocations, switchovers, and failovers. It is better for the Oracle client to wait for the service to come up than to return a connection error to the application and rely on application-level retries.
You should now be able to tune these parameters beyond the generic recommendation and understand why RAC and Data Guard often benefit from different connection profiles.
If you use a connection pool, align its timeouts with the Oracle client’s worst-case wait time: the pool’s acquisition timeout should be longer than the client’s maximum connection wait. Otherwise, the pool may start new connection attempts while earlier ones are still in progress, causing a connection storm and increasing load during an outage.
References:
- High Availability Overview and Best Practices – Recommended Connection String – https://docs.oracle.com/en/database/oracle/oracle-database/26/haovw/configuring-level-1-basic-application-high-availability.html#GUID-526F4E44-1F2B-427B-A96F-3243CEF3DA17
- High Availability Overview and Best Practices – Oracle Net TNS String Parameters – https://docs.oracle.com/en/database/oracle/oracle-database/26/haovw/oracle-net-tns-string-parameters.html
- High Availability Overview and Best Practices – Connection Time Estimates During Data Guard Switchover or Failover – https://docs.oracle.com/en/database/oracle/oracle-database/26/haovw/connection-time-estimates-data-guard-switchover-or-failover.htm
- Database Net Services Reference – Parameters for sqlnet.ora – https://docs.oracle.com/en/database/oracle/oracle-database/26/netrf/parameters-for-the-sqlnet.ora.html
