Monday Feb 29, 2016

node-oracledb 1.7.0 has a connection pool queue (Node.js add-on for Oracle Database)

Node-oracledb 1.7.0, the Node.js add-on for Oracle Database, is on NPM.

Top features: a new connection pool queue to make apps more resilient, and "Bind by position" syntax for PL/SQL Index-by array binds.

This release has a couple of interesting changes as well as some small bind fixes. A few reported build warnings with some compilers were also squashed.

Extended PL/SQL Index-by Array Bind Syntax

To start with, a followup PR from @doberkofler completes his PL/SQL Index-by array binding support project. In node-oracledb 1.7 he has added "bind by position" syntax to the already existing "bind by name" support. Thanks Dieter! The "bind by position" syntax looks like:

connection.execute(
  "BEGIN mypkg.myinproc(:id, :vals); END;",
  [
    1234,
    { type: oracledb.NUMBER,
       dir: oracledb.BIND_IN,
       val: [1, 2, 23, 4, 10]
    }
  ],
  function (err) { . . . });

Personally I'd recommend using bind by name for clarity, but this PR makes the feature congruent with binding scalar values, which is always a good thing.

Documentation is at PL/SQL Collection Associative Array (Index-by) Bind Parameters.

New Transparent JavaScript Wrapper for Existing Classes

The other major change in 1.7 is a new JavaScript wrapper over the current node-oracledb C++ API implementation, courtesy of some community discussion and the direction that users seemed to have been heading in: creating similar wrappers. It was also the result of some 'above and beyond' overtime from Dan McGhan who did the project. This wrapper should be transparent to most users. It gives a framework that will make it easier to extend node-oracledb in a consistent way and also let developers who know JavaScript better than C++ contribute to node-oracledb.

New Connection Pool Queue Enabled by Default

The layer has let Dan add his first new user feature: a request queue for connection pooling. It is enabled by a new Boolean pool attribute queueRequests. If a pool.getConnection() request is made but there are no free connections (aka sessions) in the pool, the request will now be queued until an in-use connection is released back to the pool. At this time the first request in the queue will be dequeued, and the underlying C++ implementation of pool.getConnection() will be called to return the now available connection to the waiting requester.

A second new pool attribute queueTimeout uses setTimeout to automatically dequeue and return an error for any request that has been waiting in the queue too long. The default value is 60000 milliseconds, i.e. 60 seconds. In normal cases, when requests are dequeued because a connection does become available, the timer is stopped before the underlying C++ layer gets called to return the connection.

The pool queue is enabled by default. If it is turned off, you get pre-1.7 behavior. For example if more requests are concurrently thrown at an app than the poolMax value, then some of the pool.getConnection() calls would likely return an error ORA-24418: Cannot open further sessions. When enabled, the new queue nicely stops this error occurring and lets apps be more resilient.

The pool option attribute _enableStats turns on lightweight gathering of basic pool and queue statistics. It is false by default. If it is enabled, applications can output stats to the console by calling pool._logStats() whenever needed. I think it will be wise to monitor the queue statistics to make sure your pool configuration is suitable for the load. You don't want the queue to be an invisible bottle neck when too many pool.getConnection() requests end up in the queue for too long. Statistics and the API may change in future, so the attribute and method have an underscore prefix to indicate they are internal.

Connection Queue Example

To look at an example, I used ab to throw some load at an app based on examples/webapp.js I used a load concurrency of 25 parallel requests. The pool had a maximum of 20 sessions in its pool. The extra load was nicely handled by the connection queue without the application experiencing any connection failures.

I'd modified the app to check for a particular URL and dump statistics on request:

    . . .
    var hs = http.createServer (
      function(request, response)
      {
        var urlparts = request.url.split("/");
        var arg = urlparts[1];
        if (arg === 'stats') {
          pool._logStats();
        }
    . . .

Here is snapshot of the output from _logStats() at one point during the test:

Pool statistics:
...total connection requests: 26624147
...total requests enqueued: 5821874
...total requests dequeued: 5821874
...total requests failed: 0
...total request timeouts: 0
...max queue length: 6
...sum of time in queue (milliseconds): 13920717
...min time in queue (milliseconds): 0
...max time in queue (milliseconds): 1506
...avg time in queue (milliseconds): 2
...pool connections in use: 12
...pool connections open: 20
Related pool attributes:
...queueRequests: true
...queueTimeout (milliseconds): 0
...poolMin: 10
...poolMax: 20
...poolIncrement: 10
...poolTimeout: 0
...stmtCacheSize: 30
Related environment variables:
...process.env.UV_THREADPOOL_SIZE: undefined

The connection pool was semi-arbitrarily configured for testing. It started out with 10 sessions open (poolMin) and as soon as they were in use, the pool would have grown by another 10 sessions (poolIncrement) to the maximum of 20 (poolMax).

What the stats show is that not all pool.getConnection() requests could get a pooled connection immediately. About 20% of requests ended up waiting in the queue. The connection pool poolMax is smaller than optimal for this load.

The queue was never large; it never had more than 6 requests in it. This is within expectations since there are at least 5 more concurrent requests at a time than there are connections available in the pool.

If this were a real app, I might decide to increase poolMax so no pool.getConnection() call ever waited. (I might also want to set poolTimeout so that when the pool was quiet, it could shrink, freeing up DB resources.) However the average wait time of 2 milliseconds is small. If I don't have DB resources to handle the extra sessions from a bigger pool, I might decide that a 2 millisecond wait is OK and that the pool size is fine.

At least one connection spent 1.5 seconds in the queue. Since I know my test infrastructure I'm guessing this was when the pool ramped up in size and my small, loaded DB took some time to create the second set of 10 sessions. Maybe I should experiment with a smaller poolIncrement or bigger poolMin?

Another important variable shown in the stats is UV_THREADPOOL_SIZE. I'd not set it so there were the default four worker threads in the Node process. Blindly increasing poolMax may not always help throughput. If DB operations take some time, you might find all threads get blocked waiting for their respective DB response. Increasing UV_THREADPOOL_SIZE may help improve application throughput.

The best settings for pool configuration, UV_THREADPOOL_SIZE, and any DRCP pool size will depend on your application and environment.

Connection Pool Queue Statistics

The table below shows the node-oracledb 1.7 pool statistics descriptions. These stats and the APIs to enable and log them may change in future versions of node-oracledb. I look forward to getting some PRs, for example to add a standard logging capability which the stats generation can be part of.

Connection Pool MetricDescription
total connection requests

Number of pool.getConnection() calls made for this pool

total requests enqueued

Number of connections that couldn't be satisfied because every session in the the pool was already being used, and so they had to be queued waiting for a session to be returned to the pool

total requests dequeued

Number of connection requests that were removed from the queue when a free connection has become available. This is triggered when the app has finished with a connection and calls release() to return it to the queue

total requests failed

Number of connection calls that invoked the underlying C++ pool.getConnection() callback with an error state. Does not include queue request timeout errors.

total request timeouts

Number of connection requests that were waiting in the queue but exceeded the queueTimeout setting. The timeout is triggered with a JavaScript setTimeout call

max queue length

Maximum number of connection requests that were ever waiting at one time

sum of time in queue

Total sum of time that connection requests have been waiting in the queue

min time in queue

Smallest amount of time that any request waited in the queue

max time in queue

Longest amount of time that any request waited in the queue

avg time in queue

Derived from the sum of time value divided by the number of requests enqueued

pool connections in use

A metric returned by the underlying Oracle C client session pool implementation. It is the number of currently active connections in the connection pool

pool connections open

Also returned by the underlying library. It shows the number of currently open sessions in the underlying connection pool

Note that the sum of time in queue, the min time in queue and the max time in queue values are calculated when queued requests are removed from the queue, so they don't record the amount of time for requests still waiting in the queue.

Resources

Issues and questions about node-oracledb can be posted on GitHub. We value your input to help prioritize work on the add-on. Drop us a line!

node-oracledb installation instructions are here.

Node-oracledb documentation is here.

Thursday Mar 21, 2013

Python cx_Oracle and Oracle 11g DRCP Connection Pooling

The topic of Oracle 11g DRCP connection pooling in Python cx_Oracle came up twice this week for me. DRCP is a database tier connection pooling solution which is great for applications run in multiple processes. There is a whitepaper on DRCP that covers a lot of background and talks about configuration. This whitepaper is ostensibly about PHP but is good reading for all DRCP users.

The first DRCP and cx_Oracle scenario I dealt with was a question about mod_python.

To cut a long story short, I created a handler and requested it 1000 times via Apache's 'ab' tool. In my first script, and despite having increased the default pool parameters, there were a high number of NUM_WAITS. Also NUM_AUTHENTICATIONS was high. Performance wasn't the best. Querying V$CPOOL_CC_STATS showed:

CCLASS_NAME  NUM_REQUESTS   NUM_HITS NUM_MISSES  NUM_WAITS NUM_AUTHENTICATIONS
------------ ------------ ---------- ---------- ---------- -------------------
HR.CJDEMO1           1000        992          8        478                1000

At least the session information in each DRCP server was reused (shown via a high NUM_HITS).

Results were better after fixing the script mptest.py to look like:

from mod_python import apache
import cx_Oracle
import datetime

# mptest.py
# Example: Oracle 11g DRCP with cx_Oracle and mod_python

# These pool params are suitable for Apache Pre-fork MPM
mypool = cx_Oracle.SessionPool(user='hr', password='welcome',
         dsn='localhost/orcl:pooled', min=1, max=2, increment=1)

def handler(req):
    global mypool

    req.content_type = 'text/html'
    n = datetime.datetime.now()
    req.write (str(n) + "<br>");

    db = cx_Oracle.connect(user='hr', password='welcome',
            dsn='localhost/orcl:pooled', pool=mypool, cclass="CJDEMO1",
            purity=cx_Oracle.ATTR_PURITY_SELF)

    cur = db.cursor()
    cur.execute('select * from locations')
    resultset = cur.fetchall()
    for result in resultset:
        for item in result:
            req.write (str(item) + " ")
        req.write ("<br>")
    cur.close()
    mypool.release(db)

    return apache.OK

The 'ab' benchmark on this script ran much faster and the stats from V$CPOOL_CC_STATS looked much better. The number of authentications was right down about to about 1 per Apache (ie. mod_python) process:

CCLASS_NAME  NUM_REQUESTS   NUM_HITS NUM_MISSES  NUM_WAITS NUM_AUTHENTICATIONS
------------ ------------ ---------- ---------- ---------- -------------------
HR.CJDEMO1           1000        977         23         13                  26

The NUM_HITS was high again, because the DRCP purity was ATTR_PURITY_SELF. If I hadn't wanted session information to be reused each time the handler was executed, I could have set the purity to ATTR_PURITY_NEW. If I'd done this then NUM_HITS would have been low and NUM_MISSES would have been high.

If you're testing this yourself, before restarting the DRCP pool don't forget to shutdown Apache to close all DB connections. Otherwise restarting the pool will block. Also, if you're interpreting your own V$CPOOL_CC_STATS stats don't forget to account for the DRCP "dedicated optimization" that retains an association between clients (mod_python processes) and the DB. The whitepaper previously mentioned discusses this.

The second place where DRCP and python came up this week was on the cx_Oracle mail list. David Stanek posed a question. He was seeing application processes blocking while waiting for a DRCP pooled server to execute a query. My variant of David's script is drcp2.py:

import os
import time
import cx_Oracle

# drcp2.py
# Example: Sub-optimal connection pooling with Oracle 11g DRCP and cx_Oracle

def do_connection():
    print 'Starting do_connection ' + str(os.getpid())
    con = cx_Oracle.connect(user=user, password=pw, dsn=dsn, cclass="CJDEMO2",
           purity=cx_Oracle.ATTR_PURITY_SELF)
    cur = con.cursor()
    print 'Querying ' + str(os.getpid())
    cur.execute("select to_char(systimestamp) from dual")
    print cur.fetchall()
    cur.close()
    con.close()
    print 'Sleeping ' + str(os.getpid())
    time.sleep(30)
    print 'Finishing do_connection ' + str(os.getpid())
 
user = 'hr'
pw = 'welcome'
dsn = 'localhost/orcl:pooled'
for x in range(100):
    pid = os.fork()
    if not pid:
        do_connection()
        os._exit(0)

This script forks off a bunch of processes - more than the number of pooled DRCP servers (see MAXSIZE in the DBA_CPOOL_INFO view). The first few processes grab a DRCP server from the pool and do their query. But they don't release the DRCP server back to the DRCP pool until after the sleep() when the process ends. The other forked processes are blocked waiting for those DRCP servers to become available. This isn't optimal pool sharing.

My suggestion was to use an explicit cx_Oracle session pool like drcp3.py:

import os
import time
import cx_Oracle

# drcp3.py
# Example: Connection pooling with Oracle 11g DRCP and cx_Oracle
 
def do_connection():
    print 'Starting do_connection ' + str(os.getpid())
    mypool = cx_Oracle.SessionPool(user=user,password=pw,dsn=dsn,min=1,max=2,increment=1)
    con = cx_Oracle.connect(user=user, password=pw,
          dsn=dsn, pool = mypool, cclass="CJDEMO3", purity=cx_Oracle.ATTR_PURITY_SELF)
    cur = con.cursor()
    print 'Querying ' + str(os.getpid())
    cur.execute("select to_char(systimestamp) from dual")
    print cur.fetchall()
    cur.close()
    mypool.release(con)
    print 'Sleeping ' + str(os.getpid())
    time.sleep(30)
    print 'Finishing do_connection ' + str(os.getpid())

user = 'hr'
pw = 'welcome'
dsn = 'localhost/orcl:pooled'
for x in range(100):
    pid = os.fork()
    if not pid:
        do_connection()
        os._exit(0)

The mypool.release(con) call releases the DRCP server back to the DRCP pool prior to the sleep. When this second script is run, there is a smoothness to the output. The queries happen sequentially without noticeably being blocked.

Like with any shared resource, it is recommended to release DRCP pooled servers back to the pool when they are no longer needed by the application.

Update August 2014: Oracle submitted a cx_Oracle patch which Anthony has merged but not yet released. This patch now allows drcp2.py to work efficiently. For the fork scenario the non-explicit session pool solution in drcp2.py is likely to be better than drcp3.py, since it does fewer DRCP requests. For the mod_python example, continuing to use a session pool as shown in mptest.py can minimize the number of authentications, which helps scalability.

Tuesday Dec 11, 2012

Excitement! Updated Underground PHP and Oracle Manual is Available for Download

We're thrilled to have a major update of the free Underground PHP and Oracle Manual released on OTN.

The Underground PHP and Oracle Manual is designed to bridge the gap between the many PHP scripting language and the many Oracle Database books available. It contains unique material about PHP's OCI8 extension for Oracle Database, and about other components in the PHP-Oracle ecosystem. It shows PHP developers how to use PHP and Oracle together, efficiently and easily.

The book has been completely refreshed. It has been updated for Oracle XE 11g and the latest PHP OCI8 extension. There are new chapters about using PHP with Oracle TimesTen, NetBeans and Oracle Tuxedo. There is also a new chapter about installing PHP on Oracle Solaris. The book now clocks in at 347 pages of great content.

Acknowledgements are due to all those who have helped with this and previous editions of the book. Thanks to the product teams that assisted with brand new content. In particular Craig Mohrman contributed the chapter about PHP on Solaris. Jeffry Rubinoff contributed the base text for the chapter on PHP and NetBeans.

About

Tourists looking out over an Opal mine
I'm a Product Manager in Server Technologies, working on scripting languages and developer-access.
Mail: christopher.jones@oracle.com
Twitter: @ghrd
OTN: Scripting Languages
Book: Free PHP Oracle book

Follow:
Blaine Carter
Dan McGhan

Search

Archives
« July 2016
SunMonTueWedThuFriSat
     
1
2
3
4
5
6
7
9
10
11
12
13
14
15
16
17
18
19
21
22
23
24
25
26
27
28
29
30
31
      
Today