Node-oracledb: NJS-032 Under Load

Created on 8 Feb 2016  路  22Comments  路  Source: oracle/node-oracledb

We're running into a NJS-032 (Connection cannot be released because a database call is in progress) issue when running Oracledb under load in production. We have been able to reproduce the problem using JMeter on a test environment as well (although making it reproducible outside our own environment will probably not be easy).

Our setup is as follows:

  • Node.js version 5.4.0
  • Node.js server running node-oracledb 1.6.0 against an 11g database on a remote server.
  • Connection pool size of 100 connections. Reproduced on test environment with a connection pool size of 10 connections.
  • Single query that executes a custom PL/SQL function. This is the only thing executed by the server. The function executes and query returns in less than 1 second. autoCommit is implicitly set.
  • Function takes a 200 character string (VARCHAR2) as a parameter, and returns a 200 character string (VARCHAR2) as output.
  • All node-oracledb interaction (oracle.createPool, pool.getConnection, connection.execute, and connection.release) wrapped with Bluebird promises.
  • Production OS is CentOS 6.6, also reproduced on Windows.

Generally, the flow is:

  1. Retrieve connection.
  2. Execute function.
  3. Release connection.

The problem is that some condition is causing connection.release to return NJS-032 after step #2 has been successfully completed. If I ignore the problem, I eventually run out of connections. Of course, if I throw the error, the process terminates, only to recur an hour, two, or three in the future.

Any hints?

question

All 22 comments

I know this is asking a lot, but can you give is code to reproduce?

maybe that plsql is still doing something after your callback is called? which is a bit strange.
I modified the connection.release in simple-oracledb to enable automatic retries, so it might workaround your issue in case of any latency problem.
take a look at:

https://github.com/sagiegurari/simple-oracledb#usage-release
https://github.com/sagiegurari/simple-oracledb/blob/master/docs/api.md#Connection+release

//do this one time
var oracledb = require('oracledb');
var SimpleOracleDB = require('simple-oracledb');
SimpleOracleDB.extend(oracledb);

//... create pool, get connection, do some stuff...
connection.release({
  retryCount: 10,
  retryInterval: 250
}, function onRelease(error) {
  if (error) {
    //now what?
  }
});

@dmcghan, we're working on reproducing it in a vanilla test environment, but that might take us a bit. It is killing us on a critical project in production right now--because it happens only under load, we never caught it in routine testing. Our work-around for now is periodic restarts to "refresh" the pool. If we can't get a fix in a day or two, we'll probably attempt to either:

  1. Implement retries (thanks, @sagiegurari)
  2. Check the in use connections against the pool max and either restart the server or terminate and recreate the pool.

I'd rather get the problem fixed.

@sagiegurari, the plsql is simply performing a number of queries and returning a result. The failure occurs after the function has returned, when the NJS-032 is returned in the callback to connection.release.

@jeffm13 we discussed this today and came to the conclusion it really needs a testcase.

With the JS layer that @dmcghan and @sagiegurari have been discussing it should be easier to add custom logging.

@sagiegurari would it be better to (optionally?) log a warning if a connection can't be immediately released? This situation could indicated a problem affecting scalability.

Ya, I could use node debuglog to write it.
I'll add it later today.

We've created our own wrapper for oracledb that uses Bluebird promises, very similar to orawrap, that simplifies use of the driver in a hapi.js server. So logging isn't a problem. We're working on a test case now to demonstrate the problem, but since we exclusively work with existing legacy databases, it's taking a bit longer than desired to make it repeatable.

The question is: are there cases where it is valid for a connection release to fail, even though the connection itself is valid, and the statement executing on a connection completed? And if an "immediate" release fails, should we assume that we can simply attempt to release the connection again? If so, we'll just add retries to our wrapper (probably with bluebird-retry). But this should probably be documented.

added debug logs, just need to set the env var:
NODE_DEBUG=simple-oracledb

https://github.com/sagiegurari/simple-oracledb#debug

any connection.release error would be printed there with full stack.

Hi All,

We were able to reproduce the error in an isolated database. Steps are below.

Database setup

--Create Table
create table testTable (
  val1  VARCHAR2(10) primary key,
  val2  VARCHAR2(10)
);

--Populate Table
DECLARE j NUMBER;
BEGIN
  FOR i IN 1..500 LOOP
j := i + 1;
INSERT INTO testTable (val1, val2)
  VALUES ('string' || TO_CHAR(i), 'string' || TO_CHAR(j));
  END LOOP;
END;

--Allow execute of DBMS_LOCK and DBMS_RANDOM
grant execute on dbms_lock to public;
grant execute on dbms_random to public;

--Create Function
create or replace function testFunction (input1 VARCHAR2, input2 VARCHAR2)
RETURN VARCHAR2
AS
BEGIN
  declare randTime number;
  BEGIN
SELECT sys.dbms_random.value(4, 10) INTO randTime FROM DUAL;
randTime := randTime / 10;
sys.dbms_lock.sleep(randTime);
  END;
  declare returnVal2 varchar2(10);
  begin
  SELECT val2 into returnVal2 FROM testTable WHERE val1 = input1 || input2;
  RETURN returnVal2;
end;
END;

Node API code

package.json

{
  "name": "test-server",
  "version": "1.0.0",
  "description": "test server",
  "dependencies": {
    "bluebird": "^3.1.1",
    "oracledb": "^1.5.0",
    "hapi": "^12.0.0",
    "good": "^6.4.0",
    "good-console": "^5.2.0"
  }
}

server.js

'use strict';

var Oracle = require('oracledb');
var Promise = require('bluebird');
var using = Promise.using;
var then = Promise.then;

var Hapi = require('hapi');
var server = new Hapi.Server();
server.connection({ port: 3000 });

server.register([{
  register: require('good'),
  options: {
    "opsInterval": 600000,
    "reporters": [{
      "reporter": "good-console",
      "events": {"log": "*", "ops": "*", "response": "*", "error": "*", "request": "*"}
    }]
  }
}], (err) => {
});

function createPool(config) {
  return new Promise(function(resolve, reject) {
    console.log('creating pool ' + JSON.stringify(config));
    Oracle.createPool(config, function(err, pool) {
      if (err) {
        return reject(err);
      }
      resolve(pool);
    });
  });
}
function _getConnection(pool) {
  return new Promise(function(resolve, reject) {
    pool.getConnection(function(err, connection) {
      console.log(" OPEN CONNECTIONS: " + pool.connectionsOpen + ' IN USE: ' + pool.connectionsInUse);
      if (err) {
        return reject(err);
      }
      resolve(connection);
    });
  });
}

function getConnection(pool) {
  return _getConnection(pool).disposer(function(connection, promise) {
    connection.release(function(err) {
      console.log("CLOSE CONNECTIONS: " + pool.connectionsOpen + ' IN USE: ' + pool.connectionsInUse);
      if (err) {
        console.log(err);
        console.log("----------CLOSING ERROR-----");
        throw err;
      }
    });
  });
}

function execute(sql, bindParams, options, connection) {
  return new Promise(function(resolve, reject) {
    connection.execute(sql, bindParams, options, function(err, results) {
      if (err) {
          reject(err);
      }
      resolve(results);
    });
  });
}

function pooledExecute(sql, bindParams, options, pool) {
  return using (getConnection(pool), function(connection) {
    return execute(sql, bindParams, options, connection);
  });
}

server.route({
  method: 'GET',
  path: '/val1/{val1}/val2/{val2}',
  handler: function (request, reply) {
    server.connectionCount++
    console.log('CONNECTION: #'+server.connectionCount)
    var pool = server.connectionPool;
    var statement = "select testFunction(:param1, :param2) as testVal from dual";
    var params = [request.params.val1, request.params.val2];
    pooledExecute(statement, params, {}, pool)
    .then(function(results) {
      reply(results.rows);
    })
    .catch(function(error){
      reply(error);
    });
  }
});

server.start(() => {
  server.connectionCount = 0;
  var serverSettings = {
    name: "testServer",
    connectString: "10.1.1.125:1521/XE",
    user: "system",
    password: "oracle",
    poolMin: 0,
    poolMax: 5,
    poolIncrement: 1,
    poolTimeout: 15000
  };

  createPool(serverSettings)
  .then(function(pool) {
    server.connectionPool = pool;
  })
  .catch(function(err) {
    console.log(err);
  });
});

We were able to reproduce the problem using Apache jMeter and reading is a csv file with the numbers 1-500 (one per line). We did 20 threads with a ramp up period of 5 seconds and a 5 second break between loops and ran that forever. Took us anywhere from 300 - 1100 requests to reproduce the error.

@jeffm13 I've only had time to do a quick read through the code so far. The one thing I noticed is that the first "reject", in _getConnection has a return in front of it. But the second, in execute, is missing the return.

I'll be able to run these tests tomorrow.

Thanks @dmcghan

@dmcghan. Thanks (although it's a little embarrassing). We're re-running tests now.

@jeffm13 - I'm seeing the exact same thing with a wrapper I wrote that uses async.queue to throttle requests. Just like you're seeing, it looks like the current connection is made available before the callback from connection.release() completes. I figured I was calling back early somewhere, but I can not find the problem so please let me know if you have any luck with adding that return

@dmcghan, thanks for debugging our code for us. That was it :) We ran tests for hours, and never lost a connection. @niallosul, the return did it.

Glad to have @dmcghan on the team!

I'm open to user ideas about how improve the debugging experience when NJS-030, NJS-031 or NJS-032 occur.

@jeffm13 I'd love to hear more about your app and how you are using node-oracledb. My email is in my profile, if that's easier.

@jeffm13 I'm glad it's working for you now!

For those arriving here because you used Promise.all, and had query 3 of 5 fail, the last 2 don't get resolved, and you try to call connection.release too early, here's one weird trick to solve that.

  1. Wrap the promise in a 'sureThing' (quoting a co-worker), a Promise that always resolves with success
  2. inspect the success method to identify if all the queries worked or not before deciding to call connection.commit(), and then connection.release().

So, when writing code like this that runs multiple queries:

function doABunchOfQueries(connection)
{
    return new Promise((success, failure)=>
    {
        Promise.all([query1(connection), query2(connection), query3(connection)])
        .then((results)=>
        {
            connection.release();
            success();
        })
        .catch(failure);
    });
}

Don't do this because query2 will fail, and query3 will never resolve, and your oracle.release will fail with a call in progress error.

function query1(connection)
{
    return new Promise((success)=>
    {
        // ... 
        connection.execute(...)
        // ...
        success(queryResult);
        // ...
        failure(error);
    });
}

Instead, wrap it in a sureThing:

function query1(connection)
{
    return new Promise((success)=>
    {
        // ... 
        connection.execute(...)
        // ...
        success({success: true, result: queryResult});
        // ...
        success({success: false, error: error});
    });
}

@JesterXL Interesting finding, thanks for sharing. Some of the code looks a little off. For example, node-oracledb doesn't have a oracle.release(conn) method - we do conn.release(). Is it possible this code is related to the joeferner driver?

Also, while Promise.all does have something to offer in terms of flow control, it's important to keep in mind that all execute invocations will be serialized by the connection itself if you don't do it in your own code.

@dmcghan Nope, just bad copy pasta, fixed in original post, thanks for the catch! Regarding execute invocations, what are those?

More learnings, again, by using Promise.all and thinking things will turn out ok, but don't.

If you use Promise.all with multiple queries, and 1 fails, it'll fire an error, and ignore the other calls. However, if you try to be responsible and connetion.close, you'll probably get a Error: NJS-032: connection cannot be released because a database call is in progress] error.

Again, you'll want to use the sureThing above to ensure that you have a chance for all transactions to be completed before you attempt to close the connection. #asyncIsT3hFunDOTTHEN

Just I have the same issue, I tried to put a secure Promise.all that always resolve the promise but the issue is still happening... 馃槄

I share my Oracle service:

import config from '../config/application';

const oracledb = require('oracledb');

oracledb.poolTimeout = 0; // Never terminate
oracledb.maxRows = 50000;

/**
 * Schema
 */
export const schema = config.oracle.schema ? `${config.oracle.schema}.` : '';

/**
 * getConnection
 */
const doConnection = () => new Promise((resolve, reject) => {
    /* eslint no-console: ["error", { allow: ["warn", "error", "log"] }] */
    oracledb.getConnection(config.oracle, (err, connection) => {
        if (err) {
            console.error('Not possible to connect to Oracle:', err.message);
            reject(err.message);
        } else {
            console.log('Connected to Oracle');
            resolve(connection);
        }
    });
});

/**
 * Close connection
 */
const closeConnection = (conn) => {
    conn.close()
        .then(() => console.log('Oracle connection closed'))
        .catch(() => console.error('Error closing the Oracle connection'));
};

/**
 * execute Query (is secure, the promise is always resolved)
 */
const executeQuery = (...args) => conn =>
    new Promise(resolve =>
        conn.execute(...args, (err, result) => resolve({ err, result, conn })));

/**
 * Query into oracledb
*/
export const dbQuery = (...args) =>
    new Promise((resolve, reject) => {
        doConnection()
            .then(executeQuery(...args))
            .then(({ err, result, conn }) => {
                if (err) {
                    reject(err.message);
                } else {
                    closeConnection(conn);
                    resolve(result.rows || result.outBinds);
                }
            })
            .catch(reject);
    });

/**
 * Query into oracledb expecting clob in the return
 */
export const dbClobQuery = (...args) =>
    new Promise((resolve, reject) => {
        doConnection()
            .then(executeQuery(...args))
            .then(({ err, result, conn }) => {
                if (err) {
                    reject(err.message);
                } else {
                    const clob = result.rows[0][0];
                    let data = '';
                    if (typeof clob.on === 'function') {
                        clob.on('data', (c) => { data += c; });
                        clob.on('end', () => {
                            closeConnection(conn);
                            resolve(JSON.parse(data));
                        });
                        clob.on('error', () => reject);
                    } else {
                        closeConnection(conn);
                        reject();
                    }
                }
            })
            .catch(reject);
    });

/**
 * Multi query with the same connection
 */
export const dbMultiQuery = (...queries) =>
    new Promise((resolve, reject) => {
        doConnection()
            .then(conn =>
                Promise.all(queries.map(q => executeQuery(...q)(conn)))
                    .then((r) => {
                        closeConnection(conn);
                        if (r.some(x => x.err)) {
                            reject(r.map(x => x.err));
                        }
                        resolve(r.map(x => x.result.rows || x.result.outBinds));
                    })
                    .catch(reject))
            .catch(reject);
    });

The methods that I'm using in my project are:

  • dbQuery : Used to query (opening connection and close)
  • dbClobQuery : The same, but expecting a stream in the response (Clob)
  • dbMultiQuery: To do multi-queries (oppening one connection and closing)

I'll appreciate if someone can check what is wrong in my code, because I'm going crazy with this issue, hehe 馃槄

Thanks 馃檪

@aralroca can I suggest you open a new issue?

Ok @cjbj 馃憤 I put here 馃檪

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ChrisHAdams picture ChrisHAdams  路  3Comments

JocelynDalle picture JocelynDalle  路  3Comments

chsnt picture chsnt  路  3Comments

sanfords picture sanfords  路  3Comments

cheslyn picture cheslyn  路  3Comments