Node-oracledb: NJS-018: invalid ResultSet, ORA-12537: TNS:connection closed, DPI-1010: not connected errorNum: 0, offset: 0 and NJS-003: invalid connection

Created on 1 Mar 2018 · 21Comments · Source: oracle/node-oracledb

What I did and what happend

Well, I've updated to v2 and then I am getting error messages randomly, but always at the same context. Some kind of promise rejection.

FAQ

What is your Node.js version?
local: v9.6.1, 64-bit
on server: v8.9.4, 64-bit
What is your node-oracledb version?
2.1.2
What is your Oracle client (e.g. Instant Client) version? Is it 64-bit or 32-bit? How was it installed? Where is it installed?
instantclient 12.2, 64-bit
It was installed following installation instructions
Installed at /opt/oracle/instantclient_12_2/
What is your Oracle Database version?
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
What is your OS and version?
MacOS High Sierra (latest) and Linux (probably Ubuntu/Debian running on AWS)
What is your compiler version?
I don't have access to this information
What Oracle environment variables did you set? How exactly did you set them?
Well, on server, Docker runs this:

# INSTALL LIBAIO1 & UNZIP (NEEDED FOR STRONG-ORACLE)
RUN apt-get update \
  && apt-get install -y libaio1 \
  && apt-get install -y build-essential \
  && apt-get install -y unzip \
  && apt-get install -y curl

# ADD ORACLE INSTANT CLIENT
RUN mkdir -p opt/oracle
ADD ./oracle/linux/ .

RUN unzip instantclient-basic-linux.x64-12.2.0.1.0.zip -d /opt/oracle
RUN sh -c "echo /opt/oracle/instantclient_12_2 > /etc/ld.so.conf.d/oracle-instantclient.conf" \
  && ldconfig

And UV_THREADPOOL_SIZE is set to the same value as max pools.

What error(s) you are seeing?

Unhandled Rejection at:  Promise { <rejected> Error: NJS-018: invalid ResultSet }  reason:  Error: NJS-018: invalid ResultSet
Unhandled Rejection at:  Promise { <rejected> Error: NJS-018: invalid ResultSet }  reason:  Error: NJS-018: invalid ResultSet

ORA-12537: TNS:connection closed
"Error: ORA-12537: TNS:connection closed"
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor
"Error: ORA-12514: TNS:listener does not currently know of service requested in connect descriptor"



md5-eb3baa45913030ee9cc277ed4fb46ecb



Unhandled Rejection at:  Promise {
  <rejected> { Error: DPI-1010: not connected errorNum: 0, offset: 0 } }  reason:  { Error: DPI-1010: not connected errorNum: 0, offset: 0 }
Unhandled Rejection at:  Promise {
  <rejected> { Error: DPI-1010: not connected errorNum: 0, offset: 0 } }  reason:  { Error: DPI-1010: not connected errorNum: 0, offset: 0 }
Segmentation fault: 11



md5-eb3baa45913030ee9cc277ed4fb46ecb



Unhandled Rejection at:  Promise { <rejected> Error: NJS-003: invalid connection }  reason:  Error: NJS-003: invalid connection
ERROR:ORACLE:EXECUTE Error: NJS-003: invalid connection
Unhandled Rejection at:  Promise { <rejected> Error: NJS-003: invalid connection }  reason:  Error: NJS-003: invalid connection
ERROR:ORACLE:CLOSE:  2018-03-01T04:18:04.223Z Error: NJS-003: invalid connection

Important

All this errors started after migrating to v2

question

Source

jgcmarins

👍3

Most helpful comment

@jpollard-cs I know you directed this to @cjbj, but I have to take a shot at it...

obviously besides Promise.all you're risking quite easily clogging your connection pool

You risk clogging the connection pool and the thread pool depending on the statements (I'll demonstrate below).

Not only that, but what should happen if something fails? If each iteration of promise.all gets its own connection, then you'd have multiple transactions which would make recovery from error unnecessarily difficult to fix.

for the sake of the example say we have created a version of Promise.all that waits for all promises to complete even if there is an error

I don't see how this would help. What if you have a million elements in the array, and you use Promise.all to insert them all (using the subsequent single connection example)? Imagine the first insert fails and you need all to succeed or fail as a unit. Now you have to wait for 999,999 round trips just to get the error notification.

In that case would there be any benefit to doing something like [followed by Promise.all example]

I can't think of a benefit, but the same problem described above would apply here too. With Promise.all, you lose control.

my understanding is that a connection is not associated with a single thread

That's true, there can be many threads in the thread pool and many connections in the connection pool and if you do multiple operations with a connection, then the thread used to perform the operation could vary.

However, imagine you had 4 (the default) threads in the thread pool. Then you get a single connection and use it to execute 4 statements with Promise.all (or standard/sync loop). The first statement is a long-running query and the last three are insert statements. What do you think will happen?

All of the calls to execute will go through libuv's thread pool queue. Because there are sufficient threads, libuv will allow them all through the queue (they will get threads). But, the first execute (the long-running query) will block the subsequent inserts. In this case, the threads that are blocked can't be used for other work.

Here's a test that allows you to see this...

Given the following table and procedure (proc requires execute grant on dbms_lock):

create table t (
  c number
);

create or replace function seconds_to_wait (
  p_seconds in number
)

  return number

as

begin

  dbms_lock.sleep(p_seconds);

  return 1;

end;

What output would you expect from the following?

const oracledb = require('oracledb');
const config = require('./db-config.js');

// Just so we can see it...
config.poolMax = 4;
config.poolMin = 4;
config.poolIncrement = 0;

// UV_THREADPOOL_SIZE is default (4)

async function runTest() {
  let pool;
  let conn1;
  let conn2;

  try {
    pool = await oracledb.createPool(config);
    conn1 = await oracledb.getConnection(); // from default pool

    console.log('got conn1');

    conn1.execute(
      'select seconds_to_wait(10) from dual'
    );

    console.log('execute 1 running for 10 seconds');

    for (let x = 0; x < 3; x += 1) {
      conn1.execute(
        'insert into t (c) values (:c)',
        [x]
      );

      console.log('queued execute ', x);
    }

    console.log('attempting to get conn2');

    let start = Date.now();

    conn2 = await oracledb.getConnection(config);

    console.log('got conn2 after ' + (Date.now() - start) + ' ms');
  } catch (err) {
    console.error(err);
  } finally {
    // would normally try to close conns here
  }
}

runTest();

Here's the output I got, I added a comment where the 10-second delay occurred.

got conn1
execute 1 running for 10 seconds
queued execute  0
queued execute  1
queued execute  2
attempting to get conn2
# 10 second delay
got conn2 after 10035 ms

The important thing to note is that I was unable to get conn2 because the other inserts had obtained threads (emptying the thread pool) but were blocked from using them (which shows a connection can only do one thing at a time) - so all async work was blocked.

Compare that to what I get when I maintain control in the JS layer. I'll move the inserts to the select callback so the logic can fall through to second getConnection call.

const oracledb = require('oracledb');
const config = require('./db-config.js');

// Just so we can see it...
config.poolMax = 4;
config.poolMin = 4;
config.poolIncrement = 0;

// UV_THREADPOOL_SIZE is default (4)

async function runTest() {
  let pool;
  let conn1;
  let conn2;

  try {
    pool = await oracledb.createPool(config);
    conn1 = await oracledb.getConnection(); // from default pool

    console.log('got conn1');

    conn1.execute(
      'select seconds_to_wait(10) from dual',
      [], // no binds
      async function(err) {
        if (err) throw err;

        for (let x = 0; x < 3; x += 1) {
          await conn1.execute(
            'insert into t (c) values (:c)',
            [x]
          );

          console.log('finished execute ', x);
        }

        // would close conn1 here
      }
    );

    console.log('execute 1 running for 10 seconds');

    console.log('attempting to get conn2');

    let start = Date.now();

    conn2 = await oracledb.getConnection(config);

    console.log('got conn2 after ' + (Date.now() - start) + ' ms');
  } catch (err) {
    console.error(err);
  } finally {
    // would close conn2 here
  }
}

runTest();

This gave me:

got conn1
execute 1 running for 10 seconds
attempting to get conn2
got conn2 after 34 ms
# 10 second delay
finished execute  0
finished execute  1
finished execute  2

In this case, I was able to get conn2 because I didn't send things to libuv too early - I kept the control flow/queuing in the JS layer.

dmcghan on 14 Mar 2018

👍3

All 21 comments

Can you provide some code that demonstrates the issues you are seeing? Without that it is going to be very hard to help you!

anthony-tuininga on 1 Mar 2018

where can we get more information based on this error codes?

sibelius on 1 Mar 2018

All of those are likely due to the same cause: the fact that you have closed the connection and are still trying to use it in other places in your code. It may be a timing thing. One possibility is if you are using querystream and using the "end" event to close the connection. You should instead be using the close event. Code samples would definitely help!

You can get debugging information by setting the following environment variable DPI_DEBUG_LEVEL to the value 4. That will tell you about all of the ODPI-C function calls that are being made. That may be overkill, but it can be quite helpful, too.

anthony-tuininga on 1 Mar 2018

Although @anthony-tuininga's comments are most likely (since you say it occurs in the same place), I reviewed causes for ORA-12537 errors which is not so common. There were some really old reports in other tools about incorrect network configuration. Also it has been seen when the firewall disconnects idle connections. These are just things to keep in mind.

cjbj on 2 Mar 2018

@anthony-tuininga and @cjbj thanks for your reply. I'll try to collect more information/logs/code examples.

jgcmarins on 2 Mar 2018

@jgcmarins I was seeing this problem too and I believe the cause to be that someone had used Promise.all and their error handler was attempting to release the connection. Since Promise.all returns on the first error encountered this was attempting to release the connection before all work had completed. I believe this was previously showing up in node-oracledb v1 as a NJS-032 "Connection cannot be released" error.

jpollard-cs on 9 Mar 2018

👍2

@jpollard-cs thanks for your reply. I've found this pattern that you mentioned.
@cjbj can we still let this issue opened?
I am debugging it constantly. My ideia is to bring more details and a few solutions, soon.

jgcmarins on 10 Mar 2018

@jgcmarins I'm sure many people would learn from what you share. The tip from @jpollard-cs is great: I should find a way to get it inserted into the doc.

cjbj on 10 Mar 2018

@cjbj is there a way to improve this error message or is this dependent on behavior from https://github.com/oracle/odpi? I would be happy to help document.

In general what are the recommended best practices here?

I suspect the recommended practice is to avoid doing something like this:

Promise.all(foos, ({ id }) => pool.getConnection().execute(sql, { id }));

obviously besides Promise.all you're risking quite easily clogging your connection pool, but for the sake of the example say we have created a version of Promise.all that waits for all promises to complete even if there is an error

In that case would there be _any_ benefit to doing something like

Promise.all(foos, ({ id }) => connection.execute(sql, { id }));

I know here @dmcghan says that execution over a connection happens serially, but I was curious if this was really said for simplicity and if there's a bit more going on here as my understanding is that a connection is not associated with a single thread (from what I understand issue that can lead to thread contention if you don't have more threads in your thread pool than you have connections in your connection pool). When executing a query does the thread wait for a response from Oracle or is all work done by the connection effectively suspended until there is a response?

jpollard-cs on 13 Mar 2018

@jpollard-cs
The initial example in #871 did use a single connection object for both promises to resolve, which is why Dan mentioned that the work would be serially processed in the end.

Initially I had wrote the code as

get connection from pool
use the said connection for two separate executes

In the example from #871 , the process will be done serially, because a connection object cannot process several execute statements, it has to do them one by one.

danilohgds on 13 Mar 2018

@jpollard-cs I know you directed this to @cjbj, but I have to take a shot at it...

obviously besides Promise.all you're risking quite easily clogging your connection pool

You risk clogging the connection pool and the thread pool depending on the statements (I'll demonstrate below).

for the sake of the example say we have created a version of Promise.all that waits for all promises to complete even if there is an error

In that case would there be any benefit to doing something like [followed by Promise.all example]

I can't think of a benefit, but the same problem described above would apply here too. With Promise.all, you lose control.

my understanding is that a connection is not associated with a single thread

Here's a test that allows you to see this...

Given the following table and procedure (proc requires execute grant on dbms_lock):

create table t (
  c number
);

create or replace function seconds_to_wait (
  p_seconds in number
)

  return number

as

begin

  dbms_lock.sleep(p_seconds);

  return 1;

end;

What output would you expect from the following?

const oracledb = require('oracledb');
const config = require('./db-config.js');

// Just so we can see it...
config.poolMax = 4;
config.poolMin = 4;
config.poolIncrement = 0;

// UV_THREADPOOL_SIZE is default (4)

async function runTest() {
  let pool;
  let conn1;
  let conn2;

  try {
    pool = await oracledb.createPool(config);
    conn1 = await oracledb.getConnection(); // from default pool

    console.log('got conn1');

    conn1.execute(
      'select seconds_to_wait(10) from dual'
    );

    console.log('execute 1 running for 10 seconds');

    for (let x = 0; x < 3; x += 1) {
      conn1.execute(
        'insert into t (c) values (:c)',
        [x]
      );

      console.log('queued execute ', x);
    }

    console.log('attempting to get conn2');

    let start = Date.now();

    conn2 = await oracledb.getConnection(config);

    console.log('got conn2 after ' + (Date.now() - start) + ' ms');
  } catch (err) {
    console.error(err);
  } finally {
    // would normally try to close conns here
  }
}

runTest();

Here's the output I got, I added a comment where the 10-second delay occurred.

got conn1
execute 1 running for 10 seconds
queued execute  0
queued execute  1
queued execute  2
attempting to get conn2
# 10 second delay
got conn2 after 10035 ms

Compare that to what I get when I maintain control in the JS layer. I'll move the inserts to the select callback so the logic can fall through to second getConnection call.

const oracledb = require('oracledb');
const config = require('./db-config.js');

// Just so we can see it...
config.poolMax = 4;
config.poolMin = 4;
config.poolIncrement = 0;

// UV_THREADPOOL_SIZE is default (4)

async function runTest() {
  let pool;
  let conn1;
  let conn2;

  try {
    pool = await oracledb.createPool(config);
    conn1 = await oracledb.getConnection(); // from default pool

    console.log('got conn1');

    conn1.execute(
      'select seconds_to_wait(10) from dual',
      [], // no binds
      async function(err) {
        if (err) throw err;

        for (let x = 0; x < 3; x += 1) {
          await conn1.execute(
            'insert into t (c) values (:c)',
            [x]
          );

          console.log('finished execute ', x);
        }

        // would close conn1 here
      }
    );

    console.log('execute 1 running for 10 seconds');

    console.log('attempting to get conn2');

    let start = Date.now();

    conn2 = await oracledb.getConnection(config);

    console.log('got conn2 after ' + (Date.now() - start) + ' ms');
  } catch (err) {
    console.error(err);
  } finally {
    // would close conn2 here
  }
}

runTest();

This gave me:

got conn1
execute 1 running for 10 seconds
attempting to get conn2
got conn2 after 34 ms
# 10 second delay
finished execute  0
finished execute  1
finished execute  2

In this case, I was able to get conn2 because I didn't send things to libuv too early - I kept the control flow/queuing in the JS layer.

dmcghan on 14 Mar 2018

👍3

Closing - I think this has run its course.

cjbj on 27 Mar 2018

👍2

tldr solution:

[ ] get 1 connection pool per database (no need to release it, only when server is killed or shutdown)
[ ] get 1 connection from connection pool per sql command
[ ] always use resultSet option on connect.execute(sql, binding, { resultSet: true })
[ ] use process.nextTick to close resultSet and connection after sql (like this https://github.com/coreyc/oracledb-promise/blob/master/src/oracle.js#L10)

So the problem is to do more than 1 sql query per connection

sibelius on 13 Jun 2018

👍1

@cjbj, @dmcghan and @anthony-tuininga, as @sibelius said, the main reason was related to close resultSet. We found that when we take a deep look on database and could see that the driver was keeping lots of cursors.
Maybe we can improve docs at this point.

jgcmarins on 13 Jun 2018

If we add a note to this section (https://oracle.github.io/node-oracledb/doc/api.html#-42638-resultset) indicating that the result set that is returned should be closed, would that help? Note that the result set would eventually be closed -- but only when the garbage collector got around to doing so, which usually isn't terribly helpful!

anthony-tuininga on 13 Jun 2018

@anthony-tuininga adding a note could help a lot

we could add a note that connection should be used to do only 1 query (not shared queries), maybe this problem is only related when you have many parallel queries when using GraphQL or async await

sibelius on 13 Jun 2018

We could also add a note here (https://oracle.github.io/node-oracledb/doc/api.html#-426-connectionexecute) to indicate that only one statement can be executed at a time. You can certainly execute multiple statements, they just have to be done serially; otherwise, the second statement execution will block until the first one is complete -- something you want to avoid! Let's see what @cjbj thinks.

anthony-tuininga on 13 Jun 2018

👍1

@sibelius on your checklist you should have this too:

[ ] Increase UV_THREADPOOL_SIZE to at least the max size of your connection pool (or sum of, if you have more than one pool in the same process).

@jgcmarins Doc can be improved. I'll take a look. Currently there is doc about connections being able to execute one statement at a time in the UV_THREAD_POOL section and also here:

Connections can handle one database operation at a time. Other database operations will block. Structure your code to avoid starting parallel operations on a connection. For example, instead of using async.parallel or async.each() which calls each of its items in parallel, use async.series or async.eachSeries().

The note about closing ResultSets is here:

When all rows have been fetched, or the application does not want to continue getting more rows, then the ResultSet should be freed using close(). The ResultSet should also be explicitly closed in the cases where no rows will be fetched from it.

cjbj on 14 Jun 2018

@sibelius

Here are my thoughts on the checklist:

get 1 connection pool per database (no need to release it, only when server is killed or shutdown)

Correct, if that's all you need. In some cases, you might need multiple connection pools, such as when connecting to multiple databases. Node-oracledb 2.3 added support for heterogeneous connections pools which alleviated the need for creating multiple connection pools to the same database with different users.

get 1 connection from connection pool per sql command

No. It's better to think along the lines of "get 1 connection from the connection pool for a given user's unit of work". Generally, "unit of work" is going to be anything from a single query to a complex transaction with multiple statements being executed.

Once the unit of work is done, the connection should be closed/released back to the pool for others to use.

always use resultSet option on connect.execute(sql, binding, { resultSet: true })

I agree with this for the majority of cases. However, I think using a regular execute call is fine for single row fetches (say when querying by primary key) and queries that you know will always return a small number of rows.

use process.nextTick to close resultSet and connection after sql (like this https://github.com/coreyc/oracledb-promise/blob/master/src/oracle.js#L10)

This isn't needed. When you call close on either a connection or result set, the real work is scheduled to be run asynchronous to the main thread, so process.nextTick doesn't really do much but delay the call to close a bit.

Later you said:

we could add a note that connection should be used to do only 1 query (not shared queries), maybe this problem is only related when you have many parallel queries when using GraphQL or async await

I'm not sure I understand what you mean here. GraphQL isn't all that different from REST (or even apps renderd server-side) in terms of the number of queries being issued to the database. Gernerally, the more users, the more queries.

Also, async/await doesn't do parallel really. You're waiting for one thing to finish before going to the next. You could use await on the result of calling Promise.all to do multiple async operations at the same time. However, this will not work if the operations depend on the same connection because a connection can only do one thing at a time anyway.

dmcghan on 25 Jun 2018

@dmcghan good points.

Since everyone likes optimizations, when using execute() for single row fetches then decrease fetchArraySize from the default to be nice to the system - there's no point each layer in the client and DB allocating space that won't be used.

In case people haven't seen it, check out https://blogs.oracle.com/opal/node-oracledb-v2-query-methods-and-fetch-tuning

cjbj on 26 Jun 2018

🎉1 👍1

@dmcghan thank you for your reply. Lot's of information that also makes sense for our scenario.