What versions are you using?
Database: 11.2.0.4 + 18XE
Platform: linux
Node version: v12.18.0
Architecture: x64
node-oracledb version: 5.0.0
Oracle client version: 19.6.0.0.0
Is it an error or a hang or a crash?
Code hangs on oracledb/lib/pool.js line 156:
const conn = await this._getConnection(options);
Steps to reproduce:
curl localhost:3000/ (should work) const express = require('express')
const app = express()
const port = 3000
const oracledb = require('oracledb');
const endpoint =
{
namespace: 'ora18xe',
description: 'XE18 database',
connect: {
poolAlias: 'ora18xe',
user: 'system',
password: 'manager',
connectString: '34.60.229.255:51521/XEPDB1',
poolMin: 4,
poolMax: 4,
poolIncrement: 0,
_enableStats: true,
poolPingInterval: 0
}
}
function printEnv() {
console.log(`Platform: ${process.platform}`);
console.log(`Node version: ${process.version}`);
console.log(`Architecture: ${process.arch}`);
console.log(`node-oracledb version: ${oracledb.versionString}`);
console.log(`Oracle client version: ${oracledb.oracleClientVersionString}`);
}
async function initialize() {
printEnv();
process.env.UV_THREADPOOL_SIZE = 8;
try {
await oracledb.createPool(endpoint.connect);
} catch (err) {
console.error(err);
}
}
async function pingConnection(req, res) {
try {
const pool = oracledb.getPool(endpoint.connect.poolAlias);
pool._logStats()
const connection = await pool.getConnection();
await connection.close();
res.status(200).json(endpoint);
} catch (err) {
console.log(`Connection failed ${err}`);
res.status(503).send(`Connection failed ${err}`);
}
}
async function shutdown() {
try {
const pool = oracledb.getPool(endpoint.connect.poolAlias);
await pool.close(0);
process.exit(0);
} catch (err) {
console.error(err);
process.exit(1);
}
}
async function run() {
initialize()
app.get('/', (req, res) => pingConnection(req, res));
app.listen(port, () => console.log(`Listening on http://localhost:${port}`))
// Trap Ctrl-C and force clean shutdown
process.on('SIGTERM', () => {
console.log('Received SIGTERM');
shutdown();
});
process.on('SIGINT', () => {
console.log('Received SIGINT');
shutdown();
});
}
run();
Script hangs on const connection = await pool.getConnection(); if the IP address of the VM does not match the IP address used to establish the connection pool. Stepping into the code takes you to oracledb/lib/pool.js line 156.
I would expect the script to error and return a 503.
node oracledb's sqlnet.ora file:
SQLNET.OUTBOUND_CONNECT_TIMEOUT=3
TCP.CONNECT_TIMEOUT = 10
DISABLE_OOB=ON
SQLNET.DOWN_HOSTS_TIMEOUT = 0
This is out of control of node-oracledb, since it is handled at the lower network layer of the Oracle stack. You can twiddle the SQL*Net parameters and tune your Network TCP timeouts. FAN may be useful.
@onmp may be able to comment on the SQL*Net layer.
Setting SQLNET.RECV_TIMEOUT in sqlnet.ora resolved the issue. File contents:
SQLNET.OUTBOUND_CONNECT_TIMEOUT=15
TCP.CONNECT_TIMEOUT = 10
DISABLE_OOB=ON
SQLNET.DOWN_HOSTS_TIMEOUT = 0
SQLNET.RECV_TIMEOUT=3
@pgoldtho If you could get a 'pstack' trace when the getConnection() is hanging we'd like to take a look.
Which process do you want me to trace? I got an error when I tried to pstack the node process:
pgoldtho@pgoldtho-desktop:~$ ps -ef|grep app.js
pgoldtho 184719 184708 0 12:07 pts/4 00:00:00 sh -c node app.js
pgoldtho 184720 184719 2 12:07 pts/4 00:00:00 node app.js
pgoldtho 184734 16248 0 12:08 pts/2 00:00:00 grep --color=auto app.js
pgoldtho@pgoldtho-desktop:~$ sudo pstack 184720
184720: node app.js
'linux-vdso.so.1': opening object file: No such file or directory
Could not open object file.
I've never used pstack so it's quite possible I doing something wrong.
Can you run pstack without sudo, Thanks
Try setting environment variable DPI_DEBUG_LEVEL=92" , and upload the logs created. This could give some idea. Thanks.
Can you run pstack without sudo, Thanks
I get a different error:
$ pstack 273349
Could not attach to target 273349: Operation not permitted.
detach: No such process
Can you run pstack without sudo, Thanks
I get a different error:$ pstack 273349 Could not attach to target 273349: Operation not permitted. detach: No such process
This means the process is no longer available (dead). Can you try the environment variable suggestion and upload the log files. Thanks
@pgoldtho thanks for the log files, will get back to you. Thanks
You have reached the max connections available in the pool (check the line # 98-104 in the log file)
...pool connections open: 4
Pool attributes:
...poolAlias: ora18xe
...queueMax: 500
...queueTimeout (milliseconds): 60000
...poolMin: 4
...poolMax: 4
In this case, the node application is waiting for any connections to become available in the pool. You can increase the poolMax and try again.
Thx.
I don't think that's true. My reading of line 95 ( ...pool connections in use: 0) is that none of the connections are in use so it should be able to pickup a connection immediately.
Incidentally, I'm following these guidelines for connection pool sizing: http://oracle.github.io/node-oracledb/doc/api.html#conpoolsizing
Everything is operating as expected up to line 121, after retrieving an idling connection lasting more than 60 seconds an ping operation is executed to validate the connection. The ping operation fails as indicated by line 121. Did you kill the process at this point? Or did the process core dump? If you kill the process, please let process continue and generate another log file. You can kill the process if the hang is tasting longer than a few minutes.
I waited a couple of minutes (18:58:41 -> 19:00:36) then killed the process. The core dumped when I sent a couple of additional Ctrl-c's because the process didn't want to die. I can run the test again and let it hang longer if that would help.
Do you have the log files for your last two runs? Please upload them. If you don't have them please generate log file for your runs.
@onmp I've updated the gist https://gist.github.com/pgoldtho/92de69d56a75f1932c199d0584efd368 with the output from 2 additional test runs. There are 3 runs in total.
The first no-RECV_TIMEOUT.txt is the original run. It was generated with a client sqlnet.ora file (also included in the gist) with the RECV_TIMEOUT parameter commented out. I renamed the sqlnet.ora file for the second run no-SQLNET.ora to show the behavior with default sqlnet settings. The final run with-SQLNET.txt used the sqlnet.ora file included in the gist.
In each run a request was made using curl to verify the connection, then the ip address of the VM running the database was changed to emulate a network outage and additional requests were made using curl.
Line 36 in no-SQLNET.ora shows a successful request prior to the network outage. Line 79 was made after the outage. It resulted in a curl: (52) Empty reply from server. A follow-up request at line 118 did not respond. The curl request did not complete. I waited a few minutes before cancelling it. The node process did not respond to Ctrl-C interupts. It had to be killed.
Line 36 in with-SQLNET.ora also shows a successful request prior to the network outage and line 79 was made after the IP address was changed. Line 119 shows the desired behavior. The process returned a 503 from the catch block of the pingConnection function
@pgoldtho If you don't mind can you please generate sqlnet trace for the case with-SQNET.ora. You can end your test after line 161. You can turn on non-ADR tracing by adding the following parameters to the sqlnet.ora file:
DIAG_ADR_ENABLED = OFF
TRACE_LEVEL_CLIENT = 16
TRACE_DIRECTORY_CLIENT = /tmp
You can upload the .trc file. This trace file can be large.
@onmp added to gist
can you please add expire_time in connect string as documented at
https://docs.oracle.com/en/database/oracle/oracle-database/20/netrf/local-naming-parameters-in-tns-ora-file.html#GUID-A92344BD-2A8B-445D-BEF7-82141A523C3A
If expire_time is set to 1 (1 minute), then the client side should be able to detect this condition in 2 minutes.
Thx
@pvenkatraman where are you at with reproducing this?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
We have this in mind but I will close this issue since there is nothing to comment on.