We use oracledb in an application which is deployed in a Docker container. Most of the time, we use Knex to build the query, but I also did some tests using oracledb directly.
The problem is that after a while, sometimes after 2-3 days, sometimes before that, the requests become very slow (40-50 seconds instead of 0.5 second, for example). I did a lot of tests to try to see what's going on when this occures, but without success. It doesn't seem to be a memory/disk or CPU issue. As I said, we use Knex, but I also added an endpoint (it's a REST API) that uses oracledb directly to see if the problem was Knex's connection pool. But this didn't help : once the requests made using Knex become slow, the requests using oracledb directly also become slow!
At that point, I was suspecting a problem with the Docker container itself. Maybe the network or the IO was responsible of the slowness, for some reason? So I tried installing a SQLPlus command line client in a container which was in that "slow" state... I made some requests using this command line client are *they were fast!!!
So the problem doesn't seem to be related to the Docker container. It also doesn't seem to be related to Knex's connection pool since even requests made using oracledb directly become slow. The only thing I can think of is that it's an issue with the oracledb library itself.
Any idea?
Any tests I could run that would help you understand the root of the issue?
_1. What is your version of Node.js? Run examples/version.js to find versions._
7.7.3
_2. What version of node-oracledb are you using?_
1.13.1
_3. What is the version of your Oracle client (e.g. Instant Client)? How was it installed? Where it is installed?_
instantclient-basic-linux.x64-12.2.0.1.0
instantclient-sdk-linux.x64-12.2.0.1.0
From the Dockerfile:
ENV LD_LIBRARY_PATH="/mtl/oracle/instantclient:${LD_LIBRARY_PATH}"
ENV OCI_LIB_DIR="/mtl/oracle/instantclient"
ENV OCI_INC_DIR="/mtl/oracle/instantclient/sdk/include"
_4. What is the version of Oracle Database?_
11c
_5. What is your OS and version?_
The Docker base image is "node:7.7.3-slim". We install "python make g++ libaio1" to compile the oracledb dependency.
_6. What compiler version did you use? For example, with GCC, run gcc --version_
gcc (Debian 4.9.2-10) 4.9.2
_7. What environment variables did you set? How exactly did you set them?_
From the Dockerfile:
ENV LD_LIBRARY_PATH="/mtl/oracle/instantclient:${LD_LIBRARY_PATH}"
ENV OCI_LIB_DIR="/mtl/oracle/instantclient"
ENV OCI_INC_DIR="/mtl/oracle/instantclient/sdk/include"
_8. What exact command caused the problem (e.g. what command did you try to install with)? Who were you logged in as?_
A query is launched the same way it always is, by passing from Express to a repository which uses a Knex or oracledb connection pool.
_9. What error(s) you are seeing?_
Instead of taking 1 second, the query takes 40-50 seconds.
Thread starvation? See https://github.com/oracle/node-oracledb/blob/v1.13.1/doc/api.md#numberofthreads
Add some timing code to see what bits of the app are taking time. Also check that your SQL*Plus tests are running equivalent SQL statements to your app.
_> Thread starvation? See https://github.com/oracle/node-oracledb/blob/v1.13.1/doc/api.md#numberofthreads_
Ideas like that are very appreciated! I'll report back.
_> Add some timing code to see what bits of the app are taking time._
Alreay done : it's clearly the db request itself. I added a log just before and just after. I didn't modify Knex or oracledb code itself to add more precise timers though...
_> Also check that your SQL*Plus tests are running equivalent SQL statements to your app._
Both were simple select * from dual requests, and we're talking "0.005 seconds vs 40 seconds" order of magnitude in term of difference!
I have the same problems
@muxuehen then the solution might be same.
@cjbj
Our current configurations, for a dev environment, are :
I plan on adding UV_THREADPOOL_SIZE=12 to test your idea.
Testing this issue is not easy because it can't be reproduced on demand (for us at least), it takes time for the slowness to appear (but it then seems to appear in one shot, not gradually)... So would you say those configurations would be a good test?
Would you change/add something?
Any other configuration that could potentially have an impact?
@electrotype a poolMax of 10 with only the 4 (default) threads would be a definite source of random lock ups and slowdowns.
Make sure you are releasing connections back to the pool when they no longer needed.
Consider setting poolMin = poolMax (and poolIncrement = 0) to avoid connection storms.
Keep an eye on the pool stats.
@cjbj
So I'll try:
poolMin : 10
poolMax : 10
poolIncrement : 0
and
UV_THREADPOOL_SIZE=12
Your help is very appreciated.
@electrotype
My problem is still not resolved, I follow the steps you gave, but the program started error
Thank you!
@muxuehen Can you open a new issue and fill the template? And describe the exact behavior too - thank you!
@electrotype how are you going with this?
@cjbj I will be back with results once I am 100% sure the issue is fixed using UV_THREADPOOL_SIZE. It takes time because :
What I can tell you though, is that it currenly _does_ seem to work! :-)
I didn't experience the slowness issue since I added the configurations listed in my last message...
I haven't experienced slowness since the modification following your suggestion! :-)
You can close the issue if you want.
Thanks a bunch for your help @cjbj !!