var isReadable require('isstream').isReadable;
var client = new sql.Connection(this._config);
var query = '';
return client.connect()
.then(function() {
var request = new sql.Request(client);
request.stream = true;
var reqStream = request.query(query);
assert(isReadable(reqStream), true, 'Request Stream should be a readable stream')
return reqStream;
});
As verified by https://github.com/rvagg/isstream
It seems that realistically, only the ability to pipe works... and will pass objects... Also of note is the streams don't support backpressure... in fairness even Tedious doesn't support a way to handle cursors, so this likely won't happen...
Though most of the rest of a readable stream implementation could be done.
This is a pretty substantial limitation, if you cannot pause() the stream, it will easily overfill any sink you might have on your pipe. The only reasonable workaround is to write to a csv write stream (this is only reasonable if your disk can keep up with your sql server...) and then readstream that csv into your actual pipeline... this way you are using disk instead of memory to buffer the select query.
Needless to say, this defeats the purpose of stream based programming.
with a 200m record table and a raised 16GB memory limit on node:
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
Additionally, the reader will exhaust a lot of node resources just processing the stream, and can choke the event handler even with a medeocre sql server (~20k records/s with ~800 byte rows).
I've seen the same issue, and for my case, I wound up running in batches of 1k records for my pipeline (replicating data from sql to elasticsearch), And you won't get to 16gb, you'll flood out around 2-4gb long before, unless you do a custom build of node.js
I also found, that running through a RabbitMQ server couldn't keep up either with the larger loads... Didn't try redis though, which may have worked out better.
Hi!
I have trouble with pipe data from MSSQL to Sphinx. There are ~ 3M rows (~ 20 columns) - and when I pipe this data ~ 8gb of RAM spent.
Is there any chanse to implement pause() method on mssql stream ?
Thanks!
It is work in progress in Tedious - https://github.com/tediousjs/tedious/pull/518
Hello @patriksimek
It seems to have already been implemented and merged. Do you have an estimate of when will be included here?
Thank you
+1
@patriksimek any eta for using actual streams? pause/resume got merged into tedious in October. Appreciate it
Hello!
I've create small repo with true readable stream based on node-mssql and tedious: tediousStreaming. It is possible to control memory usage with this method.
May be it will be useful.
When use node-mssql make sure this commit is in node_modules
Is there something to be expected on this front?
@ilusharulkov does your repo provide a way to solve this issue or is it simply demonstrating the problem?
@willmorgan do we want to add this to 5.0?
Right, #775 adds pause/resume but it will not satisfy the "true node stream" question because it is still not a subclass of any of the stream classes.
However it does provide the ability to manually pause the stream as required to stop memory issues, which is a step in the right direction
see #776 for an RFC on streams
I'm going to close this as the RFC is probably a better place for any further discussion on this
Most helpful comment
+1
@patriksimek any eta for using actual streams? pause/resume got merged into tedious in October. Appreciate it