Node-postgres: Insert big number of rows

Created on 7 Aug 2016 · 4Comments · Source: brianc/node-postgres

Hi, there is a problem i can't resolve.
I'm trying to insert data from text file into Postgres Db.
The description and code are here link.
The point is when file contains about 50 000 lines the code works fine, but when file contains more than 1000 000 lines the problem appears.

I made some research with debugger and found out that queries are queued and executed only when the file is read, so when the file is large the memory is overflowed.
So is there a way to force inserts in queue to be executed each let's say 10 000 lines ?

Source

v1co1n

All 4 comments

I found the answer (or workaround) and posted it on stackoverflow (link above)
We must watch PgClient.queryQueue and when there are a lot of elements in it, we pause file reading.
And here the code of changed section
lineReader.on('line', function(line) {
n++;
PgClient.query('INSERT INTO HUMANS (firstname,lastname) values ($1,$2)', [insert.firstname,insert.lastname],function (err,result){
if (err) console.log(err);
if (PgClient.queryQueue.length>15000) {
lineReader.pause();
}
else lineReader.resume();
});
});

v1co1n on 7 Aug 2016

👎2

I think you should be finding another solution, on your end. For example, why not put everything in a single query, (or a couple of queries). You don't seem to need multiple queries.

INSERT INTO HUMANS (firstname, lastname) VALUES
($1::text, $2::text), ($3::text, $4::text)

This you could do by building the query string with some append.

I am not sure this will work, but it is worth trying. Roundtrips are expensive. And the data needs to be transferred any way.

joskuijpers on 16 Aug 2016

I wrote code using node transform stream and reader stream. Transform stream reads file, passing JSON in object mode to reader which can write to DB then.

langpavel on 21 Aug 2016

Perhaps you should use something more specialised, like https://github.com/brianc/node-pg-copy-streams. However, why not use a proper tool for the job. https://github.com/lukasmartinelli/pgfutter works amazingly well with CSV (we load millions of rows with it), and you can easily use it from your Node.js app via spawn or exec