Node-postgres: Problems with bytea performance

Created on 10 May 2017 · 22Comments · Source: brianc/node-postgres

I originally opened this as part of a knex issue: https://github.com/tgriesser/knex/issues/2052 but was directed here instead.

I have a simple query:

SELECT glb FROM models where assets_id = 16 LIMIT 1;

glb is a bytea column and the value I'm retrieving is 28630077 bytes (~27MB) (models contains a single row in this example). The query takes 13305 ms to run and the Node process (not the DB process) maxes out the CPU while the query is running). If I query for the assets_id column instead of the glb column, it only takes 2 ms.

Running the same query with the same data from the psql command line completes almost immediately:

time psql -A -c "SELECT glb FROM models where assets_id = 16 LIMIT 1;" master postgres > out.glb

real    0m0.679s
user    0m0.000s
sys     0m0.031s

I also tested the same query in pg-native and it completed in ~450ms, but using pg-native isn't an option for me at this time (though I might have to re-evaluate that depending on where this issue goes).

Here's the table definition for completeness.

CREATE TABLE public.models
(
  assets_id integer NOT NULL,
  glb bytea NOT NULL,
  CONSTRAINT models_pkey PRIMARY KEY (assets_id),
  CONSTRAINT models_assets_id_foreign FOREIGN KEY (assets_id)
      REFERENCES public.assets (id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE CASCADE
)
WITH (
  OIDS=FALSE
);
ALTER TABLE public.models
  OWNER TO postgres;

Finally, I thought maybe it was a performance issue in the type parser, but all of the time is taken up by the query and then the typeParser completes almost instantly at the end.

Am I doing something wrong? Or is there a performance issue with bytea issues? I'd be happy to debug this further myself if someone can point me in the correct direction. Thanks in advance.

bug

Source

mramato

👍4

All 22 comments

I'd be happy to debug this further myself if someone can point me in the correct direction

Please do, many people would want to know :wink: Your own test is the best direction so far, add logs within the connection and the pg-types, see where the main delay happens ;)

UPDATE

On second thought though, since use of the pg-native made such a huge difference, means it is not a pg-types issue, more like inside the Connection object. There may be something wrong with the read operation.

vitaly-t on 10 May 2017

This caught my interest, so I did some quick local testing, and I can confirm the following results...

Reading a single bytes field that contains a 16MB file:

With JavaScript bindings: 4s
With Native bindings: 200ms

That's a 20x difference - huge!!!!

vitaly-t on 10 May 2017

On second thought though, since use of the pg-native made such a huge difference, means it is not a pg-types issue, more like inside the Connection object.

Thanks, I'll start there and see what I can find.

mramato on 10 May 2017

~~Reading is faster for you in JavaScript? Or is that a typo?~~

Just saw your edit. Glad you can reproduce. I'll keep digging but let me know if you find something on your end.

mramato on 10 May 2017

I did some profiling and a huge amount of time (~10 seconds of the 13 for my test case) is spent in Reader.addChunk from packet-reader: https://github.com/brianc/node-packet-reader/blob/master/index.js#L18

More specifically, it looks like lastChunk is set to a buffer instead of false most of the time causing tons of Buffer.concat calls, (874 in my case) which kills performance and probably applies a ton of memory pressure as well. I'm not exactly sure why this is happening, but I'm guessing refactoring the code to avoid all of the extra allocations would go a long way to fixing this. In my own memory streams, I usually keep an array of buffers and only concast once at the end; but I haven't learned enough about the code here yet to know if that's a possible solution.

Thoughts?

mramato on 10 May 2017

I'm not sure yet. All I can see is that all the time is being spent inside the on data handler:

https://github.com/brianc/node-postgres/blob/master/lib/connection.js#L128

Which line in there results in the longest delay - harder to tell.

vitaly-t on 10 May 2017

Yeah, most of the time is being spent on this line: https://github.com/brianc/node-postgres/blob/master/lib/connection.js#L129

From my total 4s it eats about 3.2s of time, that's definitely bad!!!

vitaly-t on 10 May 2017

Yep, that's exactly what I'm seeing. I think I know what the problem is and I'm trying to rewrite addChunk to be more performant.

mramato on 10 May 2017

@brianc It seems that this line kills the performance when dealing with large bytea columns: https://github.com/brianc/node-packet-reader/blob/master/index.js#L22

It slows down the library 20 times (from my tests), compared to the Native Bindings.

And it may be worthwhile revisiting the entire on-data handler: https://github.com/brianc/node-postgres/blob/master/lib/connection.js#L128

Unfortunately, I cannot be more specific at present, those things require a bit of a closer look.

vitaly-t on 10 May 2017

@vitaly-t, I have a fix for the packet-reader problem that I mentioned in https://github.com/brianc/node-postgres/issues/1286#issuecomment-300565369.

I've gone from ~13 seconds to ~450ms, matching the native time!

mramato on 10 May 2017

👍1

@mramato Wow, magical! I want to see that! Will you do a PR? ;)

I was, in the meantime, trying to ask a question here: http://stackoverflow.com/questions/43900530/slow-buffer-concat

@mramato If you are doing a PR, keep an eye on this question on StackOverflow, it might offer even a better idea, as there is already one answer there ;)

vitaly-t on 10 May 2017

@vitaly-t see https://github.com/brianc/node-packet-reader/pull/3, should be easy enough for you to patch your local copy and verify my claims.

Not sure of the full extent of this improvement, did I just make every node-based Postgres app in the world significantly faster? 😄

mramato on 10 May 2017

@mramato Well done! I can confirm that the change in https://github.com/brianc/node-packet-reader/pull/3 in fact improves the performance really well, close to what we get with the Native Bindings.

I cannot however confirm that the change guarantees the data integrity. New tests towards that are a must-have for such a change.

vitaly-t on 10 May 2017

I cannot however confirm that the change guarantees the data integrity. New tests towards that are a must-have for such a change.

Thanks. Can you be more specific regarding exactly what you would like to see that isn't covered by the existing tests? Are you talking about adding tests in node-packet-reader or tests in node-postgres?

mramato on 10 May 2017

I'm saying that we need to make sure that node-packet-reader tests cover this change. And I'm not saying they don't already, I haven't checked them yet.

vitaly-t on 10 May 2017

OK, thanks. I'll do a pass myself and see if there is anything obvious I can add. Thanks for the help on this by the way.

mramato on 10 May 2017

👍1

@vitaly-t I used istanbul to verify coverage and the new buffer compaction was not tested at all (but the rest of the changes were adequately covered. I added a new test to verify the behavior is expected and the file is back to 100% coverage. Hopefully that PR is now good to go, but please let me know if there's anything else you need me to do.

mramato on 10 May 2017

@mramato Good by me! But I'm not the one to approve the PR :wink:

vitaly-t on 10 May 2017

Thanks for your help y'all! Much appreciated!

brianc on 15 May 2017

🎉1

Isn't a similar bug potentially affecting the text type as well? I haven't tried or experienced any problem with it, am just wondering, because reading text type (also of "infinite" length) is likely to involve mechanisms similar to what is used for bytea (copying heavily across buffers or the like).

pdkovacs on 23 Apr 2018

@pdkovacs The packet-reader package that was patched reads entire messages; parsing individual fields wasn’t the problem. The fix applies to all types.