Hey there :-),
I'm trying to render a lot of images, each containing about ~256 images itself, using sharp's composite.
I'm collecting all the input images and their top-/ left-coordinates using into an array, and passing that into _sharp.composite_ afterwards.
The problem is that sharp seems to be _very_ slow when compositing. Again, it's about ~256 input images into a 256x256px png (every base image is 16x16)px.
I tried some different image libraries.
Using Jimp I got about 8 images/ second, Mapbox got me about 50 images/ second, but sharp only gives me 1 to 2 images/ second. I'm using the same base code to collect the images for every different image library.
I was wondering, is sharp's strength only in resizing images or am I doing something wrong? Is there something I can configure to improve performance?
Otherwise, sharp is more than perfect and does everything I want, so I would want to reduce the list of dependencies and just stick to sharp...
I hope you have an amazing week,
clarkx86 :)
Hi, what you describe here sounds more like stitching images without overlap rather than compositing so the proposed feature of #1580 might better provide what you're looking for.
Hello, the x/y offset feature was added to composite in 8.7, and implemented in the simplest way you can imagine. It's been rewritten for 8.8 (due in a few weeks) and should be quicker now.
I made a tiny benchmark in Python:
#!/usr/bin/python3
import sys
import random
import pyvips
bg = pyvips.Image.new_from_file(sys.argv[1])
fg = [pyvips.Image.new_from_file(filename) for filename in sys.argv[2:]]
xes = [random.randint(0, bg.width - 16) for i in range(len(fg))]
yes = [random.randint(0, bg.height - 16) for i in range(len(fg))]
for i in range(100):
bg.composite(fg, "over", x=xes, y=yes).copy_memory()
That's compositing args 2+ on top of arg1 100 times. I can run it like this:
$ vips crop ~/pics/wtc.jpg base.jpg 0 0 256 256
$ vips crop ~/pics/PNG_transparency_demonstration_1.png x.png 150 150 16 16
$ for i in {0..256}; do cp x.png $i.png; done
$ time ../composite-many.py base.jpg *.png
real 0m9.067s
user 0m15.710s
sys 0m0.407s
So about 11 per second. With 8.7, I see:
$ time ../composite-many.py base.jpg *.png
real 0m17.563s
user 0m42.677s
sys 0m0.768s
About 6 per second.
You'll see a larger speedup with larger images -- 256x256 is too small for the libvips threading system to be very effective.
I remembered one more possibility: composite uses g++ vector arithmetic to generate SIMD code. If your gcc is too old of if your CPU does not support 4xfloat SIMD, it may not work and it'll fall back to a slower vanilla C path.
Check for this in configure output:
checking for gcc with working vector support... yes
checking for C++ vector shuffle... yes
checking for C++ vector arithmetic... yes
checking for C++ signed constants in vector templates... yes
And @lovell is correct of course, arrayjoin will be much faster if you are simply making a grid of small images.
Most helpful comment
Hello, the x/y offset feature was added to composite in 8.7, and implemented in the simplest way you can imagine. It's been rewritten for 8.8 (due in a few weeks) and should be quicker now.
I made a tiny benchmark in Python:
That's compositing args 2+ on top of arg1 100 times. I can run it like this:
So about 11 per second. With 8.7, I see:
About 6 per second.
You'll see a larger speedup with larger images -- 256x256 is too small for the libvips threading system to be very effective.