Julia's conv2
is much slower than Matlab's (~100x):
Julia v0.5-rc3
julia> img = rand(1024,1024);
julia> @benchmark conv2(img, [-1 0 1; -2 0 2; -1 0 1.])
BenchmarkTools.Trial:
samples: 8
evals/sample: 1
time tolerance: 5.00%
memory tolerance: 1.00%
memory estimate: 136.54 mb
allocs estimate: 235
minimum time: 613.07 ms (0.99% GC)
median time: 664.07 ms (8.41% GC)
mean time: 669.47 ms (8.98% GC)
maximum time: 749.12 ms (17.76% GC)
julia> versioninfo()
Julia Version 0.5.0-rc3+0
Commit e6f843b (2016-08-22 23:43 UTC)
Platform Info:
System: Darwin (x86_64-apple-darwin13.4.0)
CPU: Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
LAPACK: libopenblas64_
LIBM: libopenlibm
LLVM: libLLVM-3.7.1 (ORCJIT, haswell)
Matlab 2015b
>> img = rand(1024, 1024);
>> tic(); conv2(img, [-1 0 1; -2 0 2; -1 0 1.]); toc();
Elapsed time is 0.005330 seconds.
This might be due to https://github.com/JuliaLang/julia/issues/17000~~
Even with bumping up the number of threads:
julia> img = rand(1024, 1024);
julia> FFTW.set_num_threads(8)
julia> @benchmark conv2(img, [-1 0 1; -2 0 2; -1 0 1.])
BenchmarkTools.Trial:
samples: 24
evals/sample: 1
time tolerance: 5.00%
memory tolerance: 1.00%
memory estimate: 136.54 mb
allocs estimate: 235
minimum time: 167.53 ms (5.64% GC)
median time: 234.68 ms (27.24% GC)
mean time: 218.57 ms (20.45% GC)
maximum time: 267.15 ms (33.01% GC)
Still 50x slower than Matlab
Maybe https://github.com/JuliaLang/julia/issues/18245 is related?
binary or source build?
Binary from http://julialang.org
Even if this is just #18245, I would guess this particular case would be faster with na茂ve convolution vs. FFT. According to the docs, that's what MATLAB conv2
does. The FFT-based conv2
algorithm would probably also be faster if it used overlap-save instead of padding both of the signals to the same size.
Interesting. I would argue that convoluting with something like a Sobel operator is a pretty common operation in image analysis so it might be good to optimize for this case.
The FFT-based conv2 algorithm would probably also be faster if it used overlap-save instead of padding both of the signals to the same size.
I didn't realize it was doing this, but that could be very expensive due to the huge size difference between the two matrices here.
The Sobel operator is also separable, so the na茂ve convolution can be implemented even more efficiently. We have a special form of conv2
for separable filters, but it's also in need of optimization. It seems as though some MATLAB functions automatically test for filter separability when given a filter as a matrix.
It looks like the separable version is also padding: https://github.com/JuliaLang/julia/blob/06c0ba6233258dbfd9cbb301e3fa115ee1b8de49/base/dsp.jl#L148-L167
So maybe we could do an automatic check for separability for the potential speedup? Something like this: http://blogs.mathworks.com/steve/2006/11/28/separable-convolution-part-2/
You may want to look at imfilter in the Images package.
I'd benchmark before assuming separability is a gain for a 3x3 kernel. Two passes through the memory might be too costly.
Rather than Images per se, the state-of-the-art (which will soon be the default in Images) is https://github.com/JuliaImages/ImagesFiltering.jl/pull/5. This now uses a cache-efficient (tiled) algorithm (somewhat similar to http://halide-lang.org/), tricks to avoid explicit padding, and optionally Julia's native multithreading capabilities. These optimizations turn out to make a pretty large difference in performance.
EDIT: I didn't mention Matlab in that PR, but the version in ImagesFiltering is about 3x faster than Matlab, even though Matlab is using multithreading. Even the single-threaded Julia version is faster than multithreaded Matlab on my 2-core machine.
(The tiled algorithm avoids two passes through memory, so the separable algorithm is a win.)
It would be great to port some of that high performance code over to conv2
in Base since having a slow function to do a subset of what imfilter
does seems like a bad idea.
no. better remove conv2
from Base and point to Images
as the package to be used when using filtering.
It seems useful to have the mathematical method (2d convolution) separated from the application (image filtering) because somebody might want to run a 2d convolution on something that is an image but I don't know the application well enough to tell if that would hurt.
@andreasnoack: This is a naming confusion. The scope of the Images package is actually nd-arrays. So the algorithms in that package are pretty application free and work on any multi-dimensional data.
Do we have a package for nd-array operations already? This is part of the problem with moving everything out Base, IMO.
The problem is not moving things out of Base but clever organization of the package landscape.
It's worth pointing out that imfilter
isn't a drop-in replacement for conv2
; conv2
is more symmetric with respect to its two input arrays, and the returned array is larger than either of the inputs because it returns all nonzero values. In contrast, imfilter
makes a strong distinction between the input "image" and the "kernel", and the output has indices that match those of the "image." You can obtain the same result by padding the input "image" with zeros, of course, but it's a noteworthy difference in API.
I do think it makes sense to have separate APIs for image filtering and general convolution, but ideally they should share as much as possible. If they can't share code directly, then they should at least use the same techniques.
@timholy: Absolutely. My comment should not mean that imfilter can be used as is but that I question the above proposal to pull in advanced filtering/convolution algorithm into Base. Instead better use a high level package like Images
. The issue is as Jeff said that the high level interfaces should share infrastructure and core implementation. And that is exactly how you designed the algorithms in Images
, no?
It should not be at all hard to write a convn
on top of the existing code. ImagesFiltering isn't an enormous package, but it uses a lot of machinery that's not in Base: for example, it depends on all the packages in JuliaArrays (with the exception of EndpointArrays), plus OffsetArrays and ComputationalResources.
@timholy: And what would be your suggestion for conv2
in Base? I see three possibilities
Either 2 or 3. When the kernel is small, it definitely makes no sense to use FFT. Fortunately, it's not hard to write a generic nd convolution. This is basically the whole thing:
fill!(out, zero(eltype(out)))
for I in CartesianRange(indices(A))
a = A[I]
for J in CartesianRange(indices(B))
out[I+J] += a*B[J]
end
end
except really the LHS has to be out[I+J+offset]
because for historical reasons we implicitly shift the indices of the output so that indexing starts at 1. (I must say, having arrays with indexing that doesn't start at 1 makes all kinds of filtering operations a whole lot cleaner---all your code looks just like the mathematics, and you no longer have to mentally translate everything for the padding.)
This won't have the fancier features of ImagesFiltering (tiling i.e. support for efficient "cascades" of multiple filters, multithreading, support for IIR filtering, etc), but still it should be pretty decent, and it's a tiny amount of code all of which is supported by Base.
Quite frankly, for the purpose of image filtering, the only reason to use conv2 is that you're running Matlab and can't afford an Image Processing Toolbox license. In all other respects imfilter is the superior option and in Julia it's only a Pkg.add away.
So the question is what other applications have use of conv2 and to what extent they benefit from optimizations. Or maybe the question is why the DSP module is still in Base and not a package (and even so, conv2 looks rather out of place there).
Or maybe the question is why the DSP module is still in Base and not a package (and even so, conv2 looks rather out of place there).
Yes, I think it should go. Most of it is already implemented in packages anyway.
conv
.deconv
.conv2
.I've changed my tune, I think we should remove conv2
from Base. imfilter
in ImageFiltering
is much superior to this. This function was thrown together as a convenience for people migrating from matlab, but is both slow and doesn't belong. I think it's presence is misleading and people might use it instead of searching for a better alternative (like imfilter
).
Needs conv2 be deprecated (in 0.6 or just removed there)? Can it be in 0.5.x? Even added to 0.5.0 NEWS?
@tknopp "remove conv2 from Base and point to Images as the package to be used when using filtering."
Point from NEWS or from an actual deprecate warning message? Both?
[Seems drastic to just remove slow stuff if slower than MATLAB (or in case of Python), but my understanding from scanning is replacement is fast [enough] as MATALB? E.g. not asked to reimplement or move elsewhere.]
@simonster "ImagesFiltering can own conv2", did you mean Images.jl? I see no ImagesFiltering.jl.
I'm looking conv2 (or related) in Images.jl and now DSP.jl and actually only found (so far):
https://github.com/JuliaDSP/DSP.jl/blob/master/src/util.jl#L216
"unsafe_dot" is it because of @inbounds? (or @simd for i in 1:cLastIdx ?) I thought this was supposed to be safe, up until new 0-based.. First of, is "unsafe_" a recommended prefix; and here now error_ a better one.. as both functions broken for 0-based?
https://groups.google.com/forum/#!topic/julia-dev/MdTBxgVoab0
We are talking about https://github.com/JuliaImages/ImageFiltering.jl, which hasn't been officially released yet, but should be soon. It has an extremely fast imfilter
routine which can do 2D convolutions.
Independently of the question if a convolution function belongs to base the function conv2
is absolutely unjulian since it encodes the dimension in the function name.
Ok, let's plan to remove dsp.jl from Base:
conv
function?conv2
?crosscor
is not the same as xcorr
. Need some advice on this.Deconvolution.jl seems to contain exactly one function, which is great, but makes me wonder whether it makes sense to add to it.
We'll also want lucyrichardson
, nearestneighbor
, blind
, etc. It just takes someone to write them :smile:. I don't think it would make sense to have a package per method, so I think a general Deconvolution package makes sense.
Can we add a wrapper in ImageFiltering.jl that behaves like conv2?
Should be doable. Maybe file an issue over at ImageFiltering so I don't forget.
Should be doable. Maybe file an issue over at ImageFiltering so I don't forget.
Done.
Any progress here?
This is done.
Most helpful comment
Rather than Images per se, the state-of-the-art (which will soon be the default in Images) is https://github.com/JuliaImages/ImagesFiltering.jl/pull/5. This now uses a cache-efficient (tiled) algorithm (somewhat similar to http://halide-lang.org/), tricks to avoid explicit padding, and optionally Julia's native multithreading capabilities. These optimizations turn out to make a pretty large difference in performance.
EDIT: I didn't mention Matlab in that PR, but the version in ImagesFiltering is about 3x faster than Matlab, even though Matlab is using multithreading. Even the single-threaded Julia version is faster than multithreaded Matlab on my 2-core machine.