Orb_slam2: Is there any difference between the ORB detector provided by opencv and ORBSLAM???

Created on 17 Jul 2017 · 8Comments · Source: raulmur/ORB_SLAM2

Is there any difference between the ORB detector provided by opencv and the ORB detector implemented in ORB-SLAM?
After the ORB-SLAM detection process, I tried to detect the feature and descriptors again via opencv's orb. The result was surprisingly better performance of the orb detector present in ORBSLAM, especially when the shadow region was present. In addition, I saw that the features obtained with the orb detector of orb SLAM are more widely distributed than those obtained with the orb of opencv.

Once again my question is, Is there a difference?

Source

hyunhoJeon

👍1

Most helpful comment

@hyunhoJeon

Yes, they are different. ORB means Oriented Brief. The "oriented" part is the same. But BRIEF, the binary descriptor, is different. And the one on orb-slam is different from the one on orb-slam2.

BRIEF paper says a random distribution of pixel coordinates works well, so each BRIEF implementation can be different.

AlejandroSilvestri on 17 Jul 2017

👍3

All 8 comments

@hyunhoJeon

Yes, they are different. ORB means Oriented Brief. The "oriented" part is the same. But BRIEF, the binary descriptor, is different. And the one on orb-slam is different from the one on orb-slam2.

BRIEF paper says a random distribution of pixel coordinates works well, so each BRIEF implementation can be different.

AlejandroSilvestri on 17 Jul 2017

👍3

I always appreciate your help. Thanks @AlejandroSilvestri
I have modified the contents of this article. There are some problems now, but I am trying to solve it myself. Thank you for your help.

hyunhoJeon on 19 Jul 2017

👍1

@AlejandroSilvestri

Please correct me if I am wrong, but the question here was about the difference between ORB descriptor and the modified version ORB in ORB-SLAM, isn't it?

Yes, ORB is based on BRIEF. However, ORB addresses the detection, the orientation assignment and the descriptor extraction phases rather than only the descriptor (i.e. BRIEF). ORB (Oriented FAST and Rotated BRIEF) adopts a multi-scale approach, where the FAST detector is run independently for each layer of a Gaussian pyramid. Thus, for each detected key-points the Harris response (instead of the FAST response) and the orientation angle are computed. The orientation is estimated using the Intensity Centroid method. The Harris response helps to retain the best N key-points in adaptive way over the scale layers. The BRIEF sampling pattern (pairs of pixels for the intensity comparison within the patch) to built the descriptor is randomly selected from a Gaussian distribution (even if the original paper investigates 5 types of selection). Unlike BRIEF, ORB learnt the sampling pattern to achieve high variance and low correlation between the tests for a patch size 31x31. If using a different patch size, the ORB sampling pattern will be random. Before extracting the descriptor, the estimated orientation angle is applied to the ORB sampling pattern.

@hyunhoJeon
Regarding the difference between ORB and the version in ORB-SLAM, the latter introduces a mechanism based on the grid to have the key-points detected evenly in the whole image and targeting a user-defined number of features. Basically, the main modification is at the detection level on how to get the points sparse over the image. Moreover, to obtain more points the FAST threshold is low down from 20 to 5 if no points are detected in a cell of the grid. (The explanation of all procedure is even more detailed, but I hope this can help). The sampling pattern, instead, is the same.

@AlejandroSilvestri
My question now: why feature-based vSLAM system such as ORB-SLAM requires the features to be sparse over the whole image? I found more than once the statement that detector/descriptor needs to be adapted for SLAM application (I believe I read it even in one of the Mikolajczyk's lessons, but I need to find again the reference). I believe, this is often connected with the forward motion like in KITTI and thus helps to estimate short-baseline motions. Does this sound familiar to you too?

kerolex on 9 Jan 2018

@kerolex

Hi, kerolex.

Thanks for the explanation. I have confirmed the part about dividing the grid. But I do not know why I have to do this. Is there any paper that suggests that spreading feature points are better than non-spreading? I have not seen this in the ORB-SLAM paper.

hyunhoJeon on 12 Jan 2018

@kerolex ,

Nicely put. Very well explained.

@hyunhoJeon , @kerolex

About feature sparseness: Frame pose is triangulated from mappoint observations, which are features. Each observation in a bunch piled up in the same area add little valuable information for triangulation. Provided a number of observations, they give more information if they are distant one from another. That means evenly distributed observations are more valuable than piled up ones.

AlejandroSilvestri on 12 Jan 2018

Thanks @AlejandroSilvestri .
You confirmed what I was guessing and understanding from different papers, such as Nister's VO, SVO, and DSO, where they always mention that it is desirable to have spatially well distributed features over the image. And indeed it makes sense.

However, I would be also interested in knowing some references in the literature that explains this motivation. So far, I have found these two papers [1][2] talking about the sparseness of features, but not specifically for visual SLAM/odometry.

I hope you or someone else can provide a good reference. Thanks.

[1] Tuytelaars T. _Dense interest points_ (CVPR 2010).
[2] S. Gauglitz, L. Foschini, M. Turk and T. Höllerer, _Efficiently selecting spatially distributed keypoints for visual tracking_ (ICIP 2011)

kerolex on 13 Jan 2018

@kerolex

Marked to read, thank you.

Beware, do not confuse spatially distributed with sparse. "Sparse" term often refers to "dense vs. sparse" debate.

AlejandroSilvestri on 13 Jan 2018

Note that the ORB descriptors are the same, only the detector is different.
This means that for a given pixel, the descriptor calculated from computeOrbDescriptor is the same as that from computeOrbDescriptors in opencv. Features in a map build from ORB-SLAM2 should be able to match with opencv ORB detected features.

Note that calling opencv feature2D::compute will get a different answer because it blurs the images even a keypoint is provided.