Models: Details on delf feature

Created on 19 Apr 2018  路  7Comments  路  Source: tensorflow/models

Hi @andrefaraujo , thanks for open sourcing your DELF project.
I want to confirm a few things about the DELF features. There are five components fo DELF:

locations: [N, 2] float array which denotes the selected keypoint
  locations. N is the number of features.
scales: [N] float array with feature scales.
descriptors: [N, depth] float array with DELF descriptors.
attention: [N] float array with attention scores.
orientations: [N] float array with orientations.
  1. For location, is the format of each row (x, y) or (y, x)? Does it similar to OpenCV's coordinate where the top-left corner is the origin and horizontal axis is x and vertical axis is y?
  2. What does the scales mean? When extracting the features, the image is re-scaled according to the image_scales in the delf_config.pbtxt file. However, the values in scales are not equal to those settings. How to calculate the scales?
  3. Is the attention array sorted?
  4. All the orientations are zeros. Is it unused currently?

Most helpful comment

I think you are taking this from the feature_io.py file, right?

  1. (y, x). Yes. The features are visualized at https://github.com/tensorflow/models/blob/master/research/delf/delf/python/examples/match_images.py#L87 , so you can see how the convention is used here.
  2. scale = 1/image_scale. This is done for usual convention of scale values, as in eg SIFT. It is done here: https://github.com/tensorflow/models/blob/master/research/delf/delf/python/feature_extractor.py#L185
  3. Usually, yes. In the feature_io.py file, there is nothing that enforces this, though. But it should be sorted once you get the features, as in https://github.com/tensorflow/models/blob/master/research/delf/delf/python/examples/extract_features.py#L109 (it is sorted by the box_list_ops.non_max_suppression function in https://github.com/tensorflow/models/blob/master/research/delf/delf/python/feature_extractor.py#L236)
  4. Correct, it is unused currently.

All 7 comments

I think you are taking this from the feature_io.py file, right?

  1. (y, x). Yes. The features are visualized at https://github.com/tensorflow/models/blob/master/research/delf/delf/python/examples/match_images.py#L87 , so you can see how the convention is used here.
  2. scale = 1/image_scale. This is done for usual convention of scale values, as in eg SIFT. It is done here: https://github.com/tensorflow/models/blob/master/research/delf/delf/python/feature_extractor.py#L185
  3. Usually, yes. In the feature_io.py file, there is nothing that enforces this, though. But it should be sorted once you get the features, as in https://github.com/tensorflow/models/blob/master/research/delf/delf/python/examples/extract_features.py#L109 (it is sorted by the box_list_ops.non_max_suppression function in https://github.com/tensorflow/models/blob/master/research/delf/delf/python/feature_extractor.py#L236)
  4. Correct, it is unused currently.

OK, Thanks!

DELF feature is composed of these 5 parts, what is the final feature which can be meatured by cosine or Euclidean distance? Or how to generate one vector from these 5 parts?
@andrefaraujo

@xiaozhi2015

The descriptors variable contains each of the N local features' descriptors. They can be used to measure similarity between local image regions. You can see an example of how features are matched here: https://github.com/tensorflow/models/blob/master/research/delf/delf/python/examples/match_images.py#L60

Much thx for your reply! @andrefaraujo
Using delf_v1_20171026 model you uploaded, I get 1000 local features' descriptors and each of them has 40 dimensions. i.e. the final feature has 1000 * 40 dims, much more than many other methods, right? Can I list the 1000 * 40 dims into one 40000 dims vector?

No, you should use the concatenated local features as a global descriptor (ie, an embedding for the entire input image). That will not work.

I am wondering what exactly you are trying to do. If your goal is to obtain a global descriptor, you might want to disable PCA, then average pool the extracted 1024D descriptors. That would give you a better global descriptor.

Yes, I just want to obtain a global descriptor. I will try it as you say. ^_^

Was this page helpful?
0 / 5 - 0 ratings