Docker
Chrome 79 on Linux
When running a submodel using --sm-cluster where the filenames contain spaces, the names are transformed incorrectly during transfer, causing file name mismatches and subsequent submodel processing failures.
Here are the corresponding filenames from the main dataset:
./images: [origin]
F - DJI_0104.JPG.exif F - DJI_0129.JPG.exif F - DJI_0154.JPG.exif F - DJI_0179.JPG.exif
./opensfm/exif: [origin]
F - DJI_0104.JPG.exif F - DJI_0129.JPG.exif F - DJI_0154.JPG.exif F - DJI_0179.JPG.exif
Here are the corresponding filenames in the submodel directory on my master node:
./images: [symlinked]
F - DJI_0104.JPG.exif F - DJI_0129.JPG.exif F - DJI_0154.JPG.exif F - DJI_0179.JPG.exif
./opensfm/exif: [symlinked]
F - DJI_0104.JPG.exif F - DJI_0129.JPG.exif F - DJI_0154.JPG.exif F - DJI_0179.JPG.exif
And finally, here are the filenames from within the docker container running the submodel:
./images: [copied from origin, name sanitized]
F_-_DJI_0104.JPG.exif F_-_DJI_0129.JPG.exif F_-_DJI_0154.JPG.exif F_-_DJI_0179.JPG.exif
./opensfm/exif: [copied from origin, not sanitized]
F - DJI_0104.JPG.exif F - DJI_0129.JPG.exif F - DJI_0154.JPG.exif F - DJI_0179.JPG.exif
(The filename list has been shortened for brevity)
Note the filename change after transfer into the container.
This name change results in an error at the feature matching stage in the submodel:
020-02-26 10:38:52,032 DEBUG: Found 37586 points in 27.7237670422s
2020-02-26 10:38:52,041 DEBUG: No segmentation for F_-_DJI_0179.JPG, no features masked.
2020-02-26 10:38:53,162 DEBUG: Found 37120 points in 29.470690012s
2020-02-26 10:38:53,169 DEBUG: No segmentation for F_-_DJI_0104.JPG, no features masked.
[INFO] running /usr/bin/env python2 /code/SuperBuild/src/opensfm/bin/opensfm match_features "/var/www/data/62a85002-8f4b-485f-8b02-52f37331711c/opensfm"
Traceback (most recent call last):
File "/code/SuperBuild/src/opensfm/bin/opensfm", line 34, in <module>
command.run(args)
File "/code/SuperBuild/src/opensfm/opensfm/commands/match_features.py", line 29, in run
pairs_matches, preport = matching.match_images(data, images, images)
File "/code/SuperBuild/src/opensfm/opensfm/matching.py", line 36, in match_images
exifs = {im: data.load_exif(im) for im in all_images}
File "/code/SuperBuild/src/opensfm/opensfm/matching.py", line 36, in <dictcomp>
exifs = {im: data.load_exif(im) for im in all_images}
File "/code/SuperBuild/src/opensfm/opensfm/dataset.py", line 269, in load_exif
with io.open_rt(self._exif_file(image)) as fin:
File "/code/SuperBuild/src/opensfm/opensfm/io.py", line 541, in open_rt
return io.open(path, 'r', encoding='utf-8')
IOError: [Errno 2] No such file or directory: '/var/www/data/62a85002-8f4b-485f-8b02-52f37331711c/opensfm/exif/F_-_DJI_0104.JPG.exif'
Traceback (most recent call last):
File "/code/run.py", line 57, in <module>
app.execute()
I considered this while looking around the codebase to find the error. In short, the name-mangling should most likely not be going on at all. You don't want mangled names in features, matches and the like that will no longer match the filenames in /images once all files are transferred back to the origin. It does not seem like any other component in ODM has a problem with unprocessed filenames, so the mangling doesn't seem to be useful either.
Currently the submodel image filenames are mangled at some point during the LRE transfer, but the exif data is copied from the original dataset, where the filenames remain unchanged.
There's some sanitization going on in NodeODM that looks similar to what's going on with the spaces in my filenames, so it might be that, but I can't verify that this is 100% the issue, so I thought it better to leave this issue in the main ODM project: https://github.com/OpenDroneMap/NodeODM/blob/60615c6f91717604e8fb15e8e6076a3b92a58f22/libs/taskNew.js#L84
I would expect this to happen in any dataset where the filenames contain spaces or other special characters that is run with the --sm-cluster option enabled.
A basic test on any dataset of sufficient size would be:
docker run --network="host" -ti --rm opendronemap/nodeodm &
docker run --network="host" -ti --rm -v DATASET:/datasets/code opendronemap/odm --project-path="/datasets" --split 10 --split-overlap 5 --sm-cluster http://localhost:3000
I've found that the export_visualsfm stage in OpenSfM creates semi-broken nvm files that cannot be opened by mve if filenames contain spaces, further complicating the issue.
I proposed a resolution here: https://github.com/simonfuhrmann/mve/issues/501 which would work around it for ODMs usage of mve, but it's an inherent limitation in the nvm file format which separates data elements by spaces without any escape character support.
Thus, I'm also proposing a resolution in OpenSfM, which is currently producing the invalid NVM files: https://github.com/mapillary/OpenSfM/issues/554 where the approach instead is to ensure OpenSfM creates consistent and readable output files.
Notably, this likely also breaks mvstex in the same way as mve.
Right, NVM files cannot contain spaces; for that reason I'm hesitant to handle it in ODM as well. You would also need to patch mvs-texturing.
Perhaps ODM should output a warning and quit. (Or implement a flag to rename files that have spaces automatically). Note the latter can get tricky to implement correctly as GCPs then could be broken. There's also the possibility of collisions ("A A.JPG" and "AA.JPG" in the same folder would be in conflict in you simply remove the space, or if you use a replacement char it's whatever the separator is, eg. "A~A.JPG" and "A A.JPG").
I use ReNamer to prevent image collisions on Windows.
My basic template is EXIF Date & Camera Model:
2020-02-27 10.26.53 [X-T1], for instance.
Something similar could be used with exiv2 to generate a more space/character friendly image (re)naming schema.
EXIF date and Make, maybe?
20200227102653_GoPro?
From a purely practical 'I want all inputs to work'-point of view, it seems the simplest place to implement a renaming strategy is OpenSfM, since it produces the broken NVM files. Otherwise we'll still have OpenSfM users finding out about this the hard way in the future.
To that end, I proposed https://github.com/mapillary/OpenSfM/pull/556 which would maintain existing filenames unless they cause issues, in which case spaces will be removed and the name will be suffixed with a hash of the original filename.
Mm, that could work! Let's see what the OpenSfM folks say.
@linusmartensson could you open a PR also for our fork at https://github.com/OpenDroneMap/OpenSfM targeting the 099 branch? It might take a while for OpenSfM to merge contributions (they are pretty busy these days), but we could test the changes in ODM sooner that way.
Since the PR is in on our fork and OpenSfM doesn't want it, should we perhaps add in an explicit warning or on our end that spaces in filenames are bad, then close this?
I think we should handle filenames with spaces, and I think our PR does it; the 099 branch will be included in the next release of ODM, so I'd wait to close this until we have released and confirmed it works.
We try to contribute as much as we can to OpenSfM, but there are features that sometimes are too specific to ODM (this wouldn't be the only one, multi-camera support is another example).
This should be fixed as of 0.9.9. Please update and re-open this issue if the problem persists? :pray:
Thanks!
Most helpful comment
I think we should handle filenames with spaces, and I think our PR does it; the
099branch will be included in the next release of ODM, so I'd wait to close this until we have released and confirmed it works.We try to contribute as much as we can to OpenSfM, but there are features that sometimes are too specific to ODM (this wouldn't be the only one, multi-camera support is another example).