Natively, on a DO 16.04 droplet with 24 cores and 192GB RAM ([email protected] to access)
N/A
Split-merge crashes with MemoryError while computing the mask raster for the orthophoto of the first submodel (trace below). Somewhat earlier, there's a warning for a memory leak (see trace).
No crash!
The incantation to launch was:
nohup python run.py Msimbazi --split 0 --split-overlap 0 --ignore-gsd --depthmap-resolution 1000 --orthophoto-resolution 3 --dem-resolution 12 --dsm --texturing-nadir-weight 32 --mesh-size 600000 --mesh-octree-depth 11 --end-with split --gcp /mnt/odmphotos/Msimbazi/gcp_list.txt --camera-lens brown &
I got greedy and attempted to get a 3cm resolution on the orthophoto (in order to match the resolution of a Pix4D processing run of the same dataset).
The memory leak warning:
```Building topology for vector map
Registering primitives...
^MBuilding areas...
0%^H^H^H^H^H 100%^H^H^H^H^H
Attaching islands...
0%^H^H^H^H^H 100%^H^H^H^H^H
Attaching centroids...
0%^H^H^H^H^H 100%^H^H^H^H^H
Using tiles processing for edge detection
Creating edge map
Finding cutlines in both directions
Creating vector polygons
WARNING: Categories will be unique sequence, raster values will be lost.
WARNING: Memory leak: 4 points are still in use
WARNING: Number of incorrect boundaries: 365
WARNING: Number of incorrect boundaries: 365
Erasing temporary files...
And the stack trace for the crash:
v.out.ogr complete. 4368 features (Polygon type) written to
format).
Execution of
Cleaning up default sqlite database ...
Cleaning up temporary files...
^[[39m[INFO] Generated cutline file: /mnt/odmphotos/Msimbazi/submodels/submodel_0000/odm_orthophoto/grass_cutline_tmpdir/cutline.gpkg --> /mnt/odmphotos/Msimbazi/submodels/submodel_0000/odm_orthophoto/cutlin
e.gpkg^[[0m
^[[39m[INFO] Computing mask raster: /mnt/odmphotos/Msimbazi/submodels/submodel_0000/odm_orthophoto/odm_orthophoto_cut.tif^[[0m
Traceback (most recent call last):
File "/home/odm/ODM//run.py", line 57, in
app.execute()
File "/home/odm/ODM/stages/odm_app.py", line 92, in execute
self.first_stage.run()
File "/home/odm/ODM/opendm/types.py", line 464, in run
self.next_stage.run(outputs)
File "/home/odm/ODM/opendm/types.py", line 464, in run
self.next_stage.run(outputs)
File "/home/odm/ODM/opendm/types.py", line 464, in run
self.next_stage.run(outputs)
File "/home/odm/ODM/opendm/types.py", line 464, in run
self.next_stage.run(outputs)
File "/home/odm/ODM/opendm/types.py", line 464, in run
self.next_stage.run(outputs)
File "/home/odm/ODM/opendm/types.py", line 464, in run
self.next_stage.run(outputs)
File "/home/odm/ODM/opendm/types.py", line 464, in run
self.next_stage.run(outputs)
File "/home/odm/ODM/opendm/types.py", line 464, in run
self.next_stage.run(outputs)
File "/home/odm/ODM/opendm/types.py", line 464, in run
self.next_stage.run(outputs)
File "/home/odm/ODM/opendm/types.py", line 464, in run
self.next_stage.run(outputs)
File "/home/odm/ODM/opendm/types.py", line 445, in run
self.process(self.args, outputs)
File "/home/odm/ODM/stages/odm_orthophoto.py", line 149, in process
blend_distance=20, only_max_coords_feature=True)
File "/home/odm/ODM/opendm/orthophoto.py", line 90, in compute_mask_raster
dist_t = ndimage.distance_transform_edt(alpha_band)
File "/usr/local/lib/python2.7/dist-packages/scipy/ndimage/morphology.py", line 2194, in distance_transform_edt
dt = ft - numpy.indices(input.shape, dtype=ft.dtype)
MemoryError
Traceback (most recent call last):
File "run.py", line 57, in
app.execute()
File "/home/odm/ODM/stages/odm_app.py", line 92, in execute
self.first_stage.run()
File "/home/odm/ODM/opendm/types.py", line 464, in run
self.next_stage.run(outputs)
File "/home/odm/ODM/opendm/types.py", line 445, in run
self.process(self.args, outputs)
File "/home/odm/ODM/stages/splitmerge.py", line 150, in process
system.run(" ".join(map(quote, argv)), env_vars=os.environ.copy())
File "/home/odm/ODM/opendm/system.py", line 76, in run
raise Exception("Child returned {}".format(retcode))
Exception: Child returned 1
```
The full console log is available at [email protected]:/home/odm/ODM/nohup_run_4.out (you can pull it down with scp using the password shared during the discussion of #1085).
The dataset is in [email protected]:/mnt/odmdata/Msimbazi. This includes a folder of images, as well as a gcp_list.txt and an image_groups.txt file.
Note that the DO droplet has now been resized down to the smallest, cheapest one, and the volume (drive) the data is on is only 300GB; both can be resized up for testing if needed but I don't want to incur the ongoing expense otherwise.
The leak is related to GRASS, but the crash is due to numpy's calculation of the euclidean map for orthophoto merging:
dt = ft - numpy.indices(input.shape, dtype=ft.dtype)
MemoryError
I'm not sure what the memory requirements are for the indices function, but you might be hitting memory limits. Add more RAM.
Ok, though it's not possible鈥攁s far as I know鈥攖o add more RAM than this to a Digital Ocean machine (already at 192GB, the max offered). Perhaps other cloud providers can go beyond 192GB.
Out of curiosity, what is the size in pixels of the orthophoto that's failing?
Seems to be 3cm and dsm 12cm
Could be what you are thinking IMO excepting running volume computation on all dsm
Yes, 3cm ortho and 12cm DEM.
The same dataset with 5cm ortho and 15cm DEM does not trigger this issue.
I might be wrong here, but as far as I know, there's no automatic calibration of the split size nor overlap, so "--split 0 --split-overlap 0" will produce an x/0 around https://github.com/mapillary/OpenSfM/blob/905550672d041f63c2d38258b6359ed3d9b473ae/opensfm/commands/create_submodels.py#L97 - most likely leaving you with just a single submodel. which could be the cause for your memory problems as the crash is seen in your first submodel: ".../Msimbazi/submodels/submodel_0000/..."
Perhaps you could confirm the number of splits you have in your submodels directory?
If you determine this might be the problem, try setting a --split that cuts your map into sizeable chunks, then a split overlap that ensures you can match between your splits. Note that too big or too small overlaps can both cause their own set of issues since the alignment between splits is rigid.
These data were run with an image_groups.txt file that specified image use in each submodel negating the need for --split and --split-overlap parameters (see https://docs.opendronemap.org/large.html#local-split-merge). Thus, those values were set to 0 just to se them, but in fact, there were plenty of images and plenty of overlap.
Related: #1113