Describe the bug
Hi, we were using the following code piece to align multiple point cloud pairs in a loop. As I remember the code was borrowed from one of open3d's examples. Everything worked before, but after I did some update, e.g. update pytorch from 1.2 to 1.4, update CUDA to 10.1 etc. the same code now has memory leak (the memory actually increases very fast, for point cloud size of around 14000, it increases 1GB in seconds). We did some tracking and it seems the issue comes from the function registration_ransac_based_on_feature_matching. Has anybody got similar issue or can provide some hints?
def align_pc(pcloud1, pcloud2, reflectance1=None, reflectance2=None):
pc1 = open3d.geometry.PointCloud()
pc1.points = open3d.utility.Vector3dVector(pcloud1)
pc2 = open3d.geometry.PointCloud()
pc2.points = open3d.utility.Vector3dVector(pcloud2)
if reflectance1 is not None and reflectance2 is not None:
max_ref = max(reflectance1.max(), reflectance2.max())
reflectance1 = reflectance1 / float(max_ref)
reflectance2 = reflectance2 / float(max_ref)
ref1 = np.expand_dims(reflectance1, axis=1)
ref1 = np.repeat(ref1, 3, axis=1)
pc1.colors = open3d.utility.Vector3dVector(ref1)
ref2 = np.expand_dims(reflectance2, axis=1)
ref2 = np.repeat(ref2, 3, axis=1)
pc2.colors = open3d.utility.Vector3dVector(ref2)
voxel_size = 0.2
normal_radius = 4 * voxel_size
open3d.geometry.PointCloud.estimate_normals(pc1, search_param=open3d.geometry.KDTreeSearchParamHybrid(
radius=normal_radius, max_nn=30))
open3d.geometry.PointCloud.estimate_normals(pc2, search_param=open3d.geometry.KDTreeSearchParamHybrid(
radius=normal_radius, max_nn=30))
feature_radius = 10 * voxel_size
pc1_fpfh = open3d.registration.compute_fpfh_feature(pc1, open3d.geometry.KDTreeSearchParamHybrid(
radius=feature_radius, max_nn=100))
pc2_fpfh = open3d.registration.compute_fpfh_feature(pc2, open3d.geometry.KDTreeSearchParamHybrid(
radius=feature_radius, max_nn=100))
# 3D feature based registration.
dist_threshold = 3 * voxel_size
result = open3d.registration.registration_ransac_based_on_feature_matching(
source=pc1, target=pc2, source_feature=pc1_fpfh, target_feature=pc2_fpfh,
max_correspondence_distance=dist_threshold,
estimation_method=open3d.registration.TransformationEstimationPointToPoint(False),
ransac_n=4,
checkers=[open3d.registration.CorrespondenceCheckerBasedOnEdgeLength(0.9),
open3d.registration.CorrespondenceCheckerBasedOnDistance(dist_threshold)],
criteria=open3d.registration.RANSACConvergenceCriteria(4000000, 500))
Expected behavior
When using the above function align_pc in a loop to align point cloud pairs, the memory should not keep increasing.
Environment (please complete the following information):
Could you provide a minimal self-contained script and data, where running the script can reproduce the issue?
Hi @yxlao ,
it seems the memory leak is related to the order of importing open3d. In my case, importing it before numpy has no problem, while changing the order gives memory increase. The following code can be used to verify it on my machine. As I mentioned, the issue appeared after some update, so not sure if it is generally reproducible on every machine. The data used in the example can be found here:
https://drive.google.com/drive/folders/1-sE1OcNgCIFYiX-p-NLluLy3NGkvou19?usp=sharing
import open3d
import numpy as np
# import numpy as np
# import open3d
def align(pcloud1, pcloud2):
pc1 = open3d.geometry.PointCloud()
pc1.points = open3d.utility.Vector3dVector(pcloud1)
pc2 = open3d.geometry.PointCloud()
pc2.points = open3d.utility.Vector3dVector(pcloud2)
voxel_size = 0.2
normal_radius = 4 * voxel_size
open3d.geometry.PointCloud.estimate_normals(pc1, search_param=open3d.geometry.KDTreeSearchParamHybrid(
radius=normal_radius, max_nn=30))
open3d.geometry.PointCloud.estimate_normals(pc2, search_param=open3d.geometry.KDTreeSearchParamHybrid(
radius=normal_radius, max_nn=30))
feature_radius = 10 * voxel_size
pc1_fpfh = open3d.registration.compute_fpfh_feature(pc1, open3d.geometry.KDTreeSearchParamHybrid(
radius=feature_radius, max_nn=100))
pc2_fpfh = open3d.registration.compute_fpfh_feature(pc2, open3d.geometry.KDTreeSearchParamHybrid(
radius=feature_radius, max_nn=100))
# 3D feature based registration.
dist_threshold = 3 * voxel_size
result = open3d.registration.registration_ransac_based_on_feature_matching(
source=pc1, target=pc2, source_feature=pc1_fpfh, target_feature=pc2_fpfh,
max_correspondence_distance=dist_threshold,
estimation_method=open3d.registration.TransformationEstimationPointToPoint(False),
ransac_n=4,
checkers=[open3d.registration.CorrespondenceCheckerBasedOnEdgeLength(0.9),
open3d.registration.CorrespondenceCheckerBasedOnDistance(dist_threshold)],
criteria=open3d.registration.RANSACConvergenceCriteria(4000000, 500))
result_icp2 = open3d.registration.registration_icp(
pc1, pc2, 0.25, result.transformation,
open3d.registration.TransformationEstimationPointToPlane())
return result_icp2.transformation, result_icp2.fitness, result_icp2.inlier_rmse
if __name__ == '__main__':
pc1 = np.load('pc1.npy')
pc2 = np.load('pc2.npy')
for i in range(200):
print(i)
align(pc1, pc2)
Hello, I got the same issue.
@yxlao Hey, is there any update on this issue? The problem still exists in the latest version 0.11.1.
I don't see a memory leak on my system with the above test code, whatever the import order is, but there seem to be more memory leaks elsewhere, as the "reconstruction system" tutorial fails on a machine with 256Gb RAM (keeping the default config parameters). open3d.pipelines.integration.ScalableTSDFVolume.integrate seems to be responsible for this.
This is probably the same memory leak notified in https://github.com/intel-isl/Open3D/issues/2107
Note that 0.10.0 works fine, so if you have this issue, I would recommend downgrading to 0.10.0.
...
Fragment 002 / 013 :: integrate rgbd frame 287 (88 of 100).
Traceback (most recent call last):
File "run_system.py", line 68, in <module>
make_fragments.run(config)
File ".../Open3D/examples/python/reconstruction_system/make_fragments.py", line 182, in run
Parallel(n_jobs=MAX_THREAD)(delayed(process_single_fragment)(
File ".../lib/python3.8/site-packages/joblib/parallel.py", line 1061, in __call__
self.retrieve()
File ".../lib/python3.8/site-packages/joblib/parallel.py", line 940, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File ".../lib/python3.8/site-packages/joblib/_parallel_backends.py", line 542, in wrap_future_result
return future.result(timeout=timeout)
File ".../lib/python3.8/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File ".../lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
raise self._exception
joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.
The exit codes of the workers are {SIGKILL(-9)}
This problem can be reproduced with open3d==0.11.1, numpy==1.19.1. Investigating.
Update:
BUILD_CUDA_MODULE=OFF, the issue is still observable.For now, we cannot reproduce the problem if we create a new Conda environment. The problem is only partially reproducible in an old environment that has been created before October. (pip has changed the way to resolve conflicts after October 2020).
@rui2016 @jbjeong @shiyoung77 please kindly let us know if you can still reproduce the problem in a newly created environment. Thanks!
I observe mo memory leak in a fresh pyenv with open3d-master installed, so I guess the leak was caused by some faulty numpy or something.
@rui2016 @jbjeong @shiyoung77 can you also test with the valgrind command below (if you're on linux)?
$ PYTHONMALLOC=malloc valgrind --tool=memcheck --leak-check=yes --leak-resolution=high ~/.pyenv/versions/open3d-master/bin/python3.8 open3d_issue_1787.py
...
==19826== LEAK SUMMARY:
==19826== definitely lost: 72,917 bytes in 323 blocks
==19826== indirectly lost: 520 bytes in 8 blocks
==19826== possibly lost: 1,216,903 bytes in 4,801 blocks
==19826== still reachable: 3,056,092 bytes in 26,984 blocks
==19826== of which reachable via heuristic:
==19826== stdstring : 14,844 bytes in 165 blocks
==19826== newarray : 32 bytes in 1 blocks
==19826== suppressed: 0 bytes in 0 blocks
@theNded @devernay I pip installed open3d in a new conda environment and now the issue is gone. It seems to be a problem of internal package management by the old version of pip/conda. Thanks for looking into that.
This issue will be closed for now. Please feel free to reopen, or continue discussions in the aforementioned issues that are still open.
I meet the same problem with open3d==0.9.0 and numpy==1.19.2. And I reinstall the open3d and upgrade the numpy from 1.19.2 to 1.19.4 and solve the problem.
Most helpful comment
Hi @yxlao ,
it seems the memory leak is related to the order of importing open3d. In my case, importing it before numpy has no problem, while changing the order gives memory increase. The following code can be used to verify it on my machine. As I mentioned, the issue appeared after some update, so not sure if it is generally reproducible on every machine. The data used in the example can be found here:
https://drive.google.com/drive/folders/1-sE1OcNgCIFYiX-p-NLluLy3NGkvou19?usp=sharing