Pytorch3d: Cuda out of memory during rendering and optimisation on textured mesh on 11 Gb GPU.

Created on 5 Jul 2020 · 9Comments · Source: facebookresearch/pytorch3d

Hi all,

I am facing an issue during rendering. The cuda is getting out of memory while rendering. I am using colab, and have 11GB gpu on running nvidia-smi.

I have tried several discussions including pytorch's cuda out of memory error from here on pytorch FAQs.

batch size is 1 only.
image size to render is 256
model have only 3 parameters as the same in your tutorial on camera position optimisation but I am using a textured mesh.

Here is the runtime error.

--------------------------------------------------------------------------------------------------------------------------

RuntimeError Traceback (most recent call last)
in ()
9 print(i)
10 optimizer.zero_grad()
---> 11 loss, _ = model()
12 loss.backward()
13 optimizer.step()

6 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, input, kwargs)
548 result = self._slow_forward(input, *kwargs)
549 else:
*--> 550 result = self.forward(input, *kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)

in forward(self)
34 T = -torch.bmm(R.transpose(1, 2), self.camera_position[None, :, None])[:, :, 0] # (1, 3)
35
---> 36 image = self.renderer(meshes_world=self.meshes.clone(), R=R, T=T)
37
38 # Calculate the silhouette loss

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, input, kwargs)
548 result = self._slow_forward(input, *kwargs)
549 else:
*--> 550 result = self.forward(input, *kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)

/usr/local/lib/python3.6/dist-packages/pytorch3d/renderer/mesh/renderer.py in forward(self, meshes_world, kwargs)
65 pix_to_face=fragments.pix_to_face,
66 )
*---> 67 images = self.shader(fragments, meshes_world, *kwargs)
68
69 return images

/usr/local/lib/python3.6/dist-packages/pytorch3d/renderer/mesh/shader.py in forward(self, fragments, meshes, kwargs)
226
227 def forward(self, fragments, meshes, *kwargs) -> torch.Tensor:
*--> 228 texels = interpolate_texture_map(fragments, meshes)
229 cameras = kwargs.get("cameras", self.cameras)
230 lights = kwargs.get("lights", self.lights)

/usr/local/lib/python3.6/dist-packages/pytorch3d/renderer/mesh/texturing.py in interpolate_texture_map(fragments, meshes)
75
76 pixel_uvs = pixel_uvs * 2.0 - 1.0
---> 77 texture_maps = torch.flip(texture_maps, [2]) # flip y axis of the texture map
78 if texture_maps.device != pixel_uvs.device:
79 texture_maps = texture_maps.to(pixel_uvs.device)

RuntimeError: CUDA out of memory. Tried to allocate 4.69 GiB (GPU 0; 11.17 GiB total capacity; 5.98 GiB already allocated; 629.88 MiB free; 10.15 GiB reserved in total by PyTorch)

----------------- on running nvidia-smi ------------------------

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

how to

Source

rohitdavas

Most helpful comment

Reducing the faces_per_pixel is indeed a way to reduce memory. However, with one image at 256x256 I don't think we expect to see OOM issues.
In this note we provide a formula to compute memory usage for forward and backward: https://github.com/facebookresearch/pytorch3d/blob/master/docs/notes/renderer.md. Can you verify that this formula and what you see are the same?

gkioxari on 7 Jul 2020

👍2

All 9 comments

This does not sound right and this should not be the case. I don't have access to your code but from the snippets you provide it seems you are doing some unnecessary clone, for example image = self.renderer(meshes_world=self.meshes.clone(), R=R, T=T). Every time you clone in PyTorch you are creating copies of tensors and in the case of meshes there is many tensors being stored.

gkioxari on 6 Jul 2020

👍1

I got your point.

I avoided the clone operation, but even after that issue is same.

One fix I got is reduce the faces_per_pixel=100 to lower value and render image image reduction.

On coming weekend I will sit again and try to see if I am using unnecessary memory.

rohitdavas on 7 Jul 2020

👍1

gkioxari on 7 Jul 2020

👍2

@rohitdavas did you manage to resolve this issue?

nikhilaravi on 30 Jul 2020

@nikhilaravi Sorry, I got busy. I have not found time to start the project again. But as soon as I start, I will first look into this.

rohitdavas on 31 Jul 2020

👍1

@rohitdavas any updates on this issue? If not please close it!

nikhilaravi on 22 Aug 2020

Sorry, I am not able to start work on this. I will reopen if I find something useful. Thanks for your patience.

Lowering image resolution and faces_per_pixel helps. commented above

rohitdavas on 2 Sep 2020

I have a very similar problem with a GPU of the same size, batch size 1, image size 256x256. Similar code worked on a larger GPU.

self.shader = SoftPhongShader(device=device, cameras=cameras, lights=lights)
img = self.shader(fragments, meshes_world, **kwargs)

GPU out of memory when trying to generate the image with this shader.

The fix of reducing the faces_per_pixel to 1 did not help.

natanielruiz on 19 Oct 2020

Hello, is there any way I can get help on the above? @nikhilaravi

Thank you :)

natanielruiz on 2 Dec 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Error when loading ShapeNet

elcronos · 3Comments

Adding support for non-watertight meshes

NotAnyMike · 3Comments

Unexpected rendering result wrt to the object's distance to the camera

unlugi · 3Comments

Sphere to Dolphin. Verts + Faces + Normals

ruslanvasylev · 3Comments

Error encountered when trying to load a .ply model using load_ply

farhanrw · 3Comments