Pytorch3d: PointsRenderer gives error in loss.backward()

Created on 14 May 2020  路  6Comments  路  Source: facebookresearch/pytorch3d

Hi,

I tried to use PointsRenderer and calculate the loss for my model. I can visualize the result and get the loss but I get "grad_distsmust be contiguous." error in "loss.backward()" step. To check if I did anything wrong, I modified the "Camera position optimization" tutorial to work with Point Clouds with PointsRenderer instead of MeshRenderer, but it still gives the same error. When I set the device as CPU, it doesn't give any error but all grads become NaN.

Also, PointsRenderer gives error if Point Clouds don't have features so I use "torch.ones_like" to set features.

raster_settings = PointsRasterizationSettings(image_size=64, radius=0.06, points_per_pixel=8)
camera = OpenGLPerspectiveCameras(device=device)
compositor = AlphaCompositor()
rasterizer = PointsRasterizer(camera, raster_settings)
PCrenderer = PointsRenderer(rasterizer, compositor)

distance = 1 
elevation = 30.0 
azimuth = 45.0 
R, T = look_at_view_transform(distance, elevation, azimuth, device=device)
testPC = torch.rand([1,1024,3], device=device) #My Point Cloud model is in the same shape, this is just for test purposes
PCs = Pointclouds(points = testPC , features = torch.ones_like(testPC, device=device))
image = PCrenderer(PCs, R=R, T=T)
question

Most helpful comment

This should now be fixed by the update in https://github.com/facebookresearch/pytorch3d/commit/3fef5068955e3628948236e7eea7d98f4e37b11e. Please reopen the issue if you have any further problems!

All 6 comments

I have looked the issues and find a similar implementation which seems to work in the previous release. I tried the code in Colab and it gives the same error.
https://github.com/facebookresearch/pytorch3d/issues/143#issue-596037879

Hi @congun
First, the issue you tag here seems totally unrelated to your issue. So I am not sure what parallels you drew.

I tried to reproduce your issue. This is what I ran

    device = torch.device("cuda:1")
    torch.cuda.set_device(device)

    raster_settings = PointsRasterizationSettings(image_size=64, radius=0.06, points_per_pixel=8)
    camera = OpenGLPerspectiveCameras(device=device)
    compositor = AlphaCompositor()
    rasterizer = PointsRasterizer(camera, raster_settings)
    PCrenderer = PointsRenderer(rasterizer, compositor)

    distance = 1
    elevation = 30.0
    azimuth = 45.0
    R, T = look_at_view_transform(distance, elevation, azimuth, device=device)
    testPC = torch.rand([1, 1024, 3], device=device)
    testPC.requires_grad = True
    PCs = Pointclouds(points=testPC , features=torch.ones_like(testPC, device=device))
    image = PCrenderer(PCs, R=R, T=T)
    loss = image.sum()
    loss.backward()
    assert torch.isfinite(testPC.grad).all()

    filename = "/tmp/output/github_issue.png"
    rgb = image[0]
    Image.fromarray((rgb.detach().cpu().numpy() * 255).astype(np.uint8)).save(filename)

The image from the forward pass is shown here
github_issue

Since I can't reproduce your error I can't quite understand what is wrong as I am able to perform the forward and backward pass. If you can't run the code I pasted, maybe try to build from source instead of the release. Otherwise, the error might be in some other part of your code.

Thank you for the answer. Your code still gives the same error in my system and also in the Colab. Here is the Colab notebook with outputs:
https://colab.research.google.com/drive/13gUWHHxorGdFYUsk6STtJ5X-i8ynr8N8?usp=sharing

About the issue I tag: The problems are unrelated, I tested the code in the issue since it manages to perform the backward pass with PointsRenderer. It was just to indicate that the problem is not about implementation. I should have been more clear.

I had upgraded my pytorch3d to the latest version and got same problem.

I reproduced this on my system: https://colab.research.google.com/drive/13gUWHHxorGdFYUsk6STtJ5X-i8ynr8N8?usp=sharing

Same error : 'RuntimeError: grad_distsmust be contiguous.'

I had downgraded to this commit and problem solved. https://github.com/facebookresearch/pytorch3d/tree/6207c359b11b7834a416eec7f2c6be317f078259

I think you folks are correct. We recently added (https://github.com/facebookresearch/pytorch3d/commit/c3d636dc8c68cb2fd36b32d8dcc4bad27e2a551b) a CONTIGUOUS_CHECK for grad tensors in our cuda kernels even though grad tensors are converted to be contiguous afterwards. My master was before that commit. This is the source of the issue. We are removing this check shortly.

This should now be fixed by the update in https://github.com/facebookresearch/pytorch3d/commit/3fef5068955e3628948236e7eea7d98f4e37b11e. Please reopen the issue if you have any further problems!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

unlugi picture unlugi  路  3Comments

cihanongun picture cihanongun  路  3Comments

eliemichel picture eliemichel  路  3Comments

farhanrw picture farhanrw  路  3Comments

zhjscut picture zhjscut  路  3Comments