I installed the latest pytorch3d 0.4 and tried to run the fit_textured_mesh tutorial under the Mesh prediction via silhouette rendering section. The loss becomes NaN after around 200 iterations (4 out of 5 times I can reproduce this issue).
I also tried pytorch3d 0.3 (built from source in December), and this issue never happened. Therefore, there might be some issues in the latest update for Mesh Rasterizer.
Install pytorch 1.7.1
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
Install pytorch3d using wheels for linux instruction
pip install pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py38_cu102_pyt171/download.html
And then simply run the fit_textured_mesh tutorial, you should be able to reproduce results. I can obtain the nan 4 out 5 times I run.
Best,
Songyou
Thanks @pengsongyou for reporting this issue! We'll look into it asap.
@pengsongyou I was able to reproduce the error. To resolve the issue in the tutorial add perspective_correct=False in the RasterizationSettings for the rasterizer. In v0.4 we changed this to be automatically inferred from the camera type but there seems to be some instability due to this. We will debug what is happening!
Great, now it indeed seems working, thanks a lot! I have been always using the perspective camera model, but I did not need to turn perspective_correct=False when I was using 0.3 because no issue was found. Just wondering if you could explain why we need to make it explicitly False now in 0.4?
Thanks so much in advance!
Best,
Songyou
@pengsongyou the perspective_correct setting basically ensures that the barycentric coordinates are correct under a perspective camera. This is not corrected in other differentiable renderers like SoftRas/NMR/DIB-R which assume that the perspective effects are small. In the previous version of PyTorch3D this was an optional setting but in the most recent release we decided to set it based on the type of the camera. We will investigate why this is causing nans in the optimization.
Hi, I have encountered similar NaN error in rasterizer :/. I just wanna provide another example that might help the team to debug. But as far as right now, perspective_correct=False / Orthogonal camera solves this particular case (Thanks Nikhila and Georgia)
NaN seems to happen when the rendered faces is parallel to the ray. (maybe relevant to the previous issue #110.)
I provided my triangle that caused nan fragments in the file: triangle.pkl, together with my script:
fname = 'triangle.pkl'
device = 'cuda:0'
with open(fname, 'rb') as fp:
obj = pickle.load(fp)
triangle = obj['tri']
triangle = triangle.to(device)
cameras = PerspectiveCameras(100., device=device)
blend_params = BlendParams(sigma=1e-4, gamma=1e-4)
dist_eps = 1e-6
raster_settings = RasterizationSettings(
image_size=224,
blur_radius=np.log(1. / dist_eps - 1.) * blend_params.sigma,
faces_per_pixel=100,
# perspective_correct=False, # this seems solve the nan error at least for this
)
rasterizer = MeshRasterizer(cameras=cameras, raster_settings=raster_settings).to(device)
fragments = rasterizer(triangle)
print(fragments.zbuf.isnan().any() ,fragments.bary_coords.isnan().any())
# True, True for me
The triangle looks like this in 3D:

and this in screen space:

visualization code:
from mpl_toolkits import mplot3d
import numpy as np
import matplotlib.pyplot as plt
import pickle
fname = '/tmp/transfer/vis/triangle.pkl'
with open(fname, 'rb') as fp:
triangle = pickle.load(fp)
verts = triangle['verts']
verts2d = triangle['verts_screen']
def refract_verts(verts):
verts = np.vstack([verts, verts[0:1]])
return verts
verts = refract_verts(verts)
verts2d = refract_verts(verts2d)
fig = plt.figure()
ax = plt.axes(projection='3d')
ax.set_xlabel('X axis')
ax.set_ylabel('Y axis')
ax.set_zlabel('Z axis')
ax.plot3D(verts[:, 0], verts[:, 1], verts[:, 2], 'gray')
fig = plt.figure()
plt.plot(verts2d[:, 0], verts2d[:, 1])
plt.show()
Thanks and good luck.
Most helpful comment
@pengsongyou I was able to reproduce the error. To resolve the issue in the tutorial add
perspective_correct=Falsein theRasterizationSettingsfor the rasterizer. In v0.4 we changed this to be automatically inferred from the camera type but there seems to be some instability due to this. We will debug what is happening!