Pytorch3d: object depth information

Created on 9 Feb 2020 · 13Comments · Source: facebookresearch/pytorch3d

Hi,

Thanks for sharing your great work!
I was wondering - can one render a mesh and get per-pixel depth too, like e.g. mesh-renderer allows you to do?

Thanks a lot!
Z.

Source

zozobozo

👍1

Most helpful comment

Hi @zozobozo yes you can get the per pixel depth (for the top k faces which overlap with that pixel) from the output of the mesh rasterizer. fragments.zbuf is a (N, H, W, K) dimensional tensor.

To retrieve this output, you can initialize a rasterizer and only use that e.g.

rasterizer = MeshRasterizer(
    cameras=cameras, 
    raster_settings=raster_settings
)

fragments = rasterizer(meshes)

OR if you want the full image as well as the depth, you can extend the MeshRenderer class to create your own renderer which also returns the fragments.zbuf e.g

class MeshRendererWithDepth(nn.Module):
    def __init__(self, rasterizer, shader):
        super().__init__()
        self.rasterizer = rasterizer
        self.shader = shader

    def forward(self, meshes_world, **kwargs) -> torch.Tensor:
        fragments = self.rasterizer(meshes_world, **kwargs)
        images = self.shader(fragments, meshes_world, **kwargs)
        return images, fragments.zbuf

We also have a setting to enable perspective correct depth interpolation (set raster_settings.perspective_correct = True).

If this answers your question, please close this issue! :)

nikhilaravi on 9 Feb 2020

❤4 👍1

All 13 comments

To retrieve this output, you can initialize a rasterizer and only use that e.g.

rasterizer = MeshRasterizer(
    cameras=cameras, 
    raster_settings=raster_settings
)

fragments = rasterizer(meshes)

OR if you want the full image as well as the depth, you can extend the MeshRenderer class to create your own renderer which also returns the fragments.zbuf e.g

class MeshRendererWithDepth(nn.Module):
    def __init__(self, rasterizer, shader):
        super().__init__()
        self.rasterizer = rasterizer
        self.shader = shader

    def forward(self, meshes_world, **kwargs) -> torch.Tensor:
        fragments = self.rasterizer(meshes_world, **kwargs)
        images = self.shader(fragments, meshes_world, **kwargs)
        return images, fragments.zbuf

We also have a setting to enable perspective correct depth interpolation (set raster_settings.perspective_correct = True).

If this answers your question, please close this issue! :)

nikhilaravi on 9 Feb 2020

❤4 👍1

thanks for the quick answer! i'll give it a try!
Best,
Z.

zozobozo on 10 Feb 2020

@nikhilaravi I have tried Depth render with images, but found they are not in the same coordinate? it seems the y-axis flips? is this an issue or I use change it manually?
Best

wangsen1312 on 24 Feb 2020

@wangsen1312 this y flip issue has now been fixed - see #78 for further discussion.

nikhilaravi on 26 Mar 2020

👍1

@nikhilaravi Got it, Nice work！

wangsen1312 on 26 Mar 2020

Is this kind of depth image differentiable? @nikhilaravi

Bob-Yeah on 17 May 2020

@Bob-Yeah yes it should be differentiable.

nikhilaravi on 17 May 2020

I'm actually kind of curious now. For the zbuf output (and we optimize with respect to another 2.5D depth map target), is it differentiable ONLY at pixels where there is a face? Or is this like the SoftSilhouetteShader where the boundaries can also be optimized?

aluo-x on 31 Oct 2020

Following this.

I am using this shader

shader = SoftPhongShader(
    cameras=cameras,
    lights= lights,
    device=device
)

and this

rasterizer = MeshRendererWithDepth(rasterizer = rasterizer, shader = shader)
fragments = rasterizer(mesh)
fragments = fragments[1]

and my output is this.

I was wondering how can I extract something with a white gradient and a dark background like this

What are the main decisions regarding your color choice ( purple and green), how can I change those?!

albertotono on 2 Nov 2020

👍1

When you do fragments = fragments[1] you have a tensor, not a picture. You have done _something_ to make the picture, and your question pertains to that _something_, not pytorch3d. My guess is you have plotted a one-channel image with matplotlib, and it has defaulted to the viridis color scheme. You can change to a different color scheme, or manually convert to a 3 channel RGB image. E.g. expand the tensor from (H,W) to (H,W,3) (I think) so it becomes a 3-channel grayscale image.

bottler on 2 Nov 2020

👍4

We are landing a change now that introduces MeshRendererWithFragments that returns images, fragments into the renderer library which you can use in the future.

theschnitz on 5 Nov 2020

❤2 👍1

Thank you so much

albertotono on 6 Nov 2020

Hi @nikhilaravi , sorry to bother you. I meet a problem while trying to convert it to point clouds after getting the zbuf. I will be very grateful if you could give me some advice.

The problem is that the point cloud calculated from rendered zbuf is deformed. From my understanding, the zbuf is just like the depth image and I can easily convert it to point clouds using the intrinsic matrix. But the results failed.

The original mesh is like this:

The point clouds generated from zbuf is like this:

Here is the code I use:

import numpy as np
import matplotlib.pyplot as plt
from pytorch3d.io import load_objs_as_meshes, load_obj
from pytorch3d.renderer import (
    FoVPerspectiveCameras, look_at_view_transform,
    RasterizationSettings, BlendParams,
    MeshRenderer, MeshRasterizer, HardPhongShader
)
import open3d as o3d

width = 512
height = 512
fov = 60
obj_path = './data/examples/models/model_normalized.obj'

verts, faces, aux = load_obj(obj_path)
meshes = load_objs_as_meshes([obj_path])

R, T = look_at_view_transform(2.7, 10, 20)
cameras = FoVPerspectiveCameras(R=R, T=T, fov=fov)

raster_settings = RasterizationSettings(
    image_size=(height, width),
    blur_radius=0.0,
    faces_per_pixel=1,
    # max_faces_per_bin=20000
)

rasterizer = MeshRasterizer(
    cameras=cameras,
    raster_settings=raster_settings
)

depth = rasterizer(meshes).zbuf.cpu().squeeze().numpy()

cx = width / 2
cy = height / 2
fx = cx / np.tan(fov / 2)
fy = cy / np.tan(fov / 2)

row = height
col = width
# TODO check whether u or v is the column. depth[v, u] ???
v = np.array(list(np.ndindex((row, col)))).reshape(row, col, 2)[:, :, 0]
u = np.array(list(np.ndindex((row, col)))).reshape(row, col, 2)[:, :, 1]

X_ = (u - cx) / fx
X_ = X_[depth > -1]  # exclude infinity
Y_ = (v - cy) / fy * depth
Y_ = Y_[depth > -1]  # exclude infinity
depth_ = depth[depth > -1]  # exclude infinity

X = X_ * depth_
Y = Y_ * depth_
Z = depth_

coords_g = np.stack([X, Y, Z])  # shape: num_points * 3
pcd = o3d.geometry.PointCloud()
pcd.points = o3d.utility.Vector3dVector(coords_g.T)
o3d.visualization.draw_geometries([pcd])

Any suggestion will be helpful. Please reply at your convenience.

Thanks!