Hi,
Thanks for sharing your great work!
I was wondering - can one render a mesh and get per-pixel depth too, like e.g. mesh-renderer allows you to do?
Thanks a lot!
Z.
Hi @zozobozo yes you can get the per pixel depth (for the top k faces which overlap with that pixel) from the output of the mesh rasterizer. fragments.zbuf is a (N, H, W, K) dimensional tensor.
To retrieve this output, you can initialize a rasterizer and only use that e.g.
rasterizer = MeshRasterizer(
cameras=cameras,
raster_settings=raster_settings
)
fragments = rasterizer(meshes)
OR if you want the full image as well as the depth, you can extend the MeshRenderer class to create your own renderer which also returns the fragments.zbuf e.g
class MeshRendererWithDepth(nn.Module):
def __init__(self, rasterizer, shader):
super().__init__()
self.rasterizer = rasterizer
self.shader = shader
def forward(self, meshes_world, **kwargs) -> torch.Tensor:
fragments = self.rasterizer(meshes_world, **kwargs)
images = self.shader(fragments, meshes_world, **kwargs)
return images, fragments.zbuf
We also have a setting to enable perspective correct depth interpolation (set raster_settings.perspective_correct = True).
If this answers your question, please close this issue! :)
thanks for the quick answer! i'll give it a try!
Best,
Z.
@nikhilaravi I have tried Depth render with images, but found they are not in the same coordinate? it seems the y-axis flips? is this an issue or I use change it manually?
Best
@wangsen1312 this y flip issue has now been fixed - see #78 for further discussion.
@nikhilaravi Got it, Nice work!
Is this kind of depth image differentiable? @nikhilaravi
@Bob-Yeah yes it should be differentiable.
I'm actually kind of curious now. For the zbuf output (and we optimize with respect to another 2.5D depth map target), is it differentiable ONLY at pixels where there is a face? Or is this like the SoftSilhouetteShader where the boundaries can also be optimized?
Following this.
I am using this shader
shader = SoftPhongShader(
cameras=cameras,
lights= lights,
device=device
)
and this
rasterizer = MeshRendererWithDepth(rasterizer = rasterizer, shader = shader)
fragments = rasterizer(mesh)
fragments = fragments[1]
and my output is this.

I was wondering how can I extract something with a white gradient and a dark background like this

What are the main decisions regarding your color choice ( purple and green), how can I change those?!
When you do fragments = fragments[1] you have a tensor, not a picture. You have done _something_ to make the picture, and your question pertains to that _something_, not pytorch3d. My guess is you have plotted a one-channel image with matplotlib, and it has defaulted to the viridis color scheme. You can change to a different color scheme, or manually convert to a 3 channel RGB image. E.g. expand the tensor from (H,W) to (H,W,3) (I think) so it becomes a 3-channel grayscale image.
We are landing a change now that introduces MeshRendererWithFragments that returns images, fragments into the renderer library which you can use in the future.
Thank you so much
Hi @nikhilaravi , sorry to bother you. I meet a problem while trying to convert it to point clouds after getting the zbuf. I will be very grateful if you could give me some advice.
The problem is that the point cloud calculated from rendered zbuf is deformed. From my understanding, the zbuf is just like the depth image and I can easily convert it to point clouds using the intrinsic matrix. But the results failed.
The original mesh is like this:

The point clouds generated from zbuf is like this:


Here is the code I use:
import numpy as np
import matplotlib.pyplot as plt
from pytorch3d.io import load_objs_as_meshes, load_obj
from pytorch3d.renderer import (
FoVPerspectiveCameras, look_at_view_transform,
RasterizationSettings, BlendParams,
MeshRenderer, MeshRasterizer, HardPhongShader
)
import open3d as o3d
width = 512
height = 512
fov = 60
obj_path = './data/examples/models/model_normalized.obj'
verts, faces, aux = load_obj(obj_path)
meshes = load_objs_as_meshes([obj_path])
R, T = look_at_view_transform(2.7, 10, 20)
cameras = FoVPerspectiveCameras(R=R, T=T, fov=fov)
raster_settings = RasterizationSettings(
image_size=(height, width),
blur_radius=0.0,
faces_per_pixel=1,
# max_faces_per_bin=20000
)
rasterizer = MeshRasterizer(
cameras=cameras,
raster_settings=raster_settings
)
depth = rasterizer(meshes).zbuf.cpu().squeeze().numpy()
cx = width / 2
cy = height / 2
fx = cx / np.tan(fov / 2)
fy = cy / np.tan(fov / 2)
row = height
col = width
# TODO check whether u or v is the column. depth[v, u] ???
v = np.array(list(np.ndindex((row, col)))).reshape(row, col, 2)[:, :, 0]
u = np.array(list(np.ndindex((row, col)))).reshape(row, col, 2)[:, :, 1]
X_ = (u - cx) / fx
X_ = X_[depth > -1] # exclude infinity
Y_ = (v - cy) / fy * depth
Y_ = Y_[depth > -1] # exclude infinity
depth_ = depth[depth > -1] # exclude infinity
X = X_ * depth_
Y = Y_ * depth_
Z = depth_
coords_g = np.stack([X, Y, Z]) # shape: num_points * 3
pcd = o3d.geometry.PointCloud()
pcd.points = o3d.utility.Vector3dVector(coords_g.T)
o3d.visualization.draw_geometries([pcd])
Any suggestion will be helpful. Please reply at your convenience.
Thanks!
Most helpful comment
Hi @zozobozo yes you can get the per pixel depth (for the top k faces which overlap with that pixel) from the output of the mesh rasterizer.
fragments.zbufis a(N, H, W, K)dimensional tensor.To retrieve this output, you can initialize a rasterizer and only use that e.g.
OR if you want the full image as well as the depth, you can extend the
MeshRendererclass to create your own renderer which also returns thefragments.zbufe.gWe also have a setting to enable perspective correct depth interpolation (set
raster_settings.perspective_correct = True).If this answers your question, please close this issue! :)