Add MeshRasterizerOpenGL

Summary:
Adding MeshRasterizerOpenGL, a faster alternative to MeshRasterizer. The new rasterizer follows the ideas from "Differentiable Surface Rendering via non-Differentiable Sampling".

The new rasterizer 20x faster on a 2M face mesh (try pose optimization on Nefertiti from https://www.cs.cmu.edu/~kmcrane/Projects/ModelRepository/!). The larger the mesh, the larger the speedup.

There are two main disadvantages:
* The new rasterizer works with an OpenGL backend, so requires pycuda.gl and pyopengl installed (though we avoided writing any C++ code, everything is in Python!)
* The new rasterizer is non-differentiable. However, you can still differentiate the rendering function if you use if with the new SplatterPhongShader which we recently added to PyTorch3D (see the original paper cited above).

Reviewed By: patricklabatut, jcjohnson

Differential Revision: D37698816

fbshipit-source-id: 54d120639d3cb001f096237807e54aced0acda25
This commit is contained in:
Krzysztof Chalupka
2022-07-22 15:52:50 -07:00
committed by Facebook GitHub Bot
parent 36edf2b302
commit cb49550486
66 changed files with 1556 additions and 337 deletions

View File

@@ -224,11 +224,13 @@ class EGLContext:
"""
self.lock.acquire()
egl.eglMakeCurrent(self.dpy, self.surface, self.surface, self.context)
yield
egl.eglMakeCurrent(
self.dpy, egl.EGL_NO_SURFACE, egl.EGL_NO_SURFACE, egl.EGL_NO_CONTEXT
)
self.lock.release()
try:
yield
finally:
egl.eglMakeCurrent(
self.dpy, egl.EGL_NO_SURFACE, egl.EGL_NO_SURFACE, egl.EGL_NO_CONTEXT
)
self.lock.release()
def get_context_info(self) -> Dict[str, Any]:
"""
@@ -418,5 +420,29 @@ def _init_cuda_context(device_id: int = 0):
return cuda_context
def _torch_to_opengl(torch_tensor, cuda_context, cuda_buffer):
# CUDA access to the OpenGL buffer is only allowed within a map-unmap block.
cuda_context.push()
mapping_obj = cuda_buffer.map()
# data_ptr points to the OpenGL shader storage buffer memory.
data_ptr, sz = mapping_obj.device_ptr_and_size()
# Copy the torch tensor to the OpenGL buffer directly on device.
cuda_copy = cuda.Memcpy2D()
cuda_copy.set_src_device(torch_tensor.data_ptr())
cuda_copy.set_dst_device(data_ptr)
cuda_copy.width_in_bytes = cuda_copy.src_pitch = cuda_copy.dst_ptch = (
torch_tensor.shape[1] * 4
)
cuda_copy.height = torch_tensor.shape[0]
cuda_copy(False)
# Unmap and pop the cuda context to make sure OpenGL won't interfere with
# PyTorch ops down the line.
mapping_obj.unmap()
cuda_context.pop()
# Initialize a global _DeviceContextStore. Almost always we will only need a single one.
global_device_context_store = _DeviceContextStore()