Use a consistent case for PyTorch3D

Summary: Use a consistent case for PyTorch3D (matching the logo...): replace all occurrences of PyTorch3d with PyTorch3D across the codebase (including documentation and notebooks)

Reviewed By: wanyenlo, gkioxari

Differential Revision: D20427546

fbshipit-source-id: 8c7697f51434c51e99b7fe271935932c72a1d9b9
This commit is contained in:
Patrick Labatut
2020-03-17 12:45:38 -07:00
committed by Facebook GitHub Bot
parent 5d3cc3569a
commit 25d2e2c8b7
20 changed files with 54 additions and 54 deletions

View File

@@ -19,7 +19,7 @@ In order to experiment with different approaches, we wanted a modular implementa
Taking inspiration from existing work [[1](#1), [2](#2)], we have created a new, modular, differentiable renderer with **parallel implementations in PyTorch, C++ and CUDA**, as well as comprehensive documentation and tests, with the aim of helping to further research in this field.
Our implementation decouples the rasterization and shading steps of rendering. The core rasterization step (based on [[2]](#2)) returns several intermediate variables and has an optimized implementation in CUDA. The rest of the pipeline is implemented purely in PyTorch, and is designed to be customized and extended. With this approach, the PyTorch3d differentiable renderer can be imported as a library.
Our implementation decouples the rasterization and shading steps of rendering. The core rasterization step (based on [[2]](#2)) returns several intermediate variables and has an optimized implementation in CUDA. The rest of the pipeline is implemented purely in PyTorch, and is designed to be customized and extended. With this approach, the PyTorch3D differentiable renderer can be imported as a library.
## <u>Get started</u>
@@ -36,9 +36,9 @@ First, the image is divided into a coarse grid and mesh faces are allocated to t
We additionally introduce a parameter `faces_per_pixel` which allows users to specify the top K faces which should be returned per pixel in the image (as opposed to traditional rasterization which returns only the index of the closest face in the mesh per pixel). The top K face properties can then be aggregated using different methods (such as the sigmoid/softmax approach proposed by Li et at in SoftRasterizer [[2]](#2)).
We compared PyTorch3d with SoftRasterizer to measure the effect of both these design changes on the speed of rasterization. We selected a set of meshes of different sizes from ShapeNetV1 core, and rasterized one mesh in each batch to produce images of different sizes. We report the speed of the forward and backward passes.
We compared PyTorch3D with SoftRasterizer to measure the effect of both these design changes on the speed of rasterization. We selected a set of meshes of different sizes from ShapeNetV1 core, and rasterized one mesh in each batch to produce images of different sizes. We report the speed of the forward and backward passes.
**Fig 1: PyTorch3d Naive vs Coarse-to-fine**
**Fig 1: PyTorch3D Naive vs Coarse-to-fine**
This figure shows how the coarse-to-fine strategy for rasterization results in significant speed up compared to naive rasterization for large image size and large mesh sizes.
@@ -49,9 +49,9 @@ For small mesh and image sizes, the naive approach is slightly faster. We advise
Setting `bin_size = 0` will enable naive rasterization. If `bin_size > 0`, the coarse-to-fine approach is used. The default is `bin_size = None` in which case we set the bin size based on [heuristics](https://github.com/facebookresearch/pytorch3d/blob/master/pytorch3d/renderer/mesh/rasterize_meshes.py#L92).
**Fig 2: PyTorch3d Coarse-to-fine vs SoftRasterizer**
**Fig 2: PyTorch3D Coarse-to-fine vs SoftRasterizer**
This figure shows the effect of the _combination_ of coarse-to-fine rasterization and caching the faces rasterized per pixel returned from the forward pass. For large meshes and image sizes, we again observe that the PyTorch3d rasterizer is significantly faster, noting that the speed is dominated by the forward pass and the backward pass is very fast.
This figure shows the effect of the _combination_ of coarse-to-fine rasterization and caching the faces rasterized per pixel returned from the forward pass. For large meshes and image sizes, we again observe that the PyTorch3D rasterizer is significantly faster, noting that the speed is dominated by the forward pass and the backward pass is very fast.
In the SoftRasterizer implementation, in both the forward and backward pass, there is a loop over every single face in the mesh for every pixel in the image. Therefore, the time for the full forward plus backward pass is ~2x the time for the forward pass. For small mesh and image sizes, the SoftRasterizer approach is slightly faster.
@@ -61,19 +61,19 @@ In the SoftRasterizer implementation, in both the forward and backward pass, the
### 2. Support for Heterogeneous Batches
PyTorch3d supports efficient rendering of batches of meshes where each mesh has different numbers of vertices and faces. This is done without using padded inputs.
PyTorch3D supports efficient rendering of batches of meshes where each mesh has different numbers of vertices and faces. This is done without using padded inputs.
We again compare with SoftRasterizer which only supports batches of homogeneous meshes and test two cases: 1) a for loop over meshes in the batch, 2) padded inputs, and compare with the native heterogeneous batching support in PyTorch3d.
We again compare with SoftRasterizer which only supports batches of homogeneous meshes and test two cases: 1) a for loop over meshes in the batch, 2) padded inputs, and compare with the native heterogeneous batching support in PyTorch3D.
We group meshes from ShapeNet into bins based on the number of faces in the mesh, and sample to compose a batch. We then render images of fixed size and measure the speed of the forward and backward passes.
We tested with a range of increasingly large meshes and bin sizes.
**Fig 3: PyTorch3d heterogeneous batching compared with SoftRasterizer**
**Fig 3: PyTorch3D heterogeneous batching compared with SoftRasterizer**
<img src="assets/fullset_batch_size_16.png" width="700"/>
This shows that for large meshes and large bin width (i.e. more variation in mesh size in the batch) the heterogeneous batching approach in PyTorch3d is faster than either of the workarounds with SoftRasterizer.
This shows that for large meshes and large bin width (i.e. more variation in mesh size in the batch) the heterogeneous batching approach in PyTorch3D is faster than either of the workarounds with SoftRasterizer.
(settings: batch size = 16, mesh sizes in bins ranging from 500-350k faces, image size = 64, faces per pixel = 100)
@@ -81,14 +81,14 @@ This shows that for large meshes and large bin width (i.e. more variation in mes
**NOTE: CUDA Memory usage**
The SoftRasterizer forward CUDA kernel only outputs one `(N, H, W, 4)` FloatTensor compared with the PyTorch3d rasterizer forward CUDA kernel which outputs 4 tensors:
The SoftRasterizer forward CUDA kernel only outputs one `(N, H, W, 4)` FloatTensor compared with the PyTorch3D rasterizer forward CUDA kernel which outputs 4 tensors:
- `pix_to_face`, LongTensor `(N, H, W, K)`
- `zbuf`, FloatTensor `(N, H, W, K)`
- `dist`, FloatTensor `(N, H, W, K)`
- `bary_coords`, FloatTensor `(N, H, W, K, 3)`
where **N** = batch size, **H/W** are image height/width, **K** is the faces per pixel. The PyTorch3d backward pass returns gradients for `zbuf`, `dist` and `bary_coords`.
where **N** = batch size, **H/W** are image height/width, **K** is the faces per pixel. The PyTorch3D backward pass returns gradients for `zbuf`, `dist` and `bary_coords`.
Returning intermediate variables from rasterization has an associated memory cost. We can calculate the theoretical lower bound on the memory usage for the forward and backward pass as follows:

View File

@@ -34,7 +34,7 @@ The differentiable renderer API is experimental and subject to change!.
### Coordinate transformation conventions
Rendering requires transformations between several different coordinate frames: world space, view/camera space, NDC space and screen space. At each step it is important to know where the camera is located, how the +X, +Y, +Z axes are aligned and the possible range of values. The following figure outlines the conventions used PyTorch3d.
Rendering requires transformations between several different coordinate frames: world space, view/camera space, NDC space and screen space. At each step it is important to know where the camera is located, how the +X, +Y, +Z axes are aligned and the possible range of values. The following figure outlines the conventions used PyTorch3D.
<img src="assets/transformations_overview.png" width="1000">
@@ -45,18 +45,18 @@ For example, given a teapot mesh, the world coordinate frame, camera coordiante
---
**NOTE: PyTorch3d vs OpenGL**
**NOTE: PyTorch3D vs OpenGL**
While we tried to emulate several aspects of OpenGL, there are differences in the coordinate frame conventions.
- The default world coordinate frame in PyTorch3D has +Z pointing in to the screen whereas in OpenGL, +Z is pointing out of the screen. Both are right handed.
- The NDC coordinate system in PyTorch3d is **right-handed** compared with a **left-handed** NDC coordinate system in OpenGL (the projection matrix switches the handedness).
- The NDC coordinate system in PyTorch3D is **right-handed** compared with a **left-handed** NDC coordinate system in OpenGL (the projection matrix switches the handedness).
<img align="center" src="assets/opengl_coordframes.png" width="300">
---
### A simple renderer
A renderer in PyTorch3d is composed of a **rasterizer** and a **shader**. Create a renderer in a few simple steps:
A renderer in PyTorch3D is composed of a **rasterizer** and a **shader**. Create a renderer in a few simple steps:
```
# Imports

View File

@@ -1,10 +1,10 @@
---
hide_title: true
sidebar_label: Why PyTorch3d
sidebar_label: Why PyTorch3D
---
# Why PyTorch3d
# Why PyTorch3D
Our goal with PyTorch3D is to help accelerate research at the intersection of deep learning and 3D. 3D data is more complex than 2D images and while working on projects such as [Mesh R-CNN](https://github.com/facebookresearch/meshrcnn) and [C3DPO](https://github.com/facebookresearch/c3dpo_nrsfm), we encountered several challenges including 3D data representation, batching, and speed. We have developed many useful operators and abstractions for working on 3D deep learning and want to share this with the community to drive novel research in this area.

View File

@@ -248,7 +248,7 @@
"\n",
"**`calc_camera_distance`** compares a pair of cameras. This function is important as it defines the loss that we are minimizing. The method utilizes the `so3_relative_angle` function from the SO3 API.\n",
"\n",
"**`get_relative_camera`** computes the parameters of a relative camera that maps between a pair of absolute cameras. Here we utilize the `compose` and `inverse` class methods from the PyTorch3d Transforms API."
"**`get_relative_camera`** computes the parameters of a relative camera that maps between a pair of absolute cameras. Here we utilize the `compose` and `inverse` class methods from the PyTorch3D Transforms API."
]
},
{

View File

@@ -119,7 +119,7 @@
"source": [
"## 1. Load the Obj\n",
"\n",
"We will load an obj file and create a **Meshes** object. **Meshes** is a unique datastructure provided in PyTorch3d for working with **batches of meshes of different sizes**. It has several useful class methods which are used in the rendering pipeline. "
"We will load an obj file and create a **Meshes** object. **Meshes** is a unique datastructure provided in PyTorch3D for working with **batches of meshes of different sizes**. It has several useful class methods which are used in the rendering pipeline. "
]
},
{
@@ -129,7 +129,7 @@
"id": "8d-oREfkrt_Z"
},
"source": [
"If you are running this notebook locally after cloning the PyTorch3d repository, the mesh will already be available. **If using Google Colab, fetch the mesh and save it at the path `data/`**:"
"If you are running this notebook locally after cloning the PyTorch3D repository, the mesh will already be available. **If using Google Colab, fetch the mesh and save it at the path `data/`**:"
]
},
{
@@ -202,7 +202,7 @@
"source": [
"### Create a renderer\n",
"\n",
"A **renderer** in PyTorch3d is composed of a **rasterizer** and a **shader** which each have a number of subcomponents such as a **camera** (orthgraphic/perspective). Here we initialize some of these components and use default values for the rest. \n",
"A **renderer** in PyTorch3D is composed of a **rasterizer** and a **shader** which each have a number of subcomponents such as a **camera** (orthgraphic/perspective). Here we initialize some of these components and use default values for the rest. \n",
"\n",
"For optimizing the camera position we will use a renderer which produces a **silhouette** of the object only and does not apply any **lighting** or **shading**. We will also initialize another renderer which applies full **phong shading** and use this for visualizing the outputs. "
]
@@ -817,7 +817,7 @@
"source": [
"## 5. Conclusion \n",
"\n",
"In this tutorial we learnt how to **load** a mesh from an obj file, initialize a PyTorch3d datastructure called **Meshes**, set up an **Renderer** consisting of a **Rasterizer** and a **Shader**, set up an optimization loop including a **Model** and a **loss function**, and run the optimization. "
"In this tutorial we learnt how to **load** a mesh from an obj file, initialize a PyTorch3D datastructure called **Meshes**, set up an **Renderer** consisting of a **Rasterizer** and a **Shader**, set up an optimization loop including a **Model** and a **loss function**, and run the optimization. "
]
}
],

View File

@@ -37,8 +37,8 @@
"We will cover: \n",
"\n",
"- How to **load a mesh** from an `.obj` file\n",
"- How to use the PyTorch3d **Meshes** datastructure\n",
"- How to use 4 different PyTorch3d **mesh loss functions**\n",
"- How to use the PyTorch3D **Meshes** datastructure\n",
"- How to use 4 different PyTorch3D **mesh loss functions**\n",
"- How to set up an **optimization loop**\n",
"\n",
"\n",
@@ -654,7 +654,7 @@
"source": [
"## 6. Conclusion \n",
"\n",
"In this tutorial we learnt how to load a mesh from an obj file, initialize a PyTorch3d datastructure called **Meshes**, set up an optimization loop and use four different PyTorch3d mesh loss functions. "
"In this tutorial we learnt how to load a mesh from an obj file, initialize a PyTorch3D datastructure called **Meshes**, set up an optimization loop and use four different PyTorch3D mesh loss functions. "
]
}
],

View File

@@ -173,7 +173,7 @@
"\n",
"Load an `.obj` file and it's associated `.mtl` file and create a **Textures** and **Meshes** object. \n",
"\n",
"**Meshes** is a unique datastructure provided in PyTorch3d for working with batches of meshes of different sizes. \n",
"**Meshes** is a unique datastructure provided in PyTorch3D for working with batches of meshes of different sizes. \n",
"\n",
"**Textures** is an auxillary datastructure for storing texture information about meshes. \n",
"\n",
@@ -287,7 +287,7 @@
"source": [
"## 2. Create a renderer\n",
"\n",
"A renderer in PyTorch3d is composed of a **rasterizer** and a **shader** which each have a number of subcomponents such as a **camera** (orthographic/perspective). Here we initialize some of these components and use default values for the rest.\n",
"A renderer in PyTorch3D is composed of a **rasterizer** and a **shader** which each have a number of subcomponents such as a **camera** (orthographic/perspective). Here we initialize some of these components and use default values for the rest.\n",
"\n",
"In this example we will first create a **renderer** which uses a **perspective camera**, a **point light** and applies **phong shading**. Then we learn how to vary different components using the modular API. "
]
@@ -545,7 +545,7 @@
"source": [
"## 6. Batched Rendering\n",
"\n",
"One of the core design choices of the PyTorch3d API is to suport **batched inputs for all components**. \n",
"One of the core design choices of the PyTorch3D API is to suport **batched inputs for all components**. \n",
"The renderer and associated components can take batched inputs and **render a batch of output images in one forward pass**. We will now use this feature to render the mesh from many different viewpoints.\n"
]
},
@@ -628,7 +628,7 @@
},
"source": [
"## 7. Conclusion\n",
"In this tutorial we learnt how to **load** a textured mesh from an obj file, initialize a PyTorch3d datastructure called **Meshes**, set up an **Renderer** consisting of a **Rasterizer** and a **Shader**, and modify several components of the rendering pipeline. "
"In this tutorial we learnt how to **load** a textured mesh from an obj file, initialize a PyTorch3D datastructure called **Meshes**, set up an **Renderer** consisting of a **Rasterizer** and a **Shader**, and modify several components of the rendering pipeline. "
]
}
],