Setup website with docusaurus (#11)
Summary: Set up landing page, docs page, and html versions of the ipython notebook tutorials. Pull Request resolved: https://github.com/fairinternal/pytorch3d/pull/11 Reviewed By: gkioxari Differential Revision: D19730380 Pulled By: nikhilaravi fbshipit-source-id: 5df8d3f2ac2f8dce4d51f5d14fc336508c2fd0ea
BIN
docs/notes/assets/architecture_overview.png
Normal file
|
After Width: | Height: | Size: 130 KiB |
BIN
docs/notes/assets/batch_intro.png
Normal file
|
After Width: | Height: | Size: 272 KiB |
BIN
docs/notes/assets/batch_modes.gif
Normal file
|
After Width: | Height: | Size: 1.5 MiB |
BIN
docs/notes/assets/fullset_batch_size_16.png
Normal file
|
After Width: | Height: | Size: 64 KiB |
BIN
docs/notes/assets/meshrcnn.png
Normal file
|
After Width: | Height: | Size: 1.1 MiB |
BIN
docs/notes/assets/opengl_coordframes.png
Normal file
|
After Width: | Height: | Size: 13 KiB |
BIN
docs/notes/assets/p3d_naive_vs_coarse.png
Normal file
|
After Width: | Height: | Size: 314 KiB |
BIN
docs/notes/assets/p3d_vs_softras.png
Normal file
|
After Width: | Height: | Size: 306 KiB |
BIN
docs/notes/assets/transformations_overview.png
Normal file
|
After Width: | Height: | Size: 150 KiB |
@@ -1,8 +1,13 @@
|
||||
---
|
||||
hide_title: true
|
||||
sidebar_label: Batching
|
||||
---
|
||||
|
||||
# Batching
|
||||
|
||||
In deep learning, every optimization step operates on multiple input examples for robust training. Thus, efficient batching is crucial. For image inputs, batching is straighforward; N images are resized to the same height and width and stacked as a 4 dimensional tensor of shape `N x 3 x H x W`. For meshes, batching is less straighforward.
|
||||
|
||||
<img src="../figs/batch_intro.png" alt="batch_intro" align="middle"/>
|
||||
<img src="assets/batch_intro.png" alt="batch_intro" align="middle"/>
|
||||
|
||||
## Batch modes for meshes
|
||||
|
||||
@@ -12,13 +17,13 @@ Assume you want to construct a batch containing two meshes, with `mesh1 = (v1: V
|
||||
* Padded: The padded representation constructs a tensor by padding the extra values. Specifically, `meshes.verts_padded()` returns a tensor of shape `2 x max(V1, V2) x 3` and pads the extra vertices with `0`s. Similarly, `meshes.faces_padded()` returns a tensor of shape `2 x max(F1, F2) x 3` and pads the extra faces with `-1`s.
|
||||
* Packed: The packed representation concatenates the examples in the batch into a tensor. In particular, `meshes.verts_packed()` returns a tensor of shape `(V1 + V2) x 3`. Similarly, `meshes.faces_packed()` returns a tensor of shape `(F1 + F2) x 3` for the faces. In the packed mode, auxiliary variables are computed that enable efficient conversion between packed and padded or list modes.
|
||||
|
||||
<img src="../figs/batch_modes.gif" alt="batch_modes" height="450" align="middle" />
|
||||
<img src="assets/batch_modes.gif" alt="batch_modes" height="450" align="middle" />
|
||||
|
||||
## Use cases for batch modes
|
||||
|
||||
The need for different mesh batch modes is inherent to the way pytorch operators are implemented. To fully utilize the optimized pytorch ops, the [Meshes][meshes] data structure allows for efficient conversion between the different batch modes. This is crucial when aiming for a fast and efficient training cycle. An example of this is [Mesh R-CNN][meshrcnn]. Here, in the same forward pass different parts of the network assume different inputs, which are computed by converting between the different batch modes. In particular, [vert_align][vert_align] assumes a *padded* input tensor while immediately after [graph_conv][graphconv] assumes a *packed* input tensor.
|
||||
|
||||
<img src="../figs/meshrcnn.png" alt="meshrcnn" width="700" align="middle" />
|
||||
<img src="assets/meshrcnn.png" alt="meshrcnn" width="700" align="middle" />
|
||||
|
||||
|
||||
[meshes]: https://github.com/facebookresearch/pytorch3d/blob/master/pytorch3d/structures/meshes.py
|
||||
|
||||
@@ -1,3 +1,8 @@
|
||||
---
|
||||
sidebar_label: Loading from file
|
||||
hide_title: true
|
||||
---
|
||||
|
||||
# Meshes and IO
|
||||
|
||||
The Meshes object represents a batch of triangulated meshes, and is central to
|
||||
|
||||
@@ -1,4 +1,9 @@
|
||||
# Differentiable Rendering
|
||||
---
|
||||
hide_title: true
|
||||
sidebar_label: Overview
|
||||
---
|
||||
|
||||
# Rendering Overview
|
||||
|
||||
Differentiable rendering is a relatively new and exciting research area in computer vision, bridging the gap between 2D and 3D by allowing 2D image pixels to be related back to 3D properties of a scene.
|
||||
|
||||
@@ -18,7 +23,7 @@ Our implementation decouples the rasterization and shading steps of rendering. T
|
||||
|
||||
## <u>Get started</u>
|
||||
|
||||
To learn about more the implementation and start using the renderer refer to [renderer_getting_started.md](renderer_getting_started.md), which also contains the [architecture overview](../figs/architecture_overview.png) and [coordinate transformation conventions](../figs/transformations_overview.png).
|
||||
To learn about more the implementation and start using the renderer refer to [renderer_getting_started.md](renderer_getting_started.md), which also contains the [architecture overview](assets/architecture_overview.png) and [coordinate transformation conventions](assets/transformations_overview.png).
|
||||
|
||||
|
||||
## <u>Key features</u>
|
||||
@@ -37,7 +42,7 @@ We compared PyTorch3d with SoftRasterizer to measure the effect of both these de
|
||||
|
||||
This figure shows how the coarse-to-fine strategy for rasterization results in significant speed up compared to naive rasterization for large image size and large mesh sizes.
|
||||
|
||||
<img src="../figs/p3d_naive_vs_coarse.png" width="1000">
|
||||
<img src="assets/p3d_naive_vs_coarse.png" width="1000">
|
||||
|
||||
|
||||
For small mesh and image sizes, the naive approach is slightly faster. We advise that you understand the data you are using and choose the rasterization setting which suits your performance requirements. It is easy to switch between the naive and coarse-to-fine options by adjusting the `bin_size` value when initializing the [rasterization settings](https://github.com/facebookresearch/pytorch3d/blob/master/pytorch3d/renderer/mesh/rasterizer.py#L26).
|
||||
@@ -50,7 +55,7 @@ This figure shows the effect of the _combination_ of coarse-to-fine rasterizatio
|
||||
|
||||
In the SoftRasterizer implementation, in both the forward and backward pass, there is a loop over every single face in the mesh for every pixel in the image. Therefore, the time for the full forward plus backward pass is ~2x the time for the forward pass. For small mesh and image sizes, the SoftRasterizer approach is slightly faster.
|
||||
|
||||
<img src="../figs/p3d_vs_softras.png" width="1000">
|
||||
<img src="assets/p3d_vs_softras.png" width="1000">
|
||||
|
||||
|
||||
|
||||
@@ -66,7 +71,7 @@ We tested with a range of increasingly large meshes and bin sizes.
|
||||
|
||||
**Fig 3: PyTorch3d heterogeneous batching compared with SoftRasterizer**
|
||||
|
||||
<img src="../figs/fullset_batch_size_16.png" width="700"/>
|
||||
<img src="assets/fullset_batch_size_16.png" width="700"/>
|
||||
|
||||
This shows that for large meshes and large bin width (i.e. more variation in mesh size in the batch) the heterogeneous batching approach in PyTorch3d is faster than either of the workarounds with SoftRasterizer.
|
||||
|
||||
|
||||
@@ -1,10 +1,15 @@
|
||||
---
|
||||
hide_title: true
|
||||
sidebar_label: Getting Started
|
||||
---
|
||||
|
||||
# Renderer Getting Started
|
||||
|
||||
### Architecture Overview
|
||||
|
||||
The renderer is designed to be modular, extensible and support batching and gradients for all inputs. The following figure describes all the components of the rendering pipeline.
|
||||
|
||||
<img src="../figs/architecture_overview.png" width="1000">
|
||||
<img src="assets/architecture_overview.png" width="1000">
|
||||
|
||||
##### Fragments
|
||||
|
||||
@@ -31,7 +36,7 @@ The differentiable renderer API is experimental and subject to change!.
|
||||
|
||||
Rendering requires transformations between several different coordinate frames: world space, view/camera space, NDC space and screen space. At each step it is important to know where the camera is located, how the x,y,z axes are aligned and the possible range of values. The following figure outlines the conventions used PyTorch3d.
|
||||
|
||||
<img src="../figs/transformations_overview.png" width="1000">
|
||||
<img src="assets/transformations_overview.png" width="1000">
|
||||
|
||||
|
||||
|
||||
@@ -43,7 +48,7 @@ While we tried to emulate several aspects of OpenGL, the NDC coordinate system i
|
||||
|
||||
In OpenGL, the camera at the origin is looking along `-z` axis in camera space, but it is looking along the `+z` axis in NDC space.
|
||||
|
||||
<img align="center" src="../figs/opengl_coordframes.png" width="300">
|
||||
<img align="center" src="assets/opengl_coordframes.png" width="300">
|
||||
|
||||
---
|
||||
### A simple renderer
|
||||
|
||||
13
docs/notes/why_pytorch3d.md
Normal file
@@ -0,0 +1,13 @@
|
||||
---
|
||||
hide_title: true
|
||||
sidebar_label: Why PyTorch3d
|
||||
---
|
||||
|
||||
|
||||
# Why PyTorch3d
|
||||
|
||||
|
||||
Our goal with PyTorch3D is to help accelerate research at the intersection of deep learning and 3D. 3D data is more complex than 2D images and while working on projects such as [Mesh R-CNN](https://github.com/facebookresearch/meshrcnn) and [C3DPO](https://github.com/facebookresearch/c3dpo_nrsfm), we encountered several challenges including 3D data representation, batching, and speed. We have developed many useful operators and abstractions for working on 3D deep learning and want to share this with the community to drive novel research in this area.
|
||||
|
||||
In PyTorch3D we have included efficient 3D operators, heterogeneous batching capabilities, and a modular differentiable rendering API, to equip researchers in this field with a much needed toolkit to implement cutting-edge research with complex 3D inputs.
|
||||
|
||||