Summary: Collection of spelling things, mostly in docs / tutorials.

Reviewed By: gkioxari

Differential Revision: D26101323

fbshipit-source-id: 652f62bc9d71a4ff872efa21141225e43191353a
This commit is contained in:
Jeremy Reizenstein
2021-04-09 09:57:55 -07:00
committed by Facebook GitHub Bot
parent c2e62a5087
commit 124bb5e391
75 changed files with 220 additions and 217 deletions

View File

@@ -5,7 +5,7 @@ sidebar_label: Batching
# Batching
In deep learning, every optimization step operates on multiple input examples for robust training. Thus, efficient batching is crucial. For image inputs, batching is straighforward; N images are resized to the same height and width and stacked as a 4 dimensional tensor of shape `N x 3 x H x W`. For meshes, batching is less straighforward.
In deep learning, every optimization step operates on multiple input examples for robust training. Thus, efficient batching is crucial. For image inputs, batching is straightforward; N images are resized to the same height and width and stacked as a 4 dimensional tensor of shape `N x 3 x H x W`. For meshes, batching is less straightforward.
<img src="assets/batch_intro.png" alt="batch_intro" align="middle"/>
@@ -21,7 +21,7 @@ Assume you want to construct a batch containing two meshes, with `mesh1 = (v1: V
## Use cases for batch modes
The need for different mesh batch modes is inherent to the way pytorch operators are implemented. To fully utilize the optimized pytorch ops, the [Meshes][meshes] data structure allows for efficient conversion between the different batch modes. This is crucial when aiming for a fast and efficient training cycle. An example of this is [Mesh R-CNN][meshrcnn]. Here, in the same forward pass different parts of the network assume different inputs, which are computed by converting between the different batch modes. In particular, [vert_align][vert_align] assumes a *padded* input tensor while immediately after [graph_conv][graphconv] assumes a *packed* input tensor.
The need for different mesh batch modes is inherent to the way PyTorch operators are implemented. To fully utilize the optimized PyTorch ops, the [Meshes][meshes] data structure allows for efficient conversion between the different batch modes. This is crucial when aiming for a fast and efficient training cycle. An example of this is [Mesh R-CNN][meshrcnn]. Here, in the same forward pass different parts of the network assume different inputs, which are computed by converting between the different batch modes. In particular, [vert_align][vert_align] assumes a *padded* input tensor while immediately after [graph_conv][graphconv] assumes a *packed* input tensor.
<img src="assets/meshrcnn.png" alt="meshrcnn" width="700" align="middle" />

View File

@@ -13,7 +13,7 @@ This is the system the object/scene lives - the world.
* **Camera view coordinate system**
This is the system that has its origin on the image plane and the `Z`-axis perpendicular to the image plane. In PyTorch3D, we assume that `+X` points left, and `+Y` points up and `+Z` points out from the image plane. The transformation from world to view happens after applying a rotation (`R`) and translation (`T`).
* **NDC coordinate system**
This is the normalized coordinate system that confines in a volume the renderered part of the object/scene. Also known as view volume. Under the PyTorch3D convention, `(+1, +1, znear)` is the top left near corner, and `(-1, -1, zfar)` is the bottom right far corner of the volume. The transformation from view to NDC happens after applying the camera projection matrix (`P`).
This is the normalized coordinate system that confines in a volume the rendered part of the object/scene. Also known as view volume. Under the PyTorch3D convention, `(+1, +1, znear)` is the top left near corner, and `(-1, -1, zfar)` is the bottom right far corner of the volume. The transformation from view to NDC happens after applying the camera projection matrix (`P`).
* **Screen coordinate system**
This is another representation of the view volume with the `XY` coordinates defined in pixel space instead of a normalized space.

View File

@@ -20,5 +20,5 @@ mesh = IO().load_mesh("mymesh.ply", device=device)
and to save a pointcloud you might do
```
pcl = Pointclouds(...)
IO().save_point_cloud(pcl, "output_poincloud.obj")
IO().save_point_cloud(pcl, "output_pointcloud.obj")
```

View File

@@ -6,13 +6,13 @@ hide_title: true
# Meshes and IO
The Meshes object represents a batch of triangulated meshes, and is central to
much of the functionality of pytorch3d. There is no insistence that each mesh in
much of the functionality of PyTorch3D. There is no insistence that each mesh in
the batch has the same number of vertices or faces. When available, it can store
other data which pertains to the mesh, for example face normals, face areas
and textures.
Two common file formats for storing single meshes are ".obj" and ".ply" files,
and pytorch3d has functions for reading these.
and PyTorch3D has functions for reading these.
## OBJ
@@ -60,7 +60,7 @@ The `load_objs_as_meshes` function provides this procedure.
## PLY
Ply files are flexible in the way they store additional information, pytorch3d
Ply files are flexible in the way they store additional information. PyTorch3D
provides a function just to read the vertices and faces from a ply file.
The call
```

View File

@@ -84,7 +84,7 @@ For mesh texturing we offer several options (in `pytorch3d/renderer/mesh/texturi
1. **Vertex Textures**: D dimensional textures for each vertex (for example an RGB color) which can be interpolated across the face. This can be represented as an `(N, V, D)` tensor. This is a fairly simple representation though and cannot model complex textures if the mesh faces are large.
2. **UV Textures**: vertex UV coordinates and **one** texture map for the whole mesh. For a point on a face with given barycentric coordinates, the face color can be computed by interpolating the vertex uv coordinates and then sampling from the texture map. This representation requires two tensors (UVs: `(N, V, 2), Texture map: `(N, H, W, 3)`), and is limited to only support one texture map per mesh.
3. **Face Textures**: In more complex cases such as ShapeNet meshes, there are multiple texture maps per mesh and some faces have texture while other do not. For these cases, a more flexible representation is a texture atlas, where each face is represented as an `(RxR)` texture map where R is the texture resolution. For a given point on the face, the texture value can be sampled from the per face texture map using the barycentric coordinates of the point. This representation requires one tensor of shape `(N, F, R, R, 3)`. This texturing method is inspired by the SoftRasterizer implementation. For more details refer to the [`make_material_atlas`](https://github.com/facebookresearch/pytorch3d/blob/master/pytorch3d/io/mtl_io.py#L123) and [`sample_textures`](https://github.com/facebookresearch/pytorch3d/blob/master/pytorch3d/renderer/mesh/textures.py#L452) functions. **NOTE:**: The `TextureAtlas` texture sampling is only differentiable with respect to the texture atlas but not differentiable with respect to the barycentric coordinates.
3. **Face Textures**: In more complex cases such as ShapeNet meshes, there are multiple texture maps per mesh and some faces have texture while other do not. For these cases, a more flexible representation is a texture atlas, where each face is represented as an `(RxR)` texture map where R is the texture resolution. For a given point on the face, the texture value can be sampled from the per face texture map using the barycentric coordinates of the point. This representation requires one tensor of shape `(N, F, R, R, 3)`. This texturing method is inspired by the SoftRasterizer implementation. For more details refer to the [`make_material_atlas`](https://github.com/facebookresearch/pytorch3d/blob/master/pytorch3d/io/mtl_io.py#L123) and [`sample_textures`](https://github.com/facebookresearch/pytorch3d/blob/master/pytorch3d/renderer/mesh/textures.py#L452) functions. **NOTE:**: The `TexturesAtlas` texture sampling is only differentiable with respect to the texture atlas but not differentiable with respect to the barycentric coordinates.
<img src="assets/texturing.jpg" width="1000">
@@ -116,7 +116,7 @@ raster_settings = RasterizationSettings(
faces_per_pixel=1,
)
# Create a phong renderer by composing a rasterizer and a shader. Here we can use a predefined
# Create a Phong renderer by composing a rasterizer and a shader. Here we can use a predefined
# PhongShader, passing in the device on which to initialize the default parameters
renderer = MeshRenderer(
rasterizer=MeshRasterizer(cameras=cameras, raster_settings=raster_settings),