88 Commits

Author SHA1 Message Date
Jeremy Reizenstein
74659aef26 CPU implementation for point_mesh functions
Summary:
point_mesh functions were missing CPU implementations.
The indices returned are not always matching, possibly due to numerical instability.

Reviewed By: gkioxari

Differential Revision: D21594264

fbshipit-source-id: 3016930e2a9a0f3cd8b3ac4c94a92c9411c0989d
2020-06-15 10:11:26 -07:00
Georgia Gkioxari
d689baac5e fix alpha compositing
Summary:
Fix division by zero when alpha is 1.0
In this case, the nominator is already 0 and we need to make sure division with 0 does not occur which would produce nans

Reviewed By: nikhilaravi

Differential Revision: D21650478

fbshipit-source-id: bc457105b3050fef1c8bd4e58e7d6d15c0c81ffd
2020-05-20 09:27:42 -07:00
Nikhila Ravi
3fef506895 Make cuda tensors contiguous in host function and remove contiguous check
Summary:
Update the cuda kernels to:
- remove contiguous checks for the grad tensors and for cpu functions which use accessors
- for cuda implementations call `.contiguous()` on all tensors in the host function before invoking the kernel

Reviewed By: gkioxari

Differential Revision: D21598008

fbshipit-source-id: 9b97bda4582fd4269c8a00999874d4552a1aea2d
2020-05-15 15:00:25 -07:00
Jeremy Reizenstein
728179e848 avoid converting a TensorOptions from float to integer
Summary: pytorch is adding checks that mean integer tensors with requires_grad=True need to be avoided. Fix accidentally creating them.

Reviewed By: jcjohnson, gkioxari

Differential Revision: D21576712

fbshipit-source-id: 008218997986800a36d93caa1a032ee91f2bffcd
2020-05-14 13:16:05 -07:00
Jeremy Reizenstein
8fc28baa27 Looser gradient check in test_rasterize_meshes
Summary: This has been failing intermittently

Reviewed By: nikhilaravi

Differential Revision: D21403157

fbshipit-source-id: 51b74d6c813b52effe72d14b565e250fcabbb463
2020-05-05 09:26:47 -07:00
Michele Sanna
f8acecb6b3 a formula for bin size for images over 64x64 (#90)
Summary:
Signed-off-by: Michele Sanna <sanna@arrival.com>

fixes the bin_size calculation with a formula for any image_size > 64. Matches the values chosen so far.

simple test:

```
import numpy as np
import matplotlib.pyplot as plt

image_size = np.arange(64, 2048)
bin_size = np.where(image_size <= 64, 8, (2 ** np.maximum(np.ceil(np.log2(image_size)) - 4, 4)).astype(int))

print(image_size)
print(bin_size)

for ims, bins in zip(image_size, bin_size):
    if ims <= 64:
        assert bins == 8
    elif ims <= 256:
        assert bins == 16
    elif ims <= 512:
        assert bins == 32
    elif ims <= 1024:
        assert bins == 64
    elif ims <= 2048:
        assert bins == 128

    assert (ims + bins - 1) // bins < 22

plt.plot(image_size, bin_size)
plt.grid()
plt.show()
```

![img](https://user-images.githubusercontent.com/54891577/75464693-795bcf00-597f-11ea-9061-26440211691c.png)
Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/90

Reviewed By: jcjohnson

Differential Revision: D21160372

Pulled By: nikhilaravi

fbshipit-source-id: 660cf5832f4ca5be243c435a6bed969596fc0188
2020-04-24 14:56:41 -07:00
Nikhila Ravi
c3d636dc8c Cuda updates
Summary:
Updates to:
- enable cuda kernel launches on any GPU (not just the default)
- cuda and contiguous checks for all kernels
- checks to ensure all tensors are on the same device
- error reporting in the cuda kernels
- cuda tests now run on a random device not just the default

Reviewed By: jcjohnson, gkioxari

Differential Revision: D21215280

fbshipit-source-id: 1bedc9fe6c35e9e920bdc4d78ed12865b1005519
2020-04-24 09:11:04 -07:00
Jeremy Reizenstein
85c396f822 avoid using torch/extension.h in cuda
Summary:
Use aten instead of torch interface in all cuda code. This allows the cuda build to work with pytorch 1.5 with GCC 5 (e.g. the compiler of ubuntu 16.04LTS). This wasn't working. It has been failing with errors like the below, perhaps due to a bug in nvcc.

```
torch/include/torch/csrc/api/include/torch/nn/cloneable.h:68:61: error: invalid static_cast from type ‘const torch::OrderedDict<std::basic_string<char>, std::shared_ptr<torch::nn::Module> >’ to type ‘torch::OrderedDict<std::basic_string<char>, std::shared_ptr<torch::nn::Module> >
```

Reviewed By: nikhilaravi

Differential Revision: D21204029

fbshipit-source-id: ca6bdbcecf42493365e1c23a33fe35e1759fe8b6
2020-04-23 10:26:17 -07:00
Justin Johnson
9f31a4fd46 Expose knn_check_version in python
Summary:
We have multiple KNN CUDA implementations. From python, users can currently request a particular implementation via the `version` flag, but they have no way of knowing which implementations can be used for a given problem.

This diff exposes a function `pytorch3d._C.knn_check_version(version, D, K)` that returns whether a particular version can be used.

Reviewed By: nikhilaravi

Differential Revision: D21162573

fbshipit-source-id: 6061960bdcecba454fd920b00036f4e9ff3fdbc0
2020-04-22 14:30:52 -07:00
Nikhila Ravi
4bf30593ff back face culling in rasterization
Summary:
Added backface culling as an option to the `raster_settings`. This is needed for the full forward rendering of shapenet meshes with texture (some meshes contain
multiple overlapping segments which have different textures).

For a triangle (v0, v1, v2) define the vectors A = (v1 - v0) and B = (v2 − v0) and use this to calculate the area of the triangle as:
```
area = 0.5 * A  x B
area = 0.5 * ((x1 − x0)(y2 − y0) − (x2 − x0)(y1 − y0))
```
The area will be positive if (v0, v1, v2) are oriented counterclockwise (a front face), and negative if (v0, v1, v2) are oriented clockwise (a back face).

We can reuse the `edge_function` as it already calculates the triangle area.

Reviewed By: jcjohnson

Differential Revision: D20960115

fbshipit-source-id: 2d8a4b9ccfb653df18e79aed8d05c7ec0f057ab1
2020-04-22 08:22:46 -07:00
Nikhila Ravi
9ef1ee8455 coarse rasterization bug fix
Summary:
Fix a bug which resulted in a rendering artifacts if the image size was not a multiple of 16.
Fix: Revert coarse rasterization to original implementation and only update fine rasterization to reverse the ordering of Y and X axis. This is much simpler than the previous approach!

Additional changes:
- updated mesh rendering end-end tests to check outputs from both naive and coarse to fine rasterization.
- added pointcloud rendering end-end tests

Reviewed By: gkioxari

Differential Revision: D21102725

fbshipit-source-id: 2e7e1b013dd6dd12b3a00b79eb8167deddb2e89a
2020-04-20 14:54:16 -07:00
Jeremy Reizenstein
6207c359b1 spelling and flake
Summary: mostly recent lintish things

Reviewed By: nikhilaravi

Differential Revision: D21089003

fbshipit-source-id: 028733c1d875268f1879e4481da475b7100ba0b6
2020-04-17 10:50:22 -07:00
Jeremy Reizenstein
9397cd872d torch C API warnings
Summary: This is mostly replacing the old PackedTensorAccessor with the new PackedTensorAccessor64.

Reviewed By: gkioxari

Differential Revision: D21088773

fbshipit-source-id: 5973e5a29d934eafb7c70ec5ec154ca076b64d27
2020-04-17 10:46:31 -07:00
Jeremy Reizenstein
e19df58766 remove final nearest_neighbor files
Summary: A couple of files for the removed nearest_neighbor functionality are left behind.

Reviewed By: nikhilaravi

Differential Revision: D21088624

fbshipit-source-id: 4bb29016b4e5f63102765b384c363733b60032fa
2020-04-17 09:27:17 -07:00
Nikhila Ravi
3794f6753f remove nearest_neighbors
Summary: knn is more general and faster than the nearest_neighbor code, so remove the latter.

Reviewed By: gkioxari

Differential Revision: D20816424

fbshipit-source-id: 75d6c44d17180752d0c9859814bbdf7892558158
2020-04-15 20:51:41 -07:00
Georgia Gkioxari
b2b0c5a442 knn autograd
Summary:
Adds knn backward to return `grad_pts1` and `grad_pts2`. Adds `knn_gather` to return the nearest neighbors in pts2.

The BM tests include backward pass and are ran on an M40.
```
Benchmark                               Avg Time(μs)      Peak Time(μs) Iterations
--------------------------------------------------------------------------------
KNN_SQUARE_32_256_128_3_24_cpu              39558           43485             13
KNN_SQUARE_32_256_128_3_24_cuda:0            1080            1404            463
KNN_SQUARE_32_256_512_3_24_cpu              81950           85781              7
KNN_SQUARE_32_256_512_3_24_cuda:0            1519            1641            330
--------------------------------------------------------------------------------

Benchmark                               Avg Time(μs)      Peak Time(μs) Iterations
--------------------------------------------------------------------------------
KNN_RAGGED_32_256_128_3_24_cpu              13798           14650             37
KNN_RAGGED_32_256_128_3_24_cuda:0            1576            1713            318
KNN_RAGGED_32_256_512_3_24_cpu              31255           32210             16
KNN_RAGGED_32_256_512_3_24_cuda:0            2024            2162            248
--------------------------------------------------------------------------------
```

Reviewed By: jcjohnson

Differential Revision: D20945556

fbshipit-source-id: a16f616029c6b5f8c2afceb5f2bc12c5c20d2f3c
2020-04-14 17:22:56 -07:00
Georgia Gkioxari
487d4d6607 point mesh distances
Summary:
Implementation of point to mesh distances. The current diff contains two types:
(a) Point to Edge
(b) Point to Face

```

Benchmark                                       Avg Time(μs)      Peak Time(μs) Iterations
--------------------------------------------------------------------------------
POINT_MESH_EDGE_4_100_300_5000_cuda:0                2745            3138            183
POINT_MESH_EDGE_4_100_300_10000_cuda:0               4408            4499            114
POINT_MESH_EDGE_4_100_3000_5000_cuda:0               4978            5070            101
POINT_MESH_EDGE_4_100_3000_10000_cuda:0              9076            9187             56
POINT_MESH_EDGE_4_1000_300_5000_cuda:0               1411            1487            355
POINT_MESH_EDGE_4_1000_300_10000_cuda:0              4829            5030            104
POINT_MESH_EDGE_4_1000_3000_5000_cuda:0              7539            7620             67
POINT_MESH_EDGE_4_1000_3000_10000_cuda:0            12088           12272             42
POINT_MESH_EDGE_8_100_300_5000_cuda:0                3106            3222            161
POINT_MESH_EDGE_8_100_300_10000_cuda:0               8561            8648             59
POINT_MESH_EDGE_8_100_3000_5000_cuda:0               6932            7021             73
POINT_MESH_EDGE_8_100_3000_10000_cuda:0             24032           24176             21
POINT_MESH_EDGE_8_1000_300_5000_cuda:0               5272            5399             95
POINT_MESH_EDGE_8_1000_300_10000_cuda:0             11348           11430             45
POINT_MESH_EDGE_8_1000_3000_5000_cuda:0             17478           17683             29
POINT_MESH_EDGE_8_1000_3000_10000_cuda:0            25961           26236             20
POINT_MESH_EDGE_16_100_300_5000_cuda:0               8244            8323             61
POINT_MESH_EDGE_16_100_300_10000_cuda:0             18018           18071             28
POINT_MESH_EDGE_16_100_3000_5000_cuda:0             19428           19544             26
POINT_MESH_EDGE_16_100_3000_10000_cuda:0            44967           45135             12
POINT_MESH_EDGE_16_1000_300_5000_cuda:0              7825            7937             64
POINT_MESH_EDGE_16_1000_300_10000_cuda:0            18504           18571             28
POINT_MESH_EDGE_16_1000_3000_5000_cuda:0            65805           66132              8
POINT_MESH_EDGE_16_1000_3000_10000_cuda:0           90885           91089              6
--------------------------------------------------------------------------------

Benchmark                                       Avg Time(μs)      Peak Time(μs) Iterations
--------------------------------------------------------------------------------
POINT_MESH_FACE_4_100_300_5000_cuda:0                1561            1685            321
POINT_MESH_FACE_4_100_300_10000_cuda:0               2818            2954            178
POINT_MESH_FACE_4_100_3000_5000_cuda:0              15893           16018             32
POINT_MESH_FACE_4_100_3000_10000_cuda:0             16350           16439             31
POINT_MESH_FACE_4_1000_300_5000_cuda:0               3179            3278            158
POINT_MESH_FACE_4_1000_300_10000_cuda:0              2353            2436            213
POINT_MESH_FACE_4_1000_3000_5000_cuda:0             16262           16336             31
POINT_MESH_FACE_4_1000_3000_10000_cuda:0             9334            9448             54
POINT_MESH_FACE_8_100_300_5000_cuda:0                4377            4493            115
POINT_MESH_FACE_8_100_300_10000_cuda:0               9728            9822             52
POINT_MESH_FACE_8_100_3000_5000_cuda:0              26428           26544             19
POINT_MESH_FACE_8_100_3000_10000_cuda:0             42238           43031             12
POINT_MESH_FACE_8_1000_300_5000_cuda:0               3891            3982            129
POINT_MESH_FACE_8_1000_300_10000_cuda:0              5363            5429             94
POINT_MESH_FACE_8_1000_3000_5000_cuda:0             20998           21084             24
POINT_MESH_FACE_8_1000_3000_10000_cuda:0            39711           39897             13
POINT_MESH_FACE_16_100_300_5000_cuda:0               5955            6001             84
POINT_MESH_FACE_16_100_300_10000_cuda:0             12082           12144             42
POINT_MESH_FACE_16_100_3000_5000_cuda:0             44996           45176             12
POINT_MESH_FACE_16_100_3000_10000_cuda:0            73042           73197              7
POINT_MESH_FACE_16_1000_300_5000_cuda:0              8292            8374             61
POINT_MESH_FACE_16_1000_300_10000_cuda:0            19442           19506             26
POINT_MESH_FACE_16_1000_3000_5000_cuda:0            36059           36194             14
POINT_MESH_FACE_16_1000_3000_10000_cuda:0           64644           64822              8
--------------------------------------------------------------------------------
```

Reviewed By: jcjohnson

Differential Revision: D20590462

fbshipit-source-id: 42a39837b514a546ac9471bfaff60eefe7fae829
2020-04-11 00:21:24 -07:00
Jeremy Reizenstein
01b5f7b228 heterogenous KNN
Summary: Interface and working implementation of ragged KNN. Benchmarks (which aren't ragged) haven't slowed. New benchmark shows that ragged is faster than non-ragged of the same shape.

Reviewed By: jcjohnson

Differential Revision: D20696507

fbshipit-source-id: 21b80f71343a3475c8d3ee0ce2680f92f0fae4de
2020-04-07 01:47:37 -07:00
Jeremy Reizenstein
37c5c8e0b6 Linter, deprecated type()
Summary: Run linter after recent changes. Fix long comment in knn.h which clang-format has reflowed badly. Add crude test that code doesn't call deprecated `.type()` or `.data()`.

Reviewed By: nikhilaravi

Differential Revision: D20692935

fbshipit-source-id: 28ce0308adae79a870cb41a810b7cf8744f41ab8
2020-03-29 14:02:58 -07:00
Justin Johnson
870290df34 Implement K-Nearest Neighbors
Summary:
Implements K-Nearest Neighbors with C++ and CUDA versions.

KNN in CUDA is highly nontrivial. I've implemented a few different versions of the kernel, and we heuristically dispatch to different kernels based on the problem size. Some of the kernels rely on template specialization on either D or K, so we use template metaprogramming to compile specialized versions for ranges of D and K.

These kernels are up to 3x faster than our existing 1-nearest-neighbor kernels, so we should also consider swapping out `nn_points_idx` to use these kernels in the backend.

I've been working mostly on the CUDA kernels, and haven't converged on the correct Python API.

I still want to benchmark against FAISS to see how far away we are from their performance.

Reviewed By: bottler

Differential Revision: D19729286

fbshipit-source-id: 608ffbb7030c21fe4008f330522f4890f0c3c21a
2020-03-26 13:40:26 -07:00
Jeremy Reizenstein
81a4aa18ad type() deprecated
Summary:
Replace `tensor.type().is_cuda()` with the preferred `tensor.is_cuda()`.
Replace `AT_DISPATCH_FLOATING_TYPES(tensor.type(), ...` with `AT_DISPATCH_FLOATING_TYPES(tensor.scalar_type(), ...`.
These avoid deprecation warnings in future pytorch.

Reviewed By: nikhilaravi

Differential Revision: D20646565

fbshipit-source-id: 1a0c15978c871af816b1dd7d4a7ea78242abd95e
2020-03-26 04:01:41 -07:00
Jeremy Reizenstein
e22d431e5b data() deprecated
Summary: replace `data()` with preferred `data_ptr()`, avoiding some deprecation warnings in future pytorch.

Reviewed By: nikhilaravi

Differential Revision: D20645738

fbshipit-source-id: 8f6e02d292729b804fa2a66f94dd0517bbaf7887
2020-03-26 03:21:48 -07:00
Jeremy Reizenstein
8fa7678614 fix CPU-only hiding of cuda calls
Summary: CPU-only builds should be fixed by this change

Reviewed By: nikhilaravi

Differential Revision: D20598014

fbshipit-source-id: df098ec4c6c93d38515172805fe57cac7463c506
2020-03-24 05:04:32 -07:00
Olivia
53599770dd Accumulate points (#4)
Summary:
Code for accumulating points in the z-buffer in three ways:
1. weighted sum
2. normalised weighted sum
3. alpha compositing

Pull Request resolved: https://github.com/fairinternal/pytorch3d/pull/4

Reviewed By: nikhilaravi

Differential Revision: D20522422

Pulled By: gkioxari

fbshipit-source-id: 5023baa05f15e338f3821ef08f5552c2dcbfc06c
2020-03-19 11:23:12 -07:00
Jeremy Reizenstein
2361845548 squared distance in comments
Summary: Comments were describing squared distance as absolute distance in a few places.

Reviewed By: nikhilaravi

Differential Revision: D20426020

fbshipit-source-id: 009946867c4a98f61f5ce7158542d41e22bf8346
2020-03-13 04:35:25 -07:00
Nikhila Ravi
d01e722849 Fix coordinate system conventions in point cloud renderer
Summary:
Applying the changes added for mesh rasterization to ensure that +Y is up and +X is left so that the coordinate system is right handed.

Also updated the diagram in the docs to indicate that (0,0) is in the top left hand corner.

Reviewed By: gkioxari

Differential Revision: D20394849

fbshipit-source-id: cfb7c79090eb1f55ad38b92327a74a70a8dc541e
2020-03-12 07:48:29 -07:00
Nikhila Ravi
32ad869dea Update point cloud rasterizer to support heterogeneous point clouds
Summary:
Update the point cloud rasterizer to:
- use the pointcloud datastructure (rebased on top of D19791851.)
- support rasterization of heterogeneous point clouds in the same way as with Meshes.

The main changes to the API will be as follows:
- The input to `rasterize_points` will be a `Pointclouds` object instead of a tensor. This will be easy to update e.g.
```
points = torch.randn(N, P, 3)
idx2, zbuf2, dists2 = rasterize_points(points, image_size, radius, points_per_pixel)

points = torch.randn(N, P, 3)
pointclouds = Pointclouds(points=points)
idx2, zbuf2, dists2 = rasterize_points(pointclouds, image_size, radius, points_per_pixel)
```

- The indices output from rasterization will now refer to points in `poinclouds.points_packed()`.
This may require some changes to the functions which consume the outputs of rasterization if they were previously
assuming that the indices ranged from 0 to P where P is the number of points in each pointcloud.

Making this change now so that Olivia can update her PR accordingly.

Reviewed By: gkioxari

Differential Revision: D20088651

fbshipit-source-id: 833ed659909712bcbbb6a50e2ec0189839f0413a
2020-03-12 07:48:29 -07:00
Nikhila Ravi
15c72be444 Fix coordinate system conventions in renderer
Summary:
## Updates

- Defined the world and camera coordinates according to this figure. The world coordinates are defined as having +Y up, +X left and +Z in.

{F230888499}

- Removed all flipping from blending functions.
- Updated the rasterizer to return images with +Y up and +X left.
- Updated all the mesh rasterizer tests
    - The expected values are now defined in terms of the default +Y up, +X left
    - Added tests where the triangles in the meshes are non symmetrical so that it is clear which direction +X and +Y are

## Questions:
- Should we have **scene settings** instead of raster settings?
    - To be more correct we should be [z clipping in the rasterizer based on the far/near clipping planes](https://github.com/ShichenLiu/SoftRas/blob/master/soft_renderer/cuda/soft_rasterize_cuda_kernel.cu#L400) - these values are also required in the blending functions so should we make these scene level parameters and have a scene settings tuple which is available to the rasterizer and shader?

Reviewed By: gkioxari

Differential Revision: D20208604

fbshipit-source-id: 55787301b1bffa0afa9618f0a0886cc681da51f3
2020-03-06 06:51:05 -08:00
takiyu
f358b9b14d Fix squared distance for CPU impl. (#83)
Summary:
`PointLineDistanceForward()` should return squared distance. However, it seems that it returned non-squared distance when `v0` was near by `v1` in CPU implementation.
Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/83

Reviewed By: bottler

Differential Revision: D20097181

Pulled By: nikhilaravi

fbshipit-source-id: 7ea851c0837ab89364e42d283c999df21ff5ff02
2020-02-25 14:00:00 -08:00
merayxu
9e21659fc5 Fixed windows MSVC build compatibility (#9)
Summary:
Fixed a few MSVC compiler (visual studio 2019, MSVC 19.16.27034) compatibility issues
1. Replaced long with int64_t. aten::data_ptr\<long\> is not supported in MSVC
2. pytorch3d/csrc/rasterize_points/rasterize_points_cpu.cpp, inline function is not correctly recognized by MSVC.
3. pytorch3d/csrc/rasterize_meshes/geometry_utils.cuh
const auto kEpsilon = 1e-30;
MSVC does not compile this const into both host and device, change to a MACRO.
4. pytorch3d/csrc/rasterize_meshes/geometry_utils.cuh,
const float area2 = pow(area, 2.0);
2.0 is considered as double by MSVC and raised an error
5. pytorch3d/csrc/rasterize_points/rasterize_points_cpu.cpp
std::tuple<torch::Tensor, torch::Tensor> RasterizePointsCoarseCpu() return type does not match the declaration in rasterize_points_cpu.h.
Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/9

Reviewed By: nikhilaravi

Differential Revision: D19986567

Pulled By: yuanluxu

fbshipit-source-id: f4d98525d088c99c513b85193db6f0fc69c7f017
2020-02-20 18:43:19 -08:00
Georgia Gkioxari
a3baa367e3 face areas backward
Summary:
Added backward for mesh face areas & normals. Exposed it as a layer. Replaced the computation with the new op in Meshes and in Sample Points.

Current issue: Circular imports. I moved the import of the op in meshes inside the function scope.

Reviewed By: jcjohnson

Differential Revision: D19920082

fbshipit-source-id: d213226d5e1d19a0c8452f4d32771d07e8b91c0a
2020-02-20 11:11:33 -08:00
Georgia Gkioxari
60f3c4e7d2 cpp support for packed to padded
Summary:
Cpu implementation for packed to padded and added gradients
```
Benchmark                                     Avg Time(μs)      Peak Time(μs) Iterations
--------------------------------------------------------------------------------
PACKED_TO_PADDED_2_100_300_1_cpu                    138             221           3625
PACKED_TO_PADDED_2_100_300_1_cuda:0                 184             261           2716
PACKED_TO_PADDED_2_100_300_16_cpu                   555             726            901
PACKED_TO_PADDED_2_100_300_16_cuda:0                179             260           2794
PACKED_TO_PADDED_2_100_3000_1_cpu                   396             519           1262
PACKED_TO_PADDED_2_100_3000_1_cuda:0                181             274           2764
PACKED_TO_PADDED_2_100_3000_16_cpu                 4517            5003            111
PACKED_TO_PADDED_2_100_3000_16_cuda:0               224             397           2235
PACKED_TO_PADDED_2_1000_300_1_cpu                   138             212           3616
PACKED_TO_PADDED_2_1000_300_1_cuda:0                180             282           2775
PACKED_TO_PADDED_2_1000_300_16_cpu                  565             711            885
PACKED_TO_PADDED_2_1000_300_16_cuda:0               179             264           2797
PACKED_TO_PADDED_2_1000_3000_1_cpu                  389             494           1287
PACKED_TO_PADDED_2_1000_3000_1_cuda:0               180             271           2777
PACKED_TO_PADDED_2_1000_3000_16_cpu                4522            5170            111
PACKED_TO_PADDED_2_1000_3000_16_cuda:0              216             286           2313
PACKED_TO_PADDED_10_100_300_1_cpu                   251             345           1995
PACKED_TO_PADDED_10_100_300_1_cuda:0                178             262           2806
PACKED_TO_PADDED_10_100_300_16_cpu                 2354            2750            213
PACKED_TO_PADDED_10_100_300_16_cuda:0               178             291           2814
PACKED_TO_PADDED_10_100_3000_1_cpu                 1519            1786            330
PACKED_TO_PADDED_10_100_3000_1_cuda:0               179             237           2791
PACKED_TO_PADDED_10_100_3000_16_cpu               24705           25879             21
PACKED_TO_PADDED_10_100_3000_16_cuda:0              228             316           2191
PACKED_TO_PADDED_10_1000_300_1_cpu                  261             432           1919
PACKED_TO_PADDED_10_1000_300_1_cuda:0               181             261           2756
PACKED_TO_PADDED_10_1000_300_16_cpu                2349            2770            213
PACKED_TO_PADDED_10_1000_300_16_cuda:0              180             256           2782
PACKED_TO_PADDED_10_1000_3000_1_cpu                1613            1929            310
PACKED_TO_PADDED_10_1000_3000_1_cuda:0              183             253           2739
PACKED_TO_PADDED_10_1000_3000_16_cpu              22041           23653             23
PACKED_TO_PADDED_10_1000_3000_16_cuda:0             220             343           2270
PACKED_TO_PADDED_32_100_300_1_cpu                   555             750            901
PACKED_TO_PADDED_32_100_300_1_cuda:0                188             282           2661
PACKED_TO_PADDED_32_100_300_16_cpu                 7550            8131             67
PACKED_TO_PADDED_32_100_300_16_cuda:0               181             272           2770
PACKED_TO_PADDED_32_100_3000_1_cpu                 4574            6327            110
PACKED_TO_PADDED_32_100_3000_1_cuda:0               173             254           2884
PACKED_TO_PADDED_32_100_3000_16_cpu               70366           72563              8
PACKED_TO_PADDED_32_100_3000_16_cuda:0              349             654           1433
PACKED_TO_PADDED_32_1000_300_1_cpu                  612             728            818
PACKED_TO_PADDED_32_1000_300_1_cuda:0               189             295           2647
PACKED_TO_PADDED_32_1000_300_16_cpu                7699            8254             65
PACKED_TO_PADDED_32_1000_300_16_cuda:0              189             311           2646
PACKED_TO_PADDED_32_1000_3000_1_cpu                5105            5261             98
PACKED_TO_PADDED_32_1000_3000_1_cuda:0              191             260           2625
PACKED_TO_PADDED_32_1000_3000_16_cpu              87073           92708              6
PACKED_TO_PADDED_32_1000_3000_16_cuda:0             344             425           1455
--------------------------------------------------------------------------------

Benchmark                                           Avg Time(μs)      Peak Time(μs) Iterations
--------------------------------------------------------------------------------
PACKED_TO_PADDED_TORCH_2_100_300_1_cpu                    492             627           1016
PACKED_TO_PADDED_TORCH_2_100_300_1_cuda:0                 768             975            652
PACKED_TO_PADDED_TORCH_2_100_300_16_cpu                   659             804            760
PACKED_TO_PADDED_TORCH_2_100_300_16_cuda:0                781             918            641
PACKED_TO_PADDED_TORCH_2_100_3000_1_cpu                   624             734            802
PACKED_TO_PADDED_TORCH_2_100_3000_1_cuda:0                778             929            643
PACKED_TO_PADDED_TORCH_2_100_3000_16_cpu                 2609            2850            192
PACKED_TO_PADDED_TORCH_2_100_3000_16_cuda:0               758             901            660
PACKED_TO_PADDED_TORCH_2_1000_300_1_cpu                   467             612           1072
PACKED_TO_PADDED_TORCH_2_1000_300_1_cuda:0                772             905            648
PACKED_TO_PADDED_TORCH_2_1000_300_16_cpu                  689             839            726
PACKED_TO_PADDED_TORCH_2_1000_300_16_cuda:0               789            1143            635
PACKED_TO_PADDED_TORCH_2_1000_3000_1_cpu                  629             735            795
PACKED_TO_PADDED_TORCH_2_1000_3000_1_cuda:0               812             916            616
PACKED_TO_PADDED_TORCH_2_1000_3000_16_cpu                2716            3117            185
PACKED_TO_PADDED_TORCH_2_1000_3000_16_cuda:0              844            1288            593
PACKED_TO_PADDED_TORCH_10_100_300_1_cpu                  2387            2557            210
PACKED_TO_PADDED_TORCH_10_100_300_1_cuda:0               4112            4993            122
PACKED_TO_PADDED_TORCH_10_100_300_16_cpu                 3385            4254            148
PACKED_TO_PADDED_TORCH_10_100_300_16_cuda:0              3959            4902            127
PACKED_TO_PADDED_TORCH_10_100_3000_1_cpu                 2918            3105            172
PACKED_TO_PADDED_TORCH_10_100_3000_1_cuda:0              4054            4450            124
PACKED_TO_PADDED_TORCH_10_100_3000_16_cpu               12748           13623             40
PACKED_TO_PADDED_TORCH_10_100_3000_16_cuda:0             4023            4395            125
PACKED_TO_PADDED_TORCH_10_1000_300_1_cpu                 2258            2492            222
PACKED_TO_PADDED_TORCH_10_1000_300_1_cuda:0              3997            4312            126
PACKED_TO_PADDED_TORCH_10_1000_300_16_cpu                3404            3597            147
PACKED_TO_PADDED_TORCH_10_1000_300_16_cuda:0             3877            4227            129
PACKED_TO_PADDED_TORCH_10_1000_3000_1_cpu                2789            3054            180
PACKED_TO_PADDED_TORCH_10_1000_3000_1_cuda:0             3821            4402            131
PACKED_TO_PADDED_TORCH_10_1000_3000_16_cpu              11967           12963             42
PACKED_TO_PADDED_TORCH_10_1000_3000_16_cuda:0            3729            4290            135
PACKED_TO_PADDED_TORCH_32_100_300_1_cpu                  6933            8152             73
PACKED_TO_PADDED_TORCH_32_100_300_1_cuda:0              11856           12287             43
PACKED_TO_PADDED_TORCH_32_100_300_16_cpu                 9895           11205             51
PACKED_TO_PADDED_TORCH_32_100_300_16_cuda:0             12354           13596             41
PACKED_TO_PADDED_TORCH_32_100_3000_1_cpu                 9516           10128             53
PACKED_TO_PADDED_TORCH_32_100_3000_1_cuda:0             12917           13597             39
PACKED_TO_PADDED_TORCH_32_100_3000_16_cpu               41209           43783             13
PACKED_TO_PADDED_TORCH_32_100_3000_16_cuda:0            12210           13288             41
PACKED_TO_PADDED_TORCH_32_1000_300_1_cpu                 7179            7689             70
PACKED_TO_PADDED_TORCH_32_1000_300_1_cuda:0             11896           12381             43
PACKED_TO_PADDED_TORCH_32_1000_300_16_cpu               10127           15494             50
PACKED_TO_PADDED_TORCH_32_1000_300_16_cuda:0            12034           12817             42
PACKED_TO_PADDED_TORCH_32_1000_3000_1_cpu                8743           10251             58
PACKED_TO_PADDED_TORCH_32_1000_3000_1_cuda:0            12023           12908             42
PACKED_TO_PADDED_TORCH_32_1000_3000_16_cpu              39071           41777             13
PACKED_TO_PADDED_TORCH_32_1000_3000_16_cuda:0           11999           13690             42
--------------------------------------------------------------------------------
```

Reviewed By: bottler, nikhilaravi, jcjohnson

Differential Revision: D19870575

fbshipit-source-id: 23a2477b73373c411899633386c87ab034c3702a
2020-02-19 10:48:54 -08:00
Jeremy Reizenstein
bdc2bb578c MACOSX_DEPLOYMENT_TARGET=10.14
Summary:
pybind now seems to need C++17 on a mac, so advise people to use it. (Also delete an unused variable to silence a warning I got on a mac build.)

Reported in github issue #68.

Reviewed By: nikhilaravi

Differential Revision: D19970512

fbshipit-source-id: f9be20c8ed425bd6ba8d009a7d62dad658dccdb1
2020-02-19 08:43:50 -08:00
Nikhila Ravi
97acf16de2 lint fixes
Summary: Ran `dev/linter.sh`.

Reviewed By: bottler

Differential Revision: D19761062

fbshipit-source-id: 1a49abe4a5f2bc7641b2b46e254aa77e6a48aa7d
2020-02-13 20:50:48 -08:00
Georgia Gkioxari
29cd181a83 CPU implem for face areas normals
Summary:
Added cpu implementation for face areas normals. Moved test and bm to separate functions.

```
Benchmark                                   Avg Time(μs)      Peak Time(μs) Iterations
--------------------------------------------------------------------------------
FACE_AREAS_NORMALS_2_100_300_False                196             268           2550
FACE_AREAS_NORMALS_2_100_300_True                 106             179           4733
FACE_AREAS_NORMALS_2_100_3000_False              1447            1630            346
FACE_AREAS_NORMALS_2_100_3000_True                107             178           4674
FACE_AREAS_NORMALS_2_1000_300_False               201             309           2486
FACE_AREAS_NORMALS_2_1000_300_True                107             186           4673
FACE_AREAS_NORMALS_2_1000_3000_False             1451            1636            345
FACE_AREAS_NORMALS_2_1000_3000_True               107             186           4655
FACE_AREAS_NORMALS_10_100_300_False               767             918            653
FACE_AREAS_NORMALS_10_100_300_True                106             167           4712
FACE_AREAS_NORMALS_10_100_3000_False             7036            7754             72
FACE_AREAS_NORMALS_10_100_3000_True               113             164           4445
FACE_AREAS_NORMALS_10_1000_300_False              748             947            669
FACE_AREAS_NORMALS_10_1000_300_True               108             169           4638
FACE_AREAS_NORMALS_10_1000_3000_False            7069            7783             71
FACE_AREAS_NORMALS_10_1000_3000_True              108             172           4646
FACE_AREAS_NORMALS_32_100_300_False              2286            2496            219
FACE_AREAS_NORMALS_32_100_300_True                108             180           4631
FACE_AREAS_NORMALS_32_100_3000_False            23184           24369             22
FACE_AREAS_NORMALS_32_100_3000_True               159             213           3147
FACE_AREAS_NORMALS_32_1000_300_False             2414            2645            208
FACE_AREAS_NORMALS_32_1000_300_True               112             197           4480
FACE_AREAS_NORMALS_32_1000_3000_False           21687           22964             24
FACE_AREAS_NORMALS_32_1000_3000_True              141             211           3540
--------------------------------------------------------------------------------

Benchmark                                         Avg Time(μs)      Peak Time(μs) Iterations
--------------------------------------------------------------------------------
FACE_AREAS_NORMALS_TORCH_2_100_300_False               5465            5782             92
FACE_AREAS_NORMALS_TORCH_2_100_300_True                1198            1351            418
FACE_AREAS_NORMALS_TORCH_2_100_3000_False             48228           48869             11
FACE_AREAS_NORMALS_TORCH_2_100_3000_True               1186            1304            422
FACE_AREAS_NORMALS_TORCH_2_1000_300_False              5556            6097             90
FACE_AREAS_NORMALS_TORCH_2_1000_300_True               1200            1328            417
FACE_AREAS_NORMALS_TORCH_2_1000_3000_False            48683           50016             11
FACE_AREAS_NORMALS_TORCH_2_1000_3000_True              1185            1306            422
FACE_AREAS_NORMALS_TORCH_10_100_300_False             24215           25097             21
FACE_AREAS_NORMALS_TORCH_10_100_300_True               1150            1314            435
FACE_AREAS_NORMALS_TORCH_10_100_3000_False           232605          234952              3
FACE_AREAS_NORMALS_TORCH_10_100_3000_True              1193            1314            420
FACE_AREAS_NORMALS_TORCH_10_1000_300_False            24912           25343             21
FACE_AREAS_NORMALS_TORCH_10_1000_300_True              1216            1330            412
FACE_AREAS_NORMALS_TORCH_10_1000_3000_False          239907          241253              3
FACE_AREAS_NORMALS_TORCH_10_1000_3000_True             1226            1333            408
FACE_AREAS_NORMALS_TORCH_32_100_300_False             73991           75776              7
FACE_AREAS_NORMALS_TORCH_32_100_300_True               1193            1339            420
FACE_AREAS_NORMALS_TORCH_32_100_3000_False           728932          728932              1
FACE_AREAS_NORMALS_TORCH_32_100_3000_True              1186            1359            422
FACE_AREAS_NORMALS_TORCH_32_1000_300_False            76385           79129              7
FACE_AREAS_NORMALS_TORCH_32_1000_300_True              1165            1310            430
FACE_AREAS_NORMALS_TORCH_32_1000_3000_False          753276          753276              1
FACE_AREAS_NORMALS_TORCH_32_1000_3000_True             1205            1340            415
--------------------------------------------------------------------------------
```

Reviewed By: bottler, jcjohnson

Differential Revision: D19864385

fbshipit-source-id: 3a87ae41a8e3ab5560febcb94961798f2e09dfb8
2020-02-13 11:42:48 -08:00
Nikhila Ravi
dcb094800f ignore cuda for cpu only installation
Summary:
Added if `WITH_CUDA` checks for points/mesh rasterization. If installing on cpu only then this causes `Undefined symbol` errors when trying to import pytorch3d.

We had these checks for all the other cuda files but not the rasterization files.

Thanks ppwwyyxx for the tip!

Reviewed By: ppwwyyxx, gkioxari

Differential Revision: D19801495

fbshipit-source-id: 20e7adccfdb33ac731c00a89414b2beaf0a35529
2020-02-08 09:14:47 -08:00
Justin Johnson
e290f87ca9 Add CPU implementation for nearest neighbor
Summary:
Adds a CPU implementation for `pytorch3d.ops.nn_points_idx`.

Also renames the associated C++ and CUDA functions to use `AllCaps` names used in other C++ / CUDA code.

Reviewed By: gkioxari

Differential Revision: D19670491

fbshipit-source-id: 1b6409404025bf05e6a93f5d847e35afc9062f05
2020-02-03 10:06:10 -08:00
facebook-github-bot
dbf06b504b Initial commit
fbshipit-source-id: ad58e416e3ceeca85fae0583308968d04e78fe0d
2020-01-23 11:53:46 -08:00