pytorch3d

mirror of https://github.com/facebookresearch/pytorch3d.git synced 2025-10-20 10:08:09 +08:00

Author	SHA1	Message	Date
Nikhila Ravi	3fef506895	Make cuda tensors contiguous in host function and remove contiguous check Summary: Update the cuda kernels to: - remove contiguous checks for the grad tensors and for cpu functions which use accessors - for cuda implementations call `.contiguous()` on all tensors in the host function before invoking the kernel Reviewed By: gkioxari Differential Revision: D21598008 fbshipit-source-id: 9b97bda4582fd4269c8a00999874d4552a1aea2d	2020-05-15 15:00:25 -07:00
Jeremy Reizenstein	728179e848	avoid converting a TensorOptions from float to integer Summary: pytorch is adding checks that mean integer tensors with requires_grad=True need to be avoided. Fix accidentally creating them. Reviewed By: jcjohnson, gkioxari Differential Revision: D21576712 fbshipit-source-id: 008218997986800a36d93caa1a032ee91f2bffcd	2020-05-14 13:16:05 -07:00
Jeremy Reizenstein	8fc28baa27	Looser gradient check in test_rasterize_meshes Summary: This has been failing intermittently Reviewed By: nikhilaravi Differential Revision: D21403157 fbshipit-source-id: 51b74d6c813b52effe72d14b565e250fcabbb463	2020-05-05 09:26:47 -07:00
Michele Sanna	f8acecb6b3	a formula for bin size for images over 64x64 (#90 ) Summary: Signed-off-by: Michele Sanna <sanna@arrival.com> fixes the bin_size calculation with a formula for any image_size > 64. Matches the values chosen so far. simple test: ``` import numpy as np import matplotlib.pyplot as plt image_size = np.arange(64, 2048) bin_size = np.where(image_size <= 64, 8, (2 ** np.maximum(np.ceil(np.log2(image_size)) - 4, 4)).astype(int)) print(image_size) print(bin_size) for ims, bins in zip(image_size, bin_size): if ims <= 64: assert bins == 8 elif ims <= 256: assert bins == 16 elif ims <= 512: assert bins == 32 elif ims <= 1024: assert bins == 64 elif ims <= 2048: assert bins == 128 assert (ims + bins - 1) // bins < 22 plt.plot(image_size, bin_size) plt.grid() plt.show() ``` ![img](https://user-images.githubusercontent.com/54891577/75464693-795bcf00-597f-11ea-9061-26440211691c.png) Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/90 Reviewed By: jcjohnson Differential Revision: D21160372 Pulled By: nikhilaravi fbshipit-source-id: 660cf5832f4ca5be243c435a6bed969596fc0188	2020-04-24 14:56:41 -07:00
Nikhila Ravi	c3d636dc8c	Cuda updates Summary: Updates to: - enable cuda kernel launches on any GPU (not just the default) - cuda and contiguous checks for all kernels - checks to ensure all tensors are on the same device - error reporting in the cuda kernels - cuda tests now run on a random device not just the default Reviewed By: jcjohnson, gkioxari Differential Revision: D21215280 fbshipit-source-id: 1bedc9fe6c35e9e920bdc4d78ed12865b1005519	2020-04-24 09:11:04 -07:00
Jeremy Reizenstein	85c396f822	avoid using torch/extension.h in cuda Summary: Use aten instead of torch interface in all cuda code. This allows the cuda build to work with pytorch 1.5 with GCC 5 (e.g. the compiler of ubuntu 16.04LTS). This wasn't working. It has been failing with errors like the below, perhaps due to a bug in nvcc. ``` torch/include/torch/csrc/api/include/torch/nn/cloneable.h:68:61: error: invalid static_cast from type ‘const torch::OrderedDict<std::basic_string<char>, std::shared_ptr<torch::nn::Module> >’ to type ‘torch::OrderedDict<std::basic_string<char>, std::shared_ptr<torch::nn::Module> > ``` Reviewed By: nikhilaravi Differential Revision: D21204029 fbshipit-source-id: ca6bdbcecf42493365e1c23a33fe35e1759fe8b6	2020-04-23 10:26:17 -07:00
Justin Johnson	9f31a4fd46	Expose knn_check_version in python Summary: We have multiple KNN CUDA implementations. From python, users can currently request a particular implementation via the `version` flag, but they have no way of knowing which implementations can be used for a given problem. This diff exposes a function `pytorch3d._C.knn_check_version(version, D, K)` that returns whether a particular version can be used. Reviewed By: nikhilaravi Differential Revision: D21162573 fbshipit-source-id: 6061960bdcecba454fd920b00036f4e9ff3fdbc0	2020-04-22 14:30:52 -07:00
Nikhila Ravi	4bf30593ff	back face culling in rasterization Summary: Added backface culling as an option to the `raster_settings`. This is needed for the full forward rendering of shapenet meshes with texture (some meshes contain multiple overlapping segments which have different textures). For a triangle (v0, v1, v2) define the vectors A = (v1 - v0) and B = (v2 − v0) and use this to calculate the area of the triangle as: ``` area = 0.5 * A x B area = 0.5 * ((x1 − x0)(y2 − y0) − (x2 − x0)(y1 − y0)) ``` The area will be positive if (v0, v1, v2) are oriented counterclockwise (a front face), and negative if (v0, v1, v2) are oriented clockwise (a back face). We can reuse the `edge_function` as it already calculates the triangle area. Reviewed By: jcjohnson Differential Revision: D20960115 fbshipit-source-id: 2d8a4b9ccfb653df18e79aed8d05c7ec0f057ab1	2020-04-22 08:22:46 -07:00
Nikhila Ravi	9ef1ee8455	coarse rasterization bug fix Summary: Fix a bug which resulted in a rendering artifacts if the image size was not a multiple of 16. Fix: Revert coarse rasterization to original implementation and only update fine rasterization to reverse the ordering of Y and X axis. This is much simpler than the previous approach! Additional changes: - updated mesh rendering end-end tests to check outputs from both naive and coarse to fine rasterization. - added pointcloud rendering end-end tests Reviewed By: gkioxari Differential Revision: D21102725 fbshipit-source-id: 2e7e1b013dd6dd12b3a00b79eb8167deddb2e89a	2020-04-20 14:54:16 -07:00
Jeremy Reizenstein	6207c359b1	spelling and flake Summary: mostly recent lintish things Reviewed By: nikhilaravi Differential Revision: D21089003 fbshipit-source-id: 028733c1d875268f1879e4481da475b7100ba0b6	2020-04-17 10:50:22 -07:00
Jeremy Reizenstein	9397cd872d	torch C API warnings Summary: This is mostly replacing the old PackedTensorAccessor with the new PackedTensorAccessor64. Reviewed By: gkioxari Differential Revision: D21088773 fbshipit-source-id: 5973e5a29d934eafb7c70ec5ec154ca076b64d27	2020-04-17 10:46:31 -07:00
Jeremy Reizenstein	e19df58766	remove final nearest_neighbor files Summary: A couple of files for the removed nearest_neighbor functionality are left behind. Reviewed By: nikhilaravi Differential Revision: D21088624 fbshipit-source-id: 4bb29016b4e5f63102765b384c363733b60032fa	2020-04-17 09:27:17 -07:00
Nikhila Ravi	3794f6753f	remove nearest_neighbors Summary: knn is more general and faster than the nearest_neighbor code, so remove the latter. Reviewed By: gkioxari Differential Revision: D20816424 fbshipit-source-id: 75d6c44d17180752d0c9859814bbdf7892558158	2020-04-15 20:51:41 -07:00
Georgia Gkioxari	b2b0c5a442	knn autograd Summary: Adds knn backward to return `grad_pts1` and `grad_pts2`. Adds `knn_gather` to return the nearest neighbors in pts2. The BM tests include backward pass and are ran on an M40. ``` Benchmark Avg Time(μs) Peak Time(μs) Iterations -------------------------------------------------------------------------------- KNN_SQUARE_32_256_128_3_24_cpu 39558 43485 13 KNN_SQUARE_32_256_128_3_24_cuda:0 1080 1404 463 KNN_SQUARE_32_256_512_3_24_cpu 81950 85781 7 KNN_SQUARE_32_256_512_3_24_cuda:0 1519 1641 330 -------------------------------------------------------------------------------- Benchmark Avg Time(μs) Peak Time(μs) Iterations -------------------------------------------------------------------------------- KNN_RAGGED_32_256_128_3_24_cpu 13798 14650 37 KNN_RAGGED_32_256_128_3_24_cuda:0 1576 1713 318 KNN_RAGGED_32_256_512_3_24_cpu 31255 32210 16 KNN_RAGGED_32_256_512_3_24_cuda:0 2024 2162 248 -------------------------------------------------------------------------------- ``` Reviewed By: jcjohnson Differential Revision: D20945556 fbshipit-source-id: a16f616029c6b5f8c2afceb5f2bc12c5c20d2f3c	2020-04-14 17:22:56 -07:00
Georgia Gkioxari	487d4d6607	point mesh distances Summary: Implementation of point to mesh distances. The current diff contains two types: (a) Point to Edge (b) Point to Face ``` Benchmark Avg Time(μs) Peak Time(μs) Iterations -------------------------------------------------------------------------------- POINT_MESH_EDGE_4_100_300_5000_cuda:0 2745 3138 183 POINT_MESH_EDGE_4_100_300_10000_cuda:0 4408 4499 114 POINT_MESH_EDGE_4_100_3000_5000_cuda:0 4978 5070 101 POINT_MESH_EDGE_4_100_3000_10000_cuda:0 9076 9187 56 POINT_MESH_EDGE_4_1000_300_5000_cuda:0 1411 1487 355 POINT_MESH_EDGE_4_1000_300_10000_cuda:0 4829 5030 104 POINT_MESH_EDGE_4_1000_3000_5000_cuda:0 7539 7620 67 POINT_MESH_EDGE_4_1000_3000_10000_cuda:0 12088 12272 42 POINT_MESH_EDGE_8_100_300_5000_cuda:0 3106 3222 161 POINT_MESH_EDGE_8_100_300_10000_cuda:0 8561 8648 59 POINT_MESH_EDGE_8_100_3000_5000_cuda:0 6932 7021 73 POINT_MESH_EDGE_8_100_3000_10000_cuda:0 24032 24176 21 POINT_MESH_EDGE_8_1000_300_5000_cuda:0 5272 5399 95 POINT_MESH_EDGE_8_1000_300_10000_cuda:0 11348 11430 45 POINT_MESH_EDGE_8_1000_3000_5000_cuda:0 17478 17683 29 POINT_MESH_EDGE_8_1000_3000_10000_cuda:0 25961 26236 20 POINT_MESH_EDGE_16_100_300_5000_cuda:0 8244 8323 61 POINT_MESH_EDGE_16_100_300_10000_cuda:0 18018 18071 28 POINT_MESH_EDGE_16_100_3000_5000_cuda:0 19428 19544 26 POINT_MESH_EDGE_16_100_3000_10000_cuda:0 44967 45135 12 POINT_MESH_EDGE_16_1000_300_5000_cuda:0 7825 7937 64 POINT_MESH_EDGE_16_1000_300_10000_cuda:0 18504 18571 28 POINT_MESH_EDGE_16_1000_3000_5000_cuda:0 65805 66132 8 POINT_MESH_EDGE_16_1000_3000_10000_cuda:0 90885 91089 6 -------------------------------------------------------------------------------- Benchmark Avg Time(μs) Peak Time(μs) Iterations -------------------------------------------------------------------------------- POINT_MESH_FACE_4_100_300_5000_cuda:0 1561 1685 321 POINT_MESH_FACE_4_100_300_10000_cuda:0 2818 2954 178 POINT_MESH_FACE_4_100_3000_5000_cuda:0 15893 16018 32 POINT_MESH_FACE_4_100_3000_10000_cuda:0 16350 16439 31 POINT_MESH_FACE_4_1000_300_5000_cuda:0 3179 3278 158 POINT_MESH_FACE_4_1000_300_10000_cuda:0 2353 2436 213 POINT_MESH_FACE_4_1000_3000_5000_cuda:0 16262 16336 31 POINT_MESH_FACE_4_1000_3000_10000_cuda:0 9334 9448 54 POINT_MESH_FACE_8_100_300_5000_cuda:0 4377 4493 115 POINT_MESH_FACE_8_100_300_10000_cuda:0 9728 9822 52 POINT_MESH_FACE_8_100_3000_5000_cuda:0 26428 26544 19 POINT_MESH_FACE_8_100_3000_10000_cuda:0 42238 43031 12 POINT_MESH_FACE_8_1000_300_5000_cuda:0 3891 3982 129 POINT_MESH_FACE_8_1000_300_10000_cuda:0 5363 5429 94 POINT_MESH_FACE_8_1000_3000_5000_cuda:0 20998 21084 24 POINT_MESH_FACE_8_1000_3000_10000_cuda:0 39711 39897 13 POINT_MESH_FACE_16_100_300_5000_cuda:0 5955 6001 84 POINT_MESH_FACE_16_100_300_10000_cuda:0 12082 12144 42 POINT_MESH_FACE_16_100_3000_5000_cuda:0 44996 45176 12 POINT_MESH_FACE_16_100_3000_10000_cuda:0 73042 73197 7 POINT_MESH_FACE_16_1000_300_5000_cuda:0 8292 8374 61 POINT_MESH_FACE_16_1000_300_10000_cuda:0 19442 19506 26 POINT_MESH_FACE_16_1000_3000_5000_cuda:0 36059 36194 14 POINT_MESH_FACE_16_1000_3000_10000_cuda:0 64644 64822 8 -------------------------------------------------------------------------------- ``` Reviewed By: jcjohnson Differential Revision: D20590462 fbshipit-source-id: 42a39837b514a546ac9471bfaff60eefe7fae829	2020-04-11 00:21:24 -07:00
Jeremy Reizenstein	01b5f7b228	heterogenous KNN Summary: Interface and working implementation of ragged KNN. Benchmarks (which aren't ragged) haven't slowed. New benchmark shows that ragged is faster than non-ragged of the same shape. Reviewed By: jcjohnson Differential Revision: D20696507 fbshipit-source-id: 21b80f71343a3475c8d3ee0ce2680f92f0fae4de	2020-04-07 01:47:37 -07:00
Jeremy Reizenstein	37c5c8e0b6	Linter, deprecated type() Summary: Run linter after recent changes. Fix long comment in knn.h which clang-format has reflowed badly. Add crude test that code doesn't call deprecated `.type()` or `.data()`. Reviewed By: nikhilaravi Differential Revision: D20692935 fbshipit-source-id: 28ce0308adae79a870cb41a810b7cf8744f41ab8	2020-03-29 14:02:58 -07:00
Justin Johnson	870290df34	Implement K-Nearest Neighbors Summary: Implements K-Nearest Neighbors with C++ and CUDA versions. KNN in CUDA is highly nontrivial. I've implemented a few different versions of the kernel, and we heuristically dispatch to different kernels based on the problem size. Some of the kernels rely on template specialization on either D or K, so we use template metaprogramming to compile specialized versions for ranges of D and K. These kernels are up to 3x faster than our existing 1-nearest-neighbor kernels, so we should also consider swapping out `nn_points_idx` to use these kernels in the backend. I've been working mostly on the CUDA kernels, and haven't converged on the correct Python API. I still want to benchmark against FAISS to see how far away we are from their performance. Reviewed By: bottler Differential Revision: D19729286 fbshipit-source-id: 608ffbb7030c21fe4008f330522f4890f0c3c21a	2020-03-26 13:40:26 -07:00
Jeremy Reizenstein	81a4aa18ad	type() deprecated Summary: Replace `tensor.type().is_cuda()` with the preferred `tensor.is_cuda()`. Replace `AT_DISPATCH_FLOATING_TYPES(tensor.type(), ...` with `AT_DISPATCH_FLOATING_TYPES(tensor.scalar_type(), ...`. These avoid deprecation warnings in future pytorch. Reviewed By: nikhilaravi Differential Revision: D20646565 fbshipit-source-id: 1a0c15978c871af816b1dd7d4a7ea78242abd95e	2020-03-26 04:01:41 -07:00
Jeremy Reizenstein	e22d431e5b	data() deprecated Summary: replace `data()` with preferred `data_ptr()`, avoiding some deprecation warnings in future pytorch. Reviewed By: nikhilaravi Differential Revision: D20645738 fbshipit-source-id: 8f6e02d292729b804fa2a66f94dd0517bbaf7887	2020-03-26 03:21:48 -07:00
Jeremy Reizenstein	8fa7678614	fix CPU-only hiding of cuda calls Summary: CPU-only builds should be fixed by this change Reviewed By: nikhilaravi Differential Revision: D20598014 fbshipit-source-id: df098ec4c6c93d38515172805fe57cac7463c506	2020-03-24 05:04:32 -07:00
Olivia	53599770dd	Accumulate points (#4 ) Summary: Code for accumulating points in the z-buffer in three ways: 1. weighted sum 2. normalised weighted sum 3. alpha compositing Pull Request resolved: https://github.com/fairinternal/pytorch3d/pull/4 Reviewed By: nikhilaravi Differential Revision: D20522422 Pulled By: gkioxari fbshipit-source-id: 5023baa05f15e338f3821ef08f5552c2dcbfc06c	2020-03-19 11:23:12 -07:00
Jeremy Reizenstein	2361845548	squared distance in comments Summary: Comments were describing squared distance as absolute distance in a few places. Reviewed By: nikhilaravi Differential Revision: D20426020 fbshipit-source-id: 009946867c4a98f61f5ce7158542d41e22bf8346	2020-03-13 04:35:25 -07:00
Nikhila Ravi	d01e722849	Fix coordinate system conventions in point cloud renderer Summary: Applying the changes added for mesh rasterization to ensure that +Y is up and +X is left so that the coordinate system is right handed. Also updated the diagram in the docs to indicate that (0,0) is in the top left hand corner. Reviewed By: gkioxari Differential Revision: D20394849 fbshipit-source-id: cfb7c79090eb1f55ad38b92327a74a70a8dc541e	2020-03-12 07:48:29 -07:00
Nikhila Ravi	32ad869dea	Update point cloud rasterizer to support heterogeneous point clouds Summary: Update the point cloud rasterizer to: - use the pointcloud datastructure (rebased on top of D19791851.) - support rasterization of heterogeneous point clouds in the same way as with Meshes. The main changes to the API will be as follows: - The input to `rasterize_points` will be a `Pointclouds` object instead of a tensor. This will be easy to update e.g. ``` points = torch.randn(N, P, 3) idx2, zbuf2, dists2 = rasterize_points(points, image_size, radius, points_per_pixel) points = torch.randn(N, P, 3) pointclouds = Pointclouds(points=points) idx2, zbuf2, dists2 = rasterize_points(pointclouds, image_size, radius, points_per_pixel) ``` - The indices output from rasterization will now refer to points in `poinclouds.points_packed()`. This may require some changes to the functions which consume the outputs of rasterization if they were previously assuming that the indices ranged from 0 to P where P is the number of points in each pointcloud. Making this change now so that Olivia can update her PR accordingly. Reviewed By: gkioxari Differential Revision: D20088651 fbshipit-source-id: 833ed659909712bcbbb6a50e2ec0189839f0413a	2020-03-12 07:48:29 -07:00
Nikhila Ravi	15c72be444	Fix coordinate system conventions in renderer Summary: ## Updates - Defined the world and camera coordinates according to this figure. The world coordinates are defined as having +Y up, +X left and +Z in. {F230888499} - Removed all flipping from blending functions. - Updated the rasterizer to return images with +Y up and +X left. - Updated all the mesh rasterizer tests - The expected values are now defined in terms of the default +Y up, +X left - Added tests where the triangles in the meshes are non symmetrical so that it is clear which direction +X and +Y are ## Questions: - Should we have scene settings instead of raster settings? - To be more correct we should be [z clipping in the rasterizer based on the far/near clipping planes](https://github.com/ShichenLiu/SoftRas/blob/master/soft_renderer/cuda/soft_rasterize_cuda_kernel.cu#L400) - these values are also required in the blending functions so should we make these scene level parameters and have a scene settings tuple which is available to the rasterizer and shader? Reviewed By: gkioxari Differential Revision: D20208604 fbshipit-source-id: 55787301b1bffa0afa9618f0a0886cc681da51f3	2020-03-06 06:51:05 -08:00
takiyu	f358b9b14d	Fix squared distance for CPU impl. (#83 ) Summary: `PointLineDistanceForward()` should return squared distance. However, it seems that it returned non-squared distance when `v0` was near by `v1` in CPU implementation. Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/83 Reviewed By: bottler Differential Revision: D20097181 Pulled By: nikhilaravi fbshipit-source-id: 7ea851c0837ab89364e42d283c999df21ff5ff02	2020-02-25 14:00:00 -08:00
merayxu	9e21659fc5	Fixed windows MSVC build compatibility (#9 ) Summary: Fixed a few MSVC compiler (visual studio 2019, MSVC 19.16.27034) compatibility issues 1. Replaced long with int64_t. aten::data_ptr\<long\> is not supported in MSVC 2. pytorch3d/csrc/rasterize_points/rasterize_points_cpu.cpp, inline function is not correctly recognized by MSVC. 3. pytorch3d/csrc/rasterize_meshes/geometry_utils.cuh const auto kEpsilon = 1e-30; MSVC does not compile this const into both host and device, change to a MACRO. 4. pytorch3d/csrc/rasterize_meshes/geometry_utils.cuh, const float area2 = pow(area, 2.0); 2.0 is considered as double by MSVC and raised an error 5. pytorch3d/csrc/rasterize_points/rasterize_points_cpu.cpp std::tuple<torch::Tensor, torch::Tensor> RasterizePointsCoarseCpu() return type does not match the declaration in rasterize_points_cpu.h. Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/9 Reviewed By: nikhilaravi Differential Revision: D19986567 Pulled By: yuanluxu fbshipit-source-id: f4d98525d088c99c513b85193db6f0fc69c7f017	2020-02-20 18:43:19 -08:00
Georgia Gkioxari	a3baa367e3	face areas backward Summary: Added backward for mesh face areas & normals. Exposed it as a layer. Replaced the computation with the new op in Meshes and in Sample Points. Current issue: Circular imports. I moved the import of the op in meshes inside the function scope. Reviewed By: jcjohnson Differential Revision: D19920082 fbshipit-source-id: d213226d5e1d19a0c8452f4d32771d07e8b91c0a	2020-02-20 11:11:33 -08:00
Georgia Gkioxari	60f3c4e7d2	cpp support for packed to padded Summary: Cpu implementation for packed to padded and added gradients ``` Benchmark Avg Time(μs) Peak Time(μs) Iterations -------------------------------------------------------------------------------- PACKED_TO_PADDED_2_100_300_1_cpu 138 221 3625 PACKED_TO_PADDED_2_100_300_1_cuda:0 184 261 2716 PACKED_TO_PADDED_2_100_300_16_cpu 555 726 901 PACKED_TO_PADDED_2_100_300_16_cuda:0 179 260 2794 PACKED_TO_PADDED_2_100_3000_1_cpu 396 519 1262 PACKED_TO_PADDED_2_100_3000_1_cuda:0 181 274 2764 PACKED_TO_PADDED_2_100_3000_16_cpu 4517 5003 111 PACKED_TO_PADDED_2_100_3000_16_cuda:0 224 397 2235 PACKED_TO_PADDED_2_1000_300_1_cpu 138 212 3616 PACKED_TO_PADDED_2_1000_300_1_cuda:0 180 282 2775 PACKED_TO_PADDED_2_1000_300_16_cpu 565 711 885 PACKED_TO_PADDED_2_1000_300_16_cuda:0 179 264 2797 PACKED_TO_PADDED_2_1000_3000_1_cpu 389 494 1287 PACKED_TO_PADDED_2_1000_3000_1_cuda:0 180 271 2777 PACKED_TO_PADDED_2_1000_3000_16_cpu 4522 5170 111 PACKED_TO_PADDED_2_1000_3000_16_cuda:0 216 286 2313 PACKED_TO_PADDED_10_100_300_1_cpu 251 345 1995 PACKED_TO_PADDED_10_100_300_1_cuda:0 178 262 2806 PACKED_TO_PADDED_10_100_300_16_cpu 2354 2750 213 PACKED_TO_PADDED_10_100_300_16_cuda:0 178 291 2814 PACKED_TO_PADDED_10_100_3000_1_cpu 1519 1786 330 PACKED_TO_PADDED_10_100_3000_1_cuda:0 179 237 2791 PACKED_TO_PADDED_10_100_3000_16_cpu 24705 25879 21 PACKED_TO_PADDED_10_100_3000_16_cuda:0 228 316 2191 PACKED_TO_PADDED_10_1000_300_1_cpu 261 432 1919 PACKED_TO_PADDED_10_1000_300_1_cuda:0 181 261 2756 PACKED_TO_PADDED_10_1000_300_16_cpu 2349 2770 213 PACKED_TO_PADDED_10_1000_300_16_cuda:0 180 256 2782 PACKED_TO_PADDED_10_1000_3000_1_cpu 1613 1929 310 PACKED_TO_PADDED_10_1000_3000_1_cuda:0 183 253 2739 PACKED_TO_PADDED_10_1000_3000_16_cpu 22041 23653 23 PACKED_TO_PADDED_10_1000_3000_16_cuda:0 220 343 2270 PACKED_TO_PADDED_32_100_300_1_cpu 555 750 901 PACKED_TO_PADDED_32_100_300_1_cuda:0 188 282 2661 PACKED_TO_PADDED_32_100_300_16_cpu 7550 8131 67 PACKED_TO_PADDED_32_100_300_16_cuda:0 181 272 2770 PACKED_TO_PADDED_32_100_3000_1_cpu 4574 6327 110 PACKED_TO_PADDED_32_100_3000_1_cuda:0 173 254 2884 PACKED_TO_PADDED_32_100_3000_16_cpu 70366 72563 8 PACKED_TO_PADDED_32_100_3000_16_cuda:0 349 654 1433 PACKED_TO_PADDED_32_1000_300_1_cpu 612 728 818 PACKED_TO_PADDED_32_1000_300_1_cuda:0 189 295 2647 PACKED_TO_PADDED_32_1000_300_16_cpu 7699 8254 65 PACKED_TO_PADDED_32_1000_300_16_cuda:0 189 311 2646 PACKED_TO_PADDED_32_1000_3000_1_cpu 5105 5261 98 PACKED_TO_PADDED_32_1000_3000_1_cuda:0 191 260 2625 PACKED_TO_PADDED_32_1000_3000_16_cpu 87073 92708 6 PACKED_TO_PADDED_32_1000_3000_16_cuda:0 344 425 1455 -------------------------------------------------------------------------------- Benchmark Avg Time(μs) Peak Time(μs) Iterations -------------------------------------------------------------------------------- PACKED_TO_PADDED_TORCH_2_100_300_1_cpu 492 627 1016 PACKED_TO_PADDED_TORCH_2_100_300_1_cuda:0 768 975 652 PACKED_TO_PADDED_TORCH_2_100_300_16_cpu 659 804 760 PACKED_TO_PADDED_TORCH_2_100_300_16_cuda:0 781 918 641 PACKED_TO_PADDED_TORCH_2_100_3000_1_cpu 624 734 802 PACKED_TO_PADDED_TORCH_2_100_3000_1_cuda:0 778 929 643 PACKED_TO_PADDED_TORCH_2_100_3000_16_cpu 2609 2850 192 PACKED_TO_PADDED_TORCH_2_100_3000_16_cuda:0 758 901 660 PACKED_TO_PADDED_TORCH_2_1000_300_1_cpu 467 612 1072 PACKED_TO_PADDED_TORCH_2_1000_300_1_cuda:0 772 905 648 PACKED_TO_PADDED_TORCH_2_1000_300_16_cpu 689 839 726 PACKED_TO_PADDED_TORCH_2_1000_300_16_cuda:0 789 1143 635 PACKED_TO_PADDED_TORCH_2_1000_3000_1_cpu 629 735 795 PACKED_TO_PADDED_TORCH_2_1000_3000_1_cuda:0 812 916 616 PACKED_TO_PADDED_TORCH_2_1000_3000_16_cpu 2716 3117 185 PACKED_TO_PADDED_TORCH_2_1000_3000_16_cuda:0 844 1288 593 PACKED_TO_PADDED_TORCH_10_100_300_1_cpu 2387 2557 210 PACKED_TO_PADDED_TORCH_10_100_300_1_cuda:0 4112 4993 122 PACKED_TO_PADDED_TORCH_10_100_300_16_cpu 3385 4254 148 PACKED_TO_PADDED_TORCH_10_100_300_16_cuda:0 3959 4902 127 PACKED_TO_PADDED_TORCH_10_100_3000_1_cpu 2918 3105 172 PACKED_TO_PADDED_TORCH_10_100_3000_1_cuda:0 4054 4450 124 PACKED_TO_PADDED_TORCH_10_100_3000_16_cpu 12748 13623 40 PACKED_TO_PADDED_TORCH_10_100_3000_16_cuda:0 4023 4395 125 PACKED_TO_PADDED_TORCH_10_1000_300_1_cpu 2258 2492 222 PACKED_TO_PADDED_TORCH_10_1000_300_1_cuda:0 3997 4312 126 PACKED_TO_PADDED_TORCH_10_1000_300_16_cpu 3404 3597 147 PACKED_TO_PADDED_TORCH_10_1000_300_16_cuda:0 3877 4227 129 PACKED_TO_PADDED_TORCH_10_1000_3000_1_cpu 2789 3054 180 PACKED_TO_PADDED_TORCH_10_1000_3000_1_cuda:0 3821 4402 131 PACKED_TO_PADDED_TORCH_10_1000_3000_16_cpu 11967 12963 42 PACKED_TO_PADDED_TORCH_10_1000_3000_16_cuda:0 3729 4290 135 PACKED_TO_PADDED_TORCH_32_100_300_1_cpu 6933 8152 73 PACKED_TO_PADDED_TORCH_32_100_300_1_cuda:0 11856 12287 43 PACKED_TO_PADDED_TORCH_32_100_300_16_cpu 9895 11205 51 PACKED_TO_PADDED_TORCH_32_100_300_16_cuda:0 12354 13596 41 PACKED_TO_PADDED_TORCH_32_100_3000_1_cpu 9516 10128 53 PACKED_TO_PADDED_TORCH_32_100_3000_1_cuda:0 12917 13597 39 PACKED_TO_PADDED_TORCH_32_100_3000_16_cpu 41209 43783 13 PACKED_TO_PADDED_TORCH_32_100_3000_16_cuda:0 12210 13288 41 PACKED_TO_PADDED_TORCH_32_1000_300_1_cpu 7179 7689 70 PACKED_TO_PADDED_TORCH_32_1000_300_1_cuda:0 11896 12381 43 PACKED_TO_PADDED_TORCH_32_1000_300_16_cpu 10127 15494 50 PACKED_TO_PADDED_TORCH_32_1000_300_16_cuda:0 12034 12817 42 PACKED_TO_PADDED_TORCH_32_1000_3000_1_cpu 8743 10251 58 PACKED_TO_PADDED_TORCH_32_1000_3000_1_cuda:0 12023 12908 42 PACKED_TO_PADDED_TORCH_32_1000_3000_16_cpu 39071 41777 13 PACKED_TO_PADDED_TORCH_32_1000_3000_16_cuda:0 11999 13690 42 -------------------------------------------------------------------------------- ``` Reviewed By: bottler, nikhilaravi, jcjohnson Differential Revision: D19870575 fbshipit-source-id: 23a2477b73373c411899633386c87ab034c3702a	2020-02-19 10:48:54 -08:00
Jeremy Reizenstein	bdc2bb578c	MACOSX_DEPLOYMENT_TARGET=10.14 Summary: pybind now seems to need C++17 on a mac, so advise people to use it. (Also delete an unused variable to silence a warning I got on a mac build.) Reported in github issue #68. Reviewed By: nikhilaravi Differential Revision: D19970512 fbshipit-source-id: f9be20c8ed425bd6ba8d009a7d62dad658dccdb1	2020-02-19 08:43:50 -08:00
Nikhila Ravi	97acf16de2	lint fixes Summary: Ran `dev/linter.sh`. Reviewed By: bottler Differential Revision: D19761062 fbshipit-source-id: 1a49abe4a5f2bc7641b2b46e254aa77e6a48aa7d	2020-02-13 20:50:48 -08:00
Georgia Gkioxari	29cd181a83	CPU implem for face areas normals Summary: Added cpu implementation for face areas normals. Moved test and bm to separate functions. ``` Benchmark Avg Time(μs) Peak Time(μs) Iterations -------------------------------------------------------------------------------- FACE_AREAS_NORMALS_2_100_300_False 196 268 2550 FACE_AREAS_NORMALS_2_100_300_True 106 179 4733 FACE_AREAS_NORMALS_2_100_3000_False 1447 1630 346 FACE_AREAS_NORMALS_2_100_3000_True 107 178 4674 FACE_AREAS_NORMALS_2_1000_300_False 201 309 2486 FACE_AREAS_NORMALS_2_1000_300_True 107 186 4673 FACE_AREAS_NORMALS_2_1000_3000_False 1451 1636 345 FACE_AREAS_NORMALS_2_1000_3000_True 107 186 4655 FACE_AREAS_NORMALS_10_100_300_False 767 918 653 FACE_AREAS_NORMALS_10_100_300_True 106 167 4712 FACE_AREAS_NORMALS_10_100_3000_False 7036 7754 72 FACE_AREAS_NORMALS_10_100_3000_True 113 164 4445 FACE_AREAS_NORMALS_10_1000_300_False 748 947 669 FACE_AREAS_NORMALS_10_1000_300_True 108 169 4638 FACE_AREAS_NORMALS_10_1000_3000_False 7069 7783 71 FACE_AREAS_NORMALS_10_1000_3000_True 108 172 4646 FACE_AREAS_NORMALS_32_100_300_False 2286 2496 219 FACE_AREAS_NORMALS_32_100_300_True 108 180 4631 FACE_AREAS_NORMALS_32_100_3000_False 23184 24369 22 FACE_AREAS_NORMALS_32_100_3000_True 159 213 3147 FACE_AREAS_NORMALS_32_1000_300_False 2414 2645 208 FACE_AREAS_NORMALS_32_1000_300_True 112 197 4480 FACE_AREAS_NORMALS_32_1000_3000_False 21687 22964 24 FACE_AREAS_NORMALS_32_1000_3000_True 141 211 3540 -------------------------------------------------------------------------------- Benchmark Avg Time(μs) Peak Time(μs) Iterations -------------------------------------------------------------------------------- FACE_AREAS_NORMALS_TORCH_2_100_300_False 5465 5782 92 FACE_AREAS_NORMALS_TORCH_2_100_300_True 1198 1351 418 FACE_AREAS_NORMALS_TORCH_2_100_3000_False 48228 48869 11 FACE_AREAS_NORMALS_TORCH_2_100_3000_True 1186 1304 422 FACE_AREAS_NORMALS_TORCH_2_1000_300_False 5556 6097 90 FACE_AREAS_NORMALS_TORCH_2_1000_300_True 1200 1328 417 FACE_AREAS_NORMALS_TORCH_2_1000_3000_False 48683 50016 11 FACE_AREAS_NORMALS_TORCH_2_1000_3000_True 1185 1306 422 FACE_AREAS_NORMALS_TORCH_10_100_300_False 24215 25097 21 FACE_AREAS_NORMALS_TORCH_10_100_300_True 1150 1314 435 FACE_AREAS_NORMALS_TORCH_10_100_3000_False 232605 234952 3 FACE_AREAS_NORMALS_TORCH_10_100_3000_True 1193 1314 420 FACE_AREAS_NORMALS_TORCH_10_1000_300_False 24912 25343 21 FACE_AREAS_NORMALS_TORCH_10_1000_300_True 1216 1330 412 FACE_AREAS_NORMALS_TORCH_10_1000_3000_False 239907 241253 3 FACE_AREAS_NORMALS_TORCH_10_1000_3000_True 1226 1333 408 FACE_AREAS_NORMALS_TORCH_32_100_300_False 73991 75776 7 FACE_AREAS_NORMALS_TORCH_32_100_300_True 1193 1339 420 FACE_AREAS_NORMALS_TORCH_32_100_3000_False 728932 728932 1 FACE_AREAS_NORMALS_TORCH_32_100_3000_True 1186 1359 422 FACE_AREAS_NORMALS_TORCH_32_1000_300_False 76385 79129 7 FACE_AREAS_NORMALS_TORCH_32_1000_300_True 1165 1310 430 FACE_AREAS_NORMALS_TORCH_32_1000_3000_False 753276 753276 1 FACE_AREAS_NORMALS_TORCH_32_1000_3000_True 1205 1340 415 -------------------------------------------------------------------------------- ``` Reviewed By: bottler, jcjohnson Differential Revision: D19864385 fbshipit-source-id: 3a87ae41a8e3ab5560febcb94961798f2e09dfb8	2020-02-13 11:42:48 -08:00
Nikhila Ravi	dcb094800f	ignore cuda for cpu only installation Summary: Added if `WITH_CUDA` checks for points/mesh rasterization. If installing on cpu only then this causes `Undefined symbol` errors when trying to import pytorch3d. We had these checks for all the other cuda files but not the rasterization files. Thanks ppwwyyxx for the tip! Reviewed By: ppwwyyxx, gkioxari Differential Revision: D19801495 fbshipit-source-id: 20e7adccfdb33ac731c00a89414b2beaf0a35529	2020-02-08 09:14:47 -08:00
Justin Johnson	e290f87ca9	Add CPU implementation for nearest neighbor Summary: Adds a CPU implementation for `pytorch3d.ops.nn_points_idx`. Also renames the associated C++ and CUDA functions to use `AllCaps` names used in other C++ / CUDA code. Reviewed By: gkioxari Differential Revision: D19670491 fbshipit-source-id: 1b6409404025bf05e6a93f5d847e35afc9062f05	2020-02-03 10:06:10 -08:00
facebook-github-bot	dbf06b504b	Initial commit fbshipit-source-id: ad58e416e3ceeca85fae0583308968d04e78fe0d	2020-01-23 11:53:46 -08:00

36 Commits