pytorch3d

mirror of https://github.com/facebookresearch/pytorch3d.git synced 2026-06-17 12:38:53 +08:00

Author	SHA1	Message	Date
Jeff Daily	b73d735ecf	Port pytorch3d (#2039 ) Summary: Enables building pytorch3d's `_C` extension against a ROCm-built PyTorch and running the test suite on AMD GPUs, including the pulsar subrenderer. Verified on AMD Instinct MI250X (gfx90a, warpSize=64), HIP 7.2, PyTorch 2.13. ## Mechanics `torch.utils.cpp_extension.BuildExtension` auto-hipifies `.cu` sources of a `CUDAExtension` against a HIP-built torch (`cuda_runtime.h → hip/hip_runtime.h`, `cub:: → hipcub::`, `cudaStream_t → hipStream_t`, etc.), so most of the lift is build-system glue and a small number of CUDA intrinsics that don't have HIP equivalents. - `setup.py`: detect ROCm via `torch.version.hip is not None`; treat `ROCM_HOME` as the GPU-toolkit-root analogue of `CUDA_HOME` (without this, `CUDA_HOME is None` silently demoted the build to a CPU-only `CppExtension`); skip `CUB_HOME`, CUDA-13 visibility flags, and `-ccbin=` on ROCm. - `pytorch3d/csrc/pulsar/gpu/commands.h`: CUDA's `_rn`-suffixed FP rounding intrinsics (`__fadd_rn`, `__fdiv_rn`, `__fsqrt_rn`, `__fmaf_rn`, `__frcp_rn`) and `__saturatef` have no HIP equivalents — AMD's GPU ISA has no instruction-level rounding-mode override, so they expand to plain operators / `sqrtf` / `fmaf` / `1.0f/x` / `fmaxf(0,fminf(1,x))` on the `USE_ROCM` arm, which are rounding-mode-equivalent (both round-to-nearest-even). The HIP compiler may fuse `a+b*c` into a single-rounding FMA where CUDA's `_rn` would have prevented it; if FMA-fusion drift ever becomes a numerical issue, add `-ffp-contract=off` to pulsar's HIPCC flags. `__powf` is replaced with `powf`. `atomicAdd_block` has no HIP function-name equivalent — the semantic equivalent is `__hip_atomic_fetch_add(ptr, val, __ATOMIC_RELAXED, __HIP_MEMORY_SCOPE_WORKGROUP)` (plain HIP `atomicAdd` is device-scope, strictly stronger than block-scope and forces L2-coherent atomics). - `tests/test_point_mesh_distance.py`: loosen `grad_faces` tolerance in `test_point_face_distance` from `5e-7` to `5e-6` to match the sibling `test_face_point_distance`. The backward kernel uses `atomicAdd` and calls `alertNotDeterministic`; FP add order varies by wavefront width. - The X_t / camera-R/T equality checks in `test_points_alignment.py` and `test_cameras_alignment.py` are now skipped when `n_points <= dim` (resp. `batch_size <= 3` for camera-center alignment in 3D). Mean-centering renders the SVD rank-deficient in those cases, so the rotation around the degenerate axis is non-unique and different BLAS implementations (rocBLAS RDNA vs CDNA, cuBLAS) pick different valid null-space directions. The center-alignment check still runs and verifies the well-defined part of the transformation. Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/2039 Test Plan: All GPU tests pass on both AMD Instinct MI250X (gfx90a, wave64, HIP 7.2) and AMD Radeon Pro W7800 (gfx1100, wave32, HIP 7.2.53211, torch 2.13.0a0). \| Module \| Result \| \|---\|---\| \| knn, ball_query, sample_farthest_points, face_areas_normals \| all pass \| \| rasterize_points, rasterize_meshes, chamfer, packed_to_padded \| all pass \| \| interpolate_face_attributes, blending, compositing, sample_pdf, mesh_normal_consistency \| all pass \| \| point_mesh_distance \| 9/9 pass (with tolerance fix in this PR) \| \| pulsar/test_forward, test_channels, test_depth, test_hands, test_ortho, test_small_spheres \| 10 passed (FB_TEST=1) \| \| test_render_points pulsar tests, test_camera_conversions::test_pulsar_conversion \| 3 passed \| \| points_to_volumes, iou_box3d, marching_cubes \| 20 failures, all env-only \| The 20 env-only failures are `torch.inverse()` on CPU tensors in test reference paths; this verification host's PyTorch was built with `USE_LAPACK: 0` (only `mkl-static` `.a` archives in the conda env; PyTorch's `FindBLAS` looks for `libmkl_intel_lp64.so`). Unrelated to the port — re-verifying with a LAPACK-linked PyTorch is left to upstream. Reviewed By: MichaelRamamonjisoa Differential Revision: D106825690 Pulled By: bottler fbshipit-source-id: f7a9b6028e6fb555f3b8c0f9792e88b818327166	2026-06-01 06:08:12 -07:00
Jeremy Reizenstein	34f648ede0	move targets Summary: Move testing targets from pytorch3d/tests/TARGETS to pytorch3d/TARGETS. Reviewed By: shapovalov Differential Revision: D36186940 fbshipit-source-id: a4c52c4d99351f885e2b0bf870532d530324039b	2022-05-25 06:16:03 -07:00
Tim Hatch	34bbb3ad32	apply import merging for fbcode/vision/fair (2 of 2) Summary: Applies new import merging and sorting from µsort v1.0. When merging imports, µsort will make a best-effort to move associated comments to match merged elements, but there are known limitations due to the diynamic nature of Python and developer tooling. These changes should not produce any dangerous runtime changes, but may require touch-ups to satisfy linters and other tooling. Note that µsort uses case-insensitive, lexicographical sorting, which results in a different ordering compared to isort. This provides a more consistent sorting order, matching the case-insensitive order used when sorting import statements by module name, and ensures that "frog", "FROG", and "Frog" always sort next to each other. For details on µsort's sorting and merging semantics, see the user guide: https://usort.readthedocs.io/en/stable/guide.html#sorting Reviewed By: bottler Differential Revision: D35553814 fbshipit-source-id: be49bdb6a4c25264ff8d4db3a601f18736d17be1	2022-04-13 06:51:33 -07:00
Winnie Lin	471b126818	add min_triangle_area argument to IsInsideTriangle Summary: 1. changed IsInsideTriangle in geometry_utils to take in min_triangle_area parameter instead of hardcoded value 2. updated point_mesh_cpu.cpp and point_mesh_cuda.[h/cu] to adapt to changes in geometry_utils function signatures 3. updated point_mesh_distance.py and test_point_mesh_distance.py to modify _C. calls Reviewed By: bottler Differential Revision: D34459764 fbshipit-source-id: 0549e78713c6d68f03d85fb597a13dd88e09b686	2022-02-25 12:43:04 -08:00
Jeremy Reizenstein	4d043fc9ac	PyTorch 1.7 compatibility Summary: Small changes discovered based on circleCI failures. Reviewed By: patricklabatut Differential Revision: D34426807 fbshipit-source-id: 819860f34b2f367dd24057ca7490284204180a13	2022-02-25 07:53:34 -08:00
Georgia Gkioxari	ee71c7c447	small numerical fix to point_mesh Summary: Small fix by adjusting the area `eps` to account for really small faces when computing point to face distances Reviewed By: bottler Differential Revision: D34331336 fbshipit-source-id: 51c4888ea46fefa4e31d5b0bb494a9f9d77813cd	2022-02-21 09:26:38 -08:00
Jeremy Reizenstein	9eeb456e82	Update license for company name Summary: Update all FB license strings to the new format. Reviewed By: patricklabatut Differential Revision: D33403538 fbshipit-source-id: 97a4596c5c888f3c54f44456dc07e718a387a02c	2022-01-04 11:43:38 -08:00
Patrick Labatut	af93f34834	License lint codebase Summary: License lint codebase Reviewed By: theschnitz Differential Revision: D29001799 fbshipit-source-id: 5c59869911785b0181b1663bbf430bc8b7fb2909	2021-06-22 03:45:27 -07:00
Georgia Gkioxari	88f5d79088	fix small face issue for ptmeshdist Summary: Fix small face issue for point_mesh distance computation. The issue lies in the computation of `IsInsideTriangle` which is unstable and non-symmetrical when faces with small areas are given as input. This diff fixes the issue by returning `False` for `IsInsideTriangle` when small faces are given as input. Reviewed By: bottler Differential Revision: D29163052 fbshipit-source-id: be297002f26b5e6eded9394fde00553a37406bee	2021-06-18 09:29:55 -07:00
Jeremy Reizenstein	74659aef26	CPU implementation for point_mesh functions Summary: point_mesh functions were missing CPU implementations. The indices returned are not always matching, possibly due to numerical instability. Reviewed By: gkioxari Differential Revision: D21594264 fbshipit-source-id: 3016930e2a9a0f3cd8b3ac4c94a92c9411c0989d	2020-06-15 10:11:26 -07:00
Jeremy Reizenstein	7f1e63aed1	Take care with single integers on gpu Summary: Pytorch seems to be becoming stricter about integer tensors of shape `(1,)` on GPU and not allowing them to be used as `int`s. For example the following no longer works on pytorch master, foo = torch.tensor([3, 5, 3], device="cuda:0") torch.arange(10) + foo[0] because this is the sum of tensors on different devices. Here fix tests which recently broke because of this. Reviewed By: nikhilaravi Differential Revision: D21929745 fbshipit-source-id: 25374f70468d1c895372766f1a9dd61df0833957	2020-06-10 14:13:30 -07:00
Nikhila Ravi	0eca74fa5f	lint fixes Summary: Ran the linter. TODO: need to update the linter as per D21353065. Reviewed By: bottler Differential Revision: D21362270 fbshipit-source-id: ad0e781de0a29f565ad25c43bc94a19b1828c020	2020-05-04 09:56:44 -07:00
Nikhila Ravi	c3d636dc8c	Cuda updates Summary: Updates to: - enable cuda kernel launches on any GPU (not just the default) - cuda and contiguous checks for all kernels - checks to ensure all tensors are on the same device - error reporting in the cuda kernels - cuda tests now run on a random device not just the default Reviewed By: jcjohnson, gkioxari Differential Revision: D21215280 fbshipit-source-id: 1bedc9fe6c35e9e920bdc4d78ed12865b1005519	2020-04-24 09:11:04 -07:00
Nikhila Ravi	b530b0af32	lint fixes Summary: Resolved trailing whitespace warnings. Reviewed By: gkioxari Differential Revision: D21023982 fbshipit-source-id: 14ea2ca372c13cfa987acc260264ca99ce44c461	2020-04-15 21:58:59 -07:00
Georgia Gkioxari	487d4d6607	point mesh distances Summary: Implementation of point to mesh distances. The current diff contains two types: (a) Point to Edge (b) Point to Face ``` Benchmark Avg Time(μs) Peak Time(μs) Iterations -------------------------------------------------------------------------------- POINT_MESH_EDGE_4_100_300_5000_cuda:0 2745 3138 183 POINT_MESH_EDGE_4_100_300_10000_cuda:0 4408 4499 114 POINT_MESH_EDGE_4_100_3000_5000_cuda:0 4978 5070 101 POINT_MESH_EDGE_4_100_3000_10000_cuda:0 9076 9187 56 POINT_MESH_EDGE_4_1000_300_5000_cuda:0 1411 1487 355 POINT_MESH_EDGE_4_1000_300_10000_cuda:0 4829 5030 104 POINT_MESH_EDGE_4_1000_3000_5000_cuda:0 7539 7620 67 POINT_MESH_EDGE_4_1000_3000_10000_cuda:0 12088 12272 42 POINT_MESH_EDGE_8_100_300_5000_cuda:0 3106 3222 161 POINT_MESH_EDGE_8_100_300_10000_cuda:0 8561 8648 59 POINT_MESH_EDGE_8_100_3000_5000_cuda:0 6932 7021 73 POINT_MESH_EDGE_8_100_3000_10000_cuda:0 24032 24176 21 POINT_MESH_EDGE_8_1000_300_5000_cuda:0 5272 5399 95 POINT_MESH_EDGE_8_1000_300_10000_cuda:0 11348 11430 45 POINT_MESH_EDGE_8_1000_3000_5000_cuda:0 17478 17683 29 POINT_MESH_EDGE_8_1000_3000_10000_cuda:0 25961 26236 20 POINT_MESH_EDGE_16_100_300_5000_cuda:0 8244 8323 61 POINT_MESH_EDGE_16_100_300_10000_cuda:0 18018 18071 28 POINT_MESH_EDGE_16_100_3000_5000_cuda:0 19428 19544 26 POINT_MESH_EDGE_16_100_3000_10000_cuda:0 44967 45135 12 POINT_MESH_EDGE_16_1000_300_5000_cuda:0 7825 7937 64 POINT_MESH_EDGE_16_1000_300_10000_cuda:0 18504 18571 28 POINT_MESH_EDGE_16_1000_3000_5000_cuda:0 65805 66132 8 POINT_MESH_EDGE_16_1000_3000_10000_cuda:0 90885 91089 6 -------------------------------------------------------------------------------- Benchmark Avg Time(μs) Peak Time(μs) Iterations -------------------------------------------------------------------------------- POINT_MESH_FACE_4_100_300_5000_cuda:0 1561 1685 321 POINT_MESH_FACE_4_100_300_10000_cuda:0 2818 2954 178 POINT_MESH_FACE_4_100_3000_5000_cuda:0 15893 16018 32 POINT_MESH_FACE_4_100_3000_10000_cuda:0 16350 16439 31 POINT_MESH_FACE_4_1000_300_5000_cuda:0 3179 3278 158 POINT_MESH_FACE_4_1000_300_10000_cuda:0 2353 2436 213 POINT_MESH_FACE_4_1000_3000_5000_cuda:0 16262 16336 31 POINT_MESH_FACE_4_1000_3000_10000_cuda:0 9334 9448 54 POINT_MESH_FACE_8_100_300_5000_cuda:0 4377 4493 115 POINT_MESH_FACE_8_100_300_10000_cuda:0 9728 9822 52 POINT_MESH_FACE_8_100_3000_5000_cuda:0 26428 26544 19 POINT_MESH_FACE_8_100_3000_10000_cuda:0 42238 43031 12 POINT_MESH_FACE_8_1000_300_5000_cuda:0 3891 3982 129 POINT_MESH_FACE_8_1000_300_10000_cuda:0 5363 5429 94 POINT_MESH_FACE_8_1000_3000_5000_cuda:0 20998 21084 24 POINT_MESH_FACE_8_1000_3000_10000_cuda:0 39711 39897 13 POINT_MESH_FACE_16_100_300_5000_cuda:0 5955 6001 84 POINT_MESH_FACE_16_100_300_10000_cuda:0 12082 12144 42 POINT_MESH_FACE_16_100_3000_5000_cuda:0 44996 45176 12 POINT_MESH_FACE_16_100_3000_10000_cuda:0 73042 73197 7 POINT_MESH_FACE_16_1000_300_5000_cuda:0 8292 8374 61 POINT_MESH_FACE_16_1000_300_10000_cuda:0 19442 19506 26 POINT_MESH_FACE_16_1000_3000_5000_cuda:0 36059 36194 14 POINT_MESH_FACE_16_1000_3000_10000_cuda:0 64644 64822 8 -------------------------------------------------------------------------------- ``` Reviewed By: jcjohnson Differential Revision: D20590462 fbshipit-source-id: 42a39837b514a546ac9471bfaff60eefe7fae829	2020-04-11 00:21:24 -07:00

15 Commits