pytorch3d

mirror of https://github.com/facebookresearch/pytorch3d.git synced 2026-06-17 20:48:55 +08:00

Author	SHA1	Message	Date
Jeff Daily	b73d735ecf	Port pytorch3d (#2039 ) Summary: Enables building pytorch3d's `_C` extension against a ROCm-built PyTorch and running the test suite on AMD GPUs, including the pulsar subrenderer. Verified on AMD Instinct MI250X (gfx90a, warpSize=64), HIP 7.2, PyTorch 2.13. ## Mechanics `torch.utils.cpp_extension.BuildExtension` auto-hipifies `.cu` sources of a `CUDAExtension` against a HIP-built torch (`cuda_runtime.h → hip/hip_runtime.h`, `cub:: → hipcub::`, `cudaStream_t → hipStream_t`, etc.), so most of the lift is build-system glue and a small number of CUDA intrinsics that don't have HIP equivalents. - `setup.py`: detect ROCm via `torch.version.hip is not None`; treat `ROCM_HOME` as the GPU-toolkit-root analogue of `CUDA_HOME` (without this, `CUDA_HOME is None` silently demoted the build to a CPU-only `CppExtension`); skip `CUB_HOME`, CUDA-13 visibility flags, and `-ccbin=` on ROCm. - `pytorch3d/csrc/pulsar/gpu/commands.h`: CUDA's `_rn`-suffixed FP rounding intrinsics (`__fadd_rn`, `__fdiv_rn`, `__fsqrt_rn`, `__fmaf_rn`, `__frcp_rn`) and `__saturatef` have no HIP equivalents — AMD's GPU ISA has no instruction-level rounding-mode override, so they expand to plain operators / `sqrtf` / `fmaf` / `1.0f/x` / `fmaxf(0,fminf(1,x))` on the `USE_ROCM` arm, which are rounding-mode-equivalent (both round-to-nearest-even). The HIP compiler may fuse `a+b*c` into a single-rounding FMA where CUDA's `_rn` would have prevented it; if FMA-fusion drift ever becomes a numerical issue, add `-ffp-contract=off` to pulsar's HIPCC flags. `__powf` is replaced with `powf`. `atomicAdd_block` has no HIP function-name equivalent — the semantic equivalent is `__hip_atomic_fetch_add(ptr, val, __ATOMIC_RELAXED, __HIP_MEMORY_SCOPE_WORKGROUP)` (plain HIP `atomicAdd` is device-scope, strictly stronger than block-scope and forces L2-coherent atomics). - `tests/test_point_mesh_distance.py`: loosen `grad_faces` tolerance in `test_point_face_distance` from `5e-7` to `5e-6` to match the sibling `test_face_point_distance`. The backward kernel uses `atomicAdd` and calls `alertNotDeterministic`; FP add order varies by wavefront width. - The X_t / camera-R/T equality checks in `test_points_alignment.py` and `test_cameras_alignment.py` are now skipped when `n_points <= dim` (resp. `batch_size <= 3` for camera-center alignment in 3D). Mean-centering renders the SVD rank-deficient in those cases, so the rotation around the degenerate axis is non-unique and different BLAS implementations (rocBLAS RDNA vs CDNA, cuBLAS) pick different valid null-space directions. The center-alignment check still runs and verifies the well-defined part of the transformation. Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/2039 Test Plan: All GPU tests pass on both AMD Instinct MI250X (gfx90a, wave64, HIP 7.2) and AMD Radeon Pro W7800 (gfx1100, wave32, HIP 7.2.53211, torch 2.13.0a0). \| Module \| Result \| \|---\|---\| \| knn, ball_query, sample_farthest_points, face_areas_normals \| all pass \| \| rasterize_points, rasterize_meshes, chamfer, packed_to_padded \| all pass \| \| interpolate_face_attributes, blending, compositing, sample_pdf, mesh_normal_consistency \| all pass \| \| point_mesh_distance \| 9/9 pass (with tolerance fix in this PR) \| \| pulsar/test_forward, test_channels, test_depth, test_hands, test_ortho, test_small_spheres \| 10 passed (FB_TEST=1) \| \| test_render_points pulsar tests, test_camera_conversions::test_pulsar_conversion \| 3 passed \| \| points_to_volumes, iou_box3d, marching_cubes \| 20 failures, all env-only \| The 20 env-only failures are `torch.inverse()` on CPU tensors in test reference paths; this verification host's PyTorch was built with `USE_LAPACK: 0` (only `mkl-static` `.a` archives in the conda env; PyTorch's `FindBLAS` looks for `libmkl_intel_lp64.so`). Unrelated to the port — re-verifying with a LAPACK-linked PyTorch is left to upstream. Reviewed By: MichaelRamamonjisoa Differential Revision: D106825690 Pulled By: bottler fbshipit-source-id: f7a9b6028e6fb555f3b8c0f9792e88b818327166	2026-06-01 06:08:12 -07:00
Jeremy Reizenstein	34f648ede0	move targets Summary: Move testing targets from pytorch3d/tests/TARGETS to pytorch3d/TARGETS. Reviewed By: shapovalov Differential Revision: D36186940 fbshipit-source-id: a4c52c4d99351f885e2b0bf870532d530324039b	2022-05-25 06:16:03 -07:00
Jeremy Reizenstein	9eeb456e82	Update license for company name Summary: Update all FB license strings to the new format. Reviewed By: patricklabatut Differential Revision: D33403538 fbshipit-source-id: 97a4596c5c888f3c54f44456dc07e718a387a02c	2022-01-04 11:43:38 -08:00
Jeremy Reizenstein	b26f4bc33a	test tolerance loosenings Summary: Increase some test tolerances so that they pass in more situations, and re-enable two tests. Reviewed By: nikhilaravi Differential Revision: D31379717 fbshipit-source-id: 06a25470cc7b6d71cd639d9fd7df500d4b84c079	2021-10-07 10:48:12 -07:00
Patrick Labatut	5284de6e97	Deprecate so3_exponential_map Summary: Deprecate the `so3_exponential_map()` function in favor of its alias `so3_exp_map()`: this aligns with the naming of `so3_log_map()` and the recently introduced `se3_exp_map()` / `se3_log_map()` pair. Reviewed By: bottler Differential Revision: D29329966 fbshipit-source-id: b6f60b9e86b2995f70b1fbeb16f9feea05c55de9	2021-06-28 04:28:06 -07:00
Patrick Labatut	af93f34834	License lint codebase Summary: License lint codebase Reviewed By: theschnitz Differential Revision: D29001799 fbshipit-source-id: 5c59869911785b0181b1663bbf430bc8b7fb2909	2021-06-22 03:45:27 -07:00
Jeremy Reizenstein	cd9786e787	Disable random-dependent tests Summary: These two tests fail (with non-small differences) when the seed is changed or if certain environmental changes are made. We disable them pending investigation. A small change to the tolerance at the failing assertion doesn't help. The change in common_testing helps diagnose this. Reviewed By: shapovalov Differential Revision: D26233419 fbshipit-source-id: 357afc1786825256c9bade101fb15707e4dea5ed	2021-02-03 18:24:27 -08:00
David Novotny	316b77782e	Camera alignment Summary: adds `corresponding_cameras_alignment` function that estimates a similarity transformation between two sets of cameras. The function is essential for computing camera errors in SfM pipelines. ``` Benchmark Avg Time(μs) Peak Time(μs) Iterations -------------------------------------------------------------------------------- CORRESPONDING_CAMERAS_ALIGNMENT_10_centers_False 32219 36211 16 CORRESPONDING_CAMERAS_ALIGNMENT_10_centers_True 32429 36063 16 CORRESPONDING_CAMERAS_ALIGNMENT_10_extrinsics_False 5548 8782 91 CORRESPONDING_CAMERAS_ALIGNMENT_10_extrinsics_True 6153 9752 82 CORRESPONDING_CAMERAS_ALIGNMENT_100_centers_False 33344 40398 16 CORRESPONDING_CAMERAS_ALIGNMENT_100_centers_True 34528 37095 15 CORRESPONDING_CAMERAS_ALIGNMENT_100_extrinsics_False 5576 7187 90 CORRESPONDING_CAMERAS_ALIGNMENT_100_extrinsics_True 6256 9166 80 CORRESPONDING_CAMERAS_ALIGNMENT_1000_centers_False 32020 37247 16 CORRESPONDING_CAMERAS_ALIGNMENT_1000_centers_True 32776 37644 16 CORRESPONDING_CAMERAS_ALIGNMENT_1000_extrinsics_False 5336 8795 94 CORRESPONDING_CAMERAS_ALIGNMENT_1000_extrinsics_True 6266 9929 80 -------------------------------------------------------------------------------- ``` Reviewed By: shapovalov Differential Revision: D22946415 fbshipit-source-id: 8caae7ee365b304d8aa1f8133cf0dd92c35bc0dd	2020-09-03 13:27:14 -07:00

8 Commits