Summary:
Pillow 10 removed `Image.ANTIALIAS` (the built 10.4.0 wheel raises `AttributeError`). Replace it with `Image.Resampling.LANCZOS`, the documented successor, which is dual-compat across Pillow 9.4.0 / 10.4.0 / 11.3.0 / 12.2.0. This lands cleanly under the current 9.4.0 pin ahead of the fleet pin bump. Pure constant rename, no behavior change.
PyTorch3D is Meta-authored OSS (fbcode is the source of truth); this change is exported to github.com/facebookresearch/pytorch3d, where the `ANTIALIAS` -> `Resampling.LANCZOS` modernization is equally valid.
Part of the Pillow 9.x -> 10.x migration; see `third-party/pypi/pillow/.agents/migrate-9-to-10.md`.
___
Differential Revision: D108493452
fbshipit-source-id: bb1588e9b2057c6cc27a3d6c382faa4b2ac65f7f
Summary:
This diff was automatically generated by the Pyre per-target upgrade tool.
It adds `# pyre-fixme` or `pyrefly: ignore` comments to suppress type errors that will be introduced by an upcoming Pyre or Pyrefly release. These suppressions allow the upgrade to proceed without breaking existing code.
wed - upgrade new suppression fix
#pyreupgrade
Differential Revision: D108188975
fbshipit-source-id: 65fb6fb0dbd6ade15bd8d85da912413eab2e41f9
Summary:
Add explicit dtype checks for input (torch.float32) and edges (torch.int64) in GatherScatter.forward and gather_scatter_python to match C++ TensorAccessor<float,2> and TensorAccessor<int64_t,2> expectations.
Python previously validated ndim, shape, and input dtype in forward but not edges dtype, and gather_scatter_python lacked dtype checks entirely, relying on ATen error from accessor. This makes errors python-friendly and guards C++ accessor before TensorAccessor construction.
___
Differential Revision: D108140422
fbshipit-source-id: ba54e857279a480a02e2c8f27e316f2e23cc6092
Summary:
Automated migration to enable Pyrefly type checking for `fbcode/vision/fair`.
- Added `python.set_pyrefly(True)` to PACKAGE file
- Suppressed pre-existing type errors
Pyrefly is Meta's next-generation Python type checker, replacing Pyre.
If you encounter issues, you can revert the PACKAGE change by removing
the `python.set_pyrefly(True)` line.
#pyreupgrade
Differential Revision: D107142434
fbshipit-source-id: 25929bb3d5a310d00dab11a46c5395df94357feb
Summary:
Enables building pytorch3d's `_C` extension against a ROCm-built PyTorch and running the test suite on AMD GPUs, including the pulsar subrenderer. Verified on AMD Instinct MI250X (gfx90a, warpSize=64), HIP 7.2, PyTorch 2.13.
## Mechanics
`torch.utils.cpp_extension.BuildExtension` auto-hipifies `.cu` sources of a `CUDAExtension` against a HIP-built torch (`cuda_runtime.h → hip/hip_runtime.h`, `cub:: → hipcub::`, `cudaStream_t → hipStream_t`, etc.), so most of the lift is build-system glue and a small number of CUDA intrinsics that don't have HIP equivalents.
- `setup.py`: detect ROCm via `torch.version.hip is not None`; treat `ROCM_HOME` as the GPU-toolkit-root analogue of `CUDA_HOME` (without this, `CUDA_HOME is None` silently demoted the build to a CPU-only `CppExtension`); skip `CUB_HOME`, CUDA-13 visibility flags, and `-ccbin=` on ROCm.
- `pytorch3d/csrc/pulsar/gpu/commands.h`: CUDA's `_rn`-suffixed FP rounding intrinsics (`__fadd_rn`, `__fdiv_rn`, `__fsqrt_rn`, `__fmaf_rn`, `__frcp_rn`) and `__saturatef` have no HIP equivalents — AMD's GPU ISA has no instruction-level rounding-mode override, so they expand to plain operators / `sqrtf` / `fmaf` / `1.0f/x` / `fmaxf(0,fminf(1,x))` on the `USE_ROCM` arm, which are rounding-mode-equivalent (both round-to-nearest-even). The HIP compiler may fuse `a+b*c` into a single-rounding FMA where CUDA's `_rn` would have prevented it; if FMA-fusion drift ever becomes a numerical issue, add `-ffp-contract=off` to pulsar's HIPCC flags. `__powf` is replaced with `powf`. `atomicAdd_block` has no HIP function-name equivalent — the semantic equivalent is `__hip_atomic_fetch_add(ptr, val, __ATOMIC_RELAXED, __HIP_MEMORY_SCOPE_WORKGROUP)` (plain HIP `atomicAdd` is device-scope, strictly stronger than block-scope and forces L2-coherent atomics).
- `tests/test_point_mesh_distance.py`: loosen `grad_faces` tolerance in `test_point_face_distance` from `5e-7` to `5e-6` to match the sibling `test_face_point_distance`. The backward kernel uses `atomicAdd` and calls `alertNotDeterministic`; FP add order varies by wavefront width.
- The X_t / camera-R/T equality checks in `test_points_alignment.py` and `test_cameras_alignment.py` are now skipped when `n_points <= dim` (resp. `batch_size <= 3` for camera-center alignment in 3D). Mean-centering renders the SVD rank-deficient in those cases, so the rotation around the degenerate axis is non-unique and different BLAS implementations (rocBLAS RDNA vs CDNA, cuBLAS) pick different valid null-space directions. The center-alignment check still runs and verifies the well-defined part of the transformation.
Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/2039
Test Plan:
All GPU tests pass on both AMD Instinct MI250X (gfx90a, wave64, HIP 7.2) and AMD Radeon Pro W7800 (gfx1100, wave32, HIP 7.2.53211, torch 2.13.0a0).
| Module | Result |
|---|---|
| knn, ball_query, sample_farthest_points, face_areas_normals | all pass |
| rasterize_points, rasterize_meshes, chamfer, packed_to_padded | all pass |
| interpolate_face_attributes, blending, compositing, sample_pdf, mesh_normal_consistency | all pass |
| point_mesh_distance | 9/9 pass (with tolerance fix in this PR) |
| pulsar/test_forward, test_channels, test_depth, test_hands, test_ortho, test_small_spheres | 10 passed (FB_TEST=1) |
| test_render_points pulsar tests, test_camera_conversions::test_pulsar_conversion | 3 passed |
| points_to_volumes, iou_box3d, marching_cubes | 20 failures, all env-only |
The 20 env-only failures are `torch.inverse()` on CPU tensors in test reference paths; this verification host's PyTorch was built with `USE_LAPACK: 0` (only `mkl-static` `.a` archives in the conda env; PyTorch's `FindBLAS` looks for `libmkl_intel_lp64.so`). Unrelated to the port — re-verifying with a LAPACK-linked PyTorch is left to upstream.
Reviewed By: MichaelRamamonjisoa
Differential Revision: D106825690
Pulled By: bottler
fbshipit-source-id: f7a9b6028e6fb555f3b8c0f9792e88b818327166
Summary:
This diff was automatically generated by the Pyre per-target upgrade tool.
It adds `# pyre-fixme` or `pyrefly: ignore` comments to suppress type errors that will be introduced by an upcoming Pyre or Pyrefly release. These suppressions allow the upgrade to proceed without breaking existing code.
Pyrefly Upgrade - f-string fix
#pyreupgrade
Differential Revision: D105268300
fbshipit-source-id: 2f19758e20755944509fe14fc256002c652052a5
Summary:
No one is using these.
(The minify part has been broken for a couple of years, too)
Reviewed By: patricklabatut
Differential Revision: D96977684
fbshipit-source-id: 4708dfd37b14d1930f1370677eb126a61a0d9d3c
Summary: Remove the Support Ukraine banner component and its usage from the PyTorch3D website homepage.
Reviewed By: bottler
Differential Revision: D96559642
fbshipit-source-id: fd716cde7145d5c0105b2d2fb569375395b9b5de
Summary:
Replace boolean indexing and torch.is_grad_enabled() control flow in _sqrt_positive_part with a pure torch.where implementation. The old code used ret[positive_mask] = torch.sqrt(x[positive_mask]) which produces an incorrect ONNX Where/index_put node with mismatched broadcast shapes when the model is exported via torch.onnx.export.
The new implementation substitutes 1.0 for non-positive values before sqrt (avoiding infinite gradient at sqrt(0)) and masks the result back to 0, preserving the zero-subgradient-at-zero property.
Fixes https://github.com/facebookresearch/pytorch3d/issues/2020
Reviewed By: sgrigory
Differential Revision: D94365479
fbshipit-source-id: a1ebe8dc077573f83efc262520b6669159b83ef0
Summary:
Added `atol=1e-4` tolerance parameter to the `assertClose` calls on lines 682 and 683 in the `test_inverse` method of `TestTranslate` class.
This is a retry of D90225548
Reviewed By: sgrigory
Differential Revision: D90682979
fbshipit-source-id: ac13f000174dd9962326296e1c3116d0d39c7751
Summary:
## LLM-generated Summary:
Replaces self.assertTrue(torch.allclose(...)) with self.assertClose(...) throughout fbcode/vision/fair/pytorch3d/tests/test_transforms.py. This standardizes numeric closeness assertions for clearer failures and consistency while preserving tolerances and test behavior.
---
Session: DEV34970678
Reviewed By: shapovalov
Differential Revision: D90251428
fbshipit-source-id: cdae842be82f0ba548802e6977be272134e8508c
Summary:
CUDA 13.0 introduced breaking changes that cause build failures in pytorch3d:
**1. Symbol Visibility Changes (pulsar)**
- NVCC now forces `__global__` functions to have hidden ELF visibility by default
- `__global__` function template stubs now have internal linkage
**Fix:** Added NVCC flags (`--device-entity-has-hidden-visibility=false` and `-static-global-template-stub=false`) for fbcode builds with CUDA 13.0+.
**2. cuCtxCreate API Change (pycuda)**
- CUDA 13.0 changed `cuCtxCreate` from 3 to 4 arguments
- pycuda 2022.2 (current default) uses the old signature and fails to compile
- pycuda 2025.1.2 (D83501913) includes the CUDA 13.0 fix
**Fix:** Added CUDA 13.0 constraint to pycuda alias to auto-select pycuda 2025.1.2.
**NCCL Compatibility Note:**
- Current stable NCCL (2.25) is NOT compatible with CUDA 13.0 (`cudaTypedefs.h` removed)
- NCCL 2.27+ works with CUDA 13.0 and will become stable in early January 2026 (per HPC Comms team)
- Until then, CUDA 13.0 builds require `-c hpc_comms.use_nccl=2.27`
References:
- GitHub issue: https://github.com/facebookresearch/pytorch3d/issues/2011
- NVIDIA blog: https://developer.nvidia.com/blog/cuda-c-compiler-updates-impacting-elf-visibility-and-linkage/
- FBGEMM_GPU fix: D86474263
- pycuda 2025.1.2 buckification: D83501913
Reviewed By: bottler
Differential Revision: D88816596
fbshipit-source-id: 1ba666dab8c0e06d1286b8d5bc5d84cfc55c86e6
Summary: When using `sample_farthest_points` with `lengths`, it throws an error because of the device mismatch between `lengths` and `torch.rand(lengths.size())` on GPU.
Reviewed By: bottler
Differential Revision: D82378997
fbshipit-source-id: 8e929256177d543d1dd1249e8488f70e03e4101f
Summary: Some random seed changes. Skip multigpu tests when there's only one gpu. This is a better fix for what AI is doing in D80600882.
Reviewed By: MichaelRamamonjisoa
Differential Revision: D80625966
fbshipit-source-id: ac3952e7144125fd3a05ad6e4e6e5976ae10a8ef
Summary:
Optimizing sample_farthest_poinst by reducing CPU/GPU sync:
1. replacing iterative randint for starting indexes for 1 function call, if length is constant
2. Avoid sync in fetching maxumum of sample points, if we sample the same amount
3. Initializing 1 tensor for samples and indixes
compare
https://fburl.com/mlhub/7wk0xi98
Before
{F1980383703}
after
{F1980383707}
Histogram match pretty closely
{F1980464338}
Reviewed By: bottler
Differential Revision: D78731869
fbshipit-source-id: 060528ae7a1e0fbbd005d129c151eaf9405841de
Summary:
Fixes hard crashes (bus errors) when using MPS device (Apple Silicon) by implementing CPU checks throughout files in csrc subdirectories to check if on same mesh on a CPU device.
Note that this is the fourth and ultimate part of a larger change through multiple files & directories.
Reviewed By: bottler
Differential Revision: D77698176
fbshipit-source-id: 5bc9e3c5cea61afd486aed7396f390d92775ec6d
Summary:
Adds CHECK_CPU macros that checks if a tensor is on the CPU device throughout csrc directories and subdir up to `pulsar`.
Note that this is the third part of a larger change, and to keep diffs better organized, subsequent diffs will update the remaining directories.
Reviewed By: bottler
Differential Revision: D77696998
fbshipit-source-id: 470ca65b23d9965483b5bdd30c712da8e1131787
Summary:
Adds CHECK_CPU macros that checks if a tensor is on the CPU device throughout csrc directories up to `marching_cubes`. Directories updated include those in `gather_scatter`, `interp_face_attrs`, `iou_box3d`, `knn`, and `marching_cubes`.
Note that this is the second part of a larger change, and to keep diffs better organized, subsequent diffs will update the remaining directories.
Reviewed By: bottler
Differential Revision: D77558550
fbshipit-source-id: 762a0fe88548dc8d0901b198a11c40d0c36e173f
Summary:
Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/1986
Adds device checks to prevent crashes on unsupported devices in PyTorch3D. Updates the `pytorch3d_cutils.h` file to include new macro CHECK_CPU that checks if a tensor is on the CPU device. This macro is then used in the directories from `ball_query` to `face_area_normals` to ensure that tensors are not on unsupported devices like MPS.
Note that this is the first part of a larger change, and to keep diffs better organized, subsequent diffs will update the remaining directories.
Reviewed By: bottler
Differential Revision: D77473296
fbshipit-source-id: 13dc84620dee667bddebad1dade2d2cb5a59c737
Summary:
The current implementation of `matrix_to_quaternion` and `_sqrt_positive_part` uses boolean indexing, which can slow down performance and cause incompatibility with `torch.compile` unless `torch._dynamo.config.capture_dynamic_output_shape_ops` is set to `True`.
To enhance performance and compatibility, I recommend using `torch.gather` to select the best-conditioned quaternions and `F.relu` instead of `x>0` (bottler's suggestion)
For a detailed comparison of the implementation differences when using `torch.compile`, please refer to my Bento notebook
N7438339.
Reviewed By: bottler
Differential Revision: D77176230
fbshipit-source-id: 9a6a2e0015b5865056297d5f45badc3c425b93ce