Summary:
Optimizing sample_farthest_poinst by reducing CPU/GPU sync:
1. replacing iterative randint for starting indexes for 1 function call, if length is constant
2. Avoid sync in fetching maxumum of sample points, if we sample the same amount
3. Initializing 1 tensor for samples and indixes
compare
https://fburl.com/mlhub/7wk0xi98
Before
{F1980383703}
after
{F1980383707}
Histogram match pretty closely
{F1980464338}
Reviewed By: bottler
Differential Revision: D78731869
fbshipit-source-id: 060528ae7a1e0fbbd005d129c151eaf9405841de
Summary:
Fixes hard crashes (bus errors) when using MPS device (Apple Silicon) by implementing CPU checks throughout files in csrc subdirectories to check if on same mesh on a CPU device.
Note that this is the fourth and ultimate part of a larger change through multiple files & directories.
Reviewed By: bottler
Differential Revision: D77698176
fbshipit-source-id: 5bc9e3c5cea61afd486aed7396f390d92775ec6d
Summary:
Adds CHECK_CPU macros that checks if a tensor is on the CPU device throughout csrc directories and subdir up to `pulsar`.
Note that this is the third part of a larger change, and to keep diffs better organized, subsequent diffs will update the remaining directories.
Reviewed By: bottler
Differential Revision: D77696998
fbshipit-source-id: 470ca65b23d9965483b5bdd30c712da8e1131787
Summary:
Adds CHECK_CPU macros that checks if a tensor is on the CPU device throughout csrc directories up to `marching_cubes`. Directories updated include those in `gather_scatter`, `interp_face_attrs`, `iou_box3d`, `knn`, and `marching_cubes`.
Note that this is the second part of a larger change, and to keep diffs better organized, subsequent diffs will update the remaining directories.
Reviewed By: bottler
Differential Revision: D77558550
fbshipit-source-id: 762a0fe88548dc8d0901b198a11c40d0c36e173f
Summary:
Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/1986
Adds device checks to prevent crashes on unsupported devices in PyTorch3D. Updates the `pytorch3d_cutils.h` file to include new macro CHECK_CPU that checks if a tensor is on the CPU device. This macro is then used in the directories from `ball_query` to `face_area_normals` to ensure that tensors are not on unsupported devices like MPS.
Note that this is the first part of a larger change, and to keep diffs better organized, subsequent diffs will update the remaining directories.
Reviewed By: bottler
Differential Revision: D77473296
fbshipit-source-id: 13dc84620dee667bddebad1dade2d2cb5a59c737
Summary:
The current implementation of `matrix_to_quaternion` and `_sqrt_positive_part` uses boolean indexing, which can slow down performance and cause incompatibility with `torch.compile` unless `torch._dynamo.config.capture_dynamic_output_shape_ops` is set to `True`.
To enhance performance and compatibility, I recommend using `torch.gather` to select the best-conditioned quaternions and `F.relu` instead of `x>0` (bottler's suggestion)
For a detailed comparison of the implementation differences when using `torch.compile`, please refer to my Bento notebook
N7438339.
Reviewed By: bottler
Differential Revision: D77176230
fbshipit-source-id: 9a6a2e0015b5865056297d5f45badc3c425b93ce
Summary: Resolved self-assignment warnings in the `renderer.forward.device.h` file by removing redundant assignments of the `stream` variable to itself in `cub::DeviceSelect::Flagged` function calls. This change eliminates compiler errors and ensures cleaner, more efficient code execution.
Reviewed By: bottler
Differential Revision: D76554140
fbshipit-source-id: 28eae0186246f51a8ac8002644f184349aa49560
Summary:
I could not access https://github.com/NVlabs/cub/issues/172 to understand whether IntWrapper was still necessary but the comment is from 5 years ago and causes problems for the ROCm build.
Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/1964
Reviewed By: MichaelRamamonjisoa
Differential Revision: D71937895
Pulled By: bottler
fbshipit-source-id: 5e0351e1bd8599b670436cd3464796eca33156f6
Summary:
CUDA kernel variables matching the type `(thread|block|grid).(Idx|Dim).(x|y|z)` [have the data type `uint`](https://docs.nvidia.com/cuda/cuda-c-programming-guide/#built-in-variables).
Many programmers mistakenly use implicit casts to turn these data types into `int`. In fact, the [CUDA Programming Guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/) it self is inconsistent and incorrect in its use of data types in programming examples.
The result of these implicit casts is that our kernels may give unexpected results when exposed to large datasets, i.e., those exceeding >~2B items.
While we now have linters in place to prevent simple mistakes (D71236150), our codebase has many problematic instances. This diff fixes some of them.
Reviewed By: dtolnay
Differential Revision: D71355356
fbshipit-source-id: cea44891416d9efd2f466d6c45df4e36008fa036
Summary:
A continuation of https://github.com/facebookresearch/pytorch3d/issues/1948 -- this commit fixes a small numerical issue with `matrix_to_axis_angle(..., fast=True)` near `pi`.
bottler feel free to check this out, it's a single-line change.
Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/1953
Reviewed By: MichaelRamamonjisoa
Differential Revision: D70088251
Pulled By: bottler
fbshipit-source-id: 54cc7f946283db700cec2cd5575cf918456b7f32
Summary:
Remove headers flagged by facebook-unused-include-check over fbcode.vision.
+ format and autodeps
This is a codemod. It was automatically generated and will be landed once it is approved and tests are passing in sandcastle.
You have been added as a reviewer by Sentinel or Butterfly.
Autodiff project: uiv
Autodiff partition: fbcode.vision
Autodiff bookmark: ad.uiv.fbcode.vision
Reviewed By: dtolnay
Differential Revision: D70403619
fbshipit-source-id: d109c15774eeb3d809875f75fa2a26ed20d7f9a6
Summary:
This is an extension of https://github.com/facebookresearch/pytorch3d/issues/1544 with various speed, stability, and readability improvements. (I could not find a way to make a commit to the existing PR). This PR is still based on the [Rodrigues' rotation formula](https://en.wikipedia.org/wiki/Rotation_formalisms_in_three_dimensions#Rotation_matrix_%E2%86%94_Euler_axis/angle).
The motivation is the same; this change speeds up the conversions up to 10x, depending on the device, batch size, etc.
### Notes
- As the angles get very close to `π`, the existing implementation and the proposed one start to differ. However, (my understanding is that) this is not a problem as the axis can not be stably inferred from the rotation matrix in this case in general.
- bottler , I tried to follow similar conventions as existing functions to deal with weird angles, let me know if something needs to be changed to merge this.
Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/1948
Reviewed By: MichaelRamamonjisoa
Differential Revision: D69193009
Pulled By: bottler
fbshipit-source-id: e5ed34b45b625114ec4419bb89e22a6aefad4eeb
Summary:
This is a somewhat not BC change: some None paths will be replaced by metadata paths, even when they were not used for data loading.
Moreover, removing the legacy fix to the paths in the old CO3D release.
Reviewed By: bottler
Differential Revision: D69048238
fbshipit-source-id: 2a8b26d7b9f5e2adf39c65888b5863a5a9de1996
Summary: Update Pytorch3D to be able to run assetgen (see later diffs in the stack)
Reviewed By: shapovalov
Differential Revision: D65942513
fbshipit-source-id: 1d01141c9f7e106608fa591be6e0d3262cb5944f
Summary: We did not often extend sequence-level metadata but now for applications like text-to-3D/video, we need to store captions and similar.
Reviewed By: bottler
Differential Revision: D68269926
fbshipit-source-id: f8af308adce51863d719a335d85cd2558943bd4c
Summary:
It is often easier to store the mask together with RGB, especially for renders. The logic in this diff:
* if load_mask and mask_path provided, take the mask from mask_path,
* otherwise, check if the image has the alpha channel and take it as a mask.
Reviewed By: antoinetlc
Differential Revision: D68160212
fbshipit-source-id: d9b6779f90027a4987ba96800983f441edff9c74
Summary: This function makes it easier to extend FrameData class with new channels; brushing it up a bit.
Reviewed By: bottler
Differential Revision: D67816470
fbshipit-source-id: 6575415c864d0f539e283889760cd2331bf226a7
Summary: Now that we have SQLAlchemy 2.0, we can fully use them.
Reviewed By: bottler
Differential Revision: D66920096
fbshipit-source-id: 25c0ea1c4f7361e66348035519627dc961b9e6e6
Summary:
Converts the directory specified to use the Ruff formatter in pyfmt
ruff_dog
If this diff causes merge conflicts when rebasing, please run
`hg status -n -0 --change . -I '**/*.{py,pyi}' | xargs -0 arc pyfmt`
on your diff, and amend any changes before rebasing onto latest.
That should help reduce or eliminate any merge conflicts.
allow-large-files
Reviewed By: bottler
Differential Revision: D66472063
fbshipit-source-id: 35841cb397e4f8e066e2159550d2f56b403b1bef
Summary:
- Hipified Pytorch Pulsar
- Created separate target for Pulsar tests and enabled RE testing
- Pytorch3D full test suite requires additional work like fixing EGL
dependencies on AMD
Reviewed By: danzimm
Differential Revision: D61339912
fbshipit-source-id: 0d10bc966e4de4a959f3834a386bad24e449dc1f
Summary: `c10::optional` is an alias for `std::optional`. Let's remove the alias and use the real thing.
Reviewed By: meyering
Differential Revision: D63402341
fbshipit-source-id: 241383e7ca4b2f3f1f9cac3af083056123dfd02b
Summary: `c10::optional` is an alias for `std::optional`. Let's remove the alias and use the real thing.
Reviewed By: palmje
Differential Revision: D63409387
fbshipit-source-id: fb6db59a14db9e897e2e6b6ad378f33bf2af86e8