pytorch3d

mirror of https://github.com/facebookresearch/pytorch3d.git synced 2026-02-06 05:52:17 +08:00

Author	SHA1	Message	Date
Bowie Chen	0c3b204375	apply Black 25.11.0 style in fbcode (70/92) Summary: Formats the covered files with pyfmt. paintitblack Reviewed By: itamaro Differential Revision: D90476295 fbshipit-source-id: 5101d4aae980a9f8955a4cb10bae23997c48837f	2026-01-12 02:54:36 -08:00
Eugene Park	2d4d345b6f	Improve `ball_query()` runtime for large-scale cases (#2006 ) Summary: ### Overview The current C++ code for `pytorch3d.ops.ball_query()` performs floating point multiplication for every coordinate of every pair of points (up until the maximum number of neighbor points is reached). This PR modifies the code (for both CPU and CUDA versions) to implement idea presented [here](https://stackoverflow.com/a/3939525): a `D`-cube around the `D`-ball is first constructed, and any point pairs falling outside the cube are skipped, without explicitly computing the squared distances. This change is especially useful for when the dimension `D` and the number of points `P2` are large and the radius is much smaller than the overall volume of space occupied by the point clouds; as much as ~2.5x speedup (CPU case; ~1.8x speedup in CUDA case) is observed when `D = 10` and `radius = 0.01`. In all benchmark cases, points were uniform randomly distributed inside a unit `D`-cube. The benchmark code used was different from `tests/benchmarks/bm_ball_query.py` (only the forward part is benchmarked, larger input sizes were used) and is stored in `tests/benchmarks/bm_ball_query_large.py`. ### Average time comparisons <img width="360" height="270" alt="cpu-03-0 01-avg" src="https://github.com/user-attachments/assets/6cc79893-7921-44af-9366-1766c3caf142" /> <img width="360" height="270" alt="cuda-03-0 01-avg" src="https://github.com/user-attachments/assets/5151647d-0273-40a3-aac6-8b9399ede18a" /> <img width="360" height="270" alt="cpu-03-0 10-avg" src="https://github.com/user-attachments/assets/a87bc150-a5eb-47cd-a4ba-83c2ec81edaf" /> <img width="360" height="270" alt="cuda-03-0 10-avg" src="https://github.com/user-attachments/assets/e3699a9f-dfd3-4dd3-b3c9-619296186d43" /> <img width="360" height="270" alt="cpu-10-0 01-avg" src="https://github.com/user-attachments/assets/5ec8c32d-8e4d-4ced-a94e-1b816b1cb0f8" /> <img width="360" height="270" alt="cuda-10-0 01-avg" src="https://github.com/user-attachments/assets/168a3dfc-777a-4fb3-8023-1ac8c13985b8" /> <img width="360" height="270" alt="cpu-10-0 10-avg" src="https://github.com/user-attachments/assets/43a57fd6-1e01-4c5e-87a9-8ef604ef5fa0" /> <img width="360" height="270" alt="cuda-10-0 10-avg" src="https://github.com/user-attachments/assets/a7c7cc69-f273-493e-95b8-3ba2bb2e32da" /> ### Peak time comparisons <img width="360" height="270" alt="cpu-03-0 01-peak" src="https://github.com/user-attachments/assets/5bbbea3f-ef9b-490d-ab0d-ce551711d74f" /> <img width="360" height="270" alt="cuda-03-0 01-peak" src="https://github.com/user-attachments/assets/30b5ab9b-45cb-4057-b69f-bda6e76bd1dc" /> <img width="360" height="270" alt="cpu-03-0 10-peak" src="https://github.com/user-attachments/assets/db69c333-e5ac-4305-8a86-a26a8a9fe80d" /> <img width="360" height="270" alt="cuda-03-0 10-peak" src="https://github.com/user-attachments/assets/82549656-1f12-409e-8160-dd4c4c9d14f7" /> <img width="360" height="270" alt="cpu-10-0 01-peak" src="https://github.com/user-attachments/assets/d0be8ef1-535e-47bc-b773-b87fad625bf0" /> <img width="360" height="270" alt="cuda-10-0 01-peak" src="https://github.com/user-attachments/assets/e308e66e-ae30-400f-8ad2-015517f6e1af" /> <img width="360" height="270" alt="cpu-10-0 10-peak" src="https://github.com/user-attachments/assets/c9b5bf59-9cc2-465c-ad5d-d4e23bdd138a" /> <img width="360" height="270" alt="cuda-10-0 10-peak" src="https://github.com/user-attachments/assets/311354d4-b488-400c-a1dc-c85a21917aa9" /> ### Full benchmark logs [benchmark-before-change.txt](https://github.com/user-attachments/files/22978300/benchmark-before-change.txt) [benchmark-after-change.txt](https://github.com/user-attachments/files/22978299/benchmark-after-change.txt) Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/2006 Reviewed By: shapovalov Differential Revision: D85356394 Pulled By: bottler fbshipit-source-id: 9b3ce5fc87bb73d4323cc5b4190fc38ae42f41b2	2025-10-30 05:01:32 -07:00
Thomas Polasek	055ab3a2e3	Convert directory fbcode/vision to use the Ruff Formatter Summary: Converts the directory specified to use the Ruff formatter in pyfmt ruff_dog If this diff causes merge conflicts when rebasing, please run `hg status -n -0 --change . -I '*/.{py,pyi}' \| xargs -0 arc pyfmt` on your diff, and amend any changes before rebasing onto latest. That should help reduce or eliminate any merge conflicts. allow-large-files Reviewed By: bottler Differential Revision: D66472063 fbshipit-source-id: 35841cb397e4f8e066e2159550d2f56b403b1bef	2024-11-26 02:38:20 -08:00
generatedunixname89002005287564	1f92c4e9d2	vision/fair Reviewed By: zsol Differential Revision: D53258682 fbshipit-source-id: 3f006b5f31a2b1ffdc6323d3a3b08ac46c3162ce	2024-01-31 07:43:49 -08:00
Jiali Duan	8b8291830e	Marching Cubes cuda extension Summary: Torch CUDA extension for Marching Cubes - MC involving 3 steps: - 1st forward pass to collect vertices and occupied state for each voxel - Compute compactVoxelArray to skip non-empty voxels - 2nd pass to genereate interpolated vertex positions and faces by marching through the grid - In contrast to existing MC: - Bind each interpolated vertex with a global edge_id to address floating-point precision - Added deduplication process to remove redundant vertices and faces Benchmarks (ms): \| N / V(^3) \| python \| C++ \| CUDA \| Speedup \| \| 2 / 20 \| 12176873 \| 24338 \| 4363 \| 2790x/5x\| \| 1 / 100 \| - \| 3070511 \| 27126 \| 113x \| \| 2 / 100 \| - \| 5968934 \| 53129 \| 112x \| \| 1 / 256 \| - \| 61278092 \| 430900 \| 142x \| \| 2 / 256 \| - \|125687930 \| 856941 \| 146x \| Reviewed By: kjchalup Differential Revision: D39644248 fbshipit-source-id: d679c0c79d67b98b235d12296f383d760a00042a	2022-11-15 19:42:04 -08:00
Jiali Duan	0d8608b9f9	Marching Cubes C++ torch extension Summary: Torch C++ extension for Marching Cubes - Add torch C++ extension for marching cubes. Observe a speed up of ~255x-324x speed up (over varying batch sizes and spatial resolutions) - Add C++ impl in existing unit-tests. (Note: this ignores all push blocking failures!) Reviewed By: kjchalup Differential Revision: D39590638 fbshipit-source-id: e44d2852a24c2c398e5ea9db20f0dfaa1817e457	2022-10-06 11:13:53 -07:00
Gavin Peng	6471893f59	Multithread CPU naive mesh rasterization Summary: Threaded the for loop: ``` for (int yi = 0; yi < H; ++yi) {...} ``` in function `RasterizeMeshesNaiveCpu()`. Chunk size is approx equal. Reviewed By: bottler Differential Revision: D40063604 fbshipit-source-id: 09150269405538119b0f1b029892179501421e68	2022-10-06 06:42:58 -07:00
Jiali Duan	03562d87f5	Benchmark Cameras Summary: Address comments to add benchmarkings for cameras and the new fisheye cameras. The dependency functions in test_cameras have been updated in Diff 1. The following two snapshots show benchmarking results. Reviewed By: kjchalup Differential Revision: D38991914 fbshipit-source-id: 51fe9bb7237543e4ee112c9f5068a4cf12a9d482	2022-08-28 11:43:46 -07:00
Jeremy Reizenstein	34f648ede0	move targets Summary: Move testing targets from pytorch3d/tests/TARGETS to pytorch3d/TARGETS. Reviewed By: shapovalov Differential Revision: D36186940 fbshipit-source-id: a4c52c4d99351f885e2b0bf870532d530324039b	2022-05-25 06:16:03 -07:00
Krzysztof Chalupka	7c25d34d22	SplatterPhongShader Benchmarks Summary: Benchmarking. We only use num_faces=2 for splatter, because as far as I can see one would never need to use more. Pose optimization and mesh optimization experiments (see next two diffs) showed that Splatter with 2 faces beats Softmax with 50 and 100 faces in terms of accuracy. Results: We're slower at 64px^2. At 128px and 256px, we're slower than Softmax+50faces, but faster than Softmax+100faces. We're also slower at 10 faces/pix, but expectation as well as results show that more then 2 faces shouldn't be necessary. See also more results in .https://fburl.com/gdoc/ttv7u7hp Reviewed By: jcjohnson Differential Revision: D36210575 fbshipit-source-id: c8de28c8a59ce5fe21a47263bd43d2757b15d123	2022-05-24 22:31:12 -07:00
John Reese	bef959c755	formatting changes from black 22.3.0 Summary: Applies the black-fbsource codemod with the new build of pyfmt. paintitblack Reviewed By: lisroach Differential Revision: D36324783 fbshipit-source-id: 280c09e88257e5e569ab729691165d8dedd767bc	2022-05-11 19:55:56 -07:00
Jeremy Reizenstein	c2862ff427	use workaround for points_normals Summary: Use existing workaround for batched 3x3 symeig because it is faster than torch.symeig. Added benchmark showing speedup. True = workaround. ``` Benchmark Avg Time(μs) Peak Time(μs) Iterations -------------------------------------------------------------------------------- normals_True_3000 16237 17233 31 normals_True_6000 33028 33391 16 normals_False_3000 18623069 18623069 1 normals_False_6000 36535475 36535475 1 ``` Should help https://github.com/facebookresearch/pytorch3d/issues/988 Reviewed By: nikhilaravi Differential Revision: D33660585 fbshipit-source-id: d1162b277f5d61ed67e367057a61f25e03888dce	2022-01-24 11:41:55 -08:00
Jeremy Reizenstein	3eb4233844	New raysamplers Summary: New MultinomialRaysampler succeeds GridRaysampler bringing masking and subsampling. Correspondingly, NDCMultinomialRaysampler succeeds NDCGridRaysampler. Reviewed By: nikhilaravi, shapovalov Differential Revision: D33256897 fbshipit-source-id: cd80ec6f35b110d1d20a75c62f4e889ba8fa5d45	2022-01-24 10:52:23 -08:00
Jeremy Reizenstein	741777b5b5	More company name & License Summary: Manual adjustments for license changes. Reviewed By: patricklabatut Differential Revision: D33405657 fbshipit-source-id: 8a21735726f3aece9f9164da9e3b272b27db8032	2022-01-04 11:43:38 -08:00
Jeremy Reizenstein	9eeb456e82	Update license for company name Summary: Update all FB license strings to the new format. Reviewed By: patricklabatut Differential Revision: D33403538 fbshipit-source-id: 97a4596c5c888f3c54f44456dc07e718a387a02c	2022-01-04 11:43:38 -08:00
Jeremy Reizenstein	a0e2d2e3c3	move benchmarks to separate directory Summary: Move benchmarks to a separate directory as tests/ is getting big. Reviewed By: nikhilaravi Differential Revision: D32885462 fbshipit-source-id: a832662a494ee341ab77d95493c95b0af0a83f43	2021-12-07 10:26:50 -08:00

16 Commits