67 Commits

Author SHA1 Message Date
generatedunixname915440834509264
3ba2030aa4 Fix CQS signal readability-braces-around-statements in fbcode/vision/fair
Reviewed By: bottler

Differential Revision: D94068738

fbshipit-source-id: cd47c67d4269ac7461acb73da6de9e4373da9d4c
2026-02-23 05:18:38 -08:00
generatedunixname1262449429094718
79a7fcf02b fbcode/vision/fair/pytorch3d/pytorch3d/csrc/rasterize_meshes/rasterize_meshes_cpu.cpp
Reviewed By: bottler

Differential Revision: D94062914

fbshipit-source-id: 9147dc68d115ce5761ebb7d07c035ac4b664da0b
2026-02-23 05:10:19 -08:00
generatedunixname1417043136753450
e43ed8c76e fbcode/vision/fair/pytorch3d/pytorch3d/transforms/rotation_conversions.py
Reviewed By: bottler

Differential Revision: D93712828

fbshipit-source-id: 3465af450104bb1e5f491e3c0ee0259698cf8ceb
2026-02-22 07:53:20 -08:00
generatedunixname1417043136753450
49f43402c6 fbcode/vision/fair/pytorch3d/pytorch3d/renderer/mesh/textures.py
Reviewed By: bottler

Differential Revision: D93710616

fbshipit-source-id: 599fe7425066bc85c0999765168788f8df7e34ce
2026-02-22 07:13:45 -08:00
generatedunixname1417043136753450
90646d93ab fbcode/vision/fair/pytorch3d/pytorch3d/renderer/mesh/clip.py
Reviewed By: bottler

Differential Revision: D93715239

fbshipit-source-id: 7417015251fe96be72daf4894e946edd43bb9c46
2026-02-22 07:13:09 -08:00
generatedunixname1417043136753450
eabb511410 fbcode/vision/fair/pytorch3d/pytorch3d/loss/mesh_laplacian_smoothing.py
Reviewed By: bottler

Differential Revision: D93709347

fbshipit-source-id: 69710e6082a0785126a121e26f1d96a571360f1d
2026-02-22 07:08:02 -08:00
generatedunixname1417043136753450
e70188ebbc fbcode/vision/fair/pytorch3d/pytorch3d/transforms/transform3d.py
Reviewed By: bottler

Differential Revision: D93713606

fbshipit-source-id: a8aa52328a76d95d3985daec529cdce04ba12bd4
2026-02-22 07:06:34 -08:00
generatedunixname1417043136753450
1bd911d534 fbcode/vision/fair/pytorch3d/pytorch3d/renderer/cameras.py
Reviewed By: bottler

Differential Revision: D93712137

fbshipit-source-id: 3457f0f9fb7d7baa29be2eaf731074a49bdbb0c8
2026-02-22 07:05:45 -08:00
generatedunixname1417043136753450
3aadd19a2b fbcode/vision/fair/pytorch3d/pytorch3d/ops/laplacian_matrices.py
Reviewed By: bottler

Differential Revision: D93708383

fbshipit-source-id: 7576f0c9800ed3d28795e521be5c63799b7e6676
2026-02-22 06:57:57 -08:00
generatedunixname1417043136753450
42d66c1145 fbcode/vision/fair/pytorch3d/pytorch3d/loss/point_mesh_distance.py
Reviewed By: bottler

Differential Revision: D93708351

fbshipit-source-id: 06a877777e4cb72a497a44ff55db0b6222bda83b
2026-02-22 06:55:36 -08:00
generatedunixname1417043136753450
e9ed1cb178 fbcode/vision/fair/pytorch3d/pytorch3d/renderer/utils.py
Reviewed By: bottler

Differential Revision: D93708316

fbshipit-source-id: f8ae2432ad34116278b3f7f7de5146b89c3fe63e
2026-02-22 04:09:20 -08:00
Jeremy Reizenstein
cbcae096a0 Add atol=1e-4 to assertClose calls in test_inverse for Translate
Summary:
Added `atol=1e-4` tolerance parameter to the `assertClose` calls on lines 682 and 683 in the `test_inverse` method of `TestTranslate` class.

This is a retry of D90225548

Reviewed By: sgrigory

Differential Revision: D90682979

fbshipit-source-id: ac13f000174dd9962326296e1c3116d0d39c7751
2026-01-14 08:57:43 -08:00
generatedunixname537391475639613
5b1cce56bc Fix for T251460511 ("Your diff, D90498281, broke one test")
Reviewed By: sgrigory

Differential Revision: D90649493

fbshipit-source-id: 2a77c45ec8e6e5aa0a20437a765fbb9f0b566406
2026-01-14 08:53:26 -08:00
Bowie Chen
0c3b204375 apply Black 25.11.0 style in fbcode (70/92)
Summary:
Formats the covered files with pyfmt.

paintitblack

Reviewed By: itamaro

Differential Revision: D90476295

fbshipit-source-id: 5101d4aae980a9f8955a4cb10bae23997c48837f
2026-01-12 02:54:36 -08:00
Jeremy Reizenstein
6be5e2da06 Replace assertTrue(torch.allclose(...)) with assertClose in test_transforms.py
Summary:
## LLM-generated Summary:
Replaces self.assertTrue(torch.allclose(...)) with self.assertClose(...) throughout fbcode/vision/fair/pytorch3d/tests/test_transforms.py. This standardizes numeric closeness assertions for clearer failures and consistency while preserving tolerances and test behavior.
 ---
Session: DEV34970678

Reviewed By: shapovalov

Differential Revision: D90251428

fbshipit-source-id: cdae842be82f0ba548802e6977be272134e8508c
2026-01-08 04:35:40 -08:00
Guilherme Albertini
f5f6b78e70 Add initial CUDA 13.0 support for pulsar and pycuda modules
Summary:
CUDA 13.0 introduced breaking changes that cause build failures in pytorch3d:

**1. Symbol Visibility Changes (pulsar)**
- NVCC now forces `__global__` functions to have hidden ELF visibility by default
- `__global__` function template stubs now have internal linkage

**Fix:** Added NVCC flags (`--device-entity-has-hidden-visibility=false` and `-static-global-template-stub=false`) for fbcode builds with CUDA 13.0+.

**2. cuCtxCreate API Change (pycuda)**
- CUDA 13.0 changed `cuCtxCreate` from 3 to 4 arguments
- pycuda 2022.2 (current default) uses the old signature and fails to compile
- pycuda 2025.1.2 (D83501913) includes the CUDA 13.0 fix

**Fix:** Added CUDA 13.0 constraint to pycuda alias to auto-select pycuda 2025.1.2.

**NCCL Compatibility Note:**
- Current stable NCCL (2.25) is NOT compatible with CUDA 13.0 (`cudaTypedefs.h` removed)
- NCCL 2.27+ works with CUDA 13.0 and will become stable in early January 2026 (per HPC Comms team)
- Until then, CUDA 13.0 builds require `-c hpc_comms.use_nccl=2.27`

References:
- GitHub issue: https://github.com/facebookresearch/pytorch3d/issues/2011
- NVIDIA blog: https://developer.nvidia.com/blog/cuda-c-compiler-updates-impacting-elf-visibility-and-linkage/
- FBGEMM_GPU fix: D86474263
- pycuda 2025.1.2 buckification: D83501913

Reviewed By: bottler

Differential Revision: D88816596

fbshipit-source-id: 1ba666dab8c0e06d1286b8d5bc5d84cfc55c86e6
2025-12-17 10:02:10 -08:00
Jeremy Reizenstein
33824be3cb version 0.7.9
Reviewed By: shapovalov

Differential Revision: D87984194

fbshipit-source-id: dee8123a2c3f5cc34ada52f4663c9bbb329e03a7
2025-11-27 09:52:08 -08:00
Eugene Park
2d4d345b6f Improve ball_query() runtime for large-scale cases (#2006)
Summary:
### Overview
The current C++ code for `pytorch3d.ops.ball_query()` performs floating point multiplication for every coordinate of every pair of points (up until the maximum number of neighbor points is reached). This PR modifies the code (for both CPU and CUDA versions) to implement idea presented [here](https://stackoverflow.com/a/3939525): a `D`-cube around the `D`-ball is first constructed, and any point pairs falling outside the cube are skipped, without explicitly computing the squared distances. This change is especially useful for when the dimension `D` and the number of points `P2` are large and the radius is much smaller than the overall volume of space occupied by the point clouds; as much as **~2.5x speedup** (CPU case; ~1.8x speedup in CUDA case) is observed when `D = 10` and `radius = 0.01`. In all benchmark cases, points were uniform randomly distributed inside a unit `D`-cube.

The benchmark code used was different from `tests/benchmarks/bm_ball_query.py` (only the forward part is benchmarked, larger input sizes were used) and is stored in `tests/benchmarks/bm_ball_query_large.py`.

### Average time comparisons

<img width="360" height="270" alt="cpu-03-0 01-avg" src="https://github.com/user-attachments/assets/6cc79893-7921-44af-9366-1766c3caf142" />
<img width="360" height="270" alt="cuda-03-0 01-avg" src="https://github.com/user-attachments/assets/5151647d-0273-40a3-aac6-8b9399ede18a" />
<img width="360" height="270" alt="cpu-03-0 10-avg" src="https://github.com/user-attachments/assets/a87bc150-a5eb-47cd-a4ba-83c2ec81edaf" />
<img width="360" height="270" alt="cuda-03-0 10-avg" src="https://github.com/user-attachments/assets/e3699a9f-dfd3-4dd3-b3c9-619296186d43" />
<img width="360" height="270" alt="cpu-10-0 01-avg" src="https://github.com/user-attachments/assets/5ec8c32d-8e4d-4ced-a94e-1b816b1cb0f8" />
<img width="360" height="270" alt="cuda-10-0 01-avg" src="https://github.com/user-attachments/assets/168a3dfc-777a-4fb3-8023-1ac8c13985b8" />
<img width="360" height="270" alt="cpu-10-0 10-avg" src="https://github.com/user-attachments/assets/43a57fd6-1e01-4c5e-87a9-8ef604ef5fa0" />
<img width="360" height="270" alt="cuda-10-0 10-avg" src="https://github.com/user-attachments/assets/a7c7cc69-f273-493e-95b8-3ba2bb2e32da" />

### Peak time comparisons

<img width="360" height="270" alt="cpu-03-0 01-peak" src="https://github.com/user-attachments/assets/5bbbea3f-ef9b-490d-ab0d-ce551711d74f" />
<img width="360" height="270" alt="cuda-03-0 01-peak" src="https://github.com/user-attachments/assets/30b5ab9b-45cb-4057-b69f-bda6e76bd1dc" />
<img width="360" height="270" alt="cpu-03-0 10-peak" src="https://github.com/user-attachments/assets/db69c333-e5ac-4305-8a86-a26a8a9fe80d" />
<img width="360" height="270" alt="cuda-03-0 10-peak" src="https://github.com/user-attachments/assets/82549656-1f12-409e-8160-dd4c4c9d14f7" />
<img width="360" height="270" alt="cpu-10-0 01-peak" src="https://github.com/user-attachments/assets/d0be8ef1-535e-47bc-b773-b87fad625bf0" />
<img width="360" height="270" alt="cuda-10-0 01-peak" src="https://github.com/user-attachments/assets/e308e66e-ae30-400f-8ad2-015517f6e1af" />
<img width="360" height="270" alt="cpu-10-0 10-peak" src="https://github.com/user-attachments/assets/c9b5bf59-9cc2-465c-ad5d-d4e23bdd138a" />
<img width="360" height="270" alt="cuda-10-0 10-peak" src="https://github.com/user-attachments/assets/311354d4-b488-400c-a1dc-c85a21917aa9" />

### Full benchmark logs

[benchmark-before-change.txt](https://github.com/user-attachments/files/22978300/benchmark-before-change.txt)
[benchmark-after-change.txt](https://github.com/user-attachments/files/22978299/benchmark-after-change.txt)

Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/2006

Reviewed By: shapovalov

Differential Revision: D85356394

Pulled By: bottler

fbshipit-source-id: 9b3ce5fc87bb73d4323cc5b4190fc38ae42f41b2
2025-10-30 05:01:32 -07:00
Nikita Lutsenko
45df20e9e2 clang-format | Format fbsource with clang-format 21.
Reviewed By: ChristianK275

Differential Revision: D85317706

fbshipit-source-id: b399c5c4b75252999442b7d7d2778e7a241b0025
2025-10-26 23:40:59 -07:00
Jeremy Reizenstein
fc6a6b8951 separate multigpu tests
Reviewed By: MichaelRamamonjisoa

Differential Revision: D83477594

fbshipit-source-id: 5ea67543e288e9a06ee5141f436e879aa5cfb7f3
2025-10-09 08:17:20 -07:00
Kihyuk Sohn
7711bf34a8 fix device error
Summary: When using `sample_farthest_points` with `lengths`, it throws an error because of the device mismatch between `lengths` and `torch.rand(lengths.size())` on GPU.

Reviewed By: bottler

Differential Revision: D82378997

fbshipit-source-id: 8e929256177d543d1dd1249e8488f70e03e4101f
2025-09-15 06:41:00 -07:00
Jeremy Reizenstein
d098beb7a7 allow python 3.12
Summary: Remove use of distutils

Reviewed By: MichaelRamamonjisoa

Differential Revision: D81594552

fbshipit-source-id: 4e979d5e03ea873bd09bc2b674b7e6480b9c6d65
2025-09-04 08:31:32 -07:00
Jeremy Reizenstein
dd068703d1 test fixes
Summary: Some random seed changes. Skip multigpu tests when there's only one gpu. This is a better fix for what AI is doing in D80600882.

Reviewed By: MichaelRamamonjisoa

Differential Revision: D80625966

fbshipit-source-id: ac3952e7144125fd3a05ad6e4e6e5976ae10a8ef
2025-08-27 06:55:50 -07:00
Antoine Dumoulin
50f8efa1cb Use sparse_coo_tensor in laplacian_matrices.py (#1991)
Summary:
update obsolete torch.sparse.FloatTensor to torch.sparse_coo_tensor

Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/1991

Reviewed By: MichaelRamamonjisoa

Differential Revision: D80084359

Pulled By: bottler

fbshipit-source-id: dc6c7a90211113d1ce5338a92c8c0030bfe12e65
2025-08-13 07:55:57 -07:00
Olga Gerasimova
5043d15361 avoid CPU/GPU sync in sample_farthest_points
Summary:
Optimizing sample_farthest_poinst by reducing CPU/GPU sync:
1. replacing iterative randint for starting indexes for 1 function call, if length is constant
2. Avoid sync in fetching maxumum of sample points, if we sample the same amount
3. Initializing 1 tensor for samples and indixes

compare
https://fburl.com/mlhub/7wk0xi98
Before
{F1980383703}
after
{F1980383707}

Histogram match pretty closely
{F1980464338}

Reviewed By: bottler

Differential Revision: D78731869

fbshipit-source-id: 060528ae7a1e0fbbd005d129c151eaf9405841de
2025-07-23 10:23:40 -07:00
Stone Tao
e3d3a67a89 Clamp matrices in matrix_to_euler_angles function (#1989)
Summary:
Closes https://github.com/facebookresearch/pytorch3d/issues/1988

Credit goes to tylerlum for raising this issue and suggesting this fix in https://github.com/haosulab/ManiSkill/pull/1090

Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/1989

Reviewed By: MichaelRamamonjisoa

Differential Revision: D78021983

Pulled By: bottler

fbshipit-source-id: d723f1924a399f4d7fd072e96ea740ae73cf280f
2025-07-10 06:08:19 -07:00
Jeremy Reizenstein
e55ea90609 disable import tests
Summary: these tests don't work, aren't needed right now

Reviewed By: MichaelRamamonjisoa

Differential Revision: D78084742

fbshipit-source-id: 9cff2b30427dec314e34e81179816af4073bbe23
2025-07-10 05:20:22 -07:00
Melvin He
3aee2a6005 Fixes bus error hard crashes on Apple Silicon MPS devices
Summary:
Fixes hard crashes (bus errors) when using MPS device (Apple Silicon) by implementing CPU checks throughout files in csrc subdirectories to check if on same mesh on a CPU device.

Note that this is the fourth and ultimate part of a larger change through multiple files & directories.

Reviewed By: bottler

Differential Revision: D77698176

fbshipit-source-id: 5bc9e3c5cea61afd486aed7396f390d92775ec6d
2025-07-03 12:34:37 -07:00
Melvin He
c5ea8fa49e Adds CHECK_CPU macros checks for tensors not on CPU
Summary:
Adds CHECK_CPU macros that checks if a tensor is on the CPU device throughout csrc directories and subdir up to `pulsar`.

Note that this is the third part of a larger change, and to keep diffs better organized, subsequent diffs will update the remaining directories.

Reviewed By: bottler

Differential Revision: D77696998

fbshipit-source-id: 470ca65b23d9965483b5bdd30c712da8e1131787
2025-07-03 08:29:36 -07:00
Melvin He
3ff6c5ab85 Error instead of crash for tensors on exotic devices
Summary:
Adds CHECK_CPU macros that checks if a tensor is on the CPU device throughout csrc directories up to `marching_cubes`. Directories updated include those in `gather_scatter`, `interp_face_attrs`, `iou_box3d`, `knn`, and `marching_cubes`.

Note that this is the second part of a larger change, and to keep diffs better organized, subsequent diffs will update the remaining directories.

Reviewed By: bottler

Differential Revision: D77558550

fbshipit-source-id: 762a0fe88548dc8d0901b198a11c40d0c36e173f
2025-07-01 09:14:38 -07:00
Srivathsan Govindarajan
267bd8ef87 Revert _sqrt_positive_part change
Reviewed By: bottler

Differential Revision: D77549647

fbshipit-source-id: a0ef0bc015c643ad7416c781886e2e23b5105bdd
2025-06-30 14:13:27 -07:00
Melvin He
177eec6378 Error instead of crash for tensors on exotic devices (#1986)
Summary:
Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/1986

Adds device checks to prevent crashes on unsupported devices in PyTorch3D. Updates the `pytorch3d_cutils.h` file to include new macro CHECK_CPU that checks if a tensor is on the CPU device. This macro is then used in the directories from `ball_query` to `face_area_normals` to ensure that tensors are not on unsupported devices like MPS.

Note that this is the first part of a larger change, and to keep diffs better organized, subsequent diffs will update the remaining directories.

Reviewed By: bottler

Differential Revision: D77473296

fbshipit-source-id: 13dc84620dee667bddebad1dade2d2cb5a59c737
2025-06-30 12:27:38 -07:00
Srivathsan Govindarajan
71db7a0ea2 Removing dynamic shape ops and boolean indexing in matrix_to_quaternion
Summary:
The current implementation of `matrix_to_quaternion` and `_sqrt_positive_part` uses boolean indexing, which can slow down performance and cause incompatibility with `torch.compile` unless `torch._dynamo.config.capture_dynamic_output_shape_ops` is set to `True`.

To enhance performance and compatibility, I recommend using  `torch.gather` to select the best-conditioned quaternions and `F.relu` instead of `x>0` (bottler's suggestion)

For a detailed comparison of the implementation differences when using `torch.compile`, please refer to my Bento notebook
N7438339.

Reviewed By: bottler

Differential Revision: D77176230

fbshipit-source-id: 9a6a2e0015b5865056297d5f45badc3c425b93ce
2025-06-25 01:18:46 -07:00
Grace Cheng
6020323d94 Fix Self-Assignment in CUDA Stream Parameter in renderer.forward.device.h
Summary: Resolved self-assignment warnings in the `renderer.forward.device.h` file by removing redundant assignments of the `stream` variable to itself in `cub::DeviceSelect::Flagged` function calls. This change eliminates compiler errors and ensures cleaner, more efficient code execution.

Reviewed By: bottler

Differential Revision: D76554140

fbshipit-source-id: 28eae0186246f51a8ac8002644f184349aa49560
2025-06-13 11:00:16 -07:00
Emmanuel Ferdman
182e845c19 Resolve logger warnings (#1981)
Summary:
# PR Summary
This small PR resolves the annoying deprecation warnings of the `logger` library:
```python
DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
```

Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/1981

Reviewed By: MichaelRamamonjisoa

Differential Revision: D75287169

Pulled By: bottler

fbshipit-source-id: 9ff9f5dd648aca8d8bb5d33577909da711d18647
2025-06-10 02:27:54 -07:00
generatedunixname89002005287564
f315ac131b Fix CQS signal facebook-unused-include-check in fbcode/vision/fair/pytorch3d/pytorch3d/csrc
Reviewed By: dtolnay

Differential Revision: D75938951

fbshipit-source-id: 8e4f9ce82ec988a30e4c8d54881b78560ceab0e0
2025-06-04 13:09:58 -07:00
Nick Riasanovsky
fc08621879 Fix distutils failure in Triton Beta testing
Summary: Fixes the distutils issues similar to D73934713

Reviewed By: bottler

Differential Revision: D75631611

fbshipit-source-id: 09c354d8cc51ff2c46f4688d7f674370e3f48f1e
2025-05-29 18:18:49 -07:00
generatedunixname89002005287564
3f327a516b Fix CQS signal facebook-unused-include-check in fbcode/vision/fair/pytorch3d/pytorch3d/csrc/pulsar
Reviewed By: dtolnay

Differential Revision: D75209078

fbshipit-source-id: 6b67d3354091d18b8171a6f4b38465ffcc9e17c5
2025-05-26 19:14:57 -07:00
Ting Xu
366eff21d9 Fix PyTorch3D build failure on windows
Summary: Replace #defines by typedefs by following the instructions at https://github.com/facebookresearch/pytorch3d/issues/1970?fbclid=IwY2xjawKZqMJleHRuA2FlbQIxMQBicmlkETFyWFczV2hMVmdOczJWellIAR7jxI6zGQiC5ag-FUXjSK12ljn7rmbMKc3HsLX-BC1TMpOUTJy-bsZxmfKzmw_aem_MIG_nc3eg7LL1o2fSAbl0A#issuecomment-2894339456

Reviewed By: bottler

Differential Revision: D75083182

fbshipit-source-id: 7131fe555bb0da615b341e77ddd8761ebce9d7eb
2025-05-21 07:46:49 -07:00
Jeff Daily
0a59450f0e remove IntWrapper (#1964)
Summary:
I could not access https://github.com/NVlabs/cub/issues/172 to understand whether IntWrapper was still necessary but the comment is from 5 years ago and causes problems for the ROCm build.

Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/1964

Reviewed By: MichaelRamamonjisoa

Differential Revision: D71937895

Pulled By: bottler

fbshipit-source-id: 5e0351e1bd8599b670436cd3464796eca33156f6
2025-03-28 08:16:54 -07:00
Richard Barnes
3987612062 Fix CUDA kernel index data type in vision/fair/pytorch3d/pytorch3d/csrc/compositing/alpha_composite.cu +10
Summary:
CUDA kernel variables matching the type `(thread|block|grid).(Idx|Dim).(x|y|z)` [have the data type `uint`](https://docs.nvidia.com/cuda/cuda-c-programming-guide/#built-in-variables).

Many programmers mistakenly use implicit casts to turn these data types into `int`. In fact, the [CUDA Programming Guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/) it self is inconsistent and incorrect in its use of data types in programming examples.

The result of these implicit casts is that our kernels may give unexpected results when exposed to large datasets, i.e., those exceeding >~2B items.

While we now have linters in place to prevent simple mistakes (D71236150), our codebase has many problematic instances. This diff fixes some of them.

Reviewed By: dtolnay

Differential Revision: D71355356

fbshipit-source-id: cea44891416d9efd2f466d6c45df4e36008fa036
2025-03-19 13:21:43 -07:00
Alexandros Benetatos
06a76ef8dd Correct "fast" matrix_to_axis_angle near pi (#1953)
Summary:
A continuation of https://github.com/facebookresearch/pytorch3d/issues/1948 -- this commit fixes a small numerical issue with `matrix_to_axis_angle(..., fast=True)` near `pi`.
bottler feel free to check this out, it's a single-line change.

Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/1953

Reviewed By: MichaelRamamonjisoa

Differential Revision: D70088251

Pulled By: bottler

fbshipit-source-id: 54cc7f946283db700cec2cd5575cf918456b7f32
2025-03-11 12:25:59 -07:00
Richard Barnes
21205730d9 Fix unused-variable issues, mostly relating to AMD/HIP
Reviewed By: meyering

Differential Revision: D70845538

fbshipit-source-id: 8e52b5e1f1d96b86404fc3b8cbc6fb952e2cb1a6
2025-03-08 13:03:17 -08:00
Richard Barnes
7e09505538 Enable -Wunused-value in vision/PACKAGE +1
Summary:
This diff enables compilation warning flags for the directory in question. Further details are in [this workplace post](https://fb.workplace.com/permalink.php?story_fbid=pfbid02XaWNiCVk69r1ghfvDVpujB8Hr9Y61uDvNakxiZFa2jwiPHscVdEQwCBHrmWZSyMRl&id=100051201402394).

This is a low-risk diff. There are **no run-time effects** and the diff has already been observed to compile locally. **If the code compiles, it work; test errors are spurious.**

Differential Revision: D70282347

fbshipit-source-id: e2fa55c002d7124b13450c812165d244b8a53f4e
2025-03-04 17:49:30 -08:00
Nicholas Ormrod
20bd8b33f6 facebook-unused-include-check in fbcode/vision
Summary:
Remove headers flagged by facebook-unused-include-check over fbcode.vision.

+ format and autodeps

This is a codemod. It was automatically generated and will be landed once it is approved and tests are passing in sandcastle.
You have been added as a reviewer by Sentinel or Butterfly.

Autodiff project: uiv
Autodiff partition: fbcode.vision
Autodiff bookmark: ad.uiv.fbcode.vision

Reviewed By: dtolnay

Differential Revision: D70403619

fbshipit-source-id: d109c15774eeb3d809875f75fa2a26ed20d7f9a6
2025-02-28 18:08:12 -08:00
alex-bene
7a3c0cbc9d Increase performance for conversions including axis angles (#1948)
Summary:
This is an extension of https://github.com/facebookresearch/pytorch3d/issues/1544 with various speed, stability, and readability improvements. (I could not find a way to make a commit to the existing PR). This PR is still based on the [Rodrigues' rotation formula](https://en.wikipedia.org/wiki/Rotation_formalisms_in_three_dimensions#Rotation_matrix_%E2%86%94_Euler_axis/angle).

The motivation is the same; this change speeds up the conversions up to 10x, depending on the device, batch size, etc.

### Notes
- As the angles get very close to `π`, the existing implementation and the proposed one start to differ. However, (my understanding is that) this is not a problem as the axis can not be stably inferred from the rotation matrix in this case in general.
- bottler , I tried to follow similar conventions as existing functions to deal with weird angles, let me know if something needs to be changed to merge this.

Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/1948

Reviewed By: MichaelRamamonjisoa

Differential Revision: D69193009

Pulled By: bottler

fbshipit-source-id: e5ed34b45b625114ec4419bb89e22a6aefad4eeb
2025-02-07 07:37:42 -08:00
Roman Shapovalov
215590b497 In FrameDataBuilder, set all path even if we don’t load blobs
Summary:
This is a somewhat not BC change: some None paths will be replaced by metadata paths, even when they were not used for data loading.

Moreover, removing the legacy fix to the paths in the old CO3D release.

Reviewed By: bottler

Differential Revision: D69048238

fbshipit-source-id: 2a8b26d7b9f5e2adf39c65888b5863a5a9de1996
2025-02-06 09:41:44 -08:00
Antoine Toisoul
43cd681d4f Updates to Implicitron dataset, metrics and tools
Summary: Update Pytorch3D to be able to run assetgen (see later diffs in the stack)

Reviewed By: shapovalov

Differential Revision: D65942513

fbshipit-source-id: 1d01141c9f7e106608fa591be6e0d3262cb5944f
2025-01-27 09:43:42 -08:00
Roman Shapovalov
42a4a7d432 Generalising SqlIndexDataset to support subtypes of SqlSequenceAnnotation
Summary: We did not often extend sequence-level metadata but now for applications like text-to-3D/video, we need to store captions and similar.

Reviewed By: bottler

Differential Revision: D68269926

fbshipit-source-id: f8af308adce51863d719a335d85cd2558943bd4c
2025-01-20 03:39:06 -08:00
generatedunixname89002005307016
699bc671ca Add missing Pyre mode headers] [batch:3/1531] [shard:41/N]
Differential Revision: D68316763

fbshipit-source-id: fb3e1e1a17786f6f681f1b11b48b4efd7a8ac311
2025-01-17 12:41:56 -08:00
Roman Shapovalov
49cf5a0f37 Loading fg probability from the alpha channel of image_rgb
Summary:
It is often easier to store the mask together with RGB, especially for renders. The logic in this diff:
* if load_mask and mask_path provided, take the mask from mask_path,
* otherwise, check if the image has the alpha channel and take it as a mask.

Reviewed By: antoinetlc

Differential Revision: D68160212

fbshipit-source-id: d9b6779f90027a4987ba96800983f441edff9c74
2025-01-15 11:53:30 -08:00
Roman Shapovalov
89b851e64c Refactor a utility function for bbox conversion
Summary: This function makes it easier to extend FrameData class with new channels; brushing it up a bit.

Reviewed By: bottler

Differential Revision: D67816470

fbshipit-source-id: 6575415c864d0f539e283889760cd2331bf226a7
2025-01-06 04:17:57 -08:00
Roman Shapovalov
5247f6ad74 Fixing type hints in FrameData
Summary: As subj

Reviewed By: bottler

Differential Revision: D67791200

fbshipit-source-id: c2db01c94718102618f4c8bc5c5130c65ee1d81f
2025-01-06 04:17:57 -08:00
Roman Shapovalov
e41aff47db Adding default values to FrameData for internal usage
Summary: Ensuring all fields in FrameData have defaults.

Reviewed By: bottler

Differential Revision: D67762780

fbshipit-source-id: b680d29a1a11689850905978df544cdb4eb7ddcd
2025-01-06 04:17:57 -08:00
Roman Shapovalov
64a5bfadc8 Adding SQL Dataset related files to the build script
Summary: Now that we have SQLAlchemy 2.0, we can fully use them.

Reviewed By: bottler

Differential Revision: D66920096

fbshipit-source-id: 25c0ea1c4f7361e66348035519627dc961b9e6e6
2024-12-23 16:05:26 -08:00
Thomas Polasek
055ab3a2e3 Convert directory fbcode/vision to use the Ruff Formatter
Summary:
Converts the directory specified to use the Ruff formatter in pyfmt

ruff_dog

If this diff causes merge conflicts when rebasing, please run
`hg status -n -0 --change . -I '**/*.{py,pyi}' | xargs -0 arc pyfmt`
on your diff, and amend any changes before rebasing onto latest.
That should help reduce or eliminate any merge conflicts.

allow-large-files

Reviewed By: bottler

Differential Revision: D66472063

fbshipit-source-id: 35841cb397e4f8e066e2159550d2f56b403b1bef
2024-11-26 02:38:20 -08:00
Edward Yang
f6c2ca6bfc Prepare for "Fix type-safety of torch.nn.Module instances": wave 2
Summary: See D52890934

Reviewed By: malfet, r-barnes

Differential Revision: D66245100

fbshipit-source-id: 019058106ac7eaacf29c1c55912922ea55894d23
2024-11-21 11:08:51 -08:00
Jeremy Reizenstein
e20cbe9b0e test fixes and lints
Summary:
- followup recent pyre change D63415925
- make tests remove temporary files
- weights_only=True in torch.load
- lint fixes

3 test fixes from VRehnberg in https://github.com/facebookresearch/pytorch3d/issues/1914
- imageio channels fix
- frozen decorator in test_config
- load_blobs positional

Reviewed By: MichaelRamamonjisoa

Differential Revision: D66162167

fbshipit-source-id: 7737e174691b62f1708443a4fae07343cec5bfeb
2024-11-20 09:15:51 -08:00
Jeremy Reizenstein
c17e6f947a run CI tests on main
Reviewed By: MichaelRamamonjisoa

Differential Revision: D66162168

fbshipit-source-id: 90268c1925fa9439b876df143035c9d3c3a74632
2024-11-20 05:06:52 -08:00
Yann Noutary
91c9f34137 Add safeguard in case num_tris diverges
Summary:
This PR fixes adds a safeguard preventing num_tris to overflow in `MAX_TRIS`-length arrays. The update rule of `num_tris` is bounded :

 - max(num_tris(t)) = 2*num_tris(t-1)
 - num_tris(0) = 12
 - t <= 6

So :
 - max(num_tris) = 2^6*12
 - max(num_tris) = 768

Reviewed By: bottler

Differential Revision: D66162573

fbshipit-source-id: e269a79c75c6cc33306986b1f1256cffbe96c730
2024-11-20 01:24:28 -08:00
Jeremy Reizenstein
81d82980bc Fix ogl test hang
Summary: See https://github.com/facebookresearch/pytorch3d/issues/1908

Reviewed By: MichaelRamamonjisoa

Differential Revision: D65280253

fbshipit-source-id: ec05902c5f2f7eb9ddd92bda0045cc3564b8c091
2024-11-06 11:40:42 -08:00
Jeremy Reizenstein
8fe6934885 fix subdivide_meshes with empty mesh #1788
Summary:
Simplify code

fixes https://github.com/facebookresearch/pytorch3d/issues/1788

Reviewed By: MichaelRamamonjisoa

Differential Revision: D61847675

fbshipit-source-id: 48400875d1d885bb3615bc9f4b3c7c3d822b67e7
2024-11-06 11:40:26 -08:00
bottler
c434957b2a Run tests in github action (#1896)
Summary: Pull Request resolved: https://github.com/facebookresearch/pytorch3d/pull/1896

Reviewed By: MichaelRamamonjisoa

Differential Revision: D65272512

Pulled By: bottler

fbshipit-source-id: 3bcfab43acd2d6be5444ff25178381510ddac015
2024-11-06 11:15:34 -08:00
Jeremy Reizenstein
dd2a11b5fc Fix OFF for new numpy errors
Summary: Error messages have changed around numpy version 2, making existing code fail.

Reviewed By: MichaelRamamonjisoa

Differential Revision: D65280674

fbshipit-source-id: b3ae613ea8f0f4ae20fb6e5e816314b8c10e6c65
2024-11-06 11:13:59 -08:00
Richard Barnes
9563ef79ca c10::optional -> std::optional in some files
Reviewed By: jermenkoo

Differential Revision: D65425234

fbshipit-source-id: 1e7707d6b6aab640cc1fdd3bd71a3b50f77a0909
2024-11-04 12:03:51 -08:00
generatedunixname89002005287564
008c7ab58c Pre-silence Pyre Errors for upcoming upgrade] [batch:67/603] [shard:3/N]
Reviewed By: MaggieMoss

Differential Revision: D65290095

fbshipit-source-id: ced87d096aa8939700de5599ce6984cd7ae93912
2024-10-31 16:26:25 -07:00
Jeremy Reizenstein
9eaed4c495 Fix K>1 in multimap UV sampling
Summary:
Fixes https://github.com/facebookresearch/pytorch3d/issues/1897
"Wrong dimension on gather".

Reviewed By: cijose

Differential Revision: D65280675

fbshipit-source-id: 1d587036887972bb2a2ea56d40df19cbf1aeb6cc
2024-10-31 16:05:10 -07:00
243 changed files with 1717 additions and 1141 deletions

View File

@@ -88,7 +88,6 @@ def workflow_pair(
upload=False,
filter_branch,
):
w = []
py = python_version.replace(".", "")
pyt = pytorch_version.replace(".", "")
@@ -127,7 +126,6 @@ def generate_base_workflow(
btype,
filter_branch=None,
):
d = {
"name": base_workflow_name,
"python_version": python_version,

23
.github/workflows/build.yml vendored Normal file
View File

@@ -0,0 +1,23 @@
name: facebookresearch/pytorch3d/build_and_test
on:
pull_request:
branches:
- main
push:
branches:
- main
jobs:
binary_linux_conda_cuda:
runs-on: 4-core-ubuntu-gpu-t4
env:
PYTHON_VERSION: "3.12"
BUILD_VERSION: "${{ github.run_number }}"
PYTORCH_VERSION: "2.4.1"
CU_VERSION: "cu121"
JUST_TESTRUN: 1
steps:
- uses: actions/checkout@v4
- name: Build and run tests
run: |-
conda create --name env --yes --quiet conda-build
conda run --no-capture-output --name env python3 ./packaging/build_conda.py --use-conda-cuda

View File

@@ -10,7 +10,7 @@
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
DIR=$(dirname "${DIR}")
if [[ -f "${DIR}/TARGETS" ]]
if [[ -f "${DIR}/BUCK" ]]
then
pyfmt "${DIR}"
else
@@ -36,5 +36,5 @@ then
echo "Running pyre..."
echo "To restart/kill pyre server, run 'pyre restart' or 'pyre kill' in fbcode/"
( cd ~/fbsource/fbcode; pyre -l vision/fair/pytorch3d/ )
( cd ~/fbsource/fbcode; arc pyre check //vision/fair/pytorch3d/... )
fi

View File

@@ -19,7 +19,6 @@
#
import os
import sys
import unittest.mock as mock
from recommonmark.parser import CommonMarkParser

View File

@@ -10,6 +10,7 @@ This example demonstrates the most trivial, direct interface of the pulsar
sphere renderer. It renders and saves an image with 10 random spheres.
Output: basic.png.
"""
import logging
import math
from os import path

View File

@@ -11,6 +11,7 @@ interface for sphere renderering. It renders and saves an image with
10 random spheres.
Output: basic-pt3d.png.
"""
import logging
from os import path

View File

@@ -14,6 +14,7 @@ distorted. Gradient-based optimization is used to converge towards the
original camera parameters.
Output: cam.gif.
"""
import logging
import math
from os import path

View File

@@ -14,6 +14,7 @@ distorted. Gradient-based optimization is used to converge towards the
original camera parameters.
Output: cam-pt3d.gif
"""
import logging
from os import path

View File

@@ -18,6 +18,7 @@ This example is not available yet through the 'unified' interface,
because opacity support has not landed in PyTorch3D for general data
structures yet.
"""
import logging
import math
from os import path

View File

@@ -13,6 +13,7 @@ The scene is initialized with random spheres. Gradient-based
optimization is used to converge towards a faithful
scene representation.
"""
import logging
import math

View File

@@ -13,6 +13,7 @@ The scene is initialized with random spheres. Gradient-based
optimization is used to converge towards a faithful
scene representation.
"""
import logging
import math

View File

@@ -4,10 +4,11 @@
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.
import argparse
import os.path
import runpy
import subprocess
from typing import List
from typing import List, Tuple
# required env vars:
# CU_VERSION: E.g. cu112
@@ -23,7 +24,7 @@ pytorch_major_minor = tuple(int(i) for i in PYTORCH_VERSION.split(".")[:2])
source_root_dir = os.environ["PWD"]
def version_constraint(version):
def version_constraint(version) -> str:
"""
Given version "11.3" returns " >=11.3,<11.4"
"""
@@ -32,7 +33,7 @@ def version_constraint(version):
return f" >={version},<{upper}"
def get_cuda_major_minor():
def get_cuda_major_minor() -> Tuple[str, str]:
if CU_VERSION == "cpu":
raise ValueError("fn only for cuda builds")
if len(CU_VERSION) != 5 or CU_VERSION[:2] != "cu":
@@ -42,11 +43,10 @@ def get_cuda_major_minor():
return major, minor
def setup_cuda():
def setup_cuda(use_conda_cuda: bool) -> List[str]:
if CU_VERSION == "cpu":
return
return []
major, minor = get_cuda_major_minor()
os.environ["CUDA_HOME"] = f"/usr/local/cuda-{major}.{minor}/"
os.environ["FORCE_CUDA"] = "1"
basic_nvcc_flags = (
@@ -75,6 +75,15 @@ def setup_cuda():
if os.environ.get("JUST_TESTRUN", "0") != "1":
os.environ["NVCC_FLAGS"] = nvcc_flags
if use_conda_cuda:
os.environ["CONDA_CUDA_TOOLKIT_BUILD_CONSTRAINT1"] = "- cuda-toolkit"
os.environ["CONDA_CUDA_TOOLKIT_BUILD_CONSTRAINT2"] = (
f"- cuda-version={major}.{minor}"
)
return ["-c", f"nvidia/label/cuda-{major}.{minor}.0"]
else:
os.environ["CUDA_HOME"] = f"/usr/local/cuda-{major}.{minor}/"
return []
def setup_conda_pytorch_constraint() -> List[str]:
@@ -95,7 +104,7 @@ def setup_conda_pytorch_constraint() -> List[str]:
return ["-c", "pytorch", "-c", "nvidia"]
def setup_conda_cudatoolkit_constraint():
def setup_conda_cudatoolkit_constraint() -> None:
if CU_VERSION == "cpu":
os.environ["CONDA_CPUONLY_FEATURE"] = "- cpuonly"
os.environ["CONDA_CUDATOOLKIT_CONSTRAINT"] = ""
@@ -116,7 +125,7 @@ def setup_conda_cudatoolkit_constraint():
os.environ["CONDA_CUDATOOLKIT_CONSTRAINT"] = toolkit
def do_build(start_args: List[str]):
def do_build(start_args: List[str]) -> None:
args = start_args.copy()
test_flag = os.environ.get("TEST_FLAG")
@@ -132,8 +141,16 @@ def do_build(start_args: List[str]):
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Build the conda package.")
parser.add_argument(
"--use-conda-cuda",
action="store_true",
help="get cuda from conda ignoring local cuda",
)
our_args = parser.parse_args()
args = ["conda", "build"]
setup_cuda()
args += setup_cuda(use_conda_cuda=our_args.use_conda_cuda)
init_path = source_root_dir + "/pytorch3d/__init__.py"
build_version = runpy.run_path(init_path)["__version__"]

View File

@@ -8,10 +8,13 @@ source:
requirements:
build:
- {{ compiler('c') }} # [win]
{{ environ.get('CONDA_CUDA_TOOLKIT_BUILD_CONSTRAINT1', '') }}
{{ environ.get('CONDA_CUDA_TOOLKIT_BUILD_CONSTRAINT2', '') }}
{{ environ.get('CONDA_CUB_CONSTRAINT') }}
host:
- python
- mkl =2023 # [x86_64]
{{ environ.get('SETUPTOOLS_CONSTRAINT') }}
{{ environ.get('CONDA_PYTORCH_BUILD_CONSTRAINT') }}
{{ environ.get('CONDA_PYTORCH_MKL_CONSTRAINT') }}
@@ -22,6 +25,7 @@ requirements:
- python
- numpy >=1.11
- torchvision >=0.5
- mkl =2023 # [x86_64]
- iopath
{{ environ.get('CONDA_PYTORCH_CONSTRAINT') }}
{{ environ.get('CONDA_CUDATOOLKIT_CONSTRAINT') }}
@@ -47,8 +51,11 @@ test:
- imageio
- hydra-core
- accelerate
- matplotlib
- tabulate
- pandas
- sqlalchemy
commands:
#pytest .
python -m unittest discover -v -s tests -t .

View File

@@ -7,7 +7,7 @@
# pyre-unsafe
""""
""" "
This file is the entry point for launching experiments with Implicitron.
Launch Training
@@ -44,25 +44,22 @@ The outputs of the experiment are saved and logged in multiple ways:
config file.
"""
import logging
import os
import warnings
from dataclasses import field
import hydra
import torch
from accelerate import Accelerator
from omegaconf import DictConfig, OmegaConf
from packaging import version
from pytorch3d.implicitron.dataset.data_source import (
DataSourceBase,
ImplicitronDataSource,
)
from pytorch3d.implicitron.models.base_model import ImplicitronModelBase
from pytorch3d.implicitron.models.renderer.multipass_ea import (
MultiPassEmissionAbsorptionRenderer,
)

View File

@@ -11,7 +11,6 @@ import os
from typing import Optional
import torch.optim
from accelerate import Accelerator
from pytorch3d.implicitron.models.base_model import ImplicitronModelBase
from pytorch3d.implicitron.tools import model_io
@@ -26,7 +25,6 @@ logger = logging.getLogger(__name__)
class ModelFactoryBase(ReplaceableBase):
resume: bool = True # resume from the last checkpoint
def __call__(self, **kwargs) -> ImplicitronModelBase:
@@ -116,7 +114,9 @@ class ImplicitronModelFactory(ModelFactoryBase):
"cuda:%d" % 0: "cuda:%d" % accelerator.local_process_index
}
model_state_dict = torch.load(
model_io.get_model_path(model_path), map_location=map_location
model_io.get_model_path(model_path),
map_location=map_location,
weights_only=True,
)
try:

View File

@@ -14,9 +14,7 @@ from dataclasses import field
from typing import Any, Dict, List, Optional, Tuple
import torch.optim
from accelerate import Accelerator
from pytorch3d.implicitron.models.base_model import ImplicitronModelBase
from pytorch3d.implicitron.tools import model_io
from pytorch3d.implicitron.tools.config import (
@@ -123,6 +121,7 @@ class ImplicitronOptimizerFactory(OptimizerFactoryBase):
"""
# Get the parameters to optimize
if hasattr(model, "_get_param_groups"): # use the model function
# pyre-fixme[29]: `Union[Tensor, Module]` is not a function.
p_groups = model._get_param_groups(self.lr, wd=self.weight_decay)
else:
p_groups = [
@@ -241,7 +240,7 @@ class ImplicitronOptimizerFactory(OptimizerFactoryBase):
map_location = {
"cuda:%d" % 0: "cuda:%d" % accelerator.local_process_index
}
optimizer_state = torch.load(opt_path, map_location)
optimizer_state = torch.load(opt_path, map_location, weights_only=True)
else:
raise FileNotFoundError(f"Optimizer state {opt_path} does not exist.")
return optimizer_state

View File

@@ -161,7 +161,6 @@ class ImplicitronTrainingLoop(TrainingLoopBase):
for epoch in range(start_epoch, self.max_epochs):
# automatic new_epoch and plotting of stats at every epoch start
with stats:
# Make sure to re-seed random generators to ensure reproducibility
# even after restart.
seed_all_random_engines(seed + epoch)
@@ -395,6 +394,7 @@ class ImplicitronTrainingLoop(TrainingLoopBase):
):
prefix = f"e{stats.epoch}_it{stats.it[trainmode]}"
if hasattr(model, "visualize"):
# pyre-fixme[29]: `Union[Tensor, Module]` is not a function.
model.visualize(
viz,
visdom_env_imgs,

View File

@@ -12,7 +12,6 @@ import unittest
from pathlib import Path
import torch
from hydra import compose, initialize_config_dir
from omegaconf import OmegaConf
from projects.implicitron_trainer.impl.optimizer_factory import (
@@ -53,12 +52,8 @@ class TestExperiment(unittest.TestCase):
cfg.data_source_ImplicitronDataSource_args.dataset_map_provider_class_type = (
"JsonIndexDatasetMapProvider"
)
dataset_args = (
cfg.data_source_ImplicitronDataSource_args.dataset_map_provider_JsonIndexDatasetMapProvider_args
)
dataloader_args = (
cfg.data_source_ImplicitronDataSource_args.data_loader_map_provider_SequenceDataLoaderMapProvider_args
)
dataset_args = cfg.data_source_ImplicitronDataSource_args.dataset_map_provider_JsonIndexDatasetMapProvider_args
dataloader_args = cfg.data_source_ImplicitronDataSource_args.data_loader_map_provider_SequenceDataLoaderMapProvider_args
dataset_args.category = "skateboard"
dataset_args.test_restrict_sequence_id = 0
dataset_args.dataset_root = "manifold://co3d/tree/extracted"
@@ -94,12 +89,8 @@ class TestExperiment(unittest.TestCase):
cfg.data_source_ImplicitronDataSource_args.dataset_map_provider_class_type = (
"JsonIndexDatasetMapProvider"
)
dataset_args = (
cfg.data_source_ImplicitronDataSource_args.dataset_map_provider_JsonIndexDatasetMapProvider_args
)
dataloader_args = (
cfg.data_source_ImplicitronDataSource_args.data_loader_map_provider_SequenceDataLoaderMapProvider_args
)
dataset_args = cfg.data_source_ImplicitronDataSource_args.dataset_map_provider_JsonIndexDatasetMapProvider_args
dataloader_args = cfg.data_source_ImplicitronDataSource_args.data_loader_map_provider_SequenceDataLoaderMapProvider_args
dataset_args.category = "skateboard"
dataset_args.test_restrict_sequence_id = 0
dataset_args.dataset_root = "manifold://co3d/tree/extracted"
@@ -111,9 +102,7 @@ class TestExperiment(unittest.TestCase):
cfg.training_loop_ImplicitronTrainingLoop_args.max_epochs = 2
cfg.training_loop_ImplicitronTrainingLoop_args.store_checkpoints = False
cfg.optimizer_factory_ImplicitronOptimizerFactory_args.lr_policy = "Exponential"
cfg.optimizer_factory_ImplicitronOptimizerFactory_args.exponential_lr_step_size = (
2
)
cfg.optimizer_factory_ImplicitronOptimizerFactory_args.exponential_lr_step_size = 2
if DEBUG:
experiment.dump_cfg(cfg)

View File

@@ -81,8 +81,9 @@ class TestOptimizerFactory(unittest.TestCase):
def test_param_overrides_self_param_group_assignment(self):
pa, pb, pc = [torch.nn.Parameter(data=torch.tensor(i * 1.0)) for i in range(3)]
na, nb = Node(params=[pa]), Node(
params=[pb], param_groups={"self": "pb_self", "p1": "pb_param"}
na, nb = (
Node(params=[pa]),
Node(params=[pb], param_groups={"self": "pb_self", "p1": "pb_param"}),
)
root = Node(children=[na, nb], params=[pc], param_groups={"m1": "pb_member"})
param_groups = self._get_param_groups(root)

View File

@@ -84,9 +84,9 @@ def get_nerf_datasets(
if autodownload and any(not os.path.isfile(p) for p in (cameras_path, image_path)):
# Automatically download the data files if missing.
download_data((dataset_name,), data_root=data_root)
download_data([dataset_name], data_root=data_root)
train_data = torch.load(cameras_path)
train_data = torch.load(cameras_path, weights_only=True)
n_cameras = train_data["cameras"]["R"].shape[0]
_image_max_image_pixels = Image.MAX_IMAGE_PIXELS

View File

@@ -194,7 +194,6 @@ class Stats:
it = self.it[stat_set]
for stat in self.log_vars:
if stat not in self.stats[stat_set]:
self.stats[stat_set][stat] = AverageMeter()

View File

@@ -24,7 +24,6 @@ CONFIG_DIR = os.path.join(os.path.dirname(os.path.realpath(__file__)), "configs"
@hydra.main(config_path=CONFIG_DIR, config_name="lego")
def main(cfg: DictConfig):
# Device on which to run.
if torch.cuda.is_available():
device = "cuda"
@@ -63,7 +62,7 @@ def main(cfg: DictConfig):
raise ValueError(f"Model checkpoint {checkpoint_path} does not exist!")
print(f"Loading checkpoint {checkpoint_path}.")
loaded_data = torch.load(checkpoint_path)
loaded_data = torch.load(checkpoint_path, weights_only=True)
# Do not load the cached xy grid.
# - this allows setting an arbitrary evaluation image size.
state_dict = {

View File

@@ -42,7 +42,6 @@ class TestRaysampler(unittest.TestCase):
cameras, rays = [], []
for _ in range(batch_size):
R = random_rotations(1)
T = torch.randn(1, 3)
focal_length = torch.rand(1, 2) + 0.5

View File

@@ -25,7 +25,6 @@ CONFIG_DIR = os.path.join(os.path.dirname(os.path.realpath(__file__)), "configs"
@hydra.main(config_path=CONFIG_DIR, config_name="lego")
def main(cfg: DictConfig):
# Set the relevant seeds for reproducibility.
np.random.seed(cfg.seed)
torch.manual_seed(cfg.seed)
@@ -77,7 +76,7 @@ def main(cfg: DictConfig):
# Resume training if requested.
if cfg.resume and os.path.isfile(checkpoint_path):
print(f"Resuming from checkpoint {checkpoint_path}.")
loaded_data = torch.load(checkpoint_path)
loaded_data = torch.load(checkpoint_path, weights_only=True)
model.load_state_dict(loaded_data["model"])
stats = pickle.loads(loaded_data["stats"])
print(f" => resuming from epoch {stats.epoch}.")
@@ -219,7 +218,6 @@ def main(cfg: DictConfig):
# Validation
if epoch % cfg.validation_epoch_interval == 0 and epoch > 0:
# Sample a validation camera/image.
val_batch = next(val_dataloader.__iter__())
val_image, val_camera, camera_idx = val_batch[0].values()

View File

@@ -6,4 +6,4 @@
# pyre-unsafe
__version__ = "0.7.8"
__version__ = "0.7.9"

View File

@@ -17,7 +17,7 @@ Some functions which depend on PyTorch or Python versions.
def meshgrid_ij(
*A: Union[torch.Tensor, Sequence[torch.Tensor]]
*A: Union[torch.Tensor, Sequence[torch.Tensor]],
) -> Tuple[torch.Tensor, ...]: # pragma: no cover
"""
Like torch.meshgrid was before PyTorch 1.10.0, i.e. with indexing set to ij

View File

@@ -32,7 +32,9 @@ __global__ void BallQueryKernel(
at::PackedTensorAccessor64<int64_t, 3, at::RestrictPtrTraits> idxs,
at::PackedTensorAccessor64<scalar_t, 3, at::RestrictPtrTraits> dists,
const int64_t K,
const float radius2) {
const float radius,
const float radius2,
const bool skip_points_outside_cube) {
const int64_t N = p1.size(0);
const int64_t chunks_per_cloud = (1 + (p1.size(1) - 1) / blockDim.x);
const int64_t chunks_to_do = N * chunks_per_cloud;
@@ -51,7 +53,19 @@ __global__ void BallQueryKernel(
// Iterate over points in p2 until desired count is reached or
// all points have been considered
for (int64_t j = 0, count = 0; j < lengths2[n] && count < K; ++j) {
// Calculate the distance between the points
if (skip_points_outside_cube) {
bool is_within_radius = true;
// Filter when any one coordinate is already outside the radius
for (int d = 0; is_within_radius && d < D; ++d) {
scalar_t abs_diff = fabs(p1[n][i][d] - p2[n][j][d]);
is_within_radius = (abs_diff <= radius);
}
if (!is_within_radius) {
continue;
}
}
// Else, calculate the distance between the points and compare
scalar_t dist2 = 0.0;
for (int d = 0; d < D; ++d) {
scalar_t diff = p1[n][i][d] - p2[n][j][d];
@@ -77,7 +91,8 @@ std::tuple<at::Tensor, at::Tensor> BallQueryCuda(
const at::Tensor& lengths1, // (N,)
const at::Tensor& lengths2, // (N,)
int K,
float radius) {
float radius,
bool skip_points_outside_cube) {
// Check inputs are on the same device
at::TensorArg p1_t{p1, "p1", 1}, p2_t{p2, "p2", 2},
lengths1_t{lengths1, "lengths1", 3}, lengths2_t{lengths2, "lengths2", 4};
@@ -120,7 +135,9 @@ std::tuple<at::Tensor, at::Tensor> BallQueryCuda(
idxs.packed_accessor64<int64_t, 3, at::RestrictPtrTraits>(),
dists.packed_accessor64<float, 3, at::RestrictPtrTraits>(),
K_64,
radius2);
radius,
radius2,
skip_points_outside_cube);
}));
AT_CUDA_CHECK(cudaGetLastError());

View File

@@ -25,6 +25,9 @@
// within the radius
// radius: the radius around each point within which the neighbors need to be
// located
// skip_points_outside_cube: If true, reduce multiplications of float values
// by not explicitly calculating distances to points that fall outside the
// D-cube with side length (2*radius) centered at each point in p1.
//
// Returns:
// p1_neighbor_idx: LongTensor of shape (N, P1, K), where
@@ -46,7 +49,8 @@ std::tuple<at::Tensor, at::Tensor> BallQueryCpu(
const at::Tensor& lengths1,
const at::Tensor& lengths2,
const int K,
const float radius);
const float radius,
const bool skip_points_outside_cube);
// CUDA implementation
std::tuple<at::Tensor, at::Tensor> BallQueryCuda(
@@ -55,7 +59,8 @@ std::tuple<at::Tensor, at::Tensor> BallQueryCuda(
const at::Tensor& lengths1,
const at::Tensor& lengths2,
const int K,
const float radius);
const float radius,
const bool skip_points_outside_cube);
// Implementation which is exposed
// Note: the backward pass reuses the KNearestNeighborBackward kernel
@@ -65,7 +70,8 @@ inline std::tuple<at::Tensor, at::Tensor> BallQuery(
const at::Tensor& lengths1,
const at::Tensor& lengths2,
int K,
float radius) {
float radius,
bool skip_points_outside_cube) {
if (p1.is_cuda() || p2.is_cuda()) {
#ifdef WITH_CUDA
CHECK_CUDA(p1);
@@ -76,16 +82,20 @@ inline std::tuple<at::Tensor, at::Tensor> BallQuery(
lengths1.contiguous(),
lengths2.contiguous(),
K,
radius);
radius,
skip_points_outside_cube);
#else
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(p1);
CHECK_CPU(p2);
return BallQueryCpu(
p1.contiguous(),
p2.contiguous(),
lengths1.contiguous(),
lengths2.contiguous(),
K,
radius);
radius,
skip_points_outside_cube);
}

View File

@@ -6,8 +6,8 @@
* LICENSE file in the root directory of this source tree.
*/
#include <math.h>
#include <torch/extension.h>
#include <queue>
#include <tuple>
std::tuple<at::Tensor, at::Tensor> BallQueryCpu(
@@ -16,7 +16,8 @@ std::tuple<at::Tensor, at::Tensor> BallQueryCpu(
const at::Tensor& lengths1,
const at::Tensor& lengths2,
int K,
float radius) {
float radius,
bool skip_points_outside_cube) {
const int N = p1.size(0);
const int P1 = p1.size(1);
const int D = p1.size(2);
@@ -38,6 +39,16 @@ std::tuple<at::Tensor, at::Tensor> BallQueryCpu(
const int64_t length2 = lengths2_a[n];
for (int64_t i = 0; i < length1; ++i) {
for (int64_t j = 0, count = 0; j < length2 && count < K; ++j) {
if (skip_points_outside_cube) {
bool is_within_radius = true;
for (int d = 0; is_within_radius && d < D; ++d) {
float abs_diff = fabs(p1_a[n][i][d] - p2_a[n][j][d]);
is_within_radius = (abs_diff <= radius);
}
if (!is_within_radius) {
continue;
}
}
float dist2 = 0;
for (int d = 0; d < D; ++d) {
float diff = p1_a[n][i][d] - p2_a[n][j][d];

View File

@@ -98,6 +98,11 @@ at::Tensor SigmoidAlphaBlendBackward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(distances);
CHECK_CPU(pix_to_face);
CHECK_CPU(alphas);
CHECK_CPU(grad_alphas);
return SigmoidAlphaBlendBackwardCpu(
grad_alphas, alphas, distances, pix_to_face, sigma);
}

View File

@@ -28,17 +28,16 @@ __global__ void alphaCompositeCudaForwardKernel(
const at::PackedTensorAccessor64<float, 4, at::RestrictPtrTraits> alphas,
const at::PackedTensorAccessor64<int64_t, 4, at::RestrictPtrTraits> points_idx) {
// clang-format on
const int64_t batch_size = result.size(0);
const int64_t C = features.size(0);
const int64_t H = points_idx.size(2);
const int64_t W = points_idx.size(3);
// Get the batch and index
const int batch = blockIdx.x;
const auto batch = blockIdx.x;
const int num_pixels = C * H * W;
const int num_threads = gridDim.y * blockDim.x;
const int tid = blockIdx.y * blockDim.x + threadIdx.x;
const auto num_threads = gridDim.y * blockDim.x;
const auto tid = blockIdx.y * blockDim.x + threadIdx.x;
// Iterate over each feature in each pixel
for (int pid = tid; pid < num_pixels; pid += num_threads) {
@@ -79,17 +78,16 @@ __global__ void alphaCompositeCudaBackwardKernel(
const at::PackedTensorAccessor64<float, 4, at::RestrictPtrTraits> alphas,
const at::PackedTensorAccessor64<int64_t, 4, at::RestrictPtrTraits> points_idx) {
// clang-format on
const int64_t batch_size = points_idx.size(0);
const int64_t C = features.size(0);
const int64_t H = points_idx.size(2);
const int64_t W = points_idx.size(3);
// Get the batch and index
const int batch = blockIdx.x;
const auto batch = blockIdx.x;
const int num_pixels = C * H * W;
const int num_threads = gridDim.y * blockDim.x;
const int tid = blockIdx.y * blockDim.x + threadIdx.x;
const auto num_threads = gridDim.y * blockDim.x;
const auto tid = blockIdx.y * blockDim.x + threadIdx.x;
// Parallelize over each feature in each pixel in images of size H * W,
// for each image in the batch of size batch_size

View File

@@ -74,6 +74,9 @@ torch::Tensor alphaCompositeForward(
AT_ERROR("Not compiled with GPU support");
#endif
} else {
CHECK_CPU(features);
CHECK_CPU(alphas);
CHECK_CPU(points_idx);
return alphaCompositeCpuForward(features, alphas, points_idx);
}
}
@@ -101,6 +104,11 @@ std::tuple<torch::Tensor, torch::Tensor> alphaCompositeBackward(
AT_ERROR("Not compiled with GPU support");
#endif
} else {
CHECK_CPU(grad_outputs);
CHECK_CPU(features);
CHECK_CPU(alphas);
CHECK_CPU(points_idx);
return alphaCompositeCpuBackward(
grad_outputs, features, alphas, points_idx);
}

View File

@@ -28,17 +28,16 @@ __global__ void weightedSumNormCudaForwardKernel(
const at::PackedTensorAccessor64<float, 4, at::RestrictPtrTraits> alphas,
const at::PackedTensorAccessor64<int64_t, 4, at::RestrictPtrTraits> points_idx) {
// clang-format on
const int64_t batch_size = result.size(0);
const int64_t C = features.size(0);
const int64_t H = points_idx.size(2);
const int64_t W = points_idx.size(3);
// Get the batch and index
const int batch = blockIdx.x;
const auto batch = blockIdx.x;
const int num_pixels = C * H * W;
const int num_threads = gridDim.y * blockDim.x;
const int tid = blockIdx.y * blockDim.x + threadIdx.x;
const auto num_threads = gridDim.y * blockDim.x;
const auto tid = blockIdx.y * blockDim.x + threadIdx.x;
// Parallelize over each feature in each pixel in images of size H * W,
// for each image in the batch of size batch_size
@@ -92,17 +91,16 @@ __global__ void weightedSumNormCudaBackwardKernel(
const at::PackedTensorAccessor64<float, 4, at::RestrictPtrTraits> alphas,
const at::PackedTensorAccessor64<int64_t, 4, at::RestrictPtrTraits> points_idx) {
// clang-format on
const int64_t batch_size = points_idx.size(0);
const int64_t C = features.size(0);
const int64_t H = points_idx.size(2);
const int64_t W = points_idx.size(3);
// Get the batch and index
const int batch = blockIdx.x;
const auto batch = blockIdx.x;
const int num_pixels = C * W * H;
const int num_threads = gridDim.y * blockDim.x;
const int tid = blockIdx.y * blockDim.x + threadIdx.x;
const auto num_threads = gridDim.y * blockDim.x;
const auto tid = blockIdx.y * blockDim.x + threadIdx.x;
// Parallelize over each feature in each pixel in images of size H * W,
// for each image in the batch of size batch_size

View File

@@ -73,6 +73,10 @@ torch::Tensor weightedSumNormForward(
AT_ERROR("Not compiled with GPU support");
#endif
} else {
CHECK_CPU(features);
CHECK_CPU(alphas);
CHECK_CPU(points_idx);
return weightedSumNormCpuForward(features, alphas, points_idx);
}
}
@@ -100,6 +104,11 @@ std::tuple<torch::Tensor, torch::Tensor> weightedSumNormBackward(
AT_ERROR("Not compiled with GPU support");
#endif
} else {
CHECK_CPU(grad_outputs);
CHECK_CPU(features);
CHECK_CPU(alphas);
CHECK_CPU(points_idx);
return weightedSumNormCpuBackward(
grad_outputs, features, alphas, points_idx);
}

View File

@@ -26,17 +26,16 @@ __global__ void weightedSumCudaForwardKernel(
const at::PackedTensorAccessor64<float, 4, at::RestrictPtrTraits> alphas,
const at::PackedTensorAccessor64<int64_t, 4, at::RestrictPtrTraits> points_idx) {
// clang-format on
const int64_t batch_size = result.size(0);
const int64_t C = features.size(0);
const int64_t H = points_idx.size(2);
const int64_t W = points_idx.size(3);
// Get the batch and index
const int batch = blockIdx.x;
const auto batch = blockIdx.x;
const int num_pixels = C * H * W;
const int num_threads = gridDim.y * blockDim.x;
const int tid = blockIdx.y * blockDim.x + threadIdx.x;
const auto num_threads = gridDim.y * blockDim.x;
const auto tid = blockIdx.y * blockDim.x + threadIdx.x;
// Parallelize over each feature in each pixel in images of size H * W,
// for each image in the batch of size batch_size
@@ -74,17 +73,16 @@ __global__ void weightedSumCudaBackwardKernel(
const at::PackedTensorAccessor64<float, 4, at::RestrictPtrTraits> alphas,
const at::PackedTensorAccessor64<int64_t, 4, at::RestrictPtrTraits> points_idx) {
// clang-format on
const int64_t batch_size = points_idx.size(0);
const int64_t C = features.size(0);
const int64_t H = points_idx.size(2);
const int64_t W = points_idx.size(3);
// Get the batch and index
const int batch = blockIdx.x;
const auto batch = blockIdx.x;
const int num_pixels = C * H * W;
const int num_threads = gridDim.y * blockDim.x;
const int tid = blockIdx.y * blockDim.x + threadIdx.x;
const auto num_threads = gridDim.y * blockDim.x;
const auto tid = blockIdx.y * blockDim.x + threadIdx.x;
// Iterate over each pixel to compute the contribution to the
// gradient for the features and weights

View File

@@ -72,6 +72,9 @@ torch::Tensor weightedSumForward(
AT_ERROR("Not compiled with GPU support");
#endif
} else {
CHECK_CPU(features);
CHECK_CPU(alphas);
CHECK_CPU(points_idx);
return weightedSumCpuForward(features, alphas, points_idx);
}
}
@@ -98,6 +101,11 @@ std::tuple<torch::Tensor, torch::Tensor> weightedSumBackward(
AT_ERROR("Not compiled with GPU support");
#endif
} else {
CHECK_CPU(grad_outputs);
CHECK_CPU(features);
CHECK_CPU(alphas);
CHECK_CPU(points_idx);
return weightedSumCpuBackward(grad_outputs, features, alphas, points_idx);
}
}

View File

@@ -8,7 +8,6 @@
// clang-format off
#include "./pulsar/global.h" // Include before <torch/extension.h>.
#include <torch/extension.h>
// clang-format on
#include "./pulsar/pytorch/renderer.h"
#include "./pulsar/pytorch/tensor_util.h"
@@ -106,15 +105,16 @@ PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
py::class_<
pulsar::pytorch::Renderer,
std::shared_ptr<pulsar::pytorch::Renderer>>(m, "PulsarRenderer")
.def(py::init<
const uint&,
const uint&,
const uint&,
const bool&,
const bool&,
const float&,
const uint&,
const uint&>())
.def(
py::init<
const uint&,
const uint&,
const uint&,
const bool&,
const bool&,
const float&,
const uint&,
const uint&>())
.def(
"__eq__",
[](const pulsar::pytorch::Renderer& a,
@@ -149,10 +149,10 @@ PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
py::arg("gamma"),
py::arg("max_depth"),
py::arg("min_depth") /* = 0.f*/,
py::arg(
"bg_col") /* = at::nullopt not exposed properly in pytorch 1.1. */
py::arg("bg_col") /* = std::nullopt not exposed properly in
pytorch 1.1. */
,
py::arg("opacity") /* = at::nullopt ... */,
py::arg("opacity") /* = std::nullopt ... */,
py::arg("percent_allowed_difference") = 0.01f,
py::arg("max_n_hits") = MAX_UINT,
py::arg("mode") = 0)

View File

@@ -60,6 +60,8 @@ std::tuple<at::Tensor, at::Tensor> FaceAreasNormalsForward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(verts);
CHECK_CPU(faces);
return FaceAreasNormalsForwardCpu(verts, faces);
}
@@ -80,5 +82,9 @@ at::Tensor FaceAreasNormalsBackward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(grad_areas);
CHECK_CPU(grad_normals);
CHECK_CPU(verts);
CHECK_CPU(faces);
return FaceAreasNormalsBackwardCpu(grad_areas, grad_normals, verts, faces);
}

View File

@@ -20,14 +20,14 @@ __global__ void GatherScatterCudaKernel(
const size_t V,
const size_t D,
const size_t E) {
const int tid = threadIdx.x;
const auto tid = threadIdx.x;
// Reverse the vertex order if backward.
const int v0_idx = backward ? 1 : 0;
const int v1_idx = backward ? 0 : 1;
// Edges are split evenly across the blocks.
for (int e = blockIdx.x; e < E; e += gridDim.x) {
for (auto e = blockIdx.x; e < E; e += gridDim.x) {
// Get indices of vertices which form the edge.
const int64_t v0 = edges[2 * e + v0_idx];
const int64_t v1 = edges[2 * e + v1_idx];
@@ -35,7 +35,7 @@ __global__ void GatherScatterCudaKernel(
// Split vertex features evenly across threads.
// This implementation will be quite wasteful when D<128 since there will be
// a lot of threads doing nothing.
for (int d = tid; d < D; d += blockDim.x) {
for (auto d = tid; d < D; d += blockDim.x) {
const float val = input[v1 * D + d];
float* address = output + v0 * D + d;
atomicAdd(address, val);

View File

@@ -53,5 +53,7 @@ at::Tensor GatherScatter(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(input);
CHECK_CPU(edges);
return GatherScatterCpu(input, edges, directed, backward);
}

View File

@@ -20,8 +20,8 @@ __global__ void InterpFaceAttrsForwardKernel(
const size_t P,
const size_t F,
const size_t D) {
const int tid = threadIdx.x + blockIdx.x * blockDim.x;
const int num_threads = blockDim.x * gridDim.x;
const auto tid = threadIdx.x + blockIdx.x * blockDim.x;
const auto num_threads = blockDim.x * gridDim.x;
for (int pd = tid; pd < P * D; pd += num_threads) {
const int p = pd / D;
const int d = pd % D;
@@ -93,8 +93,8 @@ __global__ void InterpFaceAttrsBackwardKernel(
const size_t P,
const size_t F,
const size_t D) {
const int tid = threadIdx.x + blockIdx.x * blockDim.x;
const int num_threads = blockDim.x * gridDim.x;
const auto tid = threadIdx.x + blockIdx.x * blockDim.x;
const auto num_threads = blockDim.x * gridDim.x;
for (int pd = tid; pd < P * D; pd += num_threads) {
const int p = pd / D;
const int d = pd % D;

View File

@@ -57,6 +57,8 @@ at::Tensor InterpFaceAttrsForward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(face_attrs);
CHECK_CPU(barycentric_coords);
return InterpFaceAttrsForwardCpu(pix_to_face, barycentric_coords, face_attrs);
}
@@ -106,6 +108,9 @@ std::tuple<at::Tensor, at::Tensor> InterpFaceAttrsBackward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(face_attrs);
CHECK_CPU(barycentric_coords);
CHECK_CPU(grad_pix_attrs);
return InterpFaceAttrsBackwardCpu(
pix_to_face, barycentric_coords, face_attrs, grad_pix_attrs);
}

View File

@@ -44,5 +44,7 @@ inline std::tuple<at::Tensor, at::Tensor> IoUBox3D(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(boxes1);
CHECK_CPU(boxes2);
return IoUBox3DCpu(boxes1.contiguous(), boxes2.contiguous());
}

View File

@@ -7,10 +7,7 @@
*/
#include <torch/extension.h>
#include <torch/torch.h>
#include <list>
#include <numeric>
#include <queue>
#include <tuple>
#include "iou_box3d/iou_utils.h"

View File

@@ -461,10 +461,8 @@ __device__ inline std::tuple<float3, float3> ArgMaxVerts(
__device__ inline bool IsCoplanarTriTri(
const FaceVerts& tri1,
const FaceVerts& tri2) {
const float3 tri1_ctr = FaceCenter({tri1.v0, tri1.v1, tri1.v2});
const float3 tri1_n = FaceNormal({tri1.v0, tri1.v1, tri1.v2});
const float3 tri2_ctr = FaceCenter({tri2.v0, tri2.v1, tri2.v2});
const float3 tri2_n = FaceNormal({tri2.v0, tri2.v1, tri2.v2});
// Check if parallel
@@ -500,7 +498,6 @@ __device__ inline bool IsCoplanarTriPlane(
const FaceVerts& tri,
const FaceVerts& plane,
const float3& normal) {
const float3 tri_ctr = FaceCenter({tri.v0, tri.v1, tri.v2});
const float3 nt = FaceNormal({tri.v0, tri.v1, tri.v2});
// check if parallel
@@ -728,7 +725,7 @@ __device__ inline int BoxIntersections(
}
}
// Update the face_verts_out tris
num_tris = offset;
num_tris = min(MAX_TRIS, offset);
for (int j = 0; j < num_tris; ++j) {
face_verts_out[j] = tri_verts_updated[j];
}

View File

@@ -74,6 +74,8 @@ std::tuple<at::Tensor, at::Tensor> KNearestNeighborIdx(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(p1);
CHECK_CPU(p2);
return KNearestNeighborIdxCpu(p1, p2, lengths1, lengths2, norm, K);
}
@@ -140,6 +142,8 @@ std::tuple<at::Tensor, at::Tensor> KNearestNeighborBackward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(p1);
CHECK_CPU(p2);
return KNearestNeighborBackwardCpu(
p1, p2, lengths1, lengths2, idxs, norm, grad_dists);
}

View File

@@ -58,5 +58,6 @@ inline std::tuple<at::Tensor, at::Tensor, at::Tensor> MarchingCubes(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(vol);
return MarchingCubesCpu(vol.contiguous(), isolevel);
}

View File

@@ -88,6 +88,8 @@ at::Tensor PackedToPadded(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(inputs_packed);
CHECK_CPU(first_idxs);
return PackedToPaddedCpu(inputs_packed, first_idxs, max_size);
}
@@ -105,5 +107,7 @@ at::Tensor PaddedToPacked(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(inputs_padded);
CHECK_CPU(first_idxs);
return PaddedToPackedCpu(inputs_padded, first_idxs, num_inputs);
}

View File

@@ -174,8 +174,8 @@ std::tuple<at::Tensor, at::Tensor> HullHullDistanceForwardCpu(
at::Tensor idxs = at::zeros({A_N,}, as_first_idx.options());
// clang-format on
auto as_a = as.accessor < float, H1 == 1 ? 2 : 3 > ();
auto bs_a = bs.accessor < float, H2 == 1 ? 2 : 3 > ();
auto as_a = as.accessor<float, H1 == 1 ? 2 : 3>();
auto bs_a = bs.accessor<float, H2 == 1 ? 2 : 3>();
auto as_first_idx_a = as_first_idx.accessor<int64_t, 1>();
auto bs_first_idx_a = bs_first_idx.accessor<int64_t, 1>();
auto dists_a = dists.accessor<float, 1>();
@@ -230,10 +230,10 @@ std::tuple<at::Tensor, at::Tensor> HullHullDistanceBackwardCpu(
at::Tensor grad_as = at::zeros_like(as);
at::Tensor grad_bs = at::zeros_like(bs);
auto as_a = as.accessor < float, H1 == 1 ? 2 : 3 > ();
auto bs_a = bs.accessor < float, H2 == 1 ? 2 : 3 > ();
auto grad_as_a = grad_as.accessor < float, H1 == 1 ? 2 : 3 > ();
auto grad_bs_a = grad_bs.accessor < float, H2 == 1 ? 2 : 3 > ();
auto as_a = as.accessor<float, H1 == 1 ? 2 : 3>();
auto bs_a = bs.accessor<float, H2 == 1 ? 2 : 3>();
auto grad_as_a = grad_as.accessor<float, H1 == 1 ? 2 : 3>();
auto grad_bs_a = grad_bs.accessor<float, H2 == 1 ? 2 : 3>();
auto idx_bs_a = idx_bs.accessor<int64_t, 1>();
auto grad_dists_a = grad_dists.accessor<float, 1>();

View File

@@ -110,7 +110,7 @@ __global__ void DistanceForwardKernel(
__syncthreads();
// Perform reduction in shared memory.
for (int s = blockDim.x / 2; s > 32; s >>= 1) {
for (auto s = blockDim.x / 2; s > 32; s >>= 1) {
if (tid < s) {
if (min_dists[tid] > min_dists[tid + s]) {
min_dists[tid] = min_dists[tid + s];
@@ -502,8 +502,8 @@ __global__ void PointFaceArrayForwardKernel(
const float3* tris_f3 = (float3*)tris;
// Parallelize over P * S computations
const int num_threads = gridDim.x * blockDim.x;
const int tid = blockIdx.x * blockDim.x + threadIdx.x;
const auto num_threads = gridDim.x * blockDim.x;
const auto tid = blockIdx.x * blockDim.x + threadIdx.x;
for (int t_i = tid; t_i < P * T; t_i += num_threads) {
const int t = t_i / P; // segment index.
@@ -576,8 +576,8 @@ __global__ void PointFaceArrayBackwardKernel(
const float3* tris_f3 = (float3*)tris;
// Parallelize over P * S computations
const int num_threads = gridDim.x * blockDim.x;
const int tid = blockIdx.x * blockDim.x + threadIdx.x;
const auto num_threads = gridDim.x * blockDim.x;
const auto tid = blockIdx.x * blockDim.x + threadIdx.x;
for (int t_i = tid; t_i < P * T; t_i += num_threads) {
const int t = t_i / P; // triangle index.
@@ -683,8 +683,8 @@ __global__ void PointEdgeArrayForwardKernel(
float3* segms_f3 = (float3*)segms;
// Parallelize over P * S computations
const int num_threads = gridDim.x * blockDim.x;
const int tid = blockIdx.x * blockDim.x + threadIdx.x;
const auto num_threads = gridDim.x * blockDim.x;
const auto tid = blockIdx.x * blockDim.x + threadIdx.x;
for (int t_i = tid; t_i < P * S; t_i += num_threads) {
const int s = t_i / P; // segment index.
@@ -752,8 +752,8 @@ __global__ void PointEdgeArrayBackwardKernel(
float3* segms_f3 = (float3*)segms;
// Parallelize over P * S computations
const int num_threads = gridDim.x * blockDim.x;
const int tid = blockIdx.x * blockDim.x + threadIdx.x;
const auto num_threads = gridDim.x * blockDim.x;
const auto tid = blockIdx.x * blockDim.x + threadIdx.x;
for (int t_i = tid; t_i < P * S; t_i += num_threads) {
const int s = t_i / P; // segment index.

View File

@@ -88,6 +88,10 @@ std::tuple<torch::Tensor, torch::Tensor> PointFaceDistanceForward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(points);
CHECK_CPU(points_first_idx);
CHECK_CPU(tris);
CHECK_CPU(tris_first_idx);
return PointFaceDistanceForwardCpu(
points, points_first_idx, tris, tris_first_idx, min_triangle_area);
}
@@ -143,6 +147,10 @@ std::tuple<torch::Tensor, torch::Tensor> PointFaceDistanceBackward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(points);
CHECK_CPU(tris);
CHECK_CPU(idx_points);
CHECK_CPU(grad_dists);
return PointFaceDistanceBackwardCpu(
points, tris, idx_points, grad_dists, min_triangle_area);
}
@@ -221,6 +229,10 @@ std::tuple<torch::Tensor, torch::Tensor> FacePointDistanceForward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(points);
CHECK_CPU(points_first_idx);
CHECK_CPU(tris);
CHECK_CPU(tris_first_idx);
return FacePointDistanceForwardCpu(
points, points_first_idx, tris, tris_first_idx, min_triangle_area);
}
@@ -277,6 +289,10 @@ std::tuple<torch::Tensor, torch::Tensor> FacePointDistanceBackward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(points);
CHECK_CPU(tris);
CHECK_CPU(idx_tris);
CHECK_CPU(grad_dists);
return FacePointDistanceBackwardCpu(
points, tris, idx_tris, grad_dists, min_triangle_area);
}
@@ -346,6 +362,10 @@ std::tuple<torch::Tensor, torch::Tensor> PointEdgeDistanceForward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(points);
CHECK_CPU(points_first_idx);
CHECK_CPU(segms);
CHECK_CPU(segms_first_idx);
return PointEdgeDistanceForwardCpu(
points, points_first_idx, segms, segms_first_idx, max_points);
}
@@ -396,6 +416,10 @@ std::tuple<torch::Tensor, torch::Tensor> PointEdgeDistanceBackward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(points);
CHECK_CPU(segms);
CHECK_CPU(idx_points);
CHECK_CPU(grad_dists);
return PointEdgeDistanceBackwardCpu(points, segms, idx_points, grad_dists);
}
@@ -464,6 +488,10 @@ std::tuple<torch::Tensor, torch::Tensor> EdgePointDistanceForward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(points);
CHECK_CPU(points_first_idx);
CHECK_CPU(segms);
CHECK_CPU(segms_first_idx);
return EdgePointDistanceForwardCpu(
points, points_first_idx, segms, segms_first_idx, max_segms);
}
@@ -514,6 +542,10 @@ std::tuple<torch::Tensor, torch::Tensor> EdgePointDistanceBackward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(points);
CHECK_CPU(segms);
CHECK_CPU(idx_segms);
CHECK_CPU(grad_dists);
return EdgePointDistanceBackwardCpu(points, segms, idx_segms, grad_dists);
}
@@ -567,6 +599,8 @@ torch::Tensor PointFaceArrayDistanceForward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(points);
CHECK_CPU(tris);
return PointFaceArrayDistanceForwardCpu(points, tris, min_triangle_area);
}
@@ -613,6 +647,9 @@ std::tuple<torch::Tensor, torch::Tensor> PointFaceArrayDistanceBackward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(points);
CHECK_CPU(tris);
CHECK_CPU(grad_dists);
return PointFaceArrayDistanceBackwardCpu(
points, tris, grad_dists, min_triangle_area);
}
@@ -661,6 +698,8 @@ torch::Tensor PointEdgeArrayDistanceForward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(points);
CHECK_CPU(segms);
return PointEdgeArrayDistanceForwardCpu(points, segms);
}
@@ -703,5 +742,8 @@ std::tuple<torch::Tensor, torch::Tensor> PointEdgeArrayDistanceBackward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(points);
CHECK_CPU(segms);
CHECK_CPU(grad_dists);
return PointEdgeArrayDistanceBackwardCpu(points, segms, grad_dists);
}

View File

@@ -104,6 +104,12 @@ inline void PointsToVolumesForward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(points_3d);
CHECK_CPU(points_features);
CHECK_CPU(volume_densities);
CHECK_CPU(volume_features);
CHECK_CPU(grid_sizes);
CHECK_CPU(mask);
PointsToVolumesForwardCpu(
points_3d,
points_features,
@@ -183,6 +189,14 @@ inline void PointsToVolumesBackward(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(points_3d);
CHECK_CPU(points_features);
CHECK_CPU(grid_sizes);
CHECK_CPU(mask);
CHECK_CPU(grad_volume_densities);
CHECK_CPU(grad_volume_features);
CHECK_CPU(grad_points_3d);
CHECK_CPU(grad_points_features);
PointsToVolumesBackwardCpu(
points_3d,
points_features,

View File

@@ -8,9 +8,7 @@
#include <torch/csrc/autograd/VariableTypeUtils.h>
#include <torch/extension.h>
#include <algorithm>
#include <cmath>
#include <thread>
#include <vector>
// In the x direction, the location {0, ..., grid_size_x - 1} correspond to

View File

@@ -15,8 +15,8 @@
#endif
#if defined(_WIN64) || defined(_WIN32)
#define uint unsigned int
#define ushort unsigned short
using uint = unsigned int;
using ushort = unsigned short;
#endif
#include "./logging.h" // <- include before torch/extension.h

View File

@@ -417,7 +417,7 @@ __device__ static float atomicMin(float* address, float val) {
(OUT_PTR), \
(NUM_SELECTED_PTR), \
(NUM_ITEMS), \
stream = (STREAM));
(STREAM));
#define COPY_HOST_DEV(PTR_D, PTR_H, TYPE, SIZE) \
HANDLECUDA(cudaMemcpy( \

View File

@@ -357,11 +357,11 @@ void MAX_WS(
//
//
#define END_PARALLEL() \
end_parallel :; \
end_parallel:; \
}
#define END_PARALLEL_NORET() }
#define END_PARALLEL_2D() \
end_parallel :; \
end_parallel:; \
} \
}
#define END_PARALLEL_2D_NORET() \

View File

@@ -70,11 +70,6 @@ struct CamGradInfo {
float3 pixel_dir_y;
};
// TODO: remove once https://github.com/NVlabs/cub/issues/172 is resolved.
struct IntWrapper {
int val;
};
} // namespace pulsar
#endif

View File

@@ -149,11 +149,6 @@ IHD CamGradInfo operator*(const CamGradInfo& a, const float& b) {
return res;
}
IHD IntWrapper operator+(const IntWrapper& a, const IntWrapper& b) {
IntWrapper res;
res.val = a.val + b.val;
return res;
}
} // namespace pulsar
#endif

View File

@@ -155,8 +155,8 @@ void backward(
stream);
CHECKLAUNCH();
SUM_WS(
(IntWrapper*)(self->ids_sorted_d),
(IntWrapper*)(self->n_grad_contributions_d),
self->ids_sorted_d,
self->n_grad_contributions_d,
static_cast<int>(num_balls),
self->workspace_d,
self->workspace_size,

View File

@@ -52,7 +52,7 @@ HOST void construct(
self->cam.film_width = width;
self->cam.film_height = height;
self->max_num_balls = max_num_balls;
MALLOC(self->result_d, float, width* height* n_channels);
MALLOC(self->result_d, float, width * height * n_channels);
self->cam.orthogonal_projection = orthogonal_projection;
self->cam.right_handed = right_handed_system;
self->cam.background_normalization_depth = background_normalization_depth;
@@ -93,7 +93,7 @@ HOST void construct(
MALLOC(self->di_sorted_d, DrawInfo, max_num_balls);
MALLOC(self->region_flags_d, char, max_num_balls);
MALLOC(self->num_selected_d, size_t, 1);
MALLOC(self->forw_info_d, float, width* height * (3 + 2 * n_track));
MALLOC(self->forw_info_d, float, width * height * (3 + 2 * n_track));
MALLOC(self->min_max_pixels_d, IntersectInfo, 1);
MALLOC(self->grad_pos_d, float3, max_num_balls);
MALLOC(self->grad_col_d, float, max_num_balls* n_channels);

View File

@@ -18,68 +18,89 @@ namespace Renderer {
template <bool DEV>
HOST void destruct(Renderer* self) {
if (self->result_d != NULL)
if (self->result_d != NULL) {
FREE(self->result_d);
}
self->result_d = NULL;
if (self->min_depth_d != NULL)
if (self->min_depth_d != NULL) {
FREE(self->min_depth_d);
}
self->min_depth_d = NULL;
if (self->min_depth_sorted_d != NULL)
if (self->min_depth_sorted_d != NULL) {
FREE(self->min_depth_sorted_d);
}
self->min_depth_sorted_d = NULL;
if (self->ii_d != NULL)
if (self->ii_d != NULL) {
FREE(self->ii_d);
}
self->ii_d = NULL;
if (self->ii_sorted_d != NULL)
if (self->ii_sorted_d != NULL) {
FREE(self->ii_sorted_d);
}
self->ii_sorted_d = NULL;
if (self->ids_d != NULL)
if (self->ids_d != NULL) {
FREE(self->ids_d);
}
self->ids_d = NULL;
if (self->ids_sorted_d != NULL)
if (self->ids_sorted_d != NULL) {
FREE(self->ids_sorted_d);
}
self->ids_sorted_d = NULL;
if (self->workspace_d != NULL)
if (self->workspace_d != NULL) {
FREE(self->workspace_d);
}
self->workspace_d = NULL;
if (self->di_d != NULL)
if (self->di_d != NULL) {
FREE(self->di_d);
}
self->di_d = NULL;
if (self->di_sorted_d != NULL)
if (self->di_sorted_d != NULL) {
FREE(self->di_sorted_d);
}
self->di_sorted_d = NULL;
if (self->region_flags_d != NULL)
if (self->region_flags_d != NULL) {
FREE(self->region_flags_d);
}
self->region_flags_d = NULL;
if (self->num_selected_d != NULL)
if (self->num_selected_d != NULL) {
FREE(self->num_selected_d);
}
self->num_selected_d = NULL;
if (self->forw_info_d != NULL)
if (self->forw_info_d != NULL) {
FREE(self->forw_info_d);
}
self->forw_info_d = NULL;
if (self->min_max_pixels_d != NULL)
if (self->min_max_pixels_d != NULL) {
FREE(self->min_max_pixels_d);
}
self->min_max_pixels_d = NULL;
if (self->grad_pos_d != NULL)
if (self->grad_pos_d != NULL) {
FREE(self->grad_pos_d);
}
self->grad_pos_d = NULL;
if (self->grad_col_d != NULL)
if (self->grad_col_d != NULL) {
FREE(self->grad_col_d);
}
self->grad_col_d = NULL;
if (self->grad_rad_d != NULL)
if (self->grad_rad_d != NULL) {
FREE(self->grad_rad_d);
}
self->grad_rad_d = NULL;
if (self->grad_cam_d != NULL)
if (self->grad_cam_d != NULL) {
FREE(self->grad_cam_d);
}
self->grad_cam_d = NULL;
if (self->grad_cam_buf_d != NULL)
if (self->grad_cam_buf_d != NULL) {
FREE(self->grad_cam_buf_d);
}
self->grad_cam_buf_d = NULL;
if (self->grad_opy_d != NULL)
if (self->grad_opy_d != NULL) {
FREE(self->grad_opy_d);
}
self->grad_opy_d = NULL;
if (self->n_grad_contributions_d != NULL)
if (self->n_grad_contributions_d != NULL) {
FREE(self->n_grad_contributions_d);
}
self->n_grad_contributions_d = NULL;
}

View File

@@ -255,7 +255,7 @@ GLOBAL void calc_signature(
* for every iteration through the loading loop every thread could add a
* 'hit' to the buffer.
*/
#define RENDER_BUFFER_SIZE RENDER_BLOCK_SIZE* RENDER_BLOCK_SIZE * 2
#define RENDER_BUFFER_SIZE RENDER_BLOCK_SIZE * RENDER_BLOCK_SIZE * 2
/**
* The threshold after which the spheres that are in the render buffer
* are rendered and the buffer is flushed.

View File

@@ -64,8 +64,9 @@ GLOBAL void norm_sphere_gradients(Renderer renderer, const int num_balls) {
// The sphere only contributes to the camera gradients if it is
// large enough in screen space.
if (renderer.ids_sorted_d[idx] > 0 && ii.max.x >= ii.min.x + 3 &&
ii.max.y >= ii.min.y + 3)
ii.max.y >= ii.min.y + 3) {
renderer.ids_sorted_d[idx] = 1;
}
END_PARALLEL_NORET();
};

View File

@@ -139,8 +139,9 @@ GLOBAL void render(
coord_y < cam_norm.film_border_top + cam_norm.film_height) {
// Initialize the result.
if (mode == 0u) {
for (uint c_id = 0; c_id < cam_norm.n_channels; ++c_id)
for (uint c_id = 0; c_id < cam_norm.n_channels; ++c_id) {
result[c_id] = bg_col[c_id];
}
} else {
result[0] = 0.f;
}
@@ -190,20 +191,22 @@ GLOBAL void render(
"render|found intersection with sphere %u.\n",
sphere_id_l[write_idx]);
}
if (ii.min.x == MAX_USHORT)
if (ii.min.x == MAX_USHORT) {
// This is an invalid sphere (out of image). These spheres have
// maximum depth. Since we ordered the spheres by earliest possible
// intersection depth we re certain that there will no other sphere
// that is relevant after this one.
loading_done = true;
}
}
// Reset n_pixels_done.
n_pixels_done = 0;
thread_block.sync(); // Make sure n_loaded is updated.
if (n_loaded > RENDER_BUFFER_LOAD_THRESH) {
// The load buffer is full enough. Draw.
if (thread_block.thread_rank() == 0)
if (thread_block.thread_rank() == 0) {
n_balls_loaded += n_loaded;
}
max_closest_possible_intersection = 0.f;
// This excludes threads outside of the image boundary. Also, it reduces
// block artifacts.
@@ -290,8 +293,9 @@ GLOBAL void render(
uint warp_done = thread_warp.ballot(done);
int warp_done_bit_cnt = POPC(warp_done);
#endif //__CUDACC__ && __HIP_PLATFORM_AMD__
if (thread_warp.thread_rank() == 0)
if (thread_warp.thread_rank() == 0) {
ATOMICADD_B(&n_pixels_done, warp_done_bit_cnt);
}
// This sync is necessary to keep n_loaded until all threads are done with
// painting.
thread_block.sync();
@@ -299,8 +303,9 @@ GLOBAL void render(
}
thread_block.sync();
}
if (thread_block.thread_rank() == 0)
if (thread_block.thread_rank() == 0) {
n_balls_loaded += n_loaded;
}
PULSAR_LOG_DEV_PIX(
PULSAR_LOG_RENDER_PIX,
"render|loaded %d balls in total.\n",
@@ -386,8 +391,9 @@ GLOBAL void render(
static_cast<float>(tracker.get_n_hits());
} else {
float sm_d_normfac = FRCP(FMAX(sm_d, FEPS));
for (uint c_id = 0; c_id < cam_norm.n_channels; ++c_id)
for (uint c_id = 0; c_id < cam_norm.n_channels; ++c_id) {
result[c_id] *= sm_d_normfac;
}
int write_loc = (coord_y - cam_norm.film_border_top) * cam_norm.film_width *
(3 + 2 * n_track) +
(coord_x - cam_norm.film_border_left) * (3 + 2 * n_track);

View File

@@ -860,8 +860,9 @@ std::tuple<torch::Tensor, torch::Tensor> Renderer::forward(
? (cudaStream_t) nullptr
#endif
: (cudaStream_t) nullptr);
if (mode == 1)
if (mode == 1) {
results[batch_i] = results[batch_i].slice(2, 0, 1, 1);
}
forw_infos[batch_i] = from_blob(
this->renderer_vec[batch_i].forw_info_d,
{this->renderer_vec[0].cam.film_height,

View File

@@ -128,8 +128,9 @@ struct Renderer {
stream << "pulsar::Renderer[";
// Device info.
stream << self.device_type;
if (self.device_index != -1)
if (self.device_index != -1) {
stream << ", ID " << self.device_index;
}
stream << "]";
return stream;
}

View File

@@ -8,6 +8,7 @@
#ifdef WITH_CUDA
#include <ATen/cuda/CUDAContext.h>
#include <c10/cuda/CUDAException.h>
#include <cuda_runtime_api.h>
#endif
#include <torch/extension.h>
@@ -33,13 +34,13 @@ torch::Tensor sphere_ids_from_result_info_nograd(
.contiguous();
if (forw_info.device().type() == c10::DeviceType::CUDA) {
#ifdef WITH_CUDA
cudaMemcpyAsync(
C10_CUDA_CHECK(cudaMemcpyAsync(
result.data_ptr(),
tmp.data_ptr(),
sizeof(uint32_t) * tmp.size(0) * tmp.size(1) * tmp.size(2) *
tmp.size(3),
cudaMemcpyDeviceToDevice,
at::cuda::getCurrentCUDAStream());
at::cuda::getCurrentCUDAStream()));
#else
throw std::runtime_error(
"Copy on CUDA device initiated but built "

View File

@@ -7,6 +7,7 @@
*/
#ifdef WITH_CUDA
#include <c10/cuda/CUDAException.h>
#include <cuda_runtime_api.h>
namespace pulsar {
@@ -17,7 +18,8 @@ void cudaDevToDev(
const void* src,
const int& size,
const cudaStream_t& stream) {
cudaMemcpyAsync(trg, src, size, cudaMemcpyDeviceToDevice, stream);
C10_CUDA_CHECK(
cudaMemcpyAsync(trg, src, size, cudaMemcpyDeviceToDevice, stream));
}
void cudaDevToHost(
@@ -25,7 +27,8 @@ void cudaDevToHost(
const void* src,
const int& size,
const cudaStream_t& stream) {
cudaMemcpyAsync(trg, src, size, cudaMemcpyDeviceToHost, stream);
C10_CUDA_CHECK(
cudaMemcpyAsync(trg, src, size, cudaMemcpyDeviceToHost, stream));
}
} // namespace pytorch

View File

@@ -6,9 +6,6 @@
* LICENSE file in the root directory of this source tree.
*/
#include "./global.h"
#include "./logging.h"
/**
* A compilation unit to provide warnings about the code and avoid
* repeated messages.

View File

@@ -25,7 +25,7 @@ class BitMask {
// Use all threads in the current block to clear all bits of this BitMask
__device__ void block_clear() {
for (int i = threadIdx.x; i < H * W * D; i += blockDim.x) {
for (auto i = threadIdx.x; i < H * W * D; i += blockDim.x) {
data[i] = 0;
}
__syncthreads();

View File

@@ -23,8 +23,8 @@ __global__ void TriangleBoundingBoxKernel(
const float blur_radius,
float* bboxes, // (4, F)
bool* skip_face) { // (F,)
const int tid = blockIdx.x * blockDim.x + threadIdx.x;
const int num_threads = blockDim.x * gridDim.x;
const auto tid = blockIdx.x * blockDim.x + threadIdx.x;
const auto num_threads = blockDim.x * gridDim.x;
const float sqrt_radius = sqrt(blur_radius);
for (int f = tid; f < F; f += num_threads) {
const float v0x = face_verts[f * 9 + 0 * 3 + 0];
@@ -56,8 +56,8 @@ __global__ void PointBoundingBoxKernel(
const int P,
float* bboxes, // (4, P)
bool* skip_points) {
const int tid = blockIdx.x * blockDim.x + threadIdx.x;
const int num_threads = blockDim.x * gridDim.x;
const auto tid = blockIdx.x * blockDim.x + threadIdx.x;
const auto num_threads = blockDim.x * gridDim.x;
for (int p = tid; p < P; p += num_threads) {
const float x = points[p * 3 + 0];
const float y = points[p * 3 + 1];
@@ -113,7 +113,7 @@ __global__ void RasterizeCoarseCudaKernel(
const int chunks_per_batch = 1 + (E - 1) / chunk_size;
const int num_chunks = N * chunks_per_batch;
for (int chunk = blockIdx.x; chunk < num_chunks; chunk += gridDim.x) {
for (auto chunk = blockIdx.x; chunk < num_chunks; chunk += gridDim.x) {
const int batch_idx = chunk / chunks_per_batch; // batch index
const int chunk_idx = chunk % chunks_per_batch;
const int elem_chunk_start_idx = chunk_idx * chunk_size;
@@ -123,7 +123,7 @@ __global__ void RasterizeCoarseCudaKernel(
const int64_t elem_stop_idx = elem_start_idx + elems_per_batch[batch_idx];
// Have each thread handle a different face within the chunk
for (int e = threadIdx.x; e < chunk_size; e += blockDim.x) {
for (auto e = threadIdx.x; e < chunk_size; e += blockDim.x) {
const int e_idx = elem_chunk_start_idx + e;
// Check that we are still within the same element of the batch
@@ -170,7 +170,7 @@ __global__ void RasterizeCoarseCudaKernel(
// Now we have processed every elem in the current chunk. We need to
// count the number of elems in each bin so we can write the indices
// out to global memory. We have each thread handle a different bin.
for (int byx = threadIdx.x; byx < num_bins_y * num_bins_x;
for (auto byx = threadIdx.x; byx < num_bins_y * num_bins_x;
byx += blockDim.x) {
const int by = byx / num_bins_x;
const int bx = byx % num_bins_x;

View File

@@ -260,8 +260,8 @@ __global__ void RasterizeMeshesNaiveCudaKernel(
float* pix_dists,
float* bary) {
// Simple version: One thread per output pixel
int num_threads = gridDim.x * blockDim.x;
int tid = blockDim.x * blockIdx.x + threadIdx.x;
auto num_threads = gridDim.x * blockDim.x;
auto tid = blockDim.x * blockIdx.x + threadIdx.x;
for (int i = tid; i < N * H * W; i += num_threads) {
// Convert linear index to 3D index
@@ -446,8 +446,8 @@ __global__ void RasterizeMeshesBackwardCudaKernel(
// Parallelize over each pixel in images of
// size H * W, for each image in the batch of size N.
const int num_threads = gridDim.x * blockDim.x;
const int tid = blockIdx.x * blockDim.x + threadIdx.x;
const auto num_threads = gridDim.x * blockDim.x;
const auto tid = blockIdx.x * blockDim.x + threadIdx.x;
for (int t_i = tid; t_i < N * H * W; t_i += num_threads) {
// Convert linear index to 3D index
@@ -650,8 +650,8 @@ __global__ void RasterizeMeshesFineCudaKernel(
) {
// This can be more than H * W if H or W are not divisible by bin_size.
int num_pixels = N * BH * BW * bin_size * bin_size;
int num_threads = gridDim.x * blockDim.x;
int tid = blockIdx.x * blockDim.x + threadIdx.x;
auto num_threads = gridDim.x * blockDim.x;
auto tid = blockIdx.x * blockDim.x + threadIdx.x;
for (int pid = tid; pid < num_pixels; pid += num_threads) {
// Convert linear index into bin and pixel indices. We make the within

View File

@@ -138,6 +138,9 @@ RasterizeMeshesNaive(
AT_ERROR("Not compiled with GPU support");
#endif
} else {
CHECK_CPU(face_verts);
CHECK_CPU(mesh_to_face_first_idx);
CHECK_CPU(num_faces_per_mesh);
return RasterizeMeshesNaiveCpu(
face_verts,
mesh_to_face_first_idx,
@@ -232,6 +235,11 @@ torch::Tensor RasterizeMeshesBackward(
AT_ERROR("Not compiled with GPU support");
#endif
} else {
CHECK_CPU(face_verts);
CHECK_CPU(pix_to_face);
CHECK_CPU(grad_zbuf);
CHECK_CPU(grad_bary);
CHECK_CPU(grad_dists);
return RasterizeMeshesBackwardCpu(
face_verts,
pix_to_face,
@@ -306,6 +314,9 @@ torch::Tensor RasterizeMeshesCoarse(
AT_ERROR("Not compiled with GPU support");
#endif
} else {
CHECK_CPU(face_verts);
CHECK_CPU(mesh_to_face_first_idx);
CHECK_CPU(num_faces_per_mesh);
return RasterizeMeshesCoarseCpu(
face_verts,
mesh_to_face_first_idx,
@@ -423,6 +434,8 @@ RasterizeMeshesFine(
AT_ERROR("Not compiled with GPU support");
#endif
} else {
CHECK_CPU(face_verts);
CHECK_CPU(bin_faces);
AT_ERROR("NOT IMPLEMENTED");
}
}

View File

@@ -9,7 +9,6 @@
#include <torch/extension.h>
#include <algorithm>
#include <list>
#include <queue>
#include <thread>
#include <tuple>
#include "ATen/core/TensorAccessor.h"
@@ -107,6 +106,8 @@ auto ComputeFaceAreas(const torch::Tensor& face_verts) {
return face_areas;
}
namespace {
// Helper function to use with std::find_if to find the index of any
// values in the top k struct which match a given idx.
struct IsNeighbor {
@@ -119,7 +120,6 @@ struct IsNeighbor {
int neighbor_idx;
};
namespace {
void RasterizeMeshesNaiveCpu_worker(
const int start_yi,
const int end_yi,

View File

@@ -97,8 +97,8 @@ __global__ void RasterizePointsNaiveCudaKernel(
float* zbuf, // (N, H, W, K)
float* pix_dists) { // (N, H, W, K)
// Simple version: One thread per output pixel
const int num_threads = gridDim.x * blockDim.x;
const int tid = blockDim.x * blockIdx.x + threadIdx.x;
const auto num_threads = gridDim.x * blockDim.x;
const auto tid = blockDim.x * blockIdx.x + threadIdx.x;
for (int i = tid; i < N * H * W; i += num_threads) {
// Convert linear index to 3D index
const int n = i / (H * W); // Batch index
@@ -237,8 +237,8 @@ __global__ void RasterizePointsFineCudaKernel(
float* pix_dists) { // (N, H, W, K)
// This can be more than H * W if H or W are not divisible by bin_size.
const int num_pixels = N * BH * BW * bin_size * bin_size;
const int num_threads = gridDim.x * blockDim.x;
const int tid = blockIdx.x * blockDim.x + threadIdx.x;
const auto num_threads = gridDim.x * blockDim.x;
const auto tid = blockIdx.x * blockDim.x + threadIdx.x;
for (int pid = tid; pid < num_pixels; pid += num_threads) {
// Convert linear index into bin and pixel indices. We make the within
@@ -376,8 +376,8 @@ __global__ void RasterizePointsBackwardCudaKernel(
float* grad_points) { // (P, 3)
// Parallelized over each of K points per pixel, for each pixel in images of
// size H * W, for each image in the batch of size N.
int num_threads = gridDim.x * blockDim.x;
int tid = blockIdx.x * blockDim.x + threadIdx.x;
auto num_threads = gridDim.x * blockDim.x;
auto tid = blockIdx.x * blockDim.x + threadIdx.x;
for (int i = tid; i < N * H * W * K; i += num_threads) {
// const int n = i / (H * W * K); // batch index (not needed).
const int yxk = i % (H * W * K);

View File

@@ -91,6 +91,10 @@ std::tuple<torch::Tensor, torch::Tensor, torch::Tensor> RasterizePointsNaive(
AT_ERROR("Not compiled with GPU support");
#endif
} else {
CHECK_CPU(points);
CHECK_CPU(cloud_to_packed_first_idx);
CHECK_CPU(num_points_per_cloud);
CHECK_CPU(radius);
return RasterizePointsNaiveCpu(
points,
cloud_to_packed_first_idx,
@@ -166,6 +170,10 @@ torch::Tensor RasterizePointsCoarse(
AT_ERROR("Not compiled with GPU support");
#endif
} else {
CHECK_CPU(points);
CHECK_CPU(cloud_to_packed_first_idx);
CHECK_CPU(num_points_per_cloud);
CHECK_CPU(radius);
return RasterizePointsCoarseCpu(
points,
cloud_to_packed_first_idx,
@@ -232,6 +240,8 @@ std::tuple<torch::Tensor, torch::Tensor, torch::Tensor> RasterizePointsFine(
AT_ERROR("Not compiled with GPU support");
#endif
} else {
CHECK_CPU(points);
CHECK_CPU(bin_points);
AT_ERROR("NOT IMPLEMENTED");
}
}
@@ -284,6 +294,10 @@ torch::Tensor RasterizePointsBackward(
AT_ERROR("Not compiled with GPU support");
#endif
} else {
CHECK_CPU(points);
CHECK_CPU(idxs);
CHECK_CPU(grad_zbuf);
CHECK_CPU(grad_dists);
return RasterizePointsBackwardCpu(points, idxs, grad_zbuf, grad_dists);
}
}

View File

@@ -35,8 +35,6 @@ __global__ void FarthestPointSamplingKernel(
__shared__ int64_t selected_store;
// Get constants
const int64_t N = points.size(0);
const int64_t P = points.size(1);
const int64_t D = points.size(2);
// Get batch index and thread index
@@ -109,7 +107,8 @@ at::Tensor FarthestPointSamplingCuda(
const at::Tensor& points, // (N, P, 3)
const at::Tensor& lengths, // (N,)
const at::Tensor& K, // (N,)
const at::Tensor& start_idxs) {
const at::Tensor& start_idxs,
const int64_t max_K_known = -1) {
// Check inputs are on the same device
at::TensorArg p_t{points, "points", 1}, lengths_t{lengths, "lengths", 2},
k_t{K, "K", 3}, start_idxs_t{start_idxs, "start_idxs", 4};
@@ -131,7 +130,12 @@ at::Tensor FarthestPointSamplingCuda(
const int64_t N = points.size(0);
const int64_t P = points.size(1);
const int64_t max_K = at::max(K).item<int64_t>();
int64_t max_K;
if (max_K_known > 0) {
max_K = max_K_known;
} else {
max_K = at::max(K).item<int64_t>();
}
// Initialize the output tensor with the sampled indices
auto idxs = at::full({N, max_K}, -1, lengths.options());

View File

@@ -43,7 +43,8 @@ at::Tensor FarthestPointSamplingCuda(
const at::Tensor& points,
const at::Tensor& lengths,
const at::Tensor& K,
const at::Tensor& start_idxs);
const at::Tensor& start_idxs,
const int64_t max_K_known = -1);
at::Tensor FarthestPointSamplingCpu(
const at::Tensor& points,
@@ -56,17 +57,23 @@ at::Tensor FarthestPointSampling(
const at::Tensor& points,
const at::Tensor& lengths,
const at::Tensor& K,
const at::Tensor& start_idxs) {
const at::Tensor& start_idxs,
const int64_t max_K_known = -1) {
if (points.is_cuda() || lengths.is_cuda() || K.is_cuda()) {
#ifdef WITH_CUDA
CHECK_CUDA(points);
CHECK_CUDA(lengths);
CHECK_CUDA(K);
CHECK_CUDA(start_idxs);
return FarthestPointSamplingCuda(points, lengths, K, start_idxs);
return FarthestPointSamplingCuda(
points, lengths, K, start_idxs, max_K_known);
#else
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(points);
CHECK_CPU(lengths);
CHECK_CPU(K);
CHECK_CPU(start_idxs);
return FarthestPointSamplingCpu(points, lengths, K, start_idxs);
}

View File

@@ -71,6 +71,8 @@ inline void SamplePdf(
AT_ERROR("Not compiled with GPU support.");
#endif
}
CHECK_CPU(weights);
CHECK_CPU(outputs);
CHECK_CONTIGUOUS(outputs);
SamplePdfCpu(bins, weights, outputs, eps);
}

View File

@@ -99,8 +99,7 @@ namespace {
// and increment it via template recursion until it is equal to the run-time
// argument N.
template <
template <typename, int64_t>
class Kernel,
template <typename, int64_t> class Kernel,
typename T,
int64_t minN,
int64_t maxN,
@@ -124,8 +123,7 @@ struct DispatchKernelHelper1D {
// 1D dispatch: Specialization when curN == maxN
// We need this base case to avoid infinite template recursion.
template <
template <typename, int64_t>
class Kernel,
template <typename, int64_t> class Kernel,
typename T,
int64_t minN,
int64_t maxN,
@@ -145,8 +143,7 @@ struct DispatchKernelHelper1D<Kernel, T, minN, maxN, maxN, Args...> {
// the run-time values of N and M, at which point we dispatch to the run
// method of the kernel.
template <
template <typename, int64_t, int64_t>
class Kernel,
template <typename, int64_t, int64_t> class Kernel,
typename T,
int64_t minN,
int64_t maxN,
@@ -203,8 +200,7 @@ struct DispatchKernelHelper2D {
// 2D dispatch, specialization for curN == maxN
template <
template <typename, int64_t, int64_t>
class Kernel,
template <typename, int64_t, int64_t> class Kernel,
typename T,
int64_t minN,
int64_t maxN,
@@ -243,8 +239,7 @@ struct DispatchKernelHelper2D<
// 2D dispatch, specialization for curM == maxM
template <
template <typename, int64_t, int64_t>
class Kernel,
template <typename, int64_t, int64_t> class Kernel,
typename T,
int64_t minN,
int64_t maxN,
@@ -283,8 +278,7 @@ struct DispatchKernelHelper2D<
// 2D dispatch, specialization for curN == maxN, curM == maxM
template <
template <typename, int64_t, int64_t>
class Kernel,
template <typename, int64_t, int64_t> class Kernel,
typename T,
int64_t minN,
int64_t maxN,
@@ -313,8 +307,7 @@ struct DispatchKernelHelper2D<
// This is the function we expect users to call to dispatch to 1D functions
template <
template <typename, int64_t>
class Kernel,
template <typename, int64_t> class Kernel,
typename T,
int64_t minN,
int64_t maxN,
@@ -330,8 +323,7 @@ void DispatchKernel1D(const int64_t N, Args... args) {
// This is the function we expect users to call to dispatch to 2D functions
template <
template <typename, int64_t, int64_t>
class Kernel,
template <typename, int64_t, int64_t> class Kernel,
typename T,
int64_t minN,
int64_t maxN,

View File

@@ -376,8 +376,6 @@ PointLineDistanceBackward(
float tt = t_top / t_bot;
tt = __saturatef(tt);
const float2 p_proj = (1.0f - tt) * v0 + tt * v1;
const float2 d = p - p_proj;
const float dist = sqrt(dot(d, d));
const float2 grad_p = -1.0f * grad_dist * 2.0f * (p_proj - p);
const float2 grad_v0 = grad_dist * (1.0f - tt) * 2.0f * (p_proj - p);

View File

@@ -15,3 +15,7 @@
#define CHECK_CONTIGUOUS_CUDA(x) \
CHECK_CUDA(x); \
CHECK_CONTIGUOUS(x)
#define CHECK_CPU(x) \
TORCH_CHECK( \
x.device().type() == torch::kCPU, \
"Cannot use CPU implementation: " #x " not on CPU.")

View File

@@ -83,7 +83,7 @@ class ShapeNetCore(ShapeNetBase): # pragma: no cover
):
synset_set.add(synset)
elif (synset in self.synset_inv.keys()) and (
(path.isdir(path.join(data_dir, self.synset_inv[synset])))
path.isdir(path.join(data_dir, self.synset_inv[synset]))
):
synset_set.add(self.synset_inv[synset])
else:

View File

@@ -36,7 +36,6 @@ def collate_batched_meshes(batch: List[Dict]): # pragma: no cover
collated_dict["mesh"] = None
if {"verts", "faces"}.issubset(collated_dict.keys()):
textures = None
if "textures" in collated_dict:
textures = TexturesAtlas(atlas=collated_dict["textures"])

View File

@@ -21,7 +21,6 @@ from typing import (
)
import torch
from pytorch3d.implicitron.dataset.frame_data import FrameData
from pytorch3d.implicitron.dataset.utils import GenericWorkaround

View File

@@ -25,8 +25,7 @@ from typing import (
import numpy as np
import torch
from pytorch3d.implicitron.dataset import types
from pytorch3d.implicitron.dataset import orm_types, types
from pytorch3d.implicitron.dataset.utils import (
adjust_camera_to_bbox_crop_,
adjust_camera_to_image_scale_,
@@ -48,8 +47,12 @@ from pytorch3d.implicitron.dataset.utils import (
from pytorch3d.implicitron.tools.config import registry, ReplaceableBase
from pytorch3d.renderer.camera_utils import join_cameras_as_batch
from pytorch3d.renderer.cameras import CamerasBase, PerspectiveCameras
from pytorch3d.structures.meshes import join_meshes_as_batch, Meshes
from pytorch3d.structures.pointclouds import join_pointclouds_as_batch, Pointclouds
FrameAnnotationT = types.FrameAnnotation | orm_types.SqlFrameAnnotation
SequenceAnnotationT = types.SequenceAnnotation | orm_types.SqlSequenceAnnotation
@dataclass
class FrameData(Mapping[str, Any]):
@@ -122,9 +125,9 @@ class FrameData(Mapping[str, Any]):
meta: A dict for storing additional frame information.
"""
frame_number: Optional[torch.LongTensor]
sequence_name: Union[str, List[str]]
sequence_category: Union[str, List[str]]
frame_number: Optional[torch.LongTensor] = None
sequence_name: Union[str, List[str]] = ""
sequence_category: Union[str, List[str]] = ""
frame_timestamp: Optional[torch.Tensor] = None
image_size_hw: Optional[torch.LongTensor] = None
effective_image_size_hw: Optional[torch.LongTensor] = None
@@ -155,7 +158,7 @@ class FrameData(Mapping[str, Any]):
new_params = {}
for field_name in iter(self):
value = getattr(self, field_name)
if isinstance(value, (torch.Tensor, Pointclouds, CamerasBase)):
if isinstance(value, (torch.Tensor, Pointclouds, CamerasBase, Meshes)):
new_params[field_name] = value.to(*args, **kwargs)
else:
new_params[field_name] = value
@@ -417,7 +420,6 @@ class FrameData(Mapping[str, Any]):
for f in fields(elem):
if not f.init:
continue
list_values = override_fields.get(
f.name, [getattr(d, f.name) for d in batch]
)
@@ -426,7 +428,7 @@ class FrameData(Mapping[str, Any]):
if all(list_value is not None for list_value in list_values)
else None
)
return cls(**collated)
return type(elem)(**collated)
elif isinstance(elem, Pointclouds):
return join_pointclouds_as_batch(batch)
@@ -434,6 +436,8 @@ class FrameData(Mapping[str, Any]):
elif isinstance(elem, CamerasBase):
# TODO: don't store K; enforce working in NDC space
return join_cameras_as_batch(batch)
elif isinstance(elem, Meshes):
return join_meshes_as_batch(batch)
else:
return torch.utils.data.dataloader.default_collate(batch)
@@ -454,8 +458,8 @@ class FrameDataBuilderBase(ReplaceableBase, Generic[FrameDataSubtype], ABC):
@abstractmethod
def build(
self,
frame_annotation: types.FrameAnnotation,
sequence_annotation: types.SequenceAnnotation,
frame_annotation: FrameAnnotationT,
sequence_annotation: SequenceAnnotationT,
*,
load_blobs: bool = True,
**kwargs,
@@ -541,8 +545,8 @@ class GenericFrameDataBuilder(FrameDataBuilderBase[FrameDataSubtype], ABC):
def build(
self,
frame_annotation: types.FrameAnnotation,
sequence_annotation: types.SequenceAnnotation,
frame_annotation: FrameAnnotationT,
sequence_annotation: SequenceAnnotationT,
*,
load_blobs: bool = True,
**kwargs,
@@ -586,58 +590,81 @@ class GenericFrameDataBuilder(FrameDataBuilderBase[FrameDataSubtype], ABC):
),
)
fg_mask_np: Optional[np.ndarray] = None
dataset_root = self.dataset_root
mask_annotation = frame_annotation.mask
if mask_annotation is not None:
if load_blobs and self.load_masks:
fg_mask_np, mask_path = self._load_fg_probability(frame_annotation)
depth_annotation = frame_annotation.depth
image_path: str | None = None
mask_path: str | None = None
depth_path: str | None = None
pcl_path: str | None = None
if dataset_root is not None: # set all paths even if we wont load blobs
if frame_annotation.image.path is not None:
image_path = os.path.join(dataset_root, frame_annotation.image.path)
frame_data.image_path = image_path
if mask_annotation is not None and mask_annotation.path:
mask_path = os.path.join(dataset_root, mask_annotation.path)
frame_data.mask_path = mask_path
if depth_annotation is not None and depth_annotation.path is not None:
depth_path = os.path.join(dataset_root, depth_annotation.path)
frame_data.depth_path = depth_path
if point_cloud is not None:
pcl_path = os.path.join(dataset_root, point_cloud.path)
frame_data.sequence_point_cloud_path = pcl_path
fg_mask_np: np.ndarray | None = None
bbox_xywh: tuple[float, float, float, float] | None = None
if mask_annotation is not None:
if load_blobs and self.load_masks and mask_path:
fg_mask_np = self._load_fg_probability(frame_annotation, mask_path)
frame_data.fg_probability = safe_as_tensor(fg_mask_np, torch.float)
bbox_xywh = mask_annotation.bounding_box_xywh
if bbox_xywh is None and fg_mask_np is not None:
bbox_xywh = get_bbox_from_mask(fg_mask_np, self.box_crop_mask_thr)
frame_data.bbox_xywh = safe_as_tensor(bbox_xywh, torch.float)
if frame_annotation.image is not None:
image_size_hw = safe_as_tensor(frame_annotation.image.size, torch.long)
frame_data.image_size_hw = image_size_hw # original image size
# image size after crop/resize
frame_data.effective_image_size_hw = image_size_hw
image_path = None
dataset_root = self.dataset_root
if frame_annotation.image.path is not None and dataset_root is not None:
image_path = os.path.join(dataset_root, frame_annotation.image.path)
frame_data.image_path = image_path
if load_blobs and self.load_images:
if image_path is None:
raise ValueError("Image path is required to load images.")
image_np = load_image(self._local_path(image_path))
no_mask = fg_mask_np is None # didnt read the mask file
image_np = load_image(
self._local_path(image_path), try_read_alpha=no_mask
)
if image_np.shape[0] == 4: # RGBA image
if no_mask:
fg_mask_np = image_np[3:]
frame_data.fg_probability = safe_as_tensor(
fg_mask_np, torch.float
)
image_np = image_np[:3]
frame_data.image_rgb = self._postprocess_image(
image_np, frame_annotation.image.size, frame_data.fg_probability
)
if (
load_blobs
and self.load_depths
and frame_annotation.depth is not None
and frame_annotation.depth.path is not None
):
(
frame_data.depth_map,
frame_data.depth_path,
frame_data.depth_mask,
) = self._load_mask_depth(frame_annotation, fg_mask_np)
if bbox_xywh is None and fg_mask_np is not None:
bbox_xywh = get_bbox_from_mask(fg_mask_np, self.box_crop_mask_thr)
frame_data.bbox_xywh = safe_as_tensor(bbox_xywh, torch.float)
if load_blobs and self.load_depths and depth_path is not None:
frame_data.depth_map, frame_data.depth_mask = self._load_mask_depth(
frame_annotation, depth_path, fg_mask_np
)
if load_blobs and self.load_point_clouds and point_cloud is not None:
pcl_path = self._fix_point_cloud_path(point_cloud.path)
assert pcl_path is not None
frame_data.sequence_point_cloud = load_pointcloud(
self._local_path(pcl_path), max_points=self.max_points
)
frame_data.sequence_point_cloud_path = pcl_path
if frame_annotation.viewpoint is not None:
frame_data.camera = self._get_pytorch3d_camera(frame_annotation)
@@ -653,18 +680,14 @@ class GenericFrameDataBuilder(FrameDataBuilderBase[FrameDataSubtype], ABC):
return frame_data
def _load_fg_probability(
self, entry: types.FrameAnnotation
) -> Tuple[np.ndarray, str]:
assert self.dataset_root is not None and entry.mask is not None
full_path = os.path.join(self.dataset_root, entry.mask.path)
fg_probability = load_mask(self._local_path(full_path))
def _load_fg_probability(self, entry: FrameAnnotationT, path: str) -> np.ndarray:
fg_probability = load_mask(self._local_path(path))
if fg_probability.shape[-2:] != entry.image.size:
raise ValueError(
f"bad mask size: {fg_probability.shape[-2:]} vs {entry.image.size}!"
)
return fg_probability, full_path
return fg_probability
def _postprocess_image(
self,
@@ -685,14 +708,14 @@ class GenericFrameDataBuilder(FrameDataBuilderBase[FrameDataSubtype], ABC):
def _load_mask_depth(
self,
entry: types.FrameAnnotation,
entry: FrameAnnotationT,
path: str,
fg_mask: Optional[np.ndarray],
) -> Tuple[torch.Tensor, str, torch.Tensor]:
) -> tuple[torch.Tensor, torch.Tensor]:
entry_depth = entry.depth
dataset_root = self.dataset_root
assert dataset_root is not None
assert entry_depth is not None and entry_depth.path is not None
path = os.path.join(dataset_root, entry_depth.path)
assert entry_depth is not None
depth_map = load_depth(self._local_path(path), entry_depth.scale_adjustment)
if self.mask_depths:
@@ -706,11 +729,11 @@ class GenericFrameDataBuilder(FrameDataBuilderBase[FrameDataSubtype], ABC):
else:
depth_mask = (depth_map > 0.0).astype(np.float32)
return torch.tensor(depth_map), path, torch.tensor(depth_mask)
return torch.tensor(depth_map), torch.tensor(depth_mask)
def _get_pytorch3d_camera(
self,
entry: types.FrameAnnotation,
entry: FrameAnnotationT,
) -> PerspectiveCameras:
entry_viewpoint = entry.viewpoint
assert entry_viewpoint is not None
@@ -739,19 +762,6 @@ class GenericFrameDataBuilder(FrameDataBuilderBase[FrameDataSubtype], ABC):
T=torch.tensor(entry_viewpoint.T, dtype=torch.float)[None],
)
def _fix_point_cloud_path(self, path: str) -> str:
"""
Fix up a point cloud path from the dataset.
Some files in Co3Dv2 have an accidental absolute path stored.
"""
unwanted_prefix = (
"/large_experiments/p3/replay/datasets/co3d/co3d45k_220512/export_v23/"
)
if path.startswith(unwanted_prefix):
path = path[len(unwanted_prefix) :]
assert self.dataset_root is not None
return os.path.join(self.dataset_root, path)
def _local_path(self, path: str) -> str:
if self.path_manager is None:
return path

View File

@@ -38,7 +38,6 @@ from pytorch3d.implicitron.dataset.utils import is_known_frame_scalar
from pytorch3d.implicitron.tools.config import registry, ReplaceableBase
from pytorch3d.renderer.camera_utils import join_cameras_as_batch
from pytorch3d.renderer.cameras import CamerasBase
from tqdm import tqdm
@@ -327,9 +326,9 @@ class JsonIndexDataset(DatasetBase, ReplaceableBase):
assert os.path.normpath(
# pyre-ignore[16]
self.frame_annots[idx]["frame_annotation"].image.path
) == os.path.normpath(
path
), f"Inconsistent frame indices {seq_name, frame_no, path}."
) == os.path.normpath(path), (
f"Inconsistent frame indices {seq_name, frame_no, path}."
)
return idx
dataset_idx = [

View File

@@ -21,7 +21,6 @@ from pytorch3d.renderer.cameras import CamerasBase
from .dataset_map_provider import DatasetMap, DatasetMapProviderBase, PathManagerFactory
from .json_index_dataset import JsonIndexDataset
from .utils import (
DATASET_TYPE_KNOWN,
DATASET_TYPE_TEST,

View File

@@ -18,7 +18,6 @@ from typing import Dict, List, Optional, Tuple, Type, Union
import numpy as np
from iopath.common.file_io import PathManager
from omegaconf import DictConfig
from pytorch3d.implicitron.dataset.dataset_map_provider import (
DatasetMap,
@@ -31,7 +30,6 @@ from pytorch3d.implicitron.tools.config import (
registry,
run_auto_creation,
)
from pytorch3d.renderer.cameras import CamerasBase
from tqdm import tqdm
@@ -222,7 +220,6 @@ class JsonIndexDatasetMapProviderV2(DatasetMapProviderBase):
self.dataset_map = dataset_map
def _load_category(self, category: str) -> DatasetMap:
frame_file = os.path.join(self.dataset_root, category, "frame_annotations.jgz")
sequence_file = os.path.join(
self.dataset_root, category, "sequence_annotations.jgz"

View File

@@ -12,7 +12,6 @@ import torch
from pytorch3d.implicitron.tools.config import registry
from .load_llff import load_llff_data
from .single_sequence_dataset import (
_interpret_blender_cameras,
SingleSceneDatasetMapProviderBase,

View File

@@ -8,7 +8,6 @@ import os
import warnings
import numpy as np
from PIL import Image
@@ -75,7 +74,6 @@ def _minify(basedir, path_manager, factors=(), resolutions=()):
def _load_data(
basedir, factor=None, width=None, height=None, load_imgs=True, path_manager=None
):
poses_arr = np.load(
_local_path(path_manager, os.path.join(basedir, "poses_bounds.npy"))
)
@@ -164,7 +162,6 @@ def ptstocam(pts, c2w):
def poses_avg(poses):
hwf = poses[0, :3, -1:]
center = poses[:, :3, 3].mean(0)
@@ -192,7 +189,6 @@ def render_path_spiral(c2w, up, rads, focal, zdelta, zrate, rots, N):
def recenter_poses(poses):
poses_ = poses + 0
bottom = np.reshape([0, 0, 0, 1.0], [1, 4])
c2w = poses_avg(poses)
@@ -256,7 +252,6 @@ def spherify_poses(poses, bds):
new_poses = []
for th in np.linspace(0.0, 2.0 * np.pi, 120):
camorigin = np.array([radcircle * np.cos(th), radcircle * np.sin(th), zh])
up = np.array([0, 0, -1.0])
@@ -311,7 +306,6 @@ def load_llff_data(
path_zflat=False,
path_manager=None,
):
poses, bds, imgs = _load_data(
basedir, factor=factor, path_manager=path_manager
) # factor=8 downsamples original imgs by 8x

View File

@@ -4,6 +4,8 @@
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.
# pyre-unsafe
# This functionality requires SQLAlchemy 2.0 or later.
import math
@@ -11,7 +13,6 @@ import struct
from typing import Optional, Tuple
import numpy as np
from pytorch3d.implicitron.dataset.types import (
DepthAnnotation,
ImageAnnotation,
@@ -20,7 +21,6 @@ from pytorch3d.implicitron.dataset.types import (
VideoAnnotation,
ViewpointAnnotation,
)
from sqlalchemy import LargeBinary
from sqlalchemy.orm import (
composite,

View File

@@ -4,11 +4,14 @@
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.
# pyre-unsafe
import hashlib
import json
import logging
import os
from dataclasses import dataclass
import urllib
from dataclasses import dataclass, Field, field
from typing import (
Any,
ClassVar,
@@ -28,10 +31,9 @@ import pandas as pd
import sqlalchemy as sa
import torch
from pytorch3d.implicitron.dataset.dataset_base import DatasetBase
from pytorch3d.implicitron.dataset.frame_data import ( # noqa
from pytorch3d.implicitron.dataset.frame_data import (
FrameData,
FrameDataBuilder,
FrameDataBuilder, # noqa
FrameDataBuilderBase,
)
from pytorch3d.implicitron.tools.config import (
@@ -39,7 +41,7 @@ from pytorch3d.implicitron.tools.config import (
ReplaceableBase,
run_auto_creation,
)
from sqlalchemy.orm import Session
from sqlalchemy.orm import scoped_session, Session, sessionmaker
from .orm_types import SqlFrameAnnotation, SqlSequenceAnnotation
@@ -51,7 +53,7 @@ _SET_LISTS_TABLE: str = "set_lists"
@registry.register
class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
class SqlIndexDataset(DatasetBase, ReplaceableBase):
"""
A dataset with annotations stored as SQLite tables. This is an index-based dataset.
The length is returned after all sequence and frame filters are applied (see param
@@ -88,6 +90,7 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
engine verbatim. Dont expose it to end users of your application!
pick_categories: Restrict the dataset to the given list of categories.
pick_sequences: A Sequence of sequence names to restrict the dataset to.
pick_sequences_sql_clause: Custom SQL WHERE clause to constrain sequence annotations.
exclude_sequences: A Sequence of the names of the sequences to exclude.
limit_sequences_per_category_to: Limit the dataset to the first up to N
sequences within each category (applies after all other sequence filters
@@ -102,9 +105,16 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
more frames than that; applied after other frame-level filters.
seed: The seed of the random generator sampling `n_frames_per_sequence`
random frames per sequence.
preload_metadata: If True, the metadata is preloaded into memory.
precompute_seq_to_idx: If True, precomputes the mapping from sequence name to indices.
scoped_session: If True, allows different parts of the code to share
a global session to access the database.
"""
frame_annotations_type: ClassVar[Type[SqlFrameAnnotation]] = SqlFrameAnnotation
sequence_annotations_type: ClassVar[Type[SqlSequenceAnnotation]] = (
SqlSequenceAnnotation
)
sqlite_metadata_file: str = ""
dataset_root: Optional[str] = None
@@ -117,6 +127,7 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
pick_categories: Tuple[str, ...] = ()
pick_sequences: Tuple[str, ...] = ()
pick_sequences_sql_clause: Optional[str] = None
exclude_sequences: Tuple[str, ...] = ()
limit_sequences_per_category_to: int = 0
limit_sequences_to: int = 0
@@ -124,12 +135,22 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
n_frames_per_sequence: int = -1
seed: int = 0
remove_empty_masks_poll_whole_table_threshold: int = 300_000
preload_metadata: bool = False
precompute_seq_to_idx: bool = False
# we set it manually in the constructor
# _index: pd.DataFrame = field(init=False)
_index: pd.DataFrame = field(init=False, metadata={"omegaconf_ignore": True})
_sql_engine: sa.engine.Engine = field(
init=False, metadata={"omegaconf_ignore": True}
)
eval_batches: Optional[List[Any]] = field(
init=False, metadata={"omegaconf_ignore": True}
)
frame_data_builder: FrameDataBuilderBase
frame_data_builder: FrameDataBuilderBase # pyre-ignore[13]
frame_data_builder_class_type: str = "FrameDataBuilder"
scoped_session: bool = False
def __post_init__(self) -> None:
if sa.__version__ < "2.0":
raise ImportError("This class requires SQL Alchemy 2.0 or later")
@@ -138,19 +159,28 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
raise ValueError("sqlite_metadata_file must be set")
if self.dataset_root:
frame_builder_type = self.frame_data_builder_class_type
getattr(self, f"frame_data_builder_{frame_builder_type}_args")[
"dataset_root"
] = self.dataset_root
frame_args = f"frame_data_builder_{self.frame_data_builder_class_type}_args"
getattr(self, frame_args)["dataset_root"] = self.dataset_root
getattr(self, frame_args)["path_manager"] = self.path_manager
run_auto_creation(self)
self.frame_data_builder.path_manager = self.path_manager
# pyre-ignore # NOTE: sqlite-specific args (read-only mode).
if self.path_manager is not None:
self.sqlite_metadata_file = self.path_manager.get_local_path(
self.sqlite_metadata_file
)
self.subset_lists_file = self.path_manager.get_local_path(
self.subset_lists_file
)
# NOTE: sqlite-specific args (read-only mode).
self._sql_engine = sa.create_engine(
f"sqlite:///file:{self.sqlite_metadata_file}?mode=ro&uri=true"
f"sqlite:///file:{urllib.parse.quote(self.sqlite_metadata_file)}?mode=ro&uri=true"
)
if self.preload_metadata:
self._sql_engine = self._preload_database(self._sql_engine)
sequences = self._get_filtered_sequences_if_any()
if self.subsets:
@@ -166,16 +196,29 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
if len(index) == 0:
raise ValueError(f"There are no frames in the subsets: {self.subsets}!")
self._index = index.set_index(["sequence_name", "frame_number"]) # pyre-ignore
self._index = index.set_index(["sequence_name", "frame_number"])
self.eval_batches = None # pyre-ignore
self.eval_batches = None
if self.eval_batches_file:
self.eval_batches = self._load_filter_eval_batches()
logger.info(str(self))
if self.scoped_session:
self._session_factory = sessionmaker(bind=self._sql_engine) # pyre-ignore
if self.precompute_seq_to_idx:
# This is deprecated and will be removed in the future.
# After we backport https://github.com/facebookresearch/uco3d/pull/3
logger.warning(
"Using precompute_seq_to_idx is deprecated and will be removed in the future."
)
self._index["rowid"] = np.arange(len(self._index))
groupby = self._index.groupby("sequence_name", sort=False)["rowid"]
self._seq_to_indices = dict(groupby.apply(list)) # pyre-ignore
del self._index["rowid"]
def __len__(self) -> int:
# pyre-ignore[16]
return len(self._index)
def __getitem__(self, frame_idx: Union[int, Tuple[str, int]]) -> FrameData:
@@ -232,12 +275,18 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
self.frame_annotations_type.frame_number
== int(frame), # cast from np.int64
)
seq_stmt = sa.select(SqlSequenceAnnotation).where(
SqlSequenceAnnotation.sequence_name == seq
seq_stmt = sa.select(self.sequence_annotations_type).where(
self.sequence_annotations_type.sequence_name == seq
)
with Session(self._sql_engine) as session:
entry = session.scalars(stmt).one()
seq_metadata = session.scalars(seq_stmt).one()
if self.scoped_session:
# pyre-ignore
with scoped_session(self._session_factory)() as session:
entry = session.scalars(stmt).one()
seq_metadata = session.scalars(seq_stmt).one()
else:
with Session(self._sql_engine) as session:
entry = session.scalars(stmt).one()
seq_metadata = session.scalars(seq_stmt).one()
assert entry.image.path == self._index.loc[(seq, frame), "_image_path"]
@@ -250,7 +299,6 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
return frame_data
def __str__(self) -> str:
# pyre-ignore[16]
return f"SqlIndexDataset #frames={len(self._index)}"
def sequence_names(self) -> Iterable[str]:
@@ -260,9 +308,10 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
# override
def category_to_sequence_names(self) -> Dict[str, List[str]]:
stmt = sa.select(
SqlSequenceAnnotation.category, SqlSequenceAnnotation.sequence_name
self.sequence_annotations_type.category,
self.sequence_annotations_type.sequence_name,
).where( # we limit results to sequences that have frames after all filters
SqlSequenceAnnotation.sequence_name.in_(self.sequence_names())
self.sequence_annotations_type.sequence_name.in_(self.sequence_names())
)
with self._sql_engine.connect() as connection:
cat_to_seqs = pd.read_sql(stmt, connection)
@@ -335,17 +384,31 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
rows = self._index.index.get_loc(seq_name)
if isinstance(rows, slice):
assert rows.stop is not None, "Unexpected result from pandas"
rows = range(rows.start or 0, rows.stop, rows.step or 1)
rows_seq = range(rows.start or 0, rows.stop, rows.step or 1)
else:
rows = np.where(rows)[0]
rows_seq = list(np.where(rows)[0])
index_slice, idx = self._get_frame_no_coalesced_ts_by_row_indices(
rows, seq_name, subset_filter
rows_seq, seq_name, subset_filter
)
index_slice["idx"] = idx
yield from index_slice.itertuples(index=False)
# override
def sequence_indices_in_order(
self, seq_name: str, subset_filter: Optional[Sequence[str]] = None
) -> Iterator[int]:
"""Same as `sequence_frames_in_order` but returns the iterator over
only dataset indices.
"""
if self.precompute_seq_to_idx and subset_filter is None:
# pyre-ignore
yield from self._seq_to_indices[seq_name]
else:
for _, _, idx in self.sequence_frames_in_order(seq_name, subset_filter):
yield idx
# override
def get_eval_batches(self) -> Optional[List[Any]]:
"""
@@ -379,11 +442,35 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
or self.limit_sequences_to > 0
or self.limit_sequences_per_category_to > 0
or len(self.pick_sequences) > 0
or self.pick_sequences_sql_clause is not None
or len(self.exclude_sequences) > 0
or len(self.pick_categories) > 0
or self.n_frames_per_sequence > 0
)
def _preload_database(
self, source_engine: sa.engine.base.Engine
) -> sa.engine.base.Engine:
destination_engine = sa.create_engine("sqlite:///:memory:")
metadata = sa.MetaData()
metadata.reflect(bind=source_engine)
metadata.create_all(bind=destination_engine)
with source_engine.connect() as source_conn:
with destination_engine.connect() as destination_conn:
for table_obj in metadata.tables.values():
# Select all rows from the source table
source_rows = source_conn.execute(table_obj.select())
# Insert rows into the destination table
for row in source_rows:
destination_conn.execute(table_obj.insert().values(row))
# Commit the changes for each table
destination_conn.commit()
return destination_engine
def _get_filtered_sequences_if_any(self) -> Optional[pd.Series]:
# maximum possible filter (if limit_sequences_per_category_to == 0):
# WHERE category IN 'self.pick_categories'
@@ -396,25 +483,30 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
*self._get_pick_filters(),
*self._get_exclude_filters(),
]
if pick_sequences_sql_clause := self.pick_sequences_sql_clause:
print("Applying the custom SQL clause.")
# pyre-ignore[6]: TextClause is compatible with where conditions
where_conditions.append(sa.text(pick_sequences_sql_clause))
def add_where(stmt):
return stmt.where(*where_conditions) if where_conditions else stmt
if self.limit_sequences_per_category_to <= 0:
stmt = add_where(sa.select(SqlSequenceAnnotation.sequence_name))
stmt = add_where(sa.select(self.sequence_annotations_type.sequence_name))
else:
subquery = sa.select(
SqlSequenceAnnotation.sequence_name,
self.sequence_annotations_type.sequence_name,
sa.func.row_number()
.over(
order_by=sa.text("ROWID"), # NOTE: ROWID is SQLite-specific
partition_by=SqlSequenceAnnotation.category,
partition_by=self.sequence_annotations_type.category,
)
.label("row_number"),
)
subquery = add_where(subquery).subquery()
stmt = sa.select(subquery.c.sequence_name).where(
# pyre-ignore[6]: SQLAlchemy column comparison returns ColumnElement, not bool
subquery.c.row_number <= self.limit_sequences_per_category_to
)
@@ -444,31 +536,34 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
return []
logger.info(f"Limiting dataset to categories: {self.pick_categories}")
return [SqlSequenceAnnotation.category.in_(self.pick_categories)]
return [self.sequence_annotations_type.category.in_(self.pick_categories)]
def _get_pick_filters(self) -> List[sa.ColumnElement]:
if not self.pick_sequences:
return []
logger.info(f"Limiting dataset to sequences: {self.pick_sequences}")
return [SqlSequenceAnnotation.sequence_name.in_(self.pick_sequences)]
return [self.sequence_annotations_type.sequence_name.in_(self.pick_sequences)]
def _get_exclude_filters(self) -> List[sa.ColumnOperators]:
if not self.exclude_sequences:
return []
logger.info(f"Removing sequences from the dataset: {self.exclude_sequences}")
return [SqlSequenceAnnotation.sequence_name.notin_(self.exclude_sequences)]
return [
self.sequence_annotations_type.sequence_name.notin_(self.exclude_sequences)
]
def _load_subsets_from_json(self, subset_lists_path: str) -> pd.DataFrame:
assert self.subsets is not None
subsets = self.subsets
assert subsets is not None
with open(subset_lists_path, "r") as f:
subset_to_seq_frame = json.load(f)
seq_frame_list = sum(
(
[(*row, subset) for row in subset_to_seq_frame[subset]]
for subset in self.subsets
for subset in subsets
),
[],
)
@@ -522,7 +617,7 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
stmt = sa.select(
self.frame_annotations_type.sequence_name,
self.frame_annotations_type.frame_number,
).where(self.frame_annotations_type._mask_mass == 0)
).where(self.frame_annotations_type._mask_mass == 0) # pyre-ignore[16]
with Session(self._sql_engine) as session:
to_remove = session.execute(stmt).all()
@@ -540,9 +635,10 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
)
)
if self.pick_frames_sql_clause:
if pick_frames_sql_clause := self.pick_frames_sql_clause:
logger.info("Applying the custom SQL clause.")
pick_frames_criteria.append(sa.text(self.pick_frames_sql_clause))
# pyre-ignore[6]: TextClause is compatible with where conditions
pick_frames_criteria.append(sa.text(pick_frames_sql_clause))
if pick_frames_criteria:
index = self._pick_frames_by_criteria(index, pick_frames_criteria)
@@ -586,7 +682,7 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
stmt = sa.select(
self.frame_annotations_type.sequence_name,
self.frame_annotations_type.frame_number,
self.frame_annotations_type._image_path,
self.frame_annotations_type._image_path, # pyre-ignore[16]
sa.null().label("subset"),
)
where_conditions = []
@@ -600,14 +696,15 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
logger.info(" excluding samples with empty masks")
where_conditions.append(
sa.or_(
self.frame_annotations_type._mask_mass.is_(None),
self.frame_annotations_type._mask_mass.is_(None), # pyre-ignore[16]
self.frame_annotations_type._mask_mass != 0,
)
)
if self.pick_frames_sql_clause:
if pick_frames_sql_clause := self.pick_frames_sql_clause:
logger.info(" applying custom SQL clause")
where_conditions.append(sa.text(self.pick_frames_sql_clause))
# pyre-ignore[6]: TextClause is compatible with where conditions
where_conditions.append(sa.text(pick_frames_sql_clause))
if where_conditions:
stmt = stmt.where(*where_conditions)
@@ -634,7 +731,9 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
assert self.eval_batches_file
logger.info(f"Loading eval batches from {self.eval_batches_file}")
if not os.path.isfile(self.eval_batches_file):
if (
self.path_manager and not self.path_manager.isfile(self.eval_batches_file)
) or (not self.path_manager and not os.path.isfile(self.eval_batches_file)):
# The batch indices file does not exist.
# Most probably the user has not specified the root folder.
raise ValueError(
@@ -642,7 +741,8 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
+ "Please specify a correct dataset_root folder."
)
with open(self.eval_batches_file, "r") as f:
eval_batches_file = self._local_path(self.eval_batches_file)
with open(eval_batches_file, "r") as f:
eval_batches = json.load(f)
# limit the dataset to sequences to allow multiple evaluations in one file
@@ -656,7 +756,7 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
if pick_sequences:
old_len = len(eval_batches)
eval_batches = [b for b in eval_batches if b[0][0] in pick_sequences]
logger.warn(
logger.warning(
f"Picked eval batches by sequence/cat: {old_len} -> {len(eval_batches)}"
)
@@ -664,7 +764,7 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
old_len = len(eval_batches)
exclude_sequences = set(self.exclude_sequences)
eval_batches = [b for b in eval_batches if b[0][0] not in exclude_sequences]
logger.warn(
logger.warning(
f"Excluded eval batches by sequence: {old_len} -> {len(eval_batches)}"
)
@@ -726,9 +826,15 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
self.frame_annotations_type.sequence_name == seq_name,
self.frame_annotations_type.frame_number.in_(frames),
)
frame_no_ts = None
with self._sql_engine.connect() as connection:
frame_no_ts = pd.read_sql_query(stmt, connection)
if self.scoped_session:
stmt_text = str(stmt.compile(compile_kwargs={"literal_binds": True}))
with scoped_session(self._session_factory)() as session: # pyre-ignore
frame_no_ts = pd.read_sql_query(stmt_text, session.connection())
else:
with self._sql_engine.connect() as connection:
frame_no_ts = pd.read_sql_query(stmt, connection)
if len(frame_no_ts) != len(index_slice):
raise ValueError(
@@ -758,11 +864,18 @@ class SqlIndexDataset(DatasetBase, ReplaceableBase): # pyre-ignore
prefixes=["TEMP"], # NOTE SQLite specific!
)
@classmethod
def pre_expand(cls) -> None:
# remove dataclass annotations that are not meant to be init params
# because they cause troubles for OmegaConf
for attr, attr_value in list(cls.__dict__.items()): # need to copy as we mutate
if isinstance(attr_value, Field) and attr_value.metadata.get(
"omegaconf_ignore", False
):
delattr(cls, attr)
del cls.__annotations__[attr]
def _seq_name_to_seed(seq_name) -> int:
"""Generates numbers in [0, 2 ** 28)"""
return int(hashlib.sha1(seq_name.encode("utf-8")).hexdigest()[:7], 16)
def _safe_as_tensor(data, dtype):
return torch.tensor(data, dtype=dtype) if data is not None else None

View File

@@ -4,15 +4,15 @@
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.
# pyre-unsafe
import logging
import os
from typing import List, Optional, Tuple, Type
import numpy as np
from omegaconf import DictConfig, OmegaConf
from pytorch3d.implicitron.dataset.dataset_map_provider import (
DatasetMap,
DatasetMapProviderBase,
@@ -43,7 +43,7 @@ logger = logging.getLogger(__name__)
@registry.register
class SqlIndexDatasetMapProvider(DatasetMapProviderBase): # pyre-ignore [13]
class SqlIndexDatasetMapProvider(DatasetMapProviderBase):
"""
Generates the training, validation, and testing dataset objects for
a dataset laid out on disk like SQL-CO3D, with annotations in an SQLite data base.
@@ -193,9 +193,9 @@ class SqlIndexDatasetMapProvider(DatasetMapProviderBase): # pyre-ignore [13]
# this is a mould that is never constructed, used to build self._dataset_map values
dataset_class_type: str = "SqlIndexDataset"
dataset: SqlIndexDataset
dataset: SqlIndexDataset # pyre-ignore [13]
path_manager_factory: PathManagerFactory
path_manager_factory: PathManagerFactory # pyre-ignore [13]
path_manager_factory_class_type: str = "PathManagerFactory"
def __post_init__(self):
@@ -282,8 +282,14 @@ class SqlIndexDatasetMapProvider(DatasetMapProviderBase): # pyre-ignore [13]
logger.info(f"Val dataset: {str(val_dataset)}")
logger.debug("Extracting test dataset.")
eval_batches_file = self._get_lists_file("eval_batches")
del common_dataset_kwargs["eval_batches_file"]
if self.eval_batches_path is None:
eval_batches_file = None
else:
eval_batches_file = self._get_lists_file("eval_batches")
if "eval_batches_file" in common_dataset_kwargs:
common_dataset_kwargs.pop("eval_batches_file", None)
test_dataset = dataset_type(
**common_dataset_kwargs,
subsets=self._get_subsets(self.test_subsets, True),

View File

@@ -18,7 +18,6 @@ from pytorch3d.implicitron.dataset.dataset_base import DatasetBase
from pytorch3d.implicitron.dataset.dataset_map_provider import DatasetMap
from pytorch3d.implicitron.dataset.frame_data import FrameData
from pytorch3d.implicitron.tools.config import registry, run_auto_creation
from torch.utils.data import DataLoader
logger = logging.getLogger(__name__)

View File

@@ -15,7 +15,6 @@ from typing import List, Optional, Tuple, TypeVar, Union
import numpy as np
import torch
from PIL import Image
from pytorch3d.io import IO
from pytorch3d.renderer.cameras import PerspectiveCameras
from pytorch3d.structures.pointclouds import Pointclouds
@@ -87,6 +86,15 @@ def is_train_frame(
def get_bbox_from_mask(
mask: np.ndarray, thr: float, decrease_quant: float = 0.05
) -> Tuple[int, int, int, int]:
# these corner cases need to be handled in order to avoid an infinite loop
if mask.size == 0:
warnings.warn("Empty mask is provided for bbox extraction.", stacklevel=1)
return 0, 0, 1, 1
if not mask.min() >= 0.0:
warnings.warn("Negative values in the mask for bbox extraction.", stacklevel=1)
mask = mask.clip(min=0.0)
# bbox in xywh
masks_for_box = np.zeros_like(mask)
while masks_for_box.sum() <= 1.0:
@@ -134,7 +142,15 @@ T = TypeVar("T", bound=torch.Tensor)
def bbox_xyxy_to_xywh(xyxy: T) -> T:
wh = xyxy[2:] - xyxy[:2]
xywh = torch.cat([xyxy[:2], wh])
return xywh # pyre-ignore
return xywh # pyre-ignore[7]
def bbox_xywh_to_xyxy(xywh: T, clamp_size: float | int | None = None) -> T:
wh = xywh[2:]
if clamp_size is not None:
wh = wh.clamp(min=clamp_size)
xyxy = torch.cat([xywh[:2], xywh[:2] + wh])
return xyxy # pyre-ignore[7]
def get_clamp_bbox(
@@ -180,16 +196,6 @@ def rescale_bbox(
return bbox * rel_size
def bbox_xywh_to_xyxy(
xywh: torch.Tensor, clamp_size: Optional[int] = None
) -> torch.Tensor:
xyxy = xywh.clone()
if clamp_size is not None:
xyxy[2:] = torch.clamp(xyxy[2:], clamp_size)
xyxy[2:] += xyxy[:2]
return xyxy
def get_1d_bounds(arr: np.ndarray) -> Tuple[int, int]:
nz = np.flatnonzero(arr)
return nz[0], nz[-1] + 1
@@ -201,18 +207,24 @@ def resize_image(
image_width: Optional[int],
mode: str = "bilinear",
) -> Tuple[torch.Tensor, float, torch.Tensor]:
if isinstance(image, np.ndarray):
image = torch.from_numpy(image)
if image_height is None or image_width is None:
if (
image_height is None
or image_width is None
or image.shape[-2] == 0
or image.shape[-1] == 0
):
# skip the resizing
return image, 1.0, torch.ones_like(image[:1])
# takes numpy array or tensor, returns pytorch tensor
minscale = min(
image_height / image.shape[-2],
image_width / image.shape[-1],
)
imre = torch.nn.functional.interpolate(
image[None],
scale_factor=minscale,
@@ -220,6 +232,7 @@ def resize_image(
align_corners=False if mode == "bilinear" else None,
recompute_scale_factor=True,
)[0]
imre_ = torch.zeros(image.shape[0], image_height, image_width)
imre_[:, 0 : imre.shape[1], 0 : imre.shape[2]] = imre
mask = torch.zeros(1, image_height, image_width)
@@ -232,9 +245,21 @@ def transpose_normalize_image(image: np.ndarray) -> np.ndarray:
return im.astype(np.float32) / 255.0
def load_image(path: str) -> np.ndarray:
def load_image(
path: str, try_read_alpha: bool = False, pil_format: str = "RGB"
) -> np.ndarray:
"""
Load an image from a path and return it as a numpy array.
If try_read_alpha is True, the image is read as RGBA and the alpha channel is
returned as the fourth channel.
Otherwise, the image is read as RGB and a three-channel image is returned.
"""
with Image.open(path) as pil_im:
im = np.array(pil_im.convert("RGB"))
# Check if the image has an alpha channel
if try_read_alpha and pil_im.mode == "RGBA":
im = np.array(pil_im)
else:
im = np.array(pil_im.convert(pil_format))
return transpose_normalize_image(im)
@@ -329,6 +354,7 @@ def adjust_camera_to_bbox_crop_(
focal_length_px, principal_point_px = _convert_ndc_to_pixels(
camera.focal_length[0],
# pyre-fixme[29]: `Union[(self: TensorBase, indices: Union[None, slice[Any, A...
camera.principal_point[0],
image_size_wh,
)
@@ -341,6 +367,7 @@ def adjust_camera_to_bbox_crop_(
)
camera.focal_length = focal_length[None]
# pyre-fixme[16]: `PerspectiveCameras` has no attribute `principal_point`.
camera.principal_point = principal_point_cropped[None]
@@ -352,6 +379,7 @@ def adjust_camera_to_image_scale_(
) -> PerspectiveCameras:
focal_length_px, principal_point_px = _convert_ndc_to_pixels(
camera.focal_length[0],
# pyre-fixme[29]: `Union[(self: TensorBase, indices: Union[None, slice[Any, A...
camera.principal_point[0],
original_size_wh,
)
@@ -368,7 +396,8 @@ def adjust_camera_to_image_scale_(
image_size_wh_output,
)
camera.focal_length = focal_length_scaled[None]
camera.principal_point = principal_point_scaled[None]
# pyre-fixme[16]: `PerspectiveCameras` has no attribute `principal_point`.
camera.principal_point = principal_point_scaled[None] # pyre-ignore[16]
# NOTE this cache is per-worker; they are implemented as processes.

View File

@@ -299,7 +299,6 @@ def eval_batch(
)
for loss_fg_mask, name_postfix in zip((mask_crop, mask_fg), ("_masked", "_fg")):
loss_mask_now = mask_crop * loss_fg_mask
for rgb_metric_name, rgb_metric_fun in zip(

View File

@@ -14,7 +14,6 @@ import warnings
from typing import Any, Dict, List, Optional, Tuple
import torch
import tqdm
from pytorch3d.implicitron.evaluation import evaluate_new_view_synthesis as evaluate
from pytorch3d.implicitron.models.base_model import EvaluationMode, ImplicitronModelBase

View File

@@ -10,7 +10,6 @@ from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional
import torch
from pytorch3d.implicitron.models.renderer.base import EvaluationMode
from pytorch3d.implicitron.tools.config import ReplaceableBase
from pytorch3d.renderer.cameras import CamerasBase

View File

@@ -106,7 +106,7 @@ class ResNetFeatureExtractor(FeatureExtractorBase):
self.layers = torch.nn.ModuleList()
self.proj_layers = torch.nn.ModuleList()
for stage in range(self.max_stage):
stage_name = f"layer{stage+1}"
stage_name = f"layer{stage + 1}"
feature_name = self._get_resnet_stage_feature_name(stage)
if (stage + 1) in self.stages:
if (
@@ -139,12 +139,18 @@ class ResNetFeatureExtractor(FeatureExtractorBase):
self.stages = set(self.stages) # convert to set for faster "in"
def _get_resnet_stage_feature_name(self, stage) -> str:
return f"res_layer_{stage+1}"
return f"res_layer_{stage + 1}"
def _resnet_normalize_image(self, img: torch.Tensor) -> torch.Tensor:
# pyre-fixme[58]: `-` is not supported for operand types `Tensor` and
# `Union[Tensor, Module]`.
# pyre-fixme[58]: `/` is not supported for operand types `Tensor` and
# `Union[Tensor, Module]`.
return (img - self._resnet_mean) / self._resnet_std
def get_feat_dims(self) -> int:
# pyre-fixme[29]: `Union[(self: TensorBase) -> Tensor, Tensor, Module]` is
# not a function.
return sum(self._feat_dim.values())
def forward(
@@ -183,7 +189,12 @@ class ResNetFeatureExtractor(FeatureExtractorBase):
else:
imgs_normed = imgs_resized
# is not a function.
# pyre-fixme[29]: `Union[Tensor, Module]` is not a function.
feats = self.stem(imgs_normed)
# pyre-fixme[6]: For 1st argument expected `Iterable[_T1]` but got
# `Union[Tensor, Module]`.
# pyre-fixme[6]: For 2nd argument expected `Iterable[_T2]` but got
# `Union[Tensor, Module]`.
for stage, (layer, proj) in enumerate(zip(self.layers, self.proj_layers)):
feats = layer(feats)
# just a sanity check below

Some files were not shown because too many files have changed in this diff Show More