14 Commits

Author SHA1 Message Date
Jeremy Reizenstein
d3b7f5f421 fix trainer test
Summary: After recent accelerate change D37543870 (aa8b03f31d), update interactive trainer test.

Reviewed By: shapovalov

Differential Revision: D37785932

fbshipit-source-id: 9211374323b6cfd80f6c5ff3a4fc1c0ca04b54ba
2022-07-12 07:20:21 -07:00
Nikhila Ravi
aa8b03f31d Updates to support Accelerate and multigpu training (#37)
Summary:
## Changes:
- Added Accelerate Library and refactored experiment.py to use it
- Needed to move `init_optimizer` and `ExperimentConfig` to a separate file to be compatible with submitit/hydra
- Needed to make some modifications to data loaders etc to work well with the accelerate ddp wrappers
- Loading/saving checkpoints incorporates an unwrapping step so remove the ddp wrapped model

## Tests

Tested with both `torchrun` and `submitit/hydra` on two gpus locally. Here are the commands:

**Torchrun**

Modules loaded:
```sh
1) anaconda3/2021.05   2) cuda/11.3   3) NCCL/2.9.8-3-cuda.11.3   4) gcc/5.2.0. (but unload gcc when using submit)
```

```sh
torchrun --nnodes=1 --nproc_per_node=2 experiment.py --config-path ./configs --config-name repro_singleseq_nerf_test
```

**Submitit/Hydra Local test**

```sh
~/pytorch3d/projects/implicitron_trainer$ HYDRA_FULL_ERROR=1 python3.9 experiment.py --config-name repro_singleseq_nerf_test --multirun --config-path ./configs  hydra/launcher=submitit_local hydra.launcher.gpus_per_node=2 hydra.launcher.tasks_per_node=2 hydra.launcher.nodes=1
```

**Submitit/Hydra distributed test**

```sh
~/implicitron/pytorch3d$ python3.9 experiment.py --config-name repro_singleseq_nerf_test --multirun --config-path ./configs  hydra/launcher=submitit_slurm hydra.launcher.gpus_per_node=8 hydra.launcher.tasks_per_node=8 hydra.launcher.nodes=1 hydra.launcher.partition=learnlab hydra.launcher.timeout_min=4320
```

## TODOS:
- Fix distributed evaluation: currently this doesn't work as the input format to the evaluation function is not suitable for gathering across gpus (needs to be nested list/tuple/dicts of objects that satisfy `is_torch_tensor`) and currently `frame_data`  contains `Cameras` type.
- Refactor the `accelerator` object to be accessible by all functions instead of needing to pass it around everywhere? Maybe have a `Trainer` class and add it as a method?
- Update readme with installation instructions for accelerate and also commands for running jobs with torchrun and submitit/hydra

X-link: https://github.com/fairinternal/pytorch3d/pull/37

Reviewed By: davnov134, kjchalup

Differential Revision: D37543870

Pulled By: bottler

fbshipit-source-id: be9eb4e91244d4fe3740d87dafec622ae1e0cf76
2022-07-11 19:29:58 -07:00
Jeremy Reizenstein
efb721320a extract camera_difficulty_bin_breaks
Summary: As part of removing Task, move camera difficulty bin breaks from hard code to the top level.

Reviewed By: davnov134

Differential Revision: D37491040

fbshipit-source-id: f2d6775ebc490f6f75020d13f37f6b588cc07a0b
2022-07-06 07:13:41 -07:00
Jeremy Reizenstein
771cf8a328 more padding options in Dataloader
Summary: Add facilities for dataloading non-sequential scenes.

Reviewed By: shapovalov

Differential Revision: D37291277

fbshipit-source-id: 0a33e3727b44c4f0cba3a2abe9b12f40d2a20447
2022-07-06 07:13:41 -07:00
David Novotny
0dce883241 Refactor autodecoders
Summary: Refactors autodecoders. Tests pass.

Reviewed By: bottler

Differential Revision: D37592429

fbshipit-source-id: 8f5c9eac254e1fdf0704d5ec5f69eb42f6225113
2022-07-04 07:18:03 -07:00
Krzysztof Chalupka
ae35824f21 Refactor ViewMetrics
Summary:
Make ViewMetrics easy to replace by putting them into an OmegaConf dataclass.

Also, re-word a few variable names and fix minor TODOs.

Reviewed By: bottler

Differential Revision: D37327157

fbshipit-source-id: 78d8e39bbb3548b952f10abbe05688409fb987cc
2022-06-30 09:22:01 -07:00
Jeremy Reizenstein
5c1ca757bb srn/idr followups
Summary: small followup to D37172537 (cba26506b6) and D37209012 (81d63c6382): changing default #harmonics and improving a test

Reviewed By: shapovalov

Differential Revision: D37412357

fbshipit-source-id: 1af1005a129425fd24fa6dd213d69c71632099a0
2022-06-24 04:07:15 -07:00
Jeremy Reizenstein
81d63c6382 idr harmonic_fns and doc
Summary: Document the inputs of idr functions and distinguish n_harmonic_functions to be 0 (simple embedding) versus -1 (no embedding).

Reviewed By: davnov134

Differential Revision: D37209012

fbshipit-source-id: 6e5c3eae54c4e5e8c3f76cad1caf162c6c222d52
2022-06-20 13:48:34 -07:00
Jeremy Reizenstein
cba26506b6 bg_color for lstm renderer
Summary: Allow specifying a color for non-opaque pixels in LSTMRenderer.

Reviewed By: davnov134

Differential Revision: D37172537

fbshipit-source-id: 6039726678bb7947f7d8cd04035b5023b2d5398c
2022-06-20 13:46:35 -07:00
Jeremy Reizenstein
65f667fd2e loading llff and blender datasets
Summary: Copy code from NeRF for loading LLFF data and blender synthetic data, and create dataset objects for them

Reviewed By: shapovalov

Differential Revision: D35581039

fbshipit-source-id: af7a6f3e9a42499700693381b5b147c991f57e5d
2022-06-16 03:09:15 -07:00
Jeremy Reizenstein
023a2369ae test configs are loadable
Summary: Add test that the yaml files deserialize.

Reviewed By: davnov134

Differential Revision: D36830673

fbshipit-source-id: b785d8db97b676686036760bfa2dd3fa638bda57
2022-06-10 12:22:46 -07:00
Jeremy Reizenstein
6275283202 pluggable JsonIndexDataset
Summary: Make dataset type and args configurable on JsonIndexDatasetMapProvider.

Reviewed By: davnov134

Differential Revision: D36666705

fbshipit-source-id: 4d0a3781d9a956504f51f1c7134c04edf1eb2846
2022-06-10 12:22:46 -07:00
Jeremy Reizenstein
1d43251391 PathManagerFactory
Summary: Allow access to manifold internally by default.

Reviewed By: davnov134

Differential Revision: D36760481

fbshipit-source-id: 2a16bd40e81ef526085ac1b3f4606b63c1841428
2022-06-10 12:22:46 -07:00
Jeremy Reizenstein
c31bf85a23 test runner for experiment.py
Summary: Add simple interactive testrunner for experiment.py

Reviewed By: shapovalov

Differential Revision: D35316221

fbshipit-source-id: d424bcba632eef89eefb56e18e536edb58ec6f85
2022-05-26 05:33:03 -07:00