Summary:
Addresses the following issue:
https://github.com/facebookresearch/pytorch3d/issues/1345#issuecomment-1272881244
I.e., when installed from conda, `pytorch3d_implicitron_visualizer` crashes since it invokes `main()` while `main` requires a single positional arg `argv`.
Reviewed By: shapovalov
Differential Revision: D41533497
fbshipit-source-id: e53a923eb8b2f0f9c0e92e9c0866d9cb310c4799
Summary:
Enum fields cause the following to crash since they are loaded as strings:
```
config = OmegaConf.load(autodumped_cfg_file)
Experiment(**config)
```
It would be good to come up with the general solution but for now just fixing the visualisation script.
Reviewed By: bottler
Differential Revision: D41140426
fbshipit-source-id: 71c1c6b1fffe3b5ab1ca0114cfa3f0d81160278f
Summary: Various fixes to get visualize_reconstruction running, and an interactive test for it.
Reviewed By: kjchalup
Differential Revision: D39286691
fbshipit-source-id: 88735034cc01736b24735bcb024577e6ab7ed336
Summary:
Move the flyaround rendering function into core implicitron.
The unblocks an example in the facebookresearch/co3d repo.
Reviewed By: bottler
Differential Revision: D39257801
fbshipit-source-id: 6841a88a43d4aa364dd86ba83ca2d4c3cf0435a4
Summary:
generic_model_args no longer exists. Update some references to it, mostly in doc.
This fixes the testing of all the yaml files in test_forward pass.
Reviewed By: shapovalov
Differential Revision: D38789202
fbshipit-source-id: f11417efe772d7f86368b3598aa66c52b1309dbf
Summary:
Stats are logically connected to the training loop, not to the model. Hence, moving to the training loop.
Also removing resume_epoch from OptimizerFactory in favor of a single place - ModelFactory. This removes the need for config consistency checks etc.
Reviewed By: kjchalup
Differential Revision: D38313475
fbshipit-source-id: a1d188a63e28459df381ff98ad8acdcdb14887b7
Summary:
This large diff rewrites a significant portion of Implicitron's config hierarchy. The new hierarchy, and some of the default implementation classes, are as follows:
```
Experiment
data_source: ImplicitronDataSource
dataset_map_provider
data_loader_map_provider
model_factory: ImplicitronModelFactory
model: GenericModel
optimizer_factory: ImplicitronOptimizerFactory
training_loop: ImplicitronTrainingLoop
evaluator: ImplicitronEvaluator
```
1) Experiment (used to be ExperimentConfig) is now a top-level Configurable and contains as members mainly (mostly new) high-level factory Configurables.
2) Experiment's job is to run factories, do some accelerate setup and then pass the results to the main training loop.
3) ImplicitronOptimizerFactory and ImplicitronModelFactory are new high-level factories that create the optimizer, scheduler, model, and stats objects.
4) TrainingLoop is a new configurable that runs the main training loop and the inner train-validate step.
5) Evaluator is a new configurable that TrainingLoop uses to run validation/test steps.
6) GenericModel is not the only model choice anymore. Instead, ImplicitronModelBase (by default instantiated with GenericModel) is a member of Experiment and can be easily replaced by a custom implementation by the user.
All the new Configurables are children of ReplaceableBase, and can be easily replaced with custom implementations.
In addition, I added support for the exponential LR schedule, updated the config files and the test, as well as added a config file that reproduces NERF results and a test to run the repro experiment.
Reviewed By: bottler
Differential Revision: D37723227
fbshipit-source-id: b36bee880d6aa53efdd2abfaae4489d8ab1e8a27
Summary:
1. Respecting `visdom_show_preds` parameter when it is False.
2. Clipping the images pre-visualisation, which is important for methods like SRN that are not arare of pixel value range.
Reviewed By: bottler
Differential Revision: D37786439
fbshipit-source-id: 8dbb5104290bcc5c2829716b663cae17edc911bd
Summary:
## Changes:
- Added Accelerate Library and refactored experiment.py to use it
- Needed to move `init_optimizer` and `ExperimentConfig` to a separate file to be compatible with submitit/hydra
- Needed to make some modifications to data loaders etc to work well with the accelerate ddp wrappers
- Loading/saving checkpoints incorporates an unwrapping step so remove the ddp wrapped model
## Tests
Tested with both `torchrun` and `submitit/hydra` on two gpus locally. Here are the commands:
**Torchrun**
Modules loaded:
```sh
1) anaconda3/2021.05 2) cuda/11.3 3) NCCL/2.9.8-3-cuda.11.3 4) gcc/5.2.0. (but unload gcc when using submit)
```
```sh
torchrun --nnodes=1 --nproc_per_node=2 experiment.py --config-path ./configs --config-name repro_singleseq_nerf_test
```
**Submitit/Hydra Local test**
```sh
~/pytorch3d/projects/implicitron_trainer$ HYDRA_FULL_ERROR=1 python3.9 experiment.py --config-name repro_singleseq_nerf_test --multirun --config-path ./configs hydra/launcher=submitit_local hydra.launcher.gpus_per_node=2 hydra.launcher.tasks_per_node=2 hydra.launcher.nodes=1
```
**Submitit/Hydra distributed test**
```sh
~/implicitron/pytorch3d$ python3.9 experiment.py --config-name repro_singleseq_nerf_test --multirun --config-path ./configs hydra/launcher=submitit_slurm hydra.launcher.gpus_per_node=8 hydra.launcher.tasks_per_node=8 hydra.launcher.nodes=1 hydra.launcher.partition=learnlab hydra.launcher.timeout_min=4320
```
## TODOS:
- Fix distributed evaluation: currently this doesn't work as the input format to the evaluation function is not suitable for gathering across gpus (needs to be nested list/tuple/dicts of objects that satisfy `is_torch_tensor`) and currently `frame_data` contains `Cameras` type.
- Refactor the `accelerator` object to be accessible by all functions instead of needing to pass it around everywhere? Maybe have a `Trainer` class and add it as a method?
- Update readme with installation instructions for accelerate and also commands for running jobs with torchrun and submitit/hydra
X-link: https://github.com/fairinternal/pytorch3d/pull/37
Reviewed By: davnov134, kjchalup
Differential Revision: D37543870
Pulled By: bottler
fbshipit-source-id: be9eb4e91244d4fe3740d87dafec622ae1e0cf76
Summary: The ImplicitronDataset class corresponds to JsonIndexDatasetMapProvider
Reviewed By: shapovalov
Differential Revision: D36661396
fbshipit-source-id: 80ca2ff81ef9ecc2e3d1f4e1cd14b6f66a7ec34d
Summary: replace dataset_zoo with a pluggable DatasetMapProvider. The logic is now in annotated_file_dataset_map_provider.
Reviewed By: shapovalov
Differential Revision: D36443965
fbshipit-source-id: 9087649802810055e150b2fbfcc3c197a761f28a
Summary: Separate ImplicitronDatasetBase and FrameData (to be used by all data sources) from ImplicitronDataset (which is specific).
Reviewed By: shapovalov
Differential Revision: D36413111
fbshipit-source-id: 3725744cde2e08baa11aff4048237ba10c7efbc6
Summary:
Move dataset_args and dataloader_args from ExperimentConfig into a new member called datasource so that it can contain replaceables.
Also add enum Task for task type.
Reviewed By: shapovalov
Differential Revision: D36201719
fbshipit-source-id: 47d6967bfea3b7b146b6bbd1572e0457c9365871
Summary:
To avoid model_zoo, we need to make GenericModel pluggable.
I also align creation APIs for convenience.
Reviewed By: bottler, davnov134
Differential Revision: D35933093
fbshipit-source-id: 8228926528eb41a795fbfbe32304b8019197e2b1
Summary: Using the API from D35012121 everywhere.
Reviewed By: bottler
Differential Revision: D35045870
fbshipit-source-id: dab112b5e04160334859bbe8fa2366344b6e0f70