Summary:
CUDA kernel variables matching the type `(thread|block|grid).(Idx|Dim).(x|y|z)` [have the data type `uint`](https://docs.nvidia.com/cuda/cuda-c-programming-guide/#built-in-variables).
Many programmers mistakenly use implicit casts to turn these data types into `int`. In fact, the [CUDA Programming Guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/) it self is inconsistent and incorrect in its use of data types in programming examples.
The result of these implicit casts is that our kernels may give unexpected results when exposed to large datasets, i.e., those exceeding >~2B items.
While we now have linters in place to prevent simple mistakes (D71236150), our codebase has many problematic instances. This diff fixes some of them.
Reviewed By: dtolnay
Differential Revision: D71355356
fbshipit-source-id: cea44891416d9efd2f466d6c45df4e36008fa036
Summary: Update all FB license strings to the new format.
Reviewed By: patricklabatut
Differential Revision: D33403538
fbshipit-source-id: 97a4596c5c888f3c54f44456dc07e718a387a02c
Summary:
Similar to non square image rasterization for meshes, apply the same updates to the pointcloud rasterizer.
Main API Change:
- PointRasterizationSettings now accepts a tuple/list of (H, W) for the image size.
Reviewed By: jcjohnson
Differential Revision: D25465206
fbshipit-source-id: 7370d83c431af1b972158cecae19d82364623380
Summary:
Fix division by zero when alpha is 1.0
In this case, the nominator is already 0 and we need to make sure division with 0 does not occur which would produce nans
Reviewed By: nikhilaravi
Differential Revision: D21650478
fbshipit-source-id: bc457105b3050fef1c8bd4e58e7d6d15c0c81ffd
Summary:
Update the cuda kernels to:
- remove contiguous checks for the grad tensors and for cpu functions which use accessors
- for cuda implementations call `.contiguous()` on all tensors in the host function before invoking the kernel
Reviewed By: gkioxari
Differential Revision: D21598008
fbshipit-source-id: 9b97bda4582fd4269c8a00999874d4552a1aea2d
Summary:
Updates to:
- enable cuda kernel launches on any GPU (not just the default)
- cuda and contiguous checks for all kernels
- checks to ensure all tensors are on the same device
- error reporting in the cuda kernels
- cuda tests now run on a random device not just the default
Reviewed By: jcjohnson, gkioxari
Differential Revision: D21215280
fbshipit-source-id: 1bedc9fe6c35e9e920bdc4d78ed12865b1005519
Summary:
Use aten instead of torch interface in all cuda code. This allows the cuda build to work with pytorch 1.5 with GCC 5 (e.g. the compiler of ubuntu 16.04LTS). This wasn't working. It has been failing with errors like the below, perhaps due to a bug in nvcc.
```
torch/include/torch/csrc/api/include/torch/nn/cloneable.h:68:61: error: invalid static_cast from type ‘const torch::OrderedDict<std::basic_string<char>, std::shared_ptr<torch::nn::Module> >’ to type ‘torch::OrderedDict<std::basic_string<char>, std::shared_ptr<torch::nn::Module> >
```
Reviewed By: nikhilaravi
Differential Revision: D21204029
fbshipit-source-id: ca6bdbcecf42493365e1c23a33fe35e1759fe8b6
Summary: This is mostly replacing the old PackedTensorAccessor with the new PackedTensorAccessor64.
Reviewed By: gkioxari
Differential Revision: D21088773
fbshipit-source-id: 5973e5a29d934eafb7c70ec5ec154ca076b64d27