mirror of
https://github.com/facebookresearch/pytorch3d.git
synced 2026-03-04 03:05:59 +08:00
Make cuda tensors contiguous in host function and remove contiguous check
Summary: Update the cuda kernels to: - remove contiguous checks for the grad tensors and for cpu functions which use accessors - for cuda implementations call `.contiguous()` on all tensors in the host function before invoking the kernel Reviewed By: gkioxari Differential Revision: D21598008 fbshipit-source-id: 9b97bda4582fd4269c8a00999874d4552a1aea2d
This commit is contained in:
committed by
Facebook GitHub Bot
parent
a8377f1f06
commit
3fef506895
@@ -168,6 +168,8 @@ at::Tensor alphaCompositeCudaForward(
|
||||
// doubles. Currently, support is for floats only.
|
||||
alphaCompositeCudaForwardKernel<<<numBlocks, threadsPerBlock, 0, stream>>>(
|
||||
// clang-format off
|
||||
// As we are using packed accessors here the tensors
|
||||
// do not need to be made contiguous.
|
||||
result.packed_accessor64<float, 4, at::RestrictPtrTraits>(),
|
||||
features.packed_accessor64<float, 2, at::RestrictPtrTraits>(),
|
||||
alphas.packed_accessor64<float, 4, at::RestrictPtrTraits>(),
|
||||
@@ -211,6 +213,8 @@ std::tuple<at::Tensor, at::Tensor> alphaCompositeCudaBackward(
|
||||
// doubles. Currently, support is for floats only.
|
||||
alphaCompositeCudaBackwardKernel<<<numBlocks, threadsPerBlock, 0, stream>>>(
|
||||
// clang-format off
|
||||
// As we are using packed accessors here the tensors
|
||||
// do not need to be made contiguous.
|
||||
grad_features.packed_accessor64<float, 2, at::RestrictPtrTraits>(),
|
||||
grad_alphas.packed_accessor64<float, 4, at::RestrictPtrTraits>(),
|
||||
grad_outputs.packed_accessor64<float, 4, at::RestrictPtrTraits>(),
|
||||
|
||||
Reference in New Issue
Block a user