screen cameras lose -1

Summary: All the renderers in PyTorch3D (pointclouds including pulsar, meshes, raysampling) use align_corners=False style. NDC space goes between the edges of the outer pixels. For a non square image with W>H, the vertical NDC space goes from -1 to 1 and the horizontal from -W/H to W/H. However it was recently pointed out that functionality which deals with screen space inside the camera classes is inconsistent with this. It unintentionally uses align_corners=True. This fixes that. This would change behaviour of the following: - If you create a camera in screen coordinates, i.e. setting in_ndc=False, then anything you do with the camera which touches NDC space may be affected, including trying to use renderers. The transform_points_screen function will not be affected... - If you call the function “transform_points_screen” on a camera defined in NDC space results will be different. I have illustrated in the diff how to get the old results from the new results but this probably isn’t the right long-term solution.. Reviewed By: gkioxari Differential Revision: D32536305 fbshipit-source-id: 377325a9137282971dcb7ca11a6cba3fc700c9ce
2025-12-20 14:20:38 +08:00 · 2021-12-07 15:02:46 -08:00
parent cff4876131
commit bf3bc6f8e3
5 changed files with 34 additions and 37 deletions
--- a/pytorch3d/renderer/camera_conversions.py
+++ b/pytorch3d/renderer/camera_conversions.py
@@ -33,9 +33,9 @@ def _cameras_from_opencv_projection(
    # has range [-1, 1] and the largest side has range [-u, u], with u > 1.
    # This convention is consistent with the PyTorch3D renderer, as well as
    # the transformation function `get_ndc_to_screen_transform`.
-    scale = (image_size_wh.to(R).min(dim=1, keepdim=True)[0] - 1) / 2.0
+    scale = image_size_wh.to(R).min(dim=1, keepdim=True)[0] / 2.0
    scale = scale.expand(-1, 2)
-    c0 = (image_size_wh - 1) / 2.0
+    c0 = image_size_wh / 2.0

    # Get the PyTorch3D focal length and principal point.
    focal_pytorch3d = focal_length / scale
@@ -75,9 +75,9 @@ def _opencv_from_cameras_projection(
    image_size_wh = image_size.to(R).flip(dims=(1,))

    # NDC to screen conversion.
-    scale = (image_size_wh.to(R).min(dim=1, keepdim=True)[0] - 1) / 2.0
+    scale = image_size_wh.to(R).min(dim=1, keepdim=True)[0] / 2.0
    scale = scale.expand(-1, 2)
-    c0 = (image_size_wh - 1) / 2.0
+    c0 = image_size_wh / 2.0

    # pyre-fixme[29]: `Union[BoundMethod[typing.Callable(torch.Tensor.__neg__)[[Named...
    principal_point = -p0_pytorch3d * scale + c0
--- a/pytorch3d/renderer/cameras.py
+++ b/pytorch3d/renderer/cameras.py
@@ -36,8 +36,9 @@ class CamerasBase(TensorProperties):
        and translation (T)
    - NDC coordinate system: This is the normalized coordinate system that confines
        in a volume the rendered part of the object or scene. Also known as view volume.
-        For square images, given the PyTorch3D convention, (+1, +1, znear) is the top left near corner,
-        and (-1, -1, zfar) is the bottom right far corner of the volume.
+        For square images, given the PyTorch3D convention, (+1, +1, znear)
+        is the top left near corner, and (-1, -1, zfar) is the bottom right far
+        corner of the volume.
        The transformation from view --> NDC happens after applying the camera
        projection matrix (P) if defined in NDC space.
        For non square images, we scale the points such that smallest side
@@ -1623,12 +1624,12 @@ def get_ndc_to_screen_transform(
    # For non square images, we scale the points such that smallest side
    # has range [-1, 1] and the largest side has range [-u, u], with u > 1.
    # This convention is consistent with the PyTorch3D renderer
-    scale = (image_size.min(dim=1).values - 1.0) / 2.0
+    scale = (image_size.min(dim=1).values - 0.0) / 2.0

    K[:, 0, 0] = scale
    K[:, 1, 1] = scale
-    K[:, 0, 3] = -1.0 * (width - 1.0) / 2.0
-    K[:, 1, 3] = -1.0 * (height - 1.0) / 2.0
+    K[:, 0, 3] = -1.0 * (width - 0.0) / 2.0
+    K[:, 1, 3] = -1.0 * (height - 0.0) / 2.0
    K[:, 2, 2] = 1.0
    K[:, 3, 3] = 1.0