Incorrect rendering of object compared to Ground Truth image on LineMod Dataset

## 🐛 Bugs / Unexpected behaviors


Given a ground truth image and a ground truth pose, I attempt to render the object with the same pose to align with GT image. 

Initially setting the camera pose to GT pose results in following rendering:

![test_no_change](https://user-images.githubusercontent.com/61034660/141601190-77e7cd38-b682-4421-a59e-30f80f0c3c7c.jpg)

However since Pytorch3D follows an altered axes convention, I attempt use the fix suggested #522 by negating f to -f, which results in the following render:

![test_negate_f](https://user-images.githubusercontent.com/61034660/141601197-d56ec631-94c7-4647-9530-7fdff956dc94.jpg)

Additionally , I attempt to use the fix mentioned in #294 by rotating about z by 180 degrees which results in the following render:

![test_rotate_z](https://user-images.githubusercontent.com/61034660/141601203-e112eded-2638-4f4e-9f3b-e33364450a58.jpg)

As we can see neither fixes result in the rendered object exactly aligning with the GT pose. This has been over multiple GT poses and images.

## Instructions To Reproduce the Issue:

Please include the following (depending on what the issue is):

1. Any changes you made (`git diff`) or code you wrote
```
import os
import numpy as np
import cv2
import torch
from pytorch3d.structures import Meshes
from pytorch3d.io import IO
from pytorch3d.renderer import (
    look_at_view_transform,
    look_at_rotation,
    OpenGLPerspectiveCameras,
    PerspectiveCameras, 
    PointLights, 
    DirectionalLights,
    AmbientLights,  
    RasterizationSettings, 
    MeshRenderer, 
    MeshRasterizer,  
    SoftPhongShader,
    SoftSilhouetteShader,
    SoftPhongShader,
    TexturesVertex
)

#Params
K = np.array([[572.4114, 0.,         325.2611],
            [0.,        573.57043,  242.04899],
            [0.,        0.,         1.]])

f_x, f_y = K[0,0], K[1,1]
p_x, p_y = K[0,2], K[1,2]
h = 480
w = 640

#Load mesh
device = torch.device("cuda:0")
mesh = IO().load_mesh("mesh.ply").to(device)

#GT Pose for instance 176
R=  torch.tensor([0.66307002, 0.74850100, 0.00921593, 0.50728703, -0.44026601, -0.74082798, -0.55045301, 0.49589601, -0.67163098])
T= torch.tensor([42.36749640, 1.84263252, 768.28001229])

#Apply fix #294
RT = torch.zeros((4,4))
RT[3,3] = 1
RT[:3,:3] = R.reshape(3,3)
RT[:3,3] = T/1000.0

Rz = torch.tensor([[-1,0, 0, 0],
                [0, -1, 0, 0],
                [0, 0, 1, 0],
                [0, 0, 0, 1]]).float()

RT = torch.matmul(Rz, RT)

R = RT[:3,:3].reshape(1,3,3)
T = RT[:3,3].reshape(1,3)

f = torch.tensor((f_x, f_y), dtype=torch.float32).unsqueeze(0)
p = torch.tensor((p_x, p_y), dtype=torch.float32).unsqueeze(0)
img_size= torch.tensor((h, w), dtype=torch.float32).unsqueeze(0)

lights = AmbientLights(device=device)

camera = PerspectiveCameras(
    R=R, T=T,
    focal_length=f,
    principal_point=p,
        image_size=((h, w),),
        device=device,
        in_ndc=False)

# Set Renderer Parameters
raster_settings = RasterizationSettings(
    image_size=(h,w), 
    blur_radius=0.0, 
    faces_per_pixel=1,
    max_faces_per_bin = mesh.faces_packed().shape[0],
    perspective_correct = True
)

rasterizer = MeshRasterizer(
        cameras=camera, 
        raster_settings=raster_settings
    )

renderer = MeshRenderer(
    rasterizer,
    shader=SoftPhongShader(
        device=device, 
        cameras=camera,
        lights=lights,
    )
)

# Generate rendered image
target_images= renderer(mesh, cameras=camera, lights=lights)

img = target_images[0, ..., :3]

imgray = cv2.cvtColor(img.cpu().numpy(), cv2.COLOR_BGR2GRAY)
ret, mask = cv2.threshold(imgray, 1, 255, 0)
bg = None
while bg is None or len(bg.shape) < 3:
    bg_pth = "bg.png"
    bg = cv2.cvtColor(cv2.imread(bg_pth), cv2.COLOR_BGR2RGB)
    if len(bg.shape) < 3:
        bg = None
        continue
    bg_h, bg_w, _ = bg.shape
    if bg_h < h:
        new_w = int(float(h) / bg_h * bg_w)
        bg = cv2.resize(bg, (new_w, h))
    bg_h, bg_w, _ = bg.shape
    if bg_w < w:
        new_h = int(float(w) / bg_w * bg_h)
        bg = cv2.resize(bg, (w, new_h))
    bg_h, bg_w, _ = bg.shape
    if bg_h > h:
        sh = randint(0, bg_h - h)
        bg = bg[sh : sh + h, :, :]
    bg_h, bg_w, _ = bg.shape
    if bg_w > w:
        sw = randint(0, bg_w - w)
        bg = bg[:, sw : sw + w, :]

    bg = cv2.cvtColor(bg, cv2.COLOR_RGB2BGR)

msk_3c = np.repeat(mask[:, :, None], 3, axis=2)

msk_3c = msk_3c>0

rgb =   img.cpu().numpy() * (msk_3c).astype(bg.dtype) + bg * (msk_3c == 0).astype(bg.dtype)

cv2.imwrite('test.jpg', rgb)
```
2. The exact command(s) you ran:
python test.py

Files Used:

GT/Background image: ![bg](https://user-images.githubusercontent.com/61034660/141600466-23a8a400-0108-4807-83a5-6d187ed30438.png)

Mesh: https://drive.google.com/file/d/1kIw9dypvxsJ-xgI573RL6s-ncf7e5L2a/view?usp=sharing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Incorrect rendering of object compared to Ground Truth image on LineMod Dataset #934

🐛 Bugs / Unexpected behaviors

Instructions To Reproduce the Issue:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incorrect rendering of object compared to Ground Truth image on LineMod Dataset #934

Description

🐛 Bugs / Unexpected behaviors

Instructions To Reproduce the Issue:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions