-
Notifications
You must be signed in to change notification settings - Fork 1.4k
OpenCV camera to PyTorch3D PerspectiveCameras #522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi, I figure out the solution myself after getting stuck here for quite some time :) I post my answer below. First, OpenCV coordinate system is X-right, Y-down, Z-out, and PyTorch3D is X-left, Y-up, Z-out. You can notice that we need to flip X and Y axes. However, instead of what I was doing above (I still did not make that work), one can actually simply input the negative focal length to Here I provide an example to help you understand: # Assume we have the following parameters from OpenCV
fx fy # focal length in x and y axes
px py # principal points in x and y axes
R t # rotation and translation matrix
# First, (X, Y, Z) = R @ p_world + t, where p_world is 3D coordinte under world system
# To go from a coordinate under view system (X, Y, Z) to screen space, the perspective camera mode should consider
# the following transformation and we can get coordinates in screen space in the range of [0, W-1] and [0, H-1]
x_screen = fx * X / Z + px
y_screen = fy * Y / Z + py
# In PyTorch3D, we need to build the input first in order to define camera. Note that we consider batch size N = 1
RR = torch.from_numpy(R).permute(1, 0).unsqueeze(0) # dim = (1, 3, 3)
tt = torch.from_numpy(t).permute(1, 0) # dim = (1, 3)
f = torch.tensor((fx, fy), dtype=torch.float32).unsqueeze(0) # dim = (1, 2)
p = torch.tensor((px, py), dtype=torch.float32).unsqueeze(0) # dim = (1, 2)
img_size = (W, H) # (width, height) of the image
# Now, we can define the Perspective Camera model.
# NOTE: you should consider negative focal length as input!!!
camera = PerspectiveCameras(R=RR, T=tt, focal_length=-f, principal_point=p, image_size=(img_size,))
p_world = torch.tensor([X, Y, Z], dtype=torch.float32)[None, None] # dim = (1, 1, 3)
out_screen = camera.transform_points_screen(p_world, (img_size,)) The Proof for negative focal length x_ndc = (fx * 2 / W) * X / Z - (px - W / 2) * 2 / W
y_ndc = (fy * 2 / H) * Y / Z - (py - H / 2) * 2 / H Then if you check x_screen = (W - 1) / 2 * (1 - x_ndc)
y_screen = (H - 1) / 2 * (1 - y_ndc) Now if you substitute x_screen = (-fx * (W - 1) / W) * X / Z + (W - 1) / W * px
y_screen = (-fy * (H - 1) / H) * Y / Z + (H - 1) / H * py Proved. @nikhilaravi I am wondering why not directly incorporate the negative focal length, so people would not be spending very long time like me figuring all this out. Best, |
@pengsongyou thank you for providing your detailed solution on this issue to help others. We are considering providing helper functions for converting from different coordinate system conventions to PyTorch3D as this is a common source of confusion. cc @davnov134 @gkioxari |
@nikhilaravi, hi, I would like to ask that do we have this mentioned convertion method now? |
@pengsongyou Hi I tried to use -f, the rendered result are very close as in opencv camera, but still has a little differ. I dont konw what could be wrong, any clue? |
Solved... |
That is strange because I can input R and t directly. I guess @nikhilaravi could provide some insights here. |
@pengsongyou That's just so strange... Cause I just spent hours in examining all this params (using synthetic camera rt and smpl param), including camera R t, f, c, smpl t, when using -f, it works if only when the R is np.eyes(3) @nikhilaravi I think the camera convertion is really needed cause in many CV fields we real have to use opencv camera :( |
Hi, I‘m trying to render images using some specific extrinsics like provided ground-trurth on some public datasets instead of
The rendering part is as follows:
I got some strange rendered images by using both f and -f. Is there anyone who knows how to perform rendering using a specific extrinsic? @nikhilaravi Could you please give me a clue? |
Sovled, using MengXinChengXuYuan's solution. |
Hi everyone! We integrated a function to convert camera descriptions in 75432a0 ! Now you can just use the function Good luck with your projects! |
@pengsongyou Hi, how do you compute the focal length and principal points from a intrinsic matrix?
I directly use the raw opencv intrinsics which does not work. |
Pass |
Any chance to add opengl? |
You can easily convert an OpenCV pose matrix to an OpenGL pose matrix by flipping the Y-axis and Z-axis: c2w = torch.eye(4) # Camera-to-world (i.e., camera pose) in OpenCV coordinate system
c2w[0:3, 1:3] *= -1 # Convert to OpenGL coordinate system |
Hello everyone! I am more concerned about how to verify whether the converted camera is correct. The method I am using now is depth back-projection point cloud. But I don't know if this is the standard way. |
Dear PyTorch3D team,
First of all, thanks so much for releasing this amazing library!
I have some camera intrinsic and extrinsic parameters from OpenCV, and I try to convert them to PyTorch3D PerspectiveCameras. I have been carefully following this amazing page. However, the calculated pixels in the screen coordinate system in PyTorch3D are always not correct. I provide my code snippet below:
In my case,
p_pix_p3d
is always different from GT pixelp_pix
, no matter if I useT1
orT2
as the transformation matrix. I am wondering if someone can kindly guide me on this? Thanks so much in advance for the help!Best,
Songyou
The text was updated successfully, but these errors were encountered: