-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Inaccurate camera centre obtained by Pytorch3d #294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I have solved this problem! |
Hi @Eckert-ZJB |
Sorry about my terrible describe. Here is a demo to calculate the camera centre by using different method. |
@gkioxari @nikhilaravi Firstly, I need to render a mesh in a given camera pose. The informations that I have include: the mesh (i.e. an obj file with texture), camera intrinsic, camera extrinsics (i.e. world2cam matrix, or R and T), and a real picture of object obtained from this viewpoint. Then, I use two different renderers to render this mesh. One is Pyrender (based on OpenGL, which is used as a reference result), the other is Pytorch3d. Noted the difference between their the camera coordinate system definition, I rotate the world2cam matrix 180 degrees arround the z axis before it is sent to Pytorch3d render, in other words, perform After this, I rendered the mesh with the Pytorch3d function:
In fact, I got a wrong rendering result after performming the previous process. But why? Then, I carefully checked the middle process of program execution, and I found there is an approximate transposition process in the sub-function get_world_to_view_transform(R, T) of OpenGLPerspectiveCameras(). (1) Let's see the first time:
It seems right. However, 'this is right' is based on the return w2v_trans is right. In other words, w2v_trans should be the transpose of world2cam matrix. But it isn't! (2) The process of second time to call the sub-function get_world_to_view_transform is similar to the first time. They both get a wrong result because of the wrong return of get_world_to_view_transform. The suggestion to solve the problem above: |
Hi @Eckert-ZJB I think your confusion stems from the fact that pytorch3d/pytorch3d/transforms/transform3d.py Lines 114 to 116 in b1eee57
Hence, before sending your As a side note, 3D is a complicated issue. When you switch between libraries (in this case OpenGL and PyTorch3D) you need to make sure that the data you transfer is transformed so that they obey the same world coordinate conventions of the libraries and that the transformation matrices are assuming the same order (row vs vector). |
Closing this! |
Hi @gkioxari @Eckert-ZJB I have reproduced your demo and I suppose that I've recieved required results. import numpy as np Unified camera coordinate systemprint(w2c) print('The true camera position is:\n', np.linalg.inv(world2cam)[:3,3]) R = torch.tensor(np.transpose(world2cam[:3, :3]).reshape(1,3,3)) Calculate the camera position using Pytorch3d methodworld_to_view_transform = get_world_to_view_transform(R=R, T=T) print('The camera position calculate by Pytorch3d:\n', camera_position_pytorch3d) The true camera position is: img_size = (480, 640) lights = PointLights(device=device, location=[[0.0, 0.0, 5.0]]) Render the cow mesh from each viewing angletarget_images = renderer(mesh, cameras=camera.cuda(), lights=lights)
|
@3a1b2c3 This is an old, closed issue. If you want help, especially on a new question, please open a new issue. I don't understand your question. It does look like the difference in views on the diagram is approximately around y. |
Thanks, is there a place for feature request? |
Just create a new issue asking for the feature :). |
1 similar comment
Just create a new issue asking for the feature :). |
I have some files including obj, camera extrinsics (world2cam matrix, e.g. R and T) and object photo with the same viewpoint. I try to render this obj with Pytorch3d, But I got the rendered result with a little view shifted.
For comparison, I show the original photo, OpenGL rendering result and Pytorch3d rendering result.
The original photo in the same viewpoint:
https://i.loli.net/2020/07/30/ysbBakuDZFrSi9C.png
The OpenGL rendering result:
https://i.loli.net/2020/07/30/Lk7w6SOqiAWyoMF.png
The Pytorch3d rendering result:
https://i.loli.net/2020/07/30/lWmNhfPxIDU3JyG.png
For comparison, I made a GIF with the original photo, OpenGL rendering result and Pytorch3d rendering result.
https://s1.ax1x.com/2020/07/30/aM9ZlR.gif
I think maybe there is some different in the camera centre. I noticed that when pytorch3d calculates the camera centre, it is obtained by inverting the world2cam_pytorch3d matrix, where:
_
_
While the OpenGL calculates the camera centre by inverting the world2cam matrix, where:
_
_
I output the results of the two calculation methods and found that they are not exactly the same:
https://i.loli.net/2020/07/30/yVYuoPgceEpIfKd.png
So I confused about this issue, and it is no problem to transform the R and T in the form of pytorch3d?
The text was updated successfully, but these errors were encountered: