Skip to content

[AO migration] async rendering#1229

Merged
aclegg3 merged 13 commits into
masterfrom
ao-migration-async-render
Jun 25, 2021
Merged

[AO migration] async rendering#1229
aclegg3 merged 13 commits into
masterfrom
ao-migration-async-render

Conversation

@aclegg3

@aclegg3 aclegg3 commented May 11, 2021

Copy link
Copy Markdown
Contributor

Motivation and Context

This PR introduces async rendering in a background thread allowing overlap of CPU physics simulation and GPU rendering operations resulting in a significant performance improvement. For training scenarios, this does result in delayed observation (the visual observations passed to the model will be one step behind the physical state) which is, however, analogous with sensor latency on physical systems.

Note: this feature is not enabled by default due to instability on some systems and potential for memory leaks if not properly used. See examples/tutorials/async_rendering.py for details.

Thanks @erikwijmans for building the vast majority of this feature.

How Has This Been Tested

Adds a python example with CI testing.

Types of changes

  • Docs change / refactoring / dependency upgrade
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have completed my CLA (see CONTRIBUTING)
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label May 11, 2021
@aclegg3

aclegg3 commented May 11, 2021

Copy link
Copy Markdown
Contributor Author

@erikwijmans I'd appreciate your input on this PR since you created this feature. 🙂
@bigbike I think this falls into your area of expertise, I'd appreciate your thoughts on fitting this into the new/upcoming sensor rendering paradigm.

@aclegg3 aclegg3 requested a review from eundersander May 11, 2021 00:38

#include <atomic_wait.h>

contended_t contention[256];

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about some comments here, and let's avoid unnamed magic numbers like 256 and 255.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, didn't notice this is a dependency from Nvidia. Would be nice to document where exactly we got this (since it's not a submodule). For example, do they version this?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a reference implementation for cpp20's atomic wait, it isn't part of a library so that's why it's not a submodule (it isn't intended to be compiled by another project) and it is not versioned (as this is the only version).

@eundersander eundersander left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still consider it an open question whether we should merge this to master. It would be nice if the PR description listed the known caveats/issues with this approach. (@aclegg3 mentioned that it was unsafe to reset the scene or something like that).

Anyway I left some comments.


#include <atomic_wait.h>

contended_t contention[256];

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, didn't notice this is a dependency from Nvidia. Would be nice to document where exactly we got this (since it's not a submodule). For example, do they version this?

Comment thread src/esp/bindings/GfxBindings.cpp Outdated
self.drawAsync(visualSensor, sceneGraph, view,
RenderCamera::Flags{flags});
},
R"(Draw given scene using the visual sensor)", "visualSensor"_a,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need better comments for all these new bindings.

Comment thread src/esp/bindings/SimBindings.cpp Outdated
.def("reconfigure", &Simulator::reconfigure, "configuration"_a)
.def("reset", &Simulator::reset)
.def("close", &Simulator::close)
.def("close", &Simulator::close, "destroy"_a = false)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a comment. What is close versus close-and-destroy?

Comment thread src/esp/gfx/Renderer.cpp Outdated
#include "esp/sensor/VisualSensor.h"
#include "esp/sim/Simulator.h"

#if !defined(CORRADE_TARGET_EMSCRIPTEN)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to see #ifdef ENABLE_ASYNC_RENDERING everywhere instead of !defined(CORRADE_TARGET_EMSCRIPTEN) everywhere. It would be more readable. And we may want to enable/disable this code based on other reasons in the future.

Comment thread src/esp/gfx/Renderer.cpp Outdated

#if !defined(CORRADE_TARGET_EMSCRIPTEN)

struct BackgroundRenderThread {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big struct with lots of code so let's move it to its own file.

Comment thread src/esp/gfx/Renderer.cpp Outdated
Mn::GL::Context::makeCurrent(threadContext_);
threadOwnsContext_ = true;

Mn::GL::Renderer::enable(Mn::GL::Renderer::Feature::DepthTest);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this bit of render-state logic copy-pasted from Renderer::Impl constructor? I'm worried about maintaining this. Can we avoid the code duplication?


bool makeCurrentPlatform() {
if (!windowlessGLContext_.makeCurrent())
Mn::Fatal{} << "Failed to make platform current";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as detecting/throwing fatal errors, we also have ESP_CHECK and CORRADE_ASSERT. This style here is new to me. Is it correct?

@eundersander

Copy link
Copy Markdown
Contributor

Have we tested this on Mac?

@erikwijmans

Copy link
Copy Markdown
Contributor

@eundersander @aclegg3 @bigbike I refactored this and tried to make the API a little more user friendly, i.e. force people to set physics immediately after starting the async draw.

I also made initialization of the background render thread lazy, so that if people never make async render calls, it will never be initialized and everything will work exactly like before.

@erikwijmans erikwijmans requested a review from eundersander June 2, 2021 22:42
@aclegg3

aclegg3 commented Jun 23, 2021

Copy link
Copy Markdown
Contributor Author

@erikwijmans Thanks for the work on this. I think it can land if we add a short python smoke test / tutorial script which can be checked on CI by test_examples.py and show users how to use this feature.

@aclegg3

aclegg3 commented Jun 24, 2021

Copy link
Copy Markdown
Contributor Author

@erikwijmans Thanks for adding the example! Agreed with @eundersander that some docstrings for new functions and python bindings would be ideal, but I think this is ready.

@eundersander

Copy link
Copy Markdown
Contributor

@erikwijmans thanks for adding async_rendering.py. I tried running this on my Mac, Fedora, and Ubuntu machines just out of curiosity. It ran fine on Mac but crashed on both Fedora and Ubuntu. Both machines have Nvidia GPUs. Segfault in glDrawElements in the background renderer for apartment_1.glb. Any ideas?

I'm surprised this passed our CI (which is also Ubuntu, I think). How confident are we that this python actually ran as part of https://app.circleci.com/pipelines/github/facebookresearch/habitat-sim/6961/workflows/26578358-7348-4169-ad02-d840053f266c/jobs/34364 ? Is it possible the CI machine ran it with a single hardware thread or otherwise somehow avoided the crash I hit?

@eundersander eundersander left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aclegg3 aclegg3 merged commit 81043c5 into master Jun 25, 2021
@aclegg3 aclegg3 deleted the ao-migration-async-render branch June 25, 2021 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed Do not delete this pull request or issue due to inactivity.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants