Skip to content

Add test to compare encoder inference on input with and without padding #1770

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 7, 2022

Conversation

erichan1
Copy link
Contributor

@erichan1 erichan1 commented Jun 7, 2022

Encoder inference should output identical embeddings if given a padded vs. non-padded input (assuming the non-padded data is the same). Check that this is the case.

We make this check for four paths by setting the below two flags.

  • with_no_grad = [True, False] -> this is performing inference with and without torch.no_grad(). When torch.no_grad() is turned on, we hit the fast path of torch.nn.TransformerEncoder (BetterTransformer).
  • return_all_layers = [True, False]. When set to False, torch.nn.TransformerEncoder is directly called, and when set to True, torchtext encoder loops through torch.nn.TransformerEncoderLayer directly to save the intermediate outputs of each layer.

)
model = RobertaModel(encoder_conf)
model = model.eval()
# TODO: make return_all_layers a property of RobertaEncoderConf so it can be passed as arg
Copy link
Contributor Author

@erichan1 erichan1 Jun 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, return_all_layers cannot be set from RobertaEncoderConf because it's not a property of the config. Thus we need to manually set it below. Not a big deal right now, but something to consider changing.

Copy link
Contributor

@parmeet parmeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM! Could you please fix the python lint issue? You can do so by installing pre-commit and executing "pre-commit run --all-files" from cmd.

@erichan1 erichan1 merged commit 1e9d731 into main Jun 7, 2022
@erichan1 erichan1 deleted the erichan1/add-padding-input-test branch June 7, 2022 18:54
Nayef211 pushed a commit to Nayef211/text that referenced this pull request Jun 7, 2022
…ng (pytorch#1770)

* add test to compare encoder inference on input with and without padding

* remove bottom whitespace

* remove unnecessary import

* lint
Nayef211 added a commit that referenced this pull request Jun 8, 2022
…ng (#1770) (#1772)

* add test to compare encoder inference on input with and without padding

* remove bottom whitespace

* remove unnecessary import

* lint

Co-authored-by: erichan1 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants