Skip to content

Food101 new dataset api #5584

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Apr 4, 2022
Merged

Conversation

yassineAlouini
Copy link
Contributor

In this PR, I migrate the Food101 dataset to the new datasets API. For the code, I have got some inspiration from the dtd.py new dataset particulary for the Demultiplexer usage.

The pre-commits and food101 tests seem to work locally, hope that's the case within the CI/CD.

Any reviews are welcome, many thanks in advance. 😺

@facebook-github-bot
Copy link

Hi @yassineAlouini!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@facebook-github-bot
Copy link

facebook-github-bot commented Mar 10, 2022

💊 CI failures summary and remediations

As of commit facd139 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @yassineAlouini thanks a lot for the PR. I did an initial pass and have some comments inline. In general this looks already quite good and we only need to iron out some wrinkles.

@yassineAlouini
Copy link
Contributor Author

Thanks @pmeier for the code review, will try to finish the fixes by Wednesday. I should have the CLA by then as well. 👌

Do you have a new PR to suggest? Should I work on another dataset migration since I am now familiar with the design or is there something more urgent to work on? Thanks for any pointer.

@facebook-github-bot
Copy link

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@yassineAlouini
Copy link
Contributor Author

@pmeier CLA is uploaded, let me know if I need to finish anything else. Thanks for your help and guidance.

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @yassineAlouini, I have some suggestions inline. I have yet to review the mock data for the tests. Will do that in the next review round.

@pmeier
Copy link
Collaborator

pmeier commented Mar 28, 2022

@yassineAlouini I see you are updating the PR from the current main branch. Please let me know if the PR is ready for another round of review.

@yassineAlouini
Copy link
Contributor Author

@pmeier All is resolved as far as I know. The only part that needs reviewing is the test mock I guess? Let me know if anything else needs more work. 👌

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment about the meta data files. Mock data generation looks good in general. I'll have another look after the comment is resolved, since it affects the generation.

@pmeier
Copy link
Collaborator

pmeier commented Mar 28, 2022

Also, don't worry about CI at the moment. There is a larger outage.

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @yassineAlouini. I've added some suggestion and simplifications inline. Note that this doesn't mean that your version didn't work (it did!). Thanks a lot for the continued interest.

@yassineAlouini
Copy link
Contributor Author

@pmeier Mock data should be good now. Ready for a final check I guess. 👌

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a ton @yassineAlouini!

@pmeier pmeier requested a review from NicolasHug March 28, 2022 16:57
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stamping

@pmeier pmeier merged commit 31e503f into pytorch:main Apr 4, 2022
@yassineAlouini yassineAlouini deleted the food101_new_dataset_api branch April 4, 2022 12:01
facebook-github-bot pushed a commit that referenced this pull request Apr 6, 2022
Summary:
* [FEAT] Start implementing Food101 using the new datasets API. WIP.

* [FEAT] Generate Food101 categories and start the test mock.

* [FEAT] food101 dataset code seems to work now.

* [TEST] food101 mock update.

* [FIX] Some fixes thanks to running food101 tests.

* [FIX] Fix mypy checks for the food101 file.

* [FIX] Remove unused numpy.

* [FIX] Some changes thanks to code review.

* [ENH] More idomatic dataset code thanks to code review.

* [FIX] Remove unused cast.

* [ENH] Set decompress and extract to True for some performance gains.

* [FEAT] Use the preprocess=decompress keyword.

* [ENH] Use the train and test.txt file instead of the .json variants and simplify code + update mock data.

* [ENH] Better food101 mock data generation.

* [FIX] Remove a useless print.

Reviewed By: NicolasHug

Differential Revision: D35393170

fbshipit-source-id: c7f51bdfb2e05913593cdcba9e30994557afaf87

Co-authored-by: Philip Meier <[email protected]>
@pmeier pmeier mentioned this pull request Apr 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants