Skip to content

update urls for kinetics dataset #5578

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 22, 2022
Merged

update urls for kinetics dataset #5578

merged 9 commits into from
Mar 22, 2022

Conversation

sahilg06
Copy link
Contributor

@sahilg06 sahilg06 commented Mar 9, 2022

Resolves issue #5564

@facebook-github-bot
Copy link

facebook-github-bot commented Mar 9, 2022

💊 CI failures summary and remediations

As of commit 3d36b88 (more details on the Dr. CI page):


None of the CI failures appear to be your fault 💚



🚧 3 ongoing upstream failures:

These were probably caused by upstream breakages that are not fixed yet.


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Copy link
Contributor

@bjuncek bjuncek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me - LGTM

Question, should we patch the .csv vs .txt issue as well?

@sahilg06
Copy link
Contributor Author

Looks good to me - LGTM

Question, should we patch the .csv vs .txt issue as well?

Yes, we can. I have the .csv files (which I have uploaded here. ) . May you upload them somewhere and provide me with corresponding URLs as code needs the URLs. So that I can update the URLs as well.

@pmeier pmeier requested a review from NicolasHug March 17, 2022 21:58
@sahilg06
Copy link
Contributor Author

Now the only thing left is to update the URLs for val.csv and train.csv of the k600 dataset. I have already uploaded the correct files here.
@bjuncek @pmeier May you please generate URLs corresponding to those files and provide me with the same. So that I can update them in the code.

@sahilg06
Copy link
Contributor Author

sahilg06 commented Mar 18, 2022

Why these tests are failing? I made a very small change in the last commit.
I just added the test option to the valid values of the split. All the required URLs are working for 'test' split.

@pmeier
Copy link
Collaborator

pmeier commented Mar 18, 2022

Tests fail due to #5635. It is unrelated to your PR. You can ignore them. We'll have a final look before merge that your PR doesn't break anything.

@sahilg06
Copy link
Contributor Author

Thank you @pmeier.
Please also have a look at this comment.
So that everything is perfect.

@pmeier
Copy link
Collaborator

pmeier commented Mar 18, 2022

@bjuncek is handling the PR. I was under the impression everything is done with his approval?

@sahilg06
Copy link
Contributor Author

sahilg06 commented Mar 18, 2022

Yes, he has approved all the previous changes. Also, he asked me here, if we can patch the .csv vs .txt issue as well in this PR.
So it's only possible if someone provides me the new URLs corresponding to the correct files.

@pmeier pmeier removed the request for review from NicolasHug March 18, 2022 08:53
@pmeier
Copy link
Collaborator

pmeier commented Mar 18, 2022

I'm out of the loop here. Please wait for @bjuncek's review.

@bjuncek
Copy link
Contributor

bjuncek commented Mar 18, 2022

Hi @sahilg06

I've just got Joao to merge the new csv's into kinetics S3, so the old CSV's should now work, i.e. all of the following should now have the same format:

https://s3.amazonaws.com/kinetics/600/annotations/train.csv
https://s3.amazonaws.com/kinetics/600/annotations/val.csv
https://s3.amazonaws.com/kinetics/600/annotations/test.csv
https://s3.amazonaws.com/kinetics/400/annotations/train.csv
https://s3.amazonaws.com/kinetics/400/annotations/val.csv
https://s3.amazonaws.com/kinetics/400/annotations/test.csv

Could you please verify this, and if that's ok, no further action w.r.t. URLs needs to be done

@pmeier
Copy link
Collaborator

pmeier commented Mar 18, 2022

Cross-posting #5564 (comment)

Just to clear some confusion for someone else reading this thread: When @bjuncek talks about "Joao", he doesn't mean @jdsgomes, but rather Joao Carreira operating under @kinetics-cvdf. The links were updated in cvdfoundation/kinetics-dataset@48a523e.

@sahilg06
Copy link
Contributor Author

sahilg06 commented Mar 18, 2022

Hi @sahilg06

I've just got Joao to merge the new csv's into kinetics S3, so the old CSV's should now work, i.e. all of the following should now have the same format:

https://s3.amazonaws.com/kinetics/600/annotations/train.csv
https://s3.amazonaws.com/kinetics/600/annotations/val.csv
https://s3.amazonaws.com/kinetics/600/annotations/test.csv
https://s3.amazonaws.com/kinetics/400/annotations/train.csv
https://s3.amazonaws.com/kinetics/400/annotations/val.csv
https://s3.amazonaws.com/kinetics/400/annotations/test.csv

Could you please verify this, and if that's ok, no further action w.r.t. URLs needs to be done

Thanks @bjuncek. These URLs are working!
I have changed .txt to .csv for the same. Now everything seems good.
We can finally merge this PR after a final review.

Copy link
Contributor

@bjuncek bjuncek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
Thanks for the work @sahilg06 and sorry for the delays

@bjuncek bjuncek requested a review from NicolasHug March 18, 2022 10:12
@sahilg06
Copy link
Contributor Author

@NicolasHug may you please review and merge it asap? I need to use this class in one of my work.

@NicolasHug
Copy link
Member

@sahilg06 when running your initial command

 ds = torchvision.datasets.Kinetics(root='ktest',frames_per_clip=100, num_classes='600', split='val', download=True, num_download_workers=1000, num_workers=2)

I'm getting

Downloading https://s3.amazonaws.com/kinetics/600/val/k600_val_path.txt to ktest/files/k600_val_path.txt
38912it [00:00, 445546.97it/s]                                                                                         
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-2-a444dcfd311f> in <module>
----> 1 ds = torchvision.datasets.Kinetics(root='ktest',frames_per_clip=100, num_classes='600', split='val', download=True, num_download_workers=1000, num_workers=2)

~/dev/vision/torchvision/datasets/kinetics.py in __init__(self, root, frames_per_clip, num_classes, split, frame_rate, step_between_clips, transform, extensions, download, num_download_workers, num_workers, _precomputed_metadata, _video_width, _video_height, _video_min_dimension, _audio_samples, _audio_channels, _legacy)

~/dev/vision/torchvision/datasets/kinetics.py in download_and_process_videos(self)

~/dev/vision/torchvision/datasets/kinetics.py in _download_videos(self)

~/.miniconda3/envs/pt/lib/python3.9/multiprocessing/context.py in Pool(self, processes, initializer, initargs, maxtasksperchild)

~/.miniconda3/envs/pt/lib/python3.9/multiprocessing/pool.py in __init__(self, processes, initializer, initargs, maxtasksperchild, context)

~/.miniconda3/envs/pt/lib/python3.9/multiprocessing/pool.py in _repopulate_pool(self)

~/.miniconda3/envs/pt/lib/python3.9/multiprocessing/pool.py in _repopulate_pool_static(ctx, Process, processes, pool, inqueue, outqueue, initializer, initargs, maxtasksperchild, wrap_exception)

~/.miniconda3/envs/pt/lib/python3.9/multiprocessing/process.py in start(self)

~/.miniconda3/envs/pt/lib/python3.9/multiprocessing/context.py in _Popen(process_obj)

~/.miniconda3/envs/pt/lib/python3.9/multiprocessing/popen_fork.py in __init__(self, process_obj)

~/.miniconda3/envs/pt/lib/python3.9/multiprocessing/popen_fork.py in _launch(self, process_obj)

OSError: [Errno 24] Too many open files

@sahilg06
Copy link
Contributor Author

sahilg06 commented Mar 22, 2022

@NicolasHug This error is not due to changes I made, even if I tried running the command using the Kinetics class from main branch, it gave me the same error on macOS locally.
I think it's because of the large value of num_download_workers, I tried reducing it to 10, it worked perfectly.
So does this error depend upon the system, we are using (because I tried with 1000 on a different system (Ubuntu remote server), working there perfectly)?

@NicolasHug NicolasHug merged commit 375e4ab into pytorch:main Mar 22, 2022
@NicolasHug
Copy link
Member

Thanks @sahilg06

@github-actions
Copy link

Hey @NicolasHug!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sahilg06 and @bjuncek for the review

@NicolasHug NicolasHug added the bug label Mar 22, 2022
lezwon pushed a commit to lezwon/vision that referenced this pull request Mar 23, 2022
* update urls for kinetics dataset

* update urls for kinetics dataset

* remove errors

* update the changes and add test option to split

* added test to valid values for split arg

* change .txt to .csv for annotation url of k600

Co-authored-by: Nicolas Hug <[email protected]>
pmeier added a commit that referenced this pull request Mar 25, 2022
* added usps dataset

* fixed type issues

* fix mobilnet norm layer test (#5643)

* xfail mobilnet norm layer test

* fix test

* More robust check in tests for 16 bits images (#5652)

* Prefer nvidia channel for conda builds (#5648)

To mitigate missing `libcupti.so` dependency

* fix torchdata CI installation (#5657)

* update urls for kinetics dataset (#5578)

* update urls for kinetics dataset

* update urls for kinetics dataset

* remove errors

* update the changes and add test option to split

* added test to valid values for split arg

* change .txt to .csv for annotation url of k600

Co-authored-by: Nicolas Hug <[email protected]>

* Port Multi-weight support from prototype to main (#5618)

* Moving basefiles outside of prototype and porting Alexnet, ConvNext, Densenet and EfficientNet.

* Porting googlenet

* Porting inception

* Porting mnasnet

* Porting mobilenetv2

* Porting mobilenetv3

* Porting regnet

* Porting resnet

* Porting shufflenetv2

* Porting squeezenet

* Porting vgg

* Porting vit

* Fix docstrings

* Fixing imports

* Adding missing import

* Fix mobilenet imports

* Fix tests

* Fix prototype tests

* Exclude get_weight from models on test

* Fix init files

* Porting googlenet

* Porting inception

* porting mobilenetv2

* porting mobilenetv3

* porting resnet

* porting shufflenetv2

* Fix test and linter

* Fixing docs.

* Porting Detection models (#5617)

* fix inits

* fix docs

* Port faster_rcnn

* Port fcos

* Port keypoint_rcnn

* Port mask_rcnn

* Port retinanet

* Port ssd

* Port ssdlite

* Fix linter

* Fixing tests

* Fixing tests

* Fixing vgg test

* Porting Optical Flow, Segmentation, Video models (#5619)

* Porting raft

* Porting video resnet

* Porting deeplabv3

* Porting fcn and lraspp

* Fixing the tests and linter

* Porting docs, examples, tutorials and galleries (#5620)

* Fix examples, tutorials and gallery

* Update gallery/plot_optical_flow.py

Co-authored-by: Nicolas Hug <[email protected]>

* Fix import

* Revert hardcoded normalization

* fix uncommitted changes

* Fix bug

* Fix more bugs

* Making resize optional for segmentation

* Fixing preset

* Fix mypy

* Fixing documentation strings

* Fix flake8

* minor refactoring

Co-authored-by: Nicolas Hug <[email protected]>

* Resolve conflict

* Porting model tests (#5622)

* Porting tests

* Remove unnecessary variable

* Fix linter

* Move prototype to extended tests

* Fix download models job

* Update CI on Multiweight branch to use the new weight download approach (#5628)

* port Pad to prototype transforms (#5621)

* port Pad to prototype transforms

* use literal

* Bump up LibTorchvision version number for Podspec to release Cocoapods (#5624)

Co-authored-by: Anton Thomma <[email protected]>
Co-authored-by: Vasilis Vryniotis <[email protected]>

* pre-download model weights in CI docs build (#5625)

* pre-download model weights in CI docs build

* move changes into template

* change docs image

* Regenerated config.yml

Co-authored-by: Philip Meier <[email protected]>
Co-authored-by: Anton Thomma <[email protected]>
Co-authored-by: Anton Thomma <[email protected]>

* Porting reference scripts and updating presets (#5629)

* Making _preset.py classes

* Remove support of targets on presets.

* Rewriting the video preset

* Adding tests to check that the bundled transforms are JIT scriptable

* Rename all presets from *Eval to *Inference

* Minor refactoring

* Remove --prototype and --pretrained from reference scripts

* remove  pretained_backbone refs

* Corrections and simplifications

* Fixing bug

* Fixing linter

* Fix flake8

* restore documentation example

* minor fixes

* fix optical flow missing param

* Fixing commands

* Adding weights_backbone support in detection and segmentation

* Updating the commands for InceptionV3

* Setting `weights_backbone` to its fully BC value (#5653)

* Replace default `weights_backbone=None` with its BC values.

* Fixing tests

* Fix linter

* Update docs.

* Update preprocessing on reference scripts.

* Change qat/ptq to their full values.

* Refactoring preprocessing

* Fix video preset

* No initialization on VGG if pretrained

* Fix warning messages for backbone utils.

* Adding star to all preset constructors.

* Fix mypy.

Co-authored-by: Nicolas Hug <[email protected]>
Co-authored-by: Philip Meier <[email protected]>
Co-authored-by: Anton Thomma <[email protected]>
Co-authored-by: Anton Thomma <[email protected]>

* Apply suggestions from code review

Co-authored-by: Philip Meier <[email protected]>

* use decompressor for extracting bz2

* Apply suggestions from code review

Co-authored-by: Philip Meier <[email protected]>

* Apply suggestions from code review

Co-authored-by: Philip Meier <[email protected]>

* fixed lint fails

* added tests for USPS

* check image shape

* fix tests

* check shape on image directly

* Apply suggestions from code review

Co-authored-by: Philip Meier <[email protected]>

* removed test and comments

* Update test/test_prototype_builtin_datasets.py

Co-authored-by: Nicolas Hug <[email protected]>

Co-authored-by: Philip Meier <[email protected]>
Co-authored-by: Nicolas Hug <[email protected]>
Co-authored-by: Nikita Shulga <[email protected]>
Co-authored-by: Sahil Goyal <[email protected]>
Co-authored-by: Vasilis Vryniotis <[email protected]>
Co-authored-by: Anton Thomma <[email protected]>
Co-authored-by: Anton Thomma <[email protected]>
facebook-github-bot pushed a commit that referenced this pull request Apr 5, 2022
Summary:
* update urls for kinetics dataset

* update urls for kinetics dataset

* remove errors

* update the changes and add test option to split

* added test to valid values for split arg

* change .txt to .csv for annotation url of k600

(Note: this ignores all push blocking failures!)

Reviewed By: datumbox

Differential Revision: D35216772

fbshipit-source-id: 558aad2137bdb6808cbbe863f2d01e7b490fa329

Co-authored-by: Nicolas Hug <[email protected]>
facebook-github-bot pushed a commit that referenced this pull request Apr 5, 2022
Summary:
* added usps dataset

* fixed type issues

* fix mobilnet norm layer test (#5643)

* xfail mobilnet norm layer test

* fix test

* More robust check in tests for 16 bits images (#5652)

* Prefer nvidia channel for conda builds (#5648)

To mitigate missing `libcupti.so` dependency

* fix torchdata CI installation (#5657)

* update urls for kinetics dataset (#5578)

* update urls for kinetics dataset

* update urls for kinetics dataset

* remove errors

* update the changes and add test option to split

* added test to valid values for split arg

* change .txt to .csv for annotation url of k600

* Port Multi-weight support from prototype to main (#5618)

* Moving basefiles outside of prototype and porting Alexnet, ConvNext, Densenet and EfficientNet.

* Porting googlenet

* Porting inception

* Porting mnasnet

* Porting mobilenetv2

* Porting mobilenetv3

* Porting regnet

* Porting resnet

* Porting shufflenetv2

* Porting squeezenet

* Porting vgg

* Porting vit

* Fix docstrings

* Fixing imports

* Adding missing import

* Fix mobilenet imports

* Fix tests

* Fix prototype tests

* Exclude get_weight from models on test

* Fix init files

* Porting googlenet

* Porting inception

* porting mobilenetv2

* porting mobilenetv3

* porting resnet

* porting shufflenetv2

* Fix test and linter

* Fixing docs.

* Porting Detection models (#5617)

* fix inits

* fix docs

* Port faster_rcnn

* Port fcos

* Port keypoint_rcnn

* Port mask_rcnn

* Port retinanet

* Port ssd

* Port ssdlite

* Fix linter

* Fixing tests

* Fixing tests

* Fixing vgg test

* Porting Optical Flow, Segmentation, Video models (#5619)

* Porting raft

* Porting video resnet

* Porting deeplabv3

* Porting fcn and lraspp

* Fixing the tests and linter

* Porting docs, examples, tutorials and galleries (#5620)

* Fix examples, tutorials and gallery

* Update gallery/plot_optical_flow.py

* Fix import

* Revert hardcoded normalization

* fix uncommitted changes

* Fix bug

* Fix more bugs

* Making resize optional for segmentation

* Fixing preset

* Fix mypy

* Fixing documentation strings

* Fix flake8

* minor refactoring

* Resolve conflict

* Porting model tests (#5622)

* Porting tests

* Remove unnecessary variable

* Fix linter

* Move prototype to extended tests

* Fix download models job

* Update CI on Multiweight branch to use the new weight download approach (#5628)

* port Pad to prototype transforms (#5621)

* port Pad to prototype transforms

* use literal

* Bump up LibTorchvision version number for Podspec to release Cocoapods (#5624)

* pre-download model weights in CI docs build (#5625)

* pre-download model weights in CI docs build

* move changes into template

* change docs image

* Regenerated config.yml

* Porting reference scripts and updating presets (#5629)

* Making _preset.py classes

* Remove support of targets on presets.

* Rewriting the video preset

* Adding tests to check that the bundled transforms are JIT scriptable

* Rename all presets from *Eval to *Inference

* Minor refactoring

* Remove --prototype and --pretrained from reference scripts

* remove  pretained_backbone refs

* Corrections and simplifications

* Fixing bug

* Fixing linter

* Fix flake8

* restore documentation example

* minor fixes

* fix optical flow missing param

* Fixing commands

* Adding weights_backbone support in detection and segmentation

* Updating the commands for InceptionV3

* Setting `weights_backbone` to its fully BC value (#5653)

* Replace default `weights_backbone=None` with its BC values.

* Fixing tests

* Fix linter

* Update docs.

* Update preprocessing on reference scripts.

* Change qat/ptq to their full values.

* Refactoring preprocessing

* Fix video preset

* No initialization on VGG if pretrained

* Fix warning messages for backbone utils.

* Adding star to all preset constructors.

* Fix mypy.

* Apply suggestions from code review

* use decompressor for extracting bz2

* Apply suggestions from code review

* Apply suggestions from code review

* fixed lint fails

* added tests for USPS

* check image shape

* fix tests

* check shape on image directly

* Apply suggestions from code review

* removed test and comments

* Update test/test_prototype_builtin_datasets.py

(Note: this ignores all push blocking failures!)

Reviewed By: datumbox

Differential Revision: D35216783

fbshipit-source-id: 556a63a89f15d1541ac2b479244a7b6c564eff14

Co-authored-by: Nicolas Hug <[email protected]>
Co-authored-by: Nicolas Hug <[email protected]>
Co-authored-by: Nicolas Hug <[email protected]>
Co-authored-by: Anton Thomma <[email protected]>
Co-authored-by: Vasilis Vryniotis <[email protected]>
Co-authored-by: Philip Meier <[email protected]>
Co-authored-by: Anton Thomma <[email protected]>
Co-authored-by: Anton Thomma <[email protected]>
Co-authored-by: Nicolas Hug <[email protected]>
Co-authored-by: Philip Meier <[email protected]>
Co-authored-by: Anton Thomma <[email protected]>
Co-authored-by: Anton Thomma <[email protected]>
Co-authored-by: Philip Meier <[email protected]>
Co-authored-by: Philip Meier <[email protected]>
Co-authored-by: Philip Meier <[email protected]>
Co-authored-by: Philip Meier <[email protected]>
Co-authored-by: Nicolas Hug <[email protected]>
Co-authored-by: Philip Meier <[email protected]>
Co-authored-by: Nicolas Hug <[email protected]>
Co-authored-by: Nikita Shulga <[email protected]>
Co-authored-by: Sahil Goyal <[email protected]>
Co-authored-by: Vasilis Vryniotis <[email protected]>
Co-authored-by: Anton Thomma <[email protected]>
Co-authored-by: Anton Thomma <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants