Skip to content

PCAM dataset download fails #5800

Closed
Closed
@benbo

Description

@benbo

🐛 Describe the bug

PCAM download fails for both the train and test splits due to a gzip error. I deleted all files that were created and tried it again from scratch but the same error persists.

>>> dataset = datasets.PCAM(dpath, split = 'train', download = True)                                                                                                                                      
2245it [00:00, 20206464.55it/s]                                                                                                                                                                           
Traceback (most recent call last):                                                                                                                                                                        
  File "<stdin>", line 1, in <module>                                                                                                                                                                     
  File "/user/miniconda3/envs/gpu24-2/lib/python3.9/site-packages/torchvision/datasets/pcam.py", line 92, in __init__                                                                 
    self._download()                                                                                                                                                                                      
  File "/user/miniconda3/envs/gpu24-2/lib/python3.9/site-packages/torchvision/datasets/pcam.py", line 130, in _download                                                               
    _decompress(str(self._base_folder / archive_name))                                                                                                                                                    
  File "/user/miniconda3/envs/gpu24-2/lib/python3.9/site-packages/torchvision/datasets/utils.py", line 372, in _decompress                                                            
    wfh.write(rfh.read())                                                                                                                                                                                 
  File "/user/miniconda3/envs/gpu24-2/lib/python3.9/gzip.py", line 300, in read                                                                                                       
    return self._buffer.read(size)                                                                                                                                                                        
  File "/user/miniconda3/envs/gpu24-2/lib/python3.9/gzip.py", line 487, in read
    if not self._read_gzip_header():
  File "/user/miniconda3/envs/gpu24-2/lib/python3.9/gzip.py", line 435, in _read_gzip_header                                                                                         
    raise BadGzipFile('Not a gzipped file (%r)' % magic)
gzip.BadGzipFile: Not a gzipped file (b'<!')

Versions

Collecting environment information...
PyTorch version: 1.11.0
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: Springdale Linux release 8.5 (Modena) (x86_64)
GCC version: (GCC) 8.5.0 20210514 (Red Hat 8.5.0-4)
Clang version: Could not collect
CMake version: version 3.20.2
Libc version: glibc-2.28

Python version: 3.9.7 (default, Sep 16 2021, 13:09:58) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-4.18.0-348.2.1.el8_5.x86_64-x86_64-with-glibc2.28
Is CUDA available: True
CUDA runtime version: 11.5.119
GPU models and configuration:
GPU 0: NVIDIA RTX A6000

Nvidia driver version: 495.29.05
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.21.2
[pip3] pytorch-lightning==1.5.8
[pip3] torch==1.11.0
[pip3] torch-fidelity==0.3.0
[pip3] torchinfo==1.6.3
[pip3] torchmetrics==0.6.2
[pip3] torchvision==0.12.0
[conda] blas 1.0 mkl
[conda] cudatoolkit 11.3.1 h2bc3f7f_2
[conda] ffmpeg 4.3 hf484d3e_0 pytorch
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py39h7f8727e_0
[conda] mkl_fft 1.3.1 py39hd3c417c_0
[conda] mkl_random 1.2.2 py39h51133e4_0
[conda] mypy-extensions 0.4.3 pypi_0 pypi
[conda] mypy_extensions 0.4.3 py39h06a4308_1
[conda] numpy 1.19.5 pypi_0 pypi
[conda] numpy-base 1.21.2 py39h79a1101_0
[conda] pytorch 1.11.0 py3.9_cuda11.3_cudnn8.2.0_0 pytorch
[conda] pytorch-lightning 1.5.8 pyhd8ed1ab_0 conda-forge
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torch-fidelity 0.3.0 pypi_0 pypi
[conda] torchinfo 1.6.3 pyhd8ed1ab_0 conda-forge
[conda] torchmetrics 0.6.2 pyhd8ed1ab_0 conda-forge
[conda] torchvision 0.12.0 py39_cu113 pytorch

cc @pmeier @YosuaMichael

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions